Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR Submission #88

Open
reddymh opened this issue Sep 1, 2022 · 3 comments
Open

PR Submission #88

reddymh opened this issue Sep 1, 2022 · 3 comments

Comments

@reddymh
Copy link

reddymh commented Sep 1, 2022

Hi Team,

I have done code changes for the below tasks.

  1. Pod(s) which are stuck in Terminating state and requires graceful delete based on age/time
  2. Pod(s) which are in Error/ContainerStatusUnknown/OOMKilled/Terminated/Completed(Sometimes running pod changes to completed due to node re-creation/preemptive nodes ) based on age/time

can I raise the PR for the same?

Thanks,
Raj

@reddymh
Copy link
Author

reddymh commented Sep 5, 2022

@lwolf Can I raise PR for the above use case(s)?

@lwolf
Copy link
Owner

lwolf commented Sep 5, 2022

Hi,
AFAIR pods stuck in weird states like Terminating/Unknown can't be deleted without using force deleting, othewise they wouldn't be "stuck". Using "force" usually hides the real issue, so I'd prefer to not have things that may result in inconsistent state of the cluster.

Removing Completed pods sounds reasonable.
Regarding the others not really sure, but if you already did some coding please share and we can talk more about it

@reddymh
Copy link
Author

reddymh commented Sep 5, 2022

@lwolf recently we faced issue while pods were stuck in terminating status and some of high priority class pod(s) like calico daemon set was in pending state(pod limit per node) and then we have updated the clean up operator to take care of terminating pods(stuck) by age with graceful delete.

other status like Completed(some pods move to other nodes due to auto scaling up/down but due to some issue pods will go into completed/error state.

I will raise PR for second use case and first use case it will be helpful for scheduler issues or not properly terminate the pods and we can put a flag when required we can enable it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants