KEP-5032: Container log Split and Rotation to avoid Disk pressure #5022

Zeel-Patel · 2025-01-07T20:34:25Z

One-line PR description: Rotate and clean containers logs when there is disk pressure on kubelet host.

Issue link: Container log Split and Rotate to avoid Disk pressure #5032

k8s-ci-robot · 2025-01-07T20:34:33Z

Welcome @Zeel-Patel!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2025-01-07T20:34:34Z

Hi @Zeel-Patel. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2025-01-07T20:34:35Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Zeel-Patel
Once this PR has been reviewed and has the lgtm label, please assign dchen1107 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/sig-node/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Zeel-Patel · 2025-01-07T20:35:27Z

/sig-node

kannon92 · 2025-01-07T20:52:54Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+
+## Design Details
+
+Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.


I was thinking that when there is disk pressure in nodefs.available/imagefs.available we can add rotateLogs to our list of functions we run in case of disk pressure.

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/eviction/helpers.go#L1195-#L1230

That's better. Thanks.
But won't that use kubelet eviction thresholds to identify disk pressure?

I suppose this log cleanup is another form of eviction to make space like we do with Images and containers.

So, Adding that to the nodefs.available / imagefs.available would logically make a great addition

big +1, this was my initial thinking as well. Detecting disk pressure should be a trigger for log rotation in addition to the timer-based approach.

Sure.

Considering this
#5022 (comment)

If we rely on same kubelet eviction thresholds, it will cause log rotation to fail. We have experienced it as well, that log copy creation fails.
Up for suggestions.

@ffromani @harshanarayana @kannon92

Listing down few options, lmk your thoughts

We use same disk eviction thresholds. Every time there is disk pressure, check size of .gz log files of all containers. If the combined file size (for a particular container) exceeds (containerLogMaxSize*containerLogMaxFiles), simply delete older .gz files till remaining size is within this limit. No log rotate. So if we have containerLogMaxSize = 20Mib and containerLogMaxFiles = 6, we will ensure that combined log file size remains within 120Mib. It could also happen that a single .gz file endsup taking all 120Mib after cleanup

Blindly do log rotation as proposed in KEP (But using disk pressure thresholds as suggested by you all)

Make the change suggested in option 1 in ContainerLogManager periodic log rotation itself.

my suggestion is when disk pressure is hit (by our eviction settings) we would trigger your logic for forcing rotations on any logs that exceed the limits.

Though the fact that they exceed limits seems to be a bug..

The goal would be to preemptly try and clean up disk space without evicting.

Looking at this goal, this seems good to me

We use same disk eviction thresholds. Every time there is disk pressure, check size of .gz log files of all containers. If the combined file size (for a particular container) exceeds (containerLogMaxSize*containerLogMaxFiles), simply delete older .gz files till remaining size is within this limit. No log rotate. So if we have containerLogMaxSize = 20Mib and containerLogMaxFiles = 6, we will ensure that combined log file size remains within 120Mib. It could also happen that a single .gz file endsup taking all 120Mib after cleanup

kannon92 · 2025-01-07T20:59:57Z

/ok-to-test

harshanarayana

Leaving behind a few comments/ Feel free to ignore if you consider them irrelevant/invalid.

harshanarayana · 2025-01-08T07:28:41Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+
+Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.
+
+- `logRotateDiskCheckInterval` is the time interval within which the ContainerLogManager will check Disk usage on the kubelet host.


Here is my two cent.

I was the one who added a similar config to the Log rotate workflow where it can take a configurable number workers with a duration and rotate logs in async mode to help reduce the leak of log file size when the container log generation rate it way too high.

Which definitely helps, but never really prevents the issue. The safest way to deal with this would be to truncate and fix the size when the logs are being written instead of monitor, rotate and cleanup. That will be the most fool proof way to do this.

That being said, In absence of that workflow, I like your suggestion.

it will rotate logs of all the containers of the kubelet.

However, I am not sure if all service/containers should be taxed because of one service's misbehaviour. The reason being how the kubectl log would look like after one such global rotation.

As far as I can recall, the .gz extensions are ignored when the kubectl logs are being used. So, forcing a global rotation can lead to logs going missing from the kubectl log out of nowhere.

We should still cleanup only those logs that have exceeded the configured threshold in this loop so that we let the other service behave the way they do.

Yes, that's what we are proposing.
Currently, log files are anyways exceeding the set thresholds which causes high disk usage.
Instead of evicting pods, we can once try rotating logs of such containers (containers for which logs size is exceeding the containerLogMaxSize threshold)

harshanarayana · 2025-01-08T07:30:00Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+
+## Design Details
+
+Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.


I suppose this log cleanup is another form of eviction to make space like we do with Images and containers.

So, Adding that to the nodefs.available / imagefs.available would logically make a great addition

harshanarayana · 2025-01-08T07:31:15Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+```
+containerLogMaxSize = 200M
+containerLogMaxFiles = 6
+```


FYI, there is also two other fields to configure the worker + the monitoring duration to perform the cleanup.

containerLogMaxWorkers and containerLogMonitorInterval respectively.

harshanarayana · 2025-01-08T07:36:10Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+
+
+If the pod had been generating logs in Gigabytes with minimal delay, it can cause disk pressure on kubelet host and that can affect other pods running in the same kubelets.
+


This can also prevent further container log rotation to fail and make the problem even worse.

case and point, an issue we ran into very recently.

Dec 11 21:35:31 tardis-node-192-168-4-51 kubelet[8204]: E1211 21:35:31.794811 8204 container_log_manager.go:296] "Failed to rotate log for container" err="failed to compress log "/var/log/pods/tardis-pod-2-5bb1ae31-tm-78475757b8-2mbl9_f53b5f87-5b48-4748-a919-9eb702bb5f0c/taskmanager/2.log.20241211-202335": failed to create temporary log "/var/log/pods/tardis-pod-2-5bb1ae31-tm-78475757b8-2mbl9_f53b5f87-5b48-4748-a919-9eb702bb5f0c/taskmanager/2.log.20241211-202335.tmp": open /var/log/pods/tardis-pod-2-5bb1ae31-tm-78475757b8-2mbl9_f53b5f87-5b48-4748-a919-9eb702bb5f0c/taskmanager/2.log.20241211-202335.tmp: disk quota exceeded" path="/var/log/pods/tardis-pod-2-5bb1ae31-tm-78475757b8-2mbl9_f53b5f87-5b48-4748-a919-9eb702bb5f0c/taskmanager/2.log" containerID="2c7de6fdcbda4498543ab3cc68d59dcee487e08ad7d37751f9ea5c366975f784"

Yes, we have faced this issue on disk pressure as well. But the periodic log rotations is anyways failing.
If we wisely chose logRotateDiskPressureThreshold, it would be helpful.
wdyt?

It should definitely help. But the way to chose the right size and frequency to monitor is trial and error and I can't think of an effective way to determine the number. But that is besides the scope of this KEP anyway.

harshanarayana · 2025-01-08T07:37:01Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+
+### Goals
+
+- Rotate and Clean all container logs on kubelet Disk pressure


Would there be a configuration required to help identify what logs are to be selected for such cleanup ?

Containers for which log size is exceeding containerLogMaxSize will be selected for cleanup.

ffromani · 2025-01-08T09:48:53Z

keps/sig-node/4819-log-rotate-on-disk-pressure/kep.yaml

+editor: "@Zeel-Patel"
+creation-date: 2025-01-08
+last-updated: 2025-01-08
+reviewers:


please add @kannon92 and @ffromani for starters

ffromani · 2025-01-08T09:49:44Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+<!--
+**Note:** When your KEP is complete, all of these comment blocks should be removed.
+
+To get started with this template:
+
+- [ ] **Pick a hosting SIG.**
+  Make sure that the problem space is something the SIG is interested in taking
+  up. KEPs should not be checked in without a sponsoring SIG.
+- [ ] **Create an issue in kubernetes/enhancements**
+  When filing an enhancement tracking issue, please make sure to complete all
+  fields in that template. One of the fields asks for a link to the KEP. You
+  can leave that blank until this KEP is filed, and then go back to the
+  enhancement and add the link.
+- [ ] **Make a copy of this template directory.**
+  Copy this template into the owning SIG's directory and name it
+  `NNNN-short-descriptive-title`, where `NNNN` is the issue number (with no
+  leading-zero padding) assigned to your enhancement above.
+- [ ] **Fill out as much of the kep.yaml file as you can.**
+  At minimum, you should fill in the "Title", "Authors", "Owning-sig",
+  "Status", and date-related fields.
+- [ ] **Fill out this file as best you can.**
+  At minimum, you should fill in the "Summary" and "Motivation" sections.
+  These should be easy if you've preflighted the idea of the KEP with the
+  appropriate SIG(s).
+- [ ] **Create a PR for this KEP.**
+  Assign it to people in the SIG who are sponsoring this process.
+- [ ] **Merge early and iterate.**
+  Avoid getting hung up on specific details and instead aim to get the goals of
+  the KEP clarified and merged quickly. The best way to do this is to just
+  start with the high-level sections and fill out details incrementally in
+  subsequent PRs.
+
+Just because a KEP is merged does not mean it is complete or approved. Any KEP
+marked as `provisional` is a working document and subject to change. You can
+denote sections that are under active debate as follows:


feel free to remove comments when no longer needed, to make review (and reading in general) easier

ffromani · 2025-01-08T09:50:18Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+
+Rotate containers logs when there is disk pressure on kubelet host.
+
+## Motivation


we will need something here

ffromani · 2025-01-08T09:51:34Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+<!--
+This is where we get down to the specifics of what the proposal actually is.
+This should have enough detail that reviewers can understand exactly what
+you're proposing, but should not include things like API designs or
+implementation. What is the desired outcome and how do we measure success?.
+The "Design Details" section below is for the real
+nitty-gritty.
+-->


we will need some details here

brainstorming, let's evaluate some ideas:

if the log rotation manager detects that the previous X rotations (1? 2? N?) made the rotated logs exceeds the configured max size, because containers are producing many logs continuously, should this cause kubelet to report disk pressure?

do we want or need to review log retention also? if a rotated log is like 500 megs while it is supposed to be 100 megs (random example numbers), should that cause the kubelet to report disk pressure? arguably, we are consuming more disk space than the user configured, and thus expects.

I am not sure if these should trigger a disk pressure report. That can have a side-effect. Right ?

However, this can definitely be Turned into a warning event from the node level to indicate that something is going wrong. Would that lead to an event storm for some cases? Depends on how frequently we generate the event.

if the log rotation manager detects that the previous X rotations (1? 2? N?) made the rotated logs exceeds the configured max size

This problem becomes even more concerning with high value being set for containerLogMaxFiles.

ffromani · 2025-01-08T09:52:30Z

keps/sig-node/4819-log-rotate-on-disk-pressure/README.md

+
+## Design Details
+
+Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.


big +1, this was my initial thinking as well. Detecting disk pressure should be a trigger for log rotation in addition to the timer-based approach.

ffromani · 2025-01-08T09:55:24Z

keps/sig-node/129447-log-rotate-on-disk-pressure/kep.yaml

+editor: "@Zeel-Patel"
+creation-date: 2025-01-08
+last-updated: 2025-01-08
+reviewers:


please also add @harshanarayana

ffromani · 2025-01-08T09:56:13Z

keps/sig-node/129447-log-rotate-on-disk-pressure/README.md

+
+### Risks and Mitigations
+
+No identified risk.


let's think a bit more about this, especially if we start from beta. What can go wrong?

ffromani · 2025-01-08T09:56:49Z

keps/sig-node/129447-log-rotate-on-disk-pressure/README.md

+    - Will enabling / disabling the feature require downtime or reprovisioning
+      of a node? Yes


will it? why exactly? could you please elaborate?

Suppose this is a function of what your cloud provider lets you do in terms of tuning the kubelet configuration.

For on prem deployments, this definitely is just a kubelet restart away. So, not much of a downtime/re-provisioning

The cloud provider use case is interesting indeed, but from project perspective this looks like the BM use case anyway. I think a good first step is just adding more details on the answer here.

kannon92 · 2025-01-08T16:04:34Z

@Zeel-Patel Please create an issue in kubernetes/enhancement that tracks this KEP. That would be the title of the KEP.

The k/k issue is used as a discussion to create the KEP. Sorry to be pain about this.

harshanarayana · 2025-01-09T12:57:23Z

keps/sig-node/129447-log-rotate-on-disk-pressure/README.md

+
+### Goals
+
+- Rotate and Clean all container logs on kubelet Disk pressure


Suggested change

- Rotate and Clean all container logs on kubelet Disk pressure

- Rotate and Clean all container logs on kubelet Disk pressure that has exceeded the configured log retention quota

harshanarayana · 2025-01-09T12:58:23Z

keps/sig-node/129447-log-rotate-on-disk-pressure/README.md

+
+### What would you like to be added?
+
+The [ContainerLogManager](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/logs/container_log_manager.go#L52-L60), responsible for log rotation and cleanup of log files of containers periodically, should also rotate logs of all containers in case of disk pressure on host.


Suggested change

The [ContainerLogManager](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/logs/container_log_manager.go#L52-L60), responsible for log rotation and cleanup of log files of containers periodically, should also rotate logs of all containers in case of disk pressure on host.

The [ContainerLogManager](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/logs/container_log_manager.go#L52-L60), responsible for log rotation and cleanup of log files of containers periodically, should also rotate logs of all containers that has exceeded the configured log retention quota in case of disk pressure on host.

harshanarayana · 2025-01-09T13:01:03Z

keps/sig-node/129447-log-rotate-on-disk-pressure/README.md

+## Proposal
+
+<!--
+This is where we get down to the specifics of what the proposal actually is.
+This should have enough detail that reviewers can understand exactly what
+you're proposing, but should not include things like API designs or
+implementation. What is the desired outcome and how do we measure success?.
+The "Design Details" section below is for the real
+nitty-gritty.
+-->


Some of the details from the KEP should go into this section.

harshanarayana · 2025-01-09T13:04:44Z

keps/sig-node/129447-log-rotate-on-disk-pressure/README.md

+No
+
+###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
+ContainerLogManager of kubelet will use more CPU cycle then now.


Suggested change

ContainerLogManager of kubelet will use more CPU cycle then now.

CPU cycles usage of ContainerLogManager of kubelet will increase.

sftim · 2025-01-13T11:12:32Z

AIUI there is no safe way to have a process ship logs; it is always possible to have a log volume that fills either local storage or a local buffer. There's no safe means because the rate of application log writes is formally not bounded, whereas buffers and local storage are always finite.

If we provide a way for container log writes to go to a socket or other file descriptor not backed by local storage, there is no need to rotate. That approach doesn't prevent log entries being dropped but it is proof against any local storage exhaustion.

harshanarayana · 2025-01-13T11:29:21Z

@pohly

This is only safe if the process which writes the log data is also the one which rotates because it knows the message boundaries.

Absolutely. Couldn't agree more on this.

I noticed that PR a couple of days ago but how does that help for service that do not really use that bin to run their process? For such, it is containerd that is doing the write. Correct ? https://github.com/containerd/containerd/blob/main/internal/cri/io/logger.go (is my understanding of this incorrect? )

For example, containerd exposes a knob max_container_log_line_size, would it make this better if the rotation were to be done there? Would that give us a slight bit more precision on how accurately we meet the log size per file configuration?

If we provide a way for container log writes to go to a socket or other file descriptor not backed by local storage, there is no need to rotate.

@sftim Most definitely, But this might not always be possible for everyone to configure even it provided as an option. I still remember the days when docker had a whole slew of plugins one can configure for log management that could help avoid writing to the local storage. Or you could create one yourself.

pohly · 2025-01-13T11:54:43Z

I noticed that PR a couple of days ago but how does that help for service that do not really use that bin to run their process?

I doesn't. An app has to be configured explicitly to redirect output to its own files and then not write anything to stdout/stderr... which breaks kubectl logs. I still need to review that PR again in more detail, your comments on the design and usage would be very welcome.

sftim · 2025-01-13T15:50:36Z

whole slew of plugins one can configure for log management

I'd definitely support a KEP to make container logging more extensible / pluggable. The story we have now is OK but not great.

Zeel-Patel · 2025-01-13T20:40:45Z

Honestly, the safest I can think of for mitigating this would be to find a mechanism to truncate at the source ensuring we do not run into partial logs being written and then kubelet can just do the count based cleanup.

On disk pressure, analyse log paths of all containers. Logs of containers with existing combined log size exceeding containerLogMaxSize * containerLogMaxFiles should be deleted till the combined log size is within containerLogMaxSize * containerLogMaxFiles.

what do you think about this proposal?

Zeel-Patel · 2025-01-23T02:47:09Z

@sftim @pohly @harshanarayana @kannon92 @ffromani
Hi folks, does this KEP need any change to get approval or are we not planning to make any change?

harshanarayana · 2025-01-24T07:35:03Z

On disk pressure, analyse log paths of all containers. Logs of containers with existing combined log size exceeding containerLogMaxSize * containerLogMaxFiles should be deleted till the combined log size is within containerLogMaxSize * containerLogMaxFiles.

@Zeel-Patel Yes. But that could also mean that we might end of removing all logs from some container. Right ?

Say a service is generating 10G per second data and the configured retention was 3 files with 100M each. So, if your log files are all above 300M(which is very likely to happens based on a very high log write rate), you will end up removing all 3 files, Which might be against the expected behavior. How do we protect against such cleanup ? Or should we even protect against such cleanup ?

harshanarayana · 2025-01-24T07:37:37Z

I'd definitely support a KEP to make container logging more extensible / pluggable. The story we have now is OK but not great.

@sftim Would this be a kubelet side of configuration or would this be a containerd side of the config :? Would fit more into containerd side right ? Say a configurable log write destination that is not just file based/a more controlled way to write to a file based destination with some more guard rails ?

sftim · 2025-01-24T10:33:24Z

For example, kubelet could send a memfd to the container runtime per container, which gets used as a ring buffer.

Handily, the kubelet can pass a read-only capability for the same memfd to a log helper daemon, with a second memfd used to mark the ring locations.

There are lots of viable designs.

Zeel-Patel · 2025-01-27T09:09:59Z

On disk pressure, analyse log paths of all containers. Logs of containers with existing combined log size exceeding containerLogMaxSize * containerLogMaxFiles should be deleted till the combined log size is within containerLogMaxSize * containerLogMaxFiles.

@Zeel-Patel Yes. But that could also mean that we might end of removing all logs from some container. Right ?

Say a service is generating 10G per second data and the configured retention was 3 files with 100M each. So, if your log files are all above 300M(which is very likely to happens based on a very high log write rate), you will end up removing all 3 files, Which might be against the expected behavior. How do we protect against such cleanup ? Or should we even protect against such cleanup ?

We can split the large file and only keep the data which is under limit.

kannon92 · 2025-01-31T16:55:51Z

Hey @Zeel-Patel!

Thank you for leading this effort. I think the story around log management in Kubelet needs some thought. We have had quite a few proposals around improving this.

Some contributors have asked that we disable log rotation entirely (ie kubernetes/kubernetes#123279 from @rikatz). We also had the work from @harshanarayana to add parallel workers to help maintain log rotations. And now we have your request on adding log rotation on disk pressure. I know that there is also interest in moving logs to a separate partition from kubelet but not sure if there is tracking work for that.

We briefly talked about this in our sig-node planning call with @dchen1107 that our story around log management in Kubernetes is not that clear. I think this can be a topic for sig-node and we should discuss our stance on this.

kannon92 · 2025-01-31T16:56:17Z

/hold

Let's meet and discuss #5022 (comment) before we continue on with this KEP.

Zeel-Patel · 2025-02-04T16:35:46Z

/hold

Let's meet and discuss #5022 (comment) before we continue on with this KEP.

kubernetes/kubernetes#129922
As a short term fix, we can think about something like this. I have updated the KEP accordingly

dchen1107 · 2025-02-04T18:57:03Z

cc/ @SergeyKanzhelev can we treat this as part of node reliability enhancement project? cc/ @yujuhong for vis. too.

harshanarayana · 2025-02-05T06:49:20Z

xref: kubernetes/kubernetes#129975

More items related to the log rotate workflows.

kannon92 · 2025-02-05T16:17:16Z

@Zeel-Patel is the KEP up-to-date with your current thinking?

Design details talk about imagefs/nodefs.

yujuhong · 2025-02-05T20:16:39Z

Adding @leonzz to help review the potential impact on the log collection pipeline.

Zeel-Patel · 2025-02-06T18:06:08Z

@Zeel-Patel is the KEP up-to-date with your current thinking?

Design details talk about imagefs/nodefs.

Hi @kannon92 , sorry, missed. Updated the KEP and added more details. Please re-review

leonzz · 2025-02-07T20:07:59Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+3. It does cleanup to delete all non-rotated (rotated logs have time suffix in name) and .tmp files generated (and if not deleted) in last log rotation.
+4. Then it rotates un rotated files (it does not rotate the active file) and deletes oldest rotated files till containerLogMaxFiles-2 files are left. This is because the non rotated active file be rotated and new active file will be created. Which will add upto containerLogMaxFiles.
+5. Before doing this exact above step, in the new design, it will split the large active log file in size of containerLogMaxSize and name them <active-log-file-name>.part<i>.
+6. Let's say it created n parts, after this, it can do rotate of n-1 parts, keeping last nth part active and do delete. (Basically step 4)


Adding @leonzz to help review the potential impact on the log collection pipeline.

keeping last nth part active

If you truncate a 11M container.log in place to 1M and copy out the first 10M to container.log.part1, it could potentially mess up the read position with some log collectors. For example fluentbit would re-read the 1M content after the truncation, resulting in duplicated logs, because it would always read from beginning when truncation is detected (see the discussion in fluent/fluent-bit#375 (comment))

For log collectors to behave correctly, it's better to just create a new empty active file but this means the nth part of the last active file could be much smaller than containerLogMaxSize

@leonzz
Can we think about keeping the tailing logs to the last part itself till the size is about containerLogMaxSize/2.
Create a new part for tailing logs only if it is more than containerLogMaxSize/2.

And create entirely new active file always.

Create a new part for tailing logs only if it is more than containerLogMaxSize/2.

To me this sounds like a reasonable tradeoff to always create new active files.

So we could have files sizes anywhere in between 0.5 to 1.5 times containerLogMaxSize, which I assume can still resolve the problems you encounter with super large files?

sftim

Here's some feedback. I'm not sure what quality level we expect for provisional KEPs, but I hope it's helpful even if it's optional to accept.

sftim · 2025-02-10T14:12:18Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+
+## Motivation
+
+- We manage kubernetes ecosystem at our organization. A lot of our kubelet hosts experienced Disk pressure as a certain set of pods was generating very high logs. The rate was around 3-4Gib logs generated in 15 minutes. We had kubelet configs containerLogMaxSize set to 200Mib and containerLogMaxFiles set to 6. But the .gz files (tar log files of pods) were of size around 500-600Gib. We observed that container log rotation was slow for us.


This is very specific and doesn't represent Kubernetes as an organization. @Zeel-Patel you should reword this motivation to cover the Kubernetes project motivation to make this change.

sftim · 2025-02-10T14:12:33Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+
+### What would you like to be added?
+
+We expect that the log file size is always under set kubelet config limit (i.e., containerLogMaxSize) which can help such disk pressure issues in the future.


In this paragraph, who does “we“ refer to?

sftim · 2025-02-10T14:13:29Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+### Spec 1
+
+Continuously generating 10Mib with 0.1 sec sleep in between


Suggested change

### Spec 1

Continuously generating 10Mib with 0.1 sec sleep in between

### Example 1

This manifest defines a Job that continuously generates 10MiB of logs with 0.1 sec sleep in between (100MiB/s):

sftim · 2025-02-10T14:13:52Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+### Spec 2
+
+Continuously generating 10Mib with 10 sec sleep in between


Suggested change

### Spec 2

Continuously generating 10Mib with 10 sec sleep in between

### Example 2

This manifest defines a Job that continuously generates 10MiB of logs with 10 sec sleep in between (1MiB/s mean):

sftim · 2025-02-10T14:14:29Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+
+### Goals
+
+- There is a ContainerLogManager in every kubelet. It runs an infinite go routine and checks active log file(on which all container log read write happens) size. If that exceeds the above mentioned limit (containerLogMaxSize), it starts parallel workers. Each worker creates a tar of the active file, deletes tars till there are containerLogMaxFiles files in total and creates a new active file for container.


This doesn't sound like a goal, maybe that belongs in the Proposal section.

sftim · 2025-02-10T14:15:59Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+
+

You need to answer this.

sftim · 2025-02-10T14:16:22Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+      of a node? No, restart of kubelet with updated configurations and version should work.
+
+###### Does enabling the feature change any default behavior?
+No


Is this true?

sftim · 2025-02-10T14:16:47Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+### Monitoring Requirements
+
+###### How can an operator determine if the feature is in use by workloads?
+Emit cleanup logs.


This doesn't explain how an operator can tell whether the feature is in use.

sftim · 2025-02-10T14:17:38Z

keps/sig-node/5032-log-eviction-on-disk-pressure/kep.yaml

@@ -0,0 +1,18 @@
+title: Log rotate on Disk pressure


To match the issue, try:

Suggested change

title: Log rotate on Disk pressure

title: Container log Split and Rotate to avoid Disk pressure

or

retitle the issue

sftim · 2025-02-10T14:17:53Z

keps/sig-node/5032-log-eviction-on-disk-pressure/README.md

+## Alternatives
+
+### Only Rotation on disk pressure
+Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.


(nit)

Suggested change

Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.

Define 2 new parameters `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.

KEP-4819: Container log rotation on Disk perssure

a956e96

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 7, 2025

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 7, 2025

k8s-ci-robot requested review from dchen1107 and derekwaynecarr January 7, 2025 20:34

k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Jan 7, 2025

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 7, 2025

kannon92 reviewed Jan 7, 2025

View reviewed changes

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 7, 2025

harshanarayana reviewed Jan 8, 2025

View reviewed changes

Change KEP number to k8s issue number

a4a1305

Zeel-Patel changed the title ~~KEP-4819: Container log rotation on Disk perssure~~ KEP-129447: Container log rotation on Disk perssure Jan 8, 2025

ffromani reviewed Jan 8, 2025

View reviewed changes

harshanarayana reviewed Jan 9, 2025

View reviewed changes

address comments

a017152

Zeel-Patel mentioned this pull request Jan 9, 2025

Container log Split and Rotate to avoid Disk pressure #5032

Open

4 tasks

Link k/e issue in KEP

f390c3e

Zeel-Patel changed the title ~~KEP-129447: Container log rotation on Disk perssure~~ KEP-5032: Container log rotation on Disk perssure Jan 9, 2025

Zeel-Patel added 2 commits January 10, 2025 01:20

add motivation

aeee67c

address comments

a2cc4bd

Zeel-Patel requested a review from ffromani January 10, 2025 17:31

Zeel-Patel changed the title ~~KEP-5032: Container log rotation on Disk pressure~~ KEP-5032: Container log eviction on Disk pressure Jan 23, 2025

rename KEP

46dfaaf

Zeel-Patel changed the title ~~KEP-5032: Container log eviction on Disk pressure~~ KEP-5032: Container log Split and Rotation to avoid Disk pressure Jan 31, 2025

update as per commit

33b53ff

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2025

address comments

6b5e77c

Zeel-Patel mentioned this pull request Feb 6, 2025

Split and Rotate container logs kubernetes/kubernetes#129922

Open

leonzz reviewed Feb 7, 2025

View reviewed changes

fix formattings

c8787c3

Zeel-Patel requested a review from leonzz February 9, 2025 13:18

add reviewer

ff34ee6

sftim reviewed Feb 10, 2025

View reviewed changes


		## Design Details

		Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.


		Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.

		- `logRotateDiskCheckInterval` is the time interval within which the ContainerLogManager will check Disk usage on the kubelet host.



		If the pod had been generating logs in Gigabytes with minimal delay, it can cause disk pressure on kubelet host and that can affect other pods running in the same kubelets.


		### Goals

		- Rotate and Clean all container logs on kubelet Disk pressure


		Rotate containers logs when there is disk pressure on kubelet host.

		## Motivation

		- Will enabling / disabling the feature require downtime or reprovisioning
		of a node? Yes


		### What would you like to be added?

		The [ContainerLogManager](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/logs/container_log_manager.go#L52-L60), responsible for log rotation and cleanup of log files of containers periodically, should also rotate logs of all containers in case of disk pressure on host.

	ContainerLogManager of kubelet will use more CPU cycle then now.
	CPU cycles usage of ContainerLogManager of kubelet will increase.


		## Motivation

		- We manage kubernetes ecosystem at our organization. A lot of our kubelet hosts experienced Disk pressure as a certain set of pods was generating very high logs. The rate was around 3-4Gib logs generated in 15 minutes. We had kubelet configs containerLogMaxSize set to 200Mib and containerLogMaxFiles set to 6. But the .gz files (tar log files of pods) were of size around 500-600Gib. We observed that container log rotation was slow for us.


		### What would you like to be added?

		We expect that the log file size is always under set kubelet config limit (i.e., containerLogMaxSize) which can help such disk pressure issues in the future.

		### Spec 1

		Continuously generating 10Mib with 0.1 sec sleep in between

		### Spec 2

		Continuously generating 10Mib with 10 sec sleep in between


		### Goals

		- There is a ContainerLogManager in every kubelet. It runs an infinite go routine and checks active log file(on which all container log read write happens) size. If that exceeds the above mentioned limit (containerLogMaxSize), it starts parallel workers. Each worker creates a tar of the active file, deletes tars till there are containerLogMaxFiles files in total and creates a new active file for container.

	title: Log rotate on Disk pressure
	title: Container log Split and Rotate to avoid Disk pressure

	Define 2 new flags `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.
	Define 2 new parameters `logRotateDiskCheckInterval`, `logRotateDiskPressureThreshold` in kubelet config.

KEP-5032: Container log Split and Rotation to avoid Disk pressure #5022

Are you sure you want to change the base?

KEP-5032: Container log Split and Rotation to avoid Disk pressure #5022

Conversation

Zeel-Patel commented Jan 7, 2025 • edited Loading

k8s-ci-robot commented Jan 7, 2025

k8s-ci-robot commented Jan 7, 2025

k8s-ci-robot commented Jan 7, 2025

Zeel-Patel commented Jan 7, 2025

Choose a reason for hiding this comment

Zeel-Patel Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zeel-Patel Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kannon92 Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kannon92 commented Jan 7, 2025

harshanarayana left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zeel-Patel Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kannon92 commented Jan 8, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sftim commented Jan 13, 2025

harshanarayana commented Jan 13, 2025 • edited Loading

pohly commented Jan 13, 2025

sftim commented Jan 13, 2025

Zeel-Patel commented Jan 13, 2025 • edited Loading

Zeel-Patel commented Jan 23, 2025

harshanarayana commented Jan 24, 2025 • edited Loading

harshanarayana commented Jan 24, 2025

sftim commented Jan 24, 2025 • edited Loading

Zeel-Patel commented Jan 27, 2025 • edited Loading

kannon92 commented Jan 31, 2025

kannon92 commented Jan 31, 2025

Zeel-Patel commented Feb 4, 2025 • edited Loading

dchen1107 commented Feb 4, 2025

harshanarayana commented Feb 5, 2025

kannon92 commented Feb 5, 2025 • edited Loading

yujuhong commented Feb 5, 2025

Zeel-Patel commented Feb 6, 2025

leonzz Feb 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sftim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zeel-Patel commented Jan 7, 2025 •

edited

Loading

Zeel-Patel Jan 8, 2025 •

edited

Loading

Zeel-Patel Jan 9, 2025 •

edited

Loading

kannon92 Jan 10, 2025 •

edited

Loading

Zeel-Patel Jan 8, 2025 •

edited

Loading

harshanarayana commented Jan 13, 2025 •

edited

Loading

Zeel-Patel commented Jan 13, 2025 •

edited

Loading

harshanarayana commented Jan 24, 2025 •

edited

Loading

sftim commented Jan 24, 2025 •

edited

Loading

Zeel-Patel commented Jan 27, 2025 •

edited

Loading

Zeel-Patel commented Feb 4, 2025 •

edited

Loading

kannon92 commented Feb 5, 2025 •

edited

Loading

leonzz Feb 7, 2025 •

edited

Loading