Skip to content

Update ece-manage-capacity for Podman limitation #1418

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 16, 2025
Merged

Conversation

frconil
Copy link
Contributor

@frconil frconil commented May 15, 2025

CPU quotas are not updated, and cannot be set to -1 (disable CPU hard limit) when the underlying container platform is podman.

cc @kunisen

@eedugon eedugon self-assigned this May 22, 2025
@eedugon eedugon requested a review from a team May 22, 2025 09:00
@eedugon eedugon added the Team:Admin Issues owned by the Admin Docs Team label May 22, 2025
@eedugon
Copy link
Contributor

eedugon commented May 23, 2025

thanks a lot for bringing this @frconil , and for sharing in private the reasoning behind not being able to customize the CPU quotas on Podman based installations.

As talked I'll tweak a bit the changes and share them with you before merging. I'd like to:

Update: both actions have been discarded, because:

  • Resource overrides might not be affected by this (system creates new containers and this limitation might not affect there. pending confirmation).
  • The banner at change capacity API (in the memory section) is in the right place, as it's the action that triggers CPU quotas to be re-calculated.

@eedugon eedugon force-pushed the frconil-patch-2 branch from cab411b to cb23414 Compare May 27, 2025 10:55
@eedugon
Copy link
Contributor

eedugon commented May 27, 2025

@kunisen , @frconil , I have a question about this change.

As far as I see in the private slack conversation shared by @frconil , what we want to explain in the docs is that disabling the CPU hard limit in Podman systems is not possible.

But we don't have any document that describes how to tweak the CPU hard limit (which is something we can do only in the advanced editor if i'm not mistaken).

Why are we proposing this change in the doc that talks about Allocators capacity configuration?
I was also thinking about the doc Resource Overrides, but that doc is focused on what the UI allows to override instances, and the UI doesn't show the CPU hard limit enable / disable option.

I'm totally ok with adding a message like:
CPU quotas cannot be disabled (CPU hard limit) or updated when running ECE on Podman.

But we need to ensure the message doesn't create confusion, as a reader running ECE on Docker might ask... and how can be disabled or updated in ECE running on Docker?

Any proposal to make this clearer?

Shouldn't this belong better (perhaps) to the Known Issues of ECE instead of the docs? Like explaining that CPU hard limit disabling doesn't really work on Podman systems?

@eedugon
Copy link
Contributor

eedugon commented May 27, 2025

Oh, I think the note is applicable to the allocator capacity change that @frconil was suggesting, because if the user changes the capacity of an allocator, the quotas won't be automatically updated for existing instances in Podman systems (is my understanding right?).

@eedugon
Copy link
Contributor

eedugon commented May 28, 2025

I've created a different PR for a different issue on the same page. We might get a small conflict when merged.

On this, we are still waiting to get some responses from dev, to understand better the implication of this limitation.

The text that I'm planning to propose will be similar to:

When running ECE on Podman, CPU quotas for existing instances cannot be removed (CPU hard limit) or updated. As a result, changing an allocator’s capacity won’t affect the CPU quotas of already running containers.

But we need to get confirmation if that's the expected outcome, like new / future instances really getting updated quota calculations or not.

@eedugon eedugon changed the title Update ece-manage-capacity.md Update ece-manage-capacity for Podman limitation May 29, 2025
@eedugon eedugon added the ece Elastic Cloud Enterprise label May 29, 2025
@eedugon
Copy link
Contributor

eedugon commented Jun 3, 2025

I'd like to merge first #1511 and #1507 before deciding final text on this one.

eedugon added a commit that referenced this pull request Jun 16, 2025
This PR introduces the CPU hard limit configuration (enable / disable)
within the `Resource overrides` page, and it describes that it doesn't
have any effect on Podman hosts.

I have also refined the introduction of the page.

Preview:
- [Resource
Overrides](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/1511/deploy-manage/deploy/cloud-enterprise/resource-overrides)

The reason of adding this is:

- This is available in ECE UI for all users and we didn't have it
documented anywhere (thanks @frconil for pointing this out). So there's
no point of `hiding` this button of the UI.
- We have some public KB articles that also mention this setting (CPU
hard limit toggling).
- We now want to introduce a banner saying that this doesn't work in
Podman, but we didn't have this even documented at all.

I have added some warnings and implications of using this setting, and
recommending to do it only when being guided by Support.

This PR will also be aligned with
#1418 for the Podman
limitation.

---------

Co-authored-by: shainaraskas <[email protected]>
Copy link

github-actions bot commented Jun 16, 2025

🔍 Preview links for changed docs:

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

Copy link
Contributor

@eedugon eedugon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eedugon
Copy link
Contributor

eedugon commented Jun 16, 2025

@frconil : please take a look at this final proposal. If you agree with it I'll ping devs for a final / quick review here.

Note that I've moved the ECE 3.5.0 statement about CPU quotas to the CPU quotas heading instead of leaving it in the memory related section. In the memory section it also felt repetitive at the section already explains that for ECE versions pre-3.5.0 users must reinstall and not update the capacity.

Copy link
Contributor

@matt-elastic matt-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eedugon eedugon merged commit 31031da into main Jun 16, 2025
7 checks passed
@eedugon eedugon deleted the frconil-patch-2 branch June 16, 2025 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ece Elastic Cloud Enterprise Team:Admin Issues owned by the Admin Docs Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants