Skip to content

Problem with operating systems that use cgroup v2 related to cpu speed.  #6744

@correajl

Description

@correajl
ISSUE TYPE
  • Improvement Request
COMPONENT NAME
Cloudstack management and cloudstack agent. 
CLOUDSTACK VERSION
4.17.0.1
CONFIGURATION

2 management servers, 2 databases, advanced network, everything working fine.

OS / ENVIRONMENT

Ubuntu Server 22.04 LTS, KVM, libvirt 8.0.0-1ubuntu7.1, cgroup2.

SUMMARY

The value of "CPU (in MHz)" used in some compute offering definitions is mapped to 'shares' element in 'cputune' section of domain definition file (xml). According to libvirt documentation the value should be in range [2, 262144]. But, for operating systems using cgroup v2 the maximum value is 10000. I know that Ubuntu 22.04 I'm using here is not supported yet. But this will be an issue as other OSs adopt cgroup v2 too. So, I think this parameter deserves attention. If the value of (N. CPU) * CPU (in MHz) is greater than 10000 I get "Value specified in CPUWeight is out of range" in hypervisor.

As a workaround I configured service offering with 1 Mhz. This implies that VMs with more CPUs have much more chance to get the CPU than VMs with lower number of CPUs, because besides having more CPUs, they have more chance to get the host' CPUs.

VM1 10 CPU * 1 MHz -> shares = 10
VM2 80 CPU * 1 MHz -> shares = 80

If we look at 1st CPU of VM2, it will have 8 times more chance to get the host's CPU than 1st CPU of VM1.

Shouldn't all virtual CPUs have the same chance of using one host CPU?

STEPS TO REPRODUCE

Using Ubuntu 22.04 or other OS that uses cgroup v2, create a compute offering with a CPU (in MHz) value of 1000. Try to launch an instance with more than 10 CPUs. It will be generated a configuration file for the new instance with shares > 10000 inside cputune section. Hypervisor can't launch the instance with Value specified in CPUWeight is out of range error.

EXPECTED RESULTS

Create a way for all virtual CPUs to have an equal chance of using a host CPU.
Check cgroup version and try to not generate values bigger than 10000.

ACTUAL RESULTS

Working in this way when the CPU (in MHz) is mapped to the 'shares' element, instances with fewer CPUs are always penalized. Sometimes hosts can't launch instances.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions