Skip to content

Feedback on how to make custom changes to slurm config file #6648

Open
@adebayoj

Description

@adebayoj

Hi, we are currently using parallel cluster as a SLURM cluster with a capacity reservation of 3 p4de.24xlarge instances. We've been running into certain issues, but we couldn't find clear feedback on how to address them, so we wanted to check here. I have included our cluster-config-yaml file, and would appreciate feedback.

Problem 1: Custom Changes to config files

We would like to make changes to the slurm config files to enable certain behavior. Specificially, we would like to enable the following:

# temp environment changes
PrologFlags             = Alloc,Contain,X11
JobContainerType        = job_container/tmpfs

# this might help make it so that nvidia-smi is isolated
ConstrainDevices        = yes
ConstrainRAMSpace       = yes

# For OOM containment
JobAcctGatherType       = jobacct_gather/cgroup
JobAcctGatherParams     = NoOverMemoryKill

# make salloc call srun for interactive jobs
LaunchParameters        = use_interactive_step [or] use_interactive_step,enable_nss_slurm

However, we've found that we can't set these parameters through the CustomSlurmSettings option. For temp environments, it seems like we might need to create a custom job_container.conf file. However, I currently see no way to do this via the config file.

Question: can we manually enable all of these options ourselves without repercussions? What would you suggest?

Problem 2: Separate partition for root (/) and how to enable usrquota.

We would like to mount root (/) to a separate file system through Lustre or something else. However, it currently says that only a single lustre file system can be used as a part of a given installation. Secondly, we would like to constrain the size of each user's home directory to be a particular size. Can you share how we can enable this programmatically? We could do this manually as described here: http://www.yolinux.com/TUTORIALS/LinuxTutorialQuotas.html, but we are wondering if there are any other alternatives?

Thanks for the help.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions