aws-samples
diff --git a/‎README.md‎
Lines changed: 13 additions & 11 deletions b/‎README.md‎
Lines changed: 13 additions & 11 deletions
diff --git a/‎docs/delete-cluster.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/delete-cluster.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/deploy.md‎ ‎docs/deploy-legacy-cluster.md‎docs/deploy.md renamed to docs/deploy-legacy-cluster.md
Lines changed: 77 additions & 8 deletions b/‎docs/deploy.md‎ ‎docs/deploy-legacy-cluster.md‎docs/deploy.md renamed to docs/deploy-legacy-cluster.md
Lines changed: 77 additions & 8 deletions
diff --git a/‎docs/deploy-parallel-cluster.md‎
Lines changed: 156 additions & 0 deletions b/‎docs/deploy-parallel-cluster.md‎
Lines changed: 156 additions & 0 deletions
@@ -2,7 +2,7 @@
 
 This repository contains an AWS Cloud Development Kit (CDK) application that creates a Slurm cluster that is suitable for running production EDA workloads on AWS.
 
-The original version of this repo used a custom Python plugin to integrate Slurm with AWS.
+The original (legacy) version of this repo used a custom Python plugin to integrate Slurm with AWS.
 The latest version of the repo uses AWS ParallelCluster for the core Slurm infrastructure and AWS integration.
 The big advantage of moving to AWS ParallelCluster is that it is a supported AWS service.
 Currently, some of the features of the legacy version are not supported in the ParallelCluster version, but
@@ -16,29 +16,32 @@ Key features are supported by both versions are:
 * Handling of spot terminations
 * Handling of insufficient capacity exceptions
 * Batch and interactive partitions (queues)
-* Managed tool licenses as a consumable resource
+* Manages tool licenses as a consumable resource
 * User and group fair share scheduling
 * Slurm accounting database
 * CloudWatch dashboard
 * Job preemption
 * Manage on-premises compute nodes
-* Configure partitions (queues) and nodes that are always on to support reserved instances RIs and savings plans.
+* Configure partitions (queues) and nodes that are always on to support reserved instances (RIs) and savings plans (SPs).
 
 Features in the legacy version and not in the ParallelCluster version:
 
-* Multi-AZ support. Supported by ParallelCluster, but not implemented.
+* Heterogenous clusters with mixed OSes and CPU architectures on compute nodes.
+* Multi-AZ support. Supported by ParallelCluster, but not currently implemented.
+* Multi-region support
 * AWS Fault Injection Simulator (FIS) templates to test spot terminations
-* Heterogenous cluster with mixed OSes and CPU architectures on compute nodes.
 * Support for MungeKeySsmParameter
 * Multi-cluster federation
-* Multi-region support
 
 ParallelCluster Limitations
 
-* Number of "Compute Resources" is limited to 50 which limits the number of instance types allowed in a cluster.
+* Number of "Compute Resources" (CRs) is limited to 50 which limits the number of instance types allowed in a cluster.
+  ParallelCluster can have multiple instance types in a CR, but with memory based scheduling enabled, they must all have the same number of cores and amount of memory.
 * All Slurm instances must have the same OS and CPU architecture.
 * Stand-alone Slurm database daemon instance. Prevents federation.
-* Multi-region support. This is unlikely to change because multi-region services run against our archiectural philosophy. Federation may be a better option
+* Multi-region support. This is unlikely to change because multi-region services run against our archiectural philosophy.
+  Federation may be an option but its current implementation limits scheduler performance and doesn't allow cluster prioritization so jobs land on
+  random clusters.
 
 Slurm Limitations
 
@@ -96,14 +99,13 @@ Legacy:
 * Rocky Linux 8 and arm64
 * Rocky Linux 8 and x86_64
 
-Note that in the ParallelCluster version all compute nodes must have the same OS and architecture.
+Note that in the ParallelCluster version, all compute nodes must have the same OS and architecture.
 
 ## Documentation
 
 [View on GitHub Pages](https://aws-samples.github.io/aws-eda-slurm-cluster/)
 
-To view the docs locally, clone the repository and run mkdocs:
-
+You can also view the docs locally,
 The docs are in the docs directory. You can view them in an editor or using the mkdocs tool.
 
 I recommend installing mkdocs in a python virtual environment.
 
@@ -1,4 +1,4 @@
-# Delete Cluster
+# Delete Cluster (legacy)
 
 Most of the resources can be deleted by simply deleting the cluster's CloudFormation stack.
 However, there a couple of resources that must be manually deleted:
 
@@ -1,9 +1,11 @@
-# Deploy the Cluster
+# Deploy Legacy Cluster
 
 The original (legacy) version used a custom Slurm plugin for orchestrating the EC2 compute nodes.
 The latest version uses ParallelCluster to provision the core Slurm infrastructure.
-When using ParallelCluster, a ParallelCluster configuration will be generated and use to create a ParallelCluster slurm cluster.
+When using ParallelCluster, a ParallelCluster configuration will be generated and used to create a ParallelCluster slurm cluster.
 The first supported ParallelCluster version is 3.6.0.
+Version 3.7.0 is the recommended minimum version because it support compute node weighting that is proportional to instance type
+cost so that the least expensive instance types that meet job requirements are used.
 
 ## Prerequisites
 
@@ -15,6 +17,10 @@ You will needs AWS credentials that provide admin access to deploy the cluster.
 
 Clone or download the aws-eda-slurm-cluster repository to your system.
 
+```
+git clone [email protected]:aws-samples/aws-eda-slurm-cluster.git
+```
+
 ### Make sure required packages are installed
 
 ```
@@ -38,24 +44,25 @@ remediation or create a support ticket.
 
 ## Deploy Using ParallelCluster
 
-### Create ParallelCluster UI
+### Create ParallelCluster UI (optional but recommended)
 
 It is highly recommended to create a ParallelCluster UI to manage your ParallelCluster clusters.
 A different UI is required for each version of ParallelCluster that you are using.
 The versions are list in the [ParallelCluster Release Notes](https://docs.aws.amazon.com/parallelcluster/latest/ug/document_history.html).
 The minimum required version is 3.6.0 which adds support for RHEL 8 and increases the number of allows queues and compute resources.
-The suggested version is at least 3.7.0 because it adds configurate compute node weights which we use to prioritize the selection of
+The suggested version is at least 3.7.0 because it adds configurable compute node weights which we use to prioritize the selection of
 compute nodes by their cost.
 
 The instructions are in the [ParallelCluster User Guide](https://docs.aws.amazon.com/parallelcluster/latest/ug/install-pcui-v3.html).
 
 ### Create ParallelCluster Slurm Database
 
 The Slurm Database is required for configuring Slurm accounts, users, groups, and fair share scheduling.
-It you need these and other features then you will need to create ParallelCluster Slurm Database.
+It you need these and other features then you will need to create a ParallelCluster Slurm Database.
+You do not need to create a new database for each cluster; multiple clusters can share the same database.
 Follow the directions in this [ParallelCluster tutorial to configure slurm accounting](https://docs.aws.amazon.com/parallelcluster/latest/ug/tutorials_07_slurm-accounting-v3.html#slurm-accounting-db-stack-v3).
 
-### Configuration File
+### Create Configuration File
 
 The first step in deploying your cluster is to create a configuration file.
 A default configuration file is found in [source/resources/config/default_config.yml](https://github.com/aws-samples/aws-eda-slurm-cluster/blob/main/source/resources/config/default_config.yml).
@@ -170,10 +177,72 @@ with command line arguments, however it is better to specify all of the paramete
 ./install.sh --config-file <config-file> --cdk-cmd create
 ```
 
-This will create the ParallelCuster configuration file, store it in S3, and use it to create a cluster.
+This will create the ParallelCuster configuration file, store it in S3, and then use a lambda function to create the cluster.
+
+If you look in CloudFormation you will see 2 new stacks when deployment is finished.
+The first is the configuration stack and the second is the cluster.
+
+## Create users_groups.json
+
+Before you can use the cluster you must configure the Linux users and groups for the head and compute nodes.
+One way to do that would be to join the cluster to your domain.
+But joining each compute node to a domain effectively creates a distributed denial of service (DDOS) attack on the demain controller
+when the cluster rapidly scales out or in and each node tries to join or leave the domain.
+This can lead to domain controller timeouts and widespread havoc in your environment.
+
+To solve this problem a script runs on a server that is joined to the domain which writes a JSON file with all
+of the non-privileged users and groups and their respective uids and gids.
+A script and cron job on the head and compute nodes reads this json file to create local users and groups that match the domain-joined servers.
+
+Select the server that you want to use to create and update the JSON file.
+The outputs of the configuration stack have the commands required.
+
+| Config Stack Output                     | Description
+|-----------------------------------------|------------------
+| Command01SubmitterMountHeadNode         | Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.
+| Command02CreateUsersGroupsJsonConfigure | Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly.
+
+Before deleting the cluster you can undo the configuration by running the commands in the following outputs.
+
+| Config Stack Output                       | Description
+|-------------------------------------------|------------------
+| command10CreateUsersGroupsJsonDeconfigure | Removes the crontab that refreshes users_groups.json.
+
+Now the cluster is ready to be used by sshing into the head node or a login node, if you configured one.
+
+If you configured extra file systems for the cluster that contain the users' home directories, then they should be able to ssh
+in with their own ssh keys.
+
+## Configure submission hosts to use the cluster
+
+ParallelCluster was built assuming that users would ssh into the head node or login nodes to execute Slurm commands.
+This can be undesirable for a number of reasons.
+First, users shouldn't be given ssh access to a critical infrastructure like the cluster head node.
+With ParallelCluster 3.7.0 you can configure login nodes, but if you have already provisioned desktop nodes then
+it's wasteful to have to provision login nodes.
+Second, it's just inconvenient to have to use ssh to access the cluster and use it.
+
+Fortunately, you can configure any server as a submission host so that users can run slurm commands.
+These commands must be run by an administrator that has root access to the submission host.
+The commands could also be run to create a custom AMI for user desktops so that they can access the clusters.
+The commands to configure submission hosts are in the outputs of the configuration CloudFormation stack.
+Run them in the following order:
+
+| Config Stack Output                     | Description
+|-----------------------------------------|------------------
+| Command01SubmitterMountHeadNode         | Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.
+| Command03SubmitterConfigure             | Configure the submission host so it can directly access the Slurm cluster.
+
+The first command simply mounts the head node's NFS file system so you have access to the Slurm commands and configuration.
+
+The second command runs an ansible playbook that configures the submission host so that it can run the Slurm commands for the cluster.
+It also configures the modulefile that sets up the environment to use the slurm cluster.
+
+The clusters have been configured so that a submission host can use more than one cluster by simply changing the modulefile that is loaded.
 
+On the submission host just open a new shell and load the modulefile for your cluster and you can access Slurm.
 
-### Customize the compute node AMI
+## Customize the compute node AMI
 
 The easiest way to create a custom AMI is to find the default ParallelCluster AMI in the UI.
 Create an instance using the AMI and make whatever customizations you require such as installing packages and
 
@@ -0,0 +1,156 @@
+# Deploy ParallelCluster
+
+The original (legacy) version used a custom Slurm plugin for orchestrating the EC2 compute nodes.
+The latest version uses ParallelCluster to provision the core Slurm infrastructure.
+When using ParallelCluster, a ParallelCluster configuration will be generated and used to create a ParallelCluster slurm cluster.
+The first supported ParallelCluster version is 3.6.0.
+Version 3.7.0 is the recommended minimum version because it support compute node weighting that is proportional to instance type
+cost so that the least expensive instance types that meet job requirements are used.
+
+## Prerequisites
+
+See [Deployment Prerequisites](deployment-prerequisites.md) page.
+
+The following are prerequisites that are specific to ParallelCluster.
+
+### Create ParallelCluster UI (optional but recommended)
+
+It is highly recommended to create a ParallelCluster UI to manage your ParallelCluster clusters.
+A different UI is required for each version of ParallelCluster that you are using.
+The versions are list in the [ParallelCluster Release Notes](https://docs.aws.amazon.com/parallelcluster/latest/ug/document_history.html).
+The minimum required version is 3.6.0 which adds support for RHEL 8 and increases the number of allows queues and compute resources.
+The suggested version is at least 3.7.0 because it adds configurable compute node weights which we use to prioritize the selection of
+compute nodes by their cost.
+
+The instructions are in the [ParallelCluster User Guide](https://docs.aws.amazon.com/parallelcluster/latest/ug/install-pcui-v3.html).
+
+### Create ParallelCluster Slurm Database
+
+The Slurm Database is required for configuring Slurm accounts, users, groups, and fair share scheduling.
+It you need these and other features then you will need to create a ParallelCluster Slurm Database.
+You do not need to create a new database for each cluster; multiple clusters can share the same database.
+Follow the directions in this [ParallelCluster tutorial to configure slurm accounting](https://docs.aws.amazon.com/parallelcluster/latest/ug/tutorials_07_slurm-accounting-v3.html#slurm-accounting-db-stack-v3).
+
+## Create the Cluster
+
+To install the cluster run the install script. You can override some parameters in the config file
+with command line arguments, however it is better to specify all of the parameters in the config file.
+
+```
+./install.sh --config-file <config-file> --cdk-cmd create
+```
+
+This will create the ParallelCuster configuration file, store it in S3, and then use a lambda function to create the cluster.
+
+If you look in CloudFormation you will see 2 new stacks when deployment is finished.
+The first is the configuration stack and the second is the cluster.
+
+## Create users_groups.json
+
+Before you can use the cluster you must configure the Linux users and groups for the head and compute nodes.
+One way to do that would be to join the cluster to your domain.
+But joining each compute node to a domain effectively creates a distributed denial of service (DDOS) attack on the demain controller
+when the cluster rapidly scales out or in and each node tries to join or leave the domain.
+This can lead to domain controller timeouts and widespread havoc in your environment.
+
+To solve this problem a script runs on a server that is joined to the domain which writes a JSON file with all
+of the non-privileged users and groups and their respective uids and gids.
+A script and cron job on the head and compute nodes reads this json file to create local users and groups that match the domain-joined servers.
+
+Select the server that you want to use to create and update the JSON file.
+The outputs of the configuration stack have the commands required.
+
+| Config Stack Output                     | Description
+|-----------------------------------------|------------------
+| Command01SubmitterMountHeadNode         | Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.
+| Command02CreateUsersGroupsJsonConfigure | Create /opt/slurm/{{ClusterName}}/config/users_groups.json and create a cron job to refresh it hourly.
+
+Before deleting the cluster you can undo the configuration by running the commands in the following outputs.
+
+| Config Stack Output                       | Description
+|-------------------------------------------|------------------
+| command10CreateUsersGroupsJsonDeconfigure | Removes the crontab that refreshes users_groups.json.
+
+Now the cluster is ready to be used by sshing into the head node or a login node, if you configured one.
+
+If you configured extra file systems for the cluster that contain the users' home directories, then they should be able to ssh
+in with their own ssh keys.
+
+## Configure submission hosts to use the cluster
+
+ParallelCluster was built assuming that users would ssh into the head node or login nodes to execute Slurm commands.
+This can be undesirable for a number of reasons.
+First, users shouldn't be given ssh access to a critical infrastructure like the cluster head node.
+With ParallelCluster 3.7.0 you can configure login nodes, but if you have already provisioned desktop nodes then
+it's wasteful to have to provision login nodes.
+Second, it's just inconvenient to have to use ssh to access the cluster and use it.
+
+Fortunately, you can configure any server as a submission host so that users can run slurm commands.
+These commands must be run by an administrator that has root access to the submission host.
+The commands could also be run to create a custom AMI for user desktops so that they can access the clusters.
+The commands to configure submission hosts are in the outputs of the configuration CloudFormation stack.
+Run them in the following order:
+
+| Config Stack Output                     | Description
+|-----------------------------------------|------------------
+| Command01SubmitterMountHeadNode         | Mounts the Slurm cluster's shared file system, adds it to /etc/fstab.
+| Command03SubmitterConfigure             | Configure the submission host so it can directly access the Slurm cluster.
+
+The first command simply mounts the head node's NFS file system so you have access to the Slurm commands and configuration.
+
+The second command runs an ansible playbook that configures the submission host so that it can run the Slurm commands for the cluster.
+It also configures the modulefile that sets up the environment to use the slurm cluster.
+
+The clusters have been configured so that a submission host can use more than one cluster by simply changing the modulefile that is loaded.
+
+On the submission host just open a new shell and load the modulefile for your cluster and you can access Slurm.
+
+## Customize the compute node AMI
+
+The easiest way to create a custom AMI is to find the default ParallelCluster AMI in the UI.
+Create an instance using the AMI and make whatever customizations you require such as installing packages and
+configuring users and groups.
+
+Custom file system mounts can be configured in the aws-eda-slurm-cluster config file which will add it to the
+ParallelCluster config file so that ParallelCluster can manage them for you.
+
+When you are done create a new AMI and wait for the AMI to become available.
+After it is available you can add the custom ami to the aws-eda-slurm-cluster config file.
+
+```
+slurm:
+  ParallelClusterConfig:
+    ComputeNodeAmi: ami-0fdb972bda05d2932
+```
+
+Then update your aws-eda-slurm-cluster stack by running the install script again.
+
+## Run Your First Job
+
+Run the following command in a shell to configure your environment to use your slurm cluster.
+
+```
+module load {{ClusterName}}
+```
+
+To submit a job run the following command.
+
+```
+sbatch /opt/slurm/$SLURM_CLUSTER_NAME/test/job_simple_array.sh
+```
+
+To check the status run the following command.
+
+```
+squeue
+```
+
+To open an interactive shell on a slurm node.
+
+```
+srun --pty /bin/bash
+```
+
+## Slurm Documentation
+
+[https://slurm.schedmd.com](https://slurm.schedmd.com)
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-# Delete Cluster`
	`1`	`+# Delete Cluster (legacy)`
`2`	`2`
`3`	`3`	`Most of the resources can be deleted by simply deleting the cluster's CloudFormation stack.`
`4`	`4`	`However, there a couple of resources that must be manually deleted:`