The issue
Starting ParallelCluster 3.9.0, some performance degradation can occur on tightly coupled MPI workloads on large clusters.
The root cause is that in order to execute in-place cluster updates on compute and login nodes, which allowed for the mounting/unmounting of shared storage without replacing the nodes, we introduced a process supporting in-place updates on the compute nodes. Even if the process is lightweight, it is run periodically and may affect the performance of some specific workloads.
Affected ParallelCluster versions, OSes and schedulers
All ParallelCluster versions from 3.9.0 to 3.14.0 on all OSes.
Mitigation
You can find a detailed explanation and the mitigation of the problem here.