You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From NCI's announcement today re. mid-cycle upgrade to Gadi included (emphasis mine):
The new system will contain 74,880 cores from Intel's newest fourth-generation Sapphire Rapids processors. A total of 720 nodes, each containing two 52-core CPUs, make up this latest upgrade. ... Users are recommended to use the latest versions of their software to maximise compatibility with the new hardware. Recompiling using the latest versions of the Intel compilers is also recommended to get the most out of this new CPU architecture.
As far as I can see /g/data/ik11/inputs/mom6/bin/symmetric_FMS2-e7d09b7 was compiled with intel-compiler/2019.3.199, which is actually the oldest one on NCI - it's now up to intel-compiler-llvm/2022.2.0 (or intel-compiler/2021.7.0 if we don't want llvm).
Should we recompile for the 1/40° run?
And while we're at it, should we upgrade the dependencies, e.g. we're using openmpi/4.1.2 and netcdf/4.7.4p but there are modules for openmpi/4.1.4 and netcdf/4.9.0p available.
The text was updated successfully, but these errors were encountered:
Rui gave awesome presentation today (March 27) on profiling. 3 month, 15 or 5 day tests of global and panan on cascade and sapphire nodes with different layouts and different compilers. He has a big spreadsheet of numbers/data. Post it somewhere please?
MPI wait time is big waster in MOM6. Lot of wait time = load imbalance. Different layouts doesn't seem to improve the load imbalance. I'O needs to be looked at with a different tool. Sea ice is there as well. So each rank has ocean and sea ice that complicates the diagnostics.
This analysis includes initialization time for model. Maybe should just evaluate main loop for MPI imbalance. But 15 day vs 5 day runs don't dramatically change the MPI wait times -so initialisation not a big deal
Found the best compiler flags 🙂 but not a big difference.
Better performance on new saphire nodes. 8% better. MPI wait time problems are the same though. e.g. 50% MPT time total and 30% of that time is wasted.
Rui also showed AVX256 is fastest; older non-LLVM compiler generates faster code; Sapphire Rapids is faster, probably due to larger cache memory and DDR5
I believe in the end the conclusion is that there's no need to recompile the code. So maybe we can close this issue? Note that there's another issue (#20) specifically for the scaling and optimization
From NCI's announcement today re. mid-cycle upgrade to Gadi included (emphasis mine):
As far as I can see
/g/data/ik11/inputs/mom6/bin/symmetric_FMS2-e7d09b7
was compiled withintel-compiler/2019.3.199
, which is actually the oldest one on NCI - it's now up tointel-compiler-llvm/2022.2.0
(orintel-compiler/2021.7.0
if we don't want llvm).Should we recompile for the 1/40° run?
And while we're at it, should we upgrade the dependencies, e.g. we're using
openmpi/4.1.2
andnetcdf/4.7.4p
but there are modules foropenmpi/4.1.4
andnetcdf/4.9.0p
available.The text was updated successfully, but these errors were encountered: