Skip to content

Ban ligomp.so at GCCcore level #4951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

ocaisa
Copy link
Member

@ocaisa ocaisa commented Jul 4, 2025

Fixes #4535

Will need work for easyconfigs that exist with GCCcore and use OpenMP (see #4535 (comment))

@@ -35,6 +35,9 @@
from easybuild.tools import LooseVersion
from easybuild.tools.toolchain.toolchain import SYSTEM_TOOLCHAIN_NAME

# At GCCcore level we do not allow OpenMP
GCCCORE_BANNED_LIBRARIES = ['libgomp.so']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has there been a discussion of what version of GCCcore we should add this check for? I do not think it is worthwhile to apply this to all GCCcore versions we have.

Probably start with one of:

  • 14.2.0 (2025a generation)
  • 14.3.0 (2025b generation)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I think it only really matters once we introduce modern LLVM (although it was probably also a potential issue for Intel compilers since we introduced GCCcore)

@ocaisa
Copy link
Member Author

ocaisa commented Jul 4, 2025

Works as expected

(eb_devel) ocaisa@~/EasyBuild$ eb --rebuild ImageMagick-7.1.1-34-GCCcore-13.2.0.eb
== Temporary log file in case of crash /tmp/eb-ui9xrft8/easybuild-ie7x0fgk.log
...
== sanity checking...
  >> file 'bin/magick' found: OK
  >> (non-empty) directory 'etc/ImageMagick-7' found: OK
  >> (non-empty) directory 'include/ImageMagick-7' found: OK
  >> (non-empty) directory 'lib' found: OK
  >> (non-empty) directory 'share' found: OK
  >> loading modules: ImageMagick/7.1.1-34-GCCcore-13.2.0...
  >> running command 'magick --help' ...
  >> result for command 'magick --help': OK
== ... (took 9 secs)
== FAILED: Installation ended unsuccessfully: Sanity check failed: Check for banned/required shared libraries failed for
/home/ocaisa/eessi/versions/2023.06/software/linux/x86_64/intel/icelake/software/ImageMagick/7.1.1-34-GCCcore-13.2.0/bin/import,
/home/ocaisa/eessi/versions/2023.06/software/linux/x86_64/intel/icelake/software/ImageMagick/7.1.1-34-GCCcore-13.2.0/bin/mogrify,
/home/ocaisa/eessi/versions/2023.06/software/linux/x86_64/intel/icelake/software/ImageMagick/7.1.1-34-GCCcore-13.2.0/bin/display,
...

and switching this up to GCC toolchain

(eb_devel) ocaisa@~/EasyBuild$ eb ImageMagick-7.1.1-34-GCC-13.2.0.eb
== Temporary log file in case of crash /tmp/eb-qd917_bz/easybuild-tnbqj387.log
...
== COMPLETED: Installation ended successfully (took 21 mins 13 secs)
== Results of the build can be found in the log file(s)
/home/ocaisa/eessi/versions/2023.06/software/linux/x86_64/intel/icelake/software/ImageMagick/7.1.1-34-GCC-13.2.0/easybuild/easybuild-ImageMagick-7.1.1-34-20250704.144727.l
og.bz2

== Build succeeded for 1 out of 1
== Summary:
   * [SUCCESS] ImageMagick/7.1.1-34-GCC-13.2.0

Now to see why the tests are failing...

@ocaisa
Copy link
Member Author

ocaisa commented Jul 4, 2025

For some reason this change now triggers

det_flexiblas_backend_libs()

which is causing the test failures (since the flexiblas command does not exist)

@Micket
Copy link
Contributor

Micket commented Jul 4, 2025

== FAILED: Installation ended unsuccessfully: Sanity check failed: Check for banned/required shared libraries failed for
/home/ocaisa/eessi/versions/2023.06/software/linux/x86_64/intel/icelake/software/ImageMagick/7.1.1-34-GCCcore-13.2.0/bin/import,

This is way to strict. It's perfectly fine for binaries to link to libgomp (it's also fine if they are fortran codes).
I can't think of any reason to check any binary for any banned linking.

@Thyre
Copy link
Contributor

Thyre commented Jul 5, 2025

This is way to strict. It's perfectly fine for binaries to link to libgomp (it's also fine if they are fortran codes). I can't think of any reason to check any binary for any banned linking.

The reason was discussed in #4535 before.
Having multiple OpenMP runtimes at the same time can have all sorts of side-effects. As an example, multiple OpenMP runtimes can yield undefined results for OpenMP runtime calls like omp_in_parallel().

I've encountered several cases already where multiple OpenMP runtimes were causing havoc on the performance tool side (though this was with Cray's OpenMP & LLVM OpenMP runtime, LLVM OpenMP + NVHPC OpenMP), e.g. trying to initialize data structures twice. Switching to libgomp.so for all of them might not always work, and means that features like OpenMP target offload will not be available. That's also not ideal.

As more of a general question though:
Even if we ban libgomp.so from GCCcore, what about binary packages which either bundle their own OpenMP runtime, or e.g. use the one from GCC as well?

@Thyre
Copy link
Contributor

Thyre commented Jul 5, 2025

Oh, I misread the binary part.
You're right, binaries shouldn't be a problem, as long as we only have one OpenMP runtime. Totally agreed there.

@ocaisa
Copy link
Member Author

ocaisa commented Jul 6, 2025

== FAILED: Installation ended unsuccessfully: Sanity check failed: Check for banned/required shared libraries failed for
/home/ocaisa/eessi/versions/2023.06/software/linux/x86_64/intel/icelake/software/ImageMagick/7.1.1-34-GCCcore-13.2.0/bin/import,

This is way to strict. It's perfectly fine for binaries to link to libgomp (it's also fine if they are fortran codes). I can't think of any reason to check any binary for any banned linking.

It's not that I don't agree, but this would require a total rework of how the check is done in general in framework as there is no distinction made (and looks like you need additional requirements to check in python if something is an executable as opposed to a shared lib).

@Micket
Copy link
Contributor

Micket commented Jul 16, 2025

After scanning through all my libraries and binaries at GCCcore level, use of OMP was actually far less prevalent than thought it would be (though i haven't built everything, and of course, things can change).

I'm worried about cascading effects of some small library that happens to be a dependency of something very core, like a compression library or something that would force everything that depends on it to move the entirety of GCCcore -> GCC + intel-compilers + clang + NVHPC, which would be a massive pain.

A quick scan of my tree found things we would be banning due to binaries:
gettext
Kalign
SoX
speech_tools
UCX

what really worries me are things like gettext (or UCX). Yikes, that would really suck to have to move.

We would of course just fix those cases rather than moving them (like disabling omp or statically link them) if possible to avoid the cascade of dependencies moved. So, maybe in practice this is a non issue.


what about binary packages which either bundle their own OpenMP runtime, or e.g. use the one from GCC as well?

I don't think there is any hope that we can do something about ABI compatible we don't control ourself. So, as with any commercial, binary install, I think there is just install and hope.


When python import foo from a lib/site-packages/foo.cpython-313-x86_64-linux.gnu.so it can also pull in banned linked libraries (which will be in a global scope if i'm not mistaken, it's just a dlopen). Looking at the framework code, we don't check for this currently (it's not recursively checking those directories, just lib/ and bin/).
A quick check in my test tree it doesn't look like we are breaking this anywhere I've tested yet, so that's good (only used by GPAW, rpy2 in my software tree, which are at foss level).

@Thyre
Copy link
Contributor

Thyre commented Jul 16, 2025

gettext uses OpenMP in msgmerge at one single place. One would need to check how large the impact of not using OpenMP would be.

IIRC, UCX only uses OpenMP for a (performance) test. Could be fine to not build that test, or disable OpenMP for that...

But like you said earlier, binaries using OpenMP shouldn't necessarily be a blocker for an installation, only shared libraries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The GCC OpenMP runtime (libgomp) should be a banned library for GCCcore (only)
4 participants