-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[libc][cmake][linux] require new LLVM_LIBC_USE_HOST_KERNEL_HEADERS or LIBC_KERNEL_HEADERS #123820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-github-workflow @llvm/pr-subscribers-libc Author: Nick Desaulniers (nickdesaulniers) ChangesWhen cross compiling, we DO NOT want to use the host's kernel headers. Adding We already support setting -DLIBC_KERNEL_HEADERS=/path/to/kernel/headers. Add a Setting LIBC_KERNEL_HEADERS currently produces a few unit test failures that I Add some checks to ensure that at least one of these flags are set, that Full diff: https://github.com/llvm/llvm-project/pull/123820.diff 3 Files Affected:
diff --git a/libc/CMakeLists.txt b/libc/CMakeLists.txt
index 6f1c180a3f192e..11cb7ced4e6073 100644
--- a/libc/CMakeLists.txt
+++ b/libc/CMakeLists.txt
@@ -46,7 +46,11 @@ set(LIBC_BUILD_DIR ${CMAKE_CURRENT_BINARY_DIR})
set(LIBC_ENABLE_USE_BY_CLANG OFF CACHE BOOL "Whether or not to place libc in a build directory findable by a just built clang")
-set(LIBC_KERNEL_HEADERS "/usr/include" CACHE STRING "Path to Linux kernel headers")
+# LLVM_LIBC_USE_HOST_KERNEL_HEADERS is brittle, and frowned upon (prefer
+# setting LIBC_KERNEL_HEADERS), but is simpler from a hello world linux host
+# build.
+set(LLVM_LIBC_USE_HOST_KERNEL_HEADERS OFF CACHE BOOL "Add /usr/include to the header seach path")
+set(LIBC_KERNEL_HEADERS "" CACHE STRING "Path to Linux kernel headers")
# Defining a global namespace to enclose all libc functions.
set(default_namespace "__llvm_libc")
diff --git a/libc/cmake/modules/LLVMLibCArchitectures.cmake b/libc/cmake/modules/LLVMLibCArchitectures.cmake
index fbb1091ddabab4..8cf4dfac70e498 100644
--- a/libc/cmake/modules/LLVMLibCArchitectures.cmake
+++ b/libc/cmake/modules/LLVMLibCArchitectures.cmake
@@ -216,6 +216,27 @@ if (LIBC_TARGET_OS_IS_WINDOWS AND LLVM_LIBC_FULL_BUILD)
message(FATAL_ERROR "Windows does not support full mode build.")
endif ()
+if (LIBC_TARGET_OS_IS_LINUX)
+ if (NOT LLVM_LIBC_USE_HOST_KERNEL_HEADERS AND NOT LIBC_KERNEL_HEADERS STREQUAL "")
+ message(FATAL_ERROR "MUST specify either LIBC_KERNEL_HEADERS or LLVM_LIBC_USE_HOST_KERNEL_HEADERS")
+ endif()
+ if (LLVM_LIBC_USE_HOST_KERNEL_HEADERS AND NOT LIBC_KERNEL_HEADERS STREQUAL "")
+ message(FATAL_ERROR "LLVM_LIBC_USE_HOST_KERNEL_HEADERS and LIBC_USE_HOST_KERNEL_HEADERS are mutually exclusive")
+ endif()
+ if(LIBC_TARGET_TRIPLE AND LLVM_LIBC_USE_HOST_KERNEL_HEADERS)
+ # This is because the syscall numbers are frequently different between
+ # different target architectures. Code may compile, but you'll get spooky
+ # runtime failures.
+ message(FATAL_ERROR "LLVM_LIBC_USE_HOST_KERNEL_HEADERS should not be set when using LIBC_TARGET_TRIPLE, set LIBC_KERNEL_HEADERS instead")
+ endif()
+ if (NOT LIBC_KERNEL_HEADERS STREQUAL "" AND NOT IS_DIRECTORY LIBC_KERNEL_HEADERS)
+ message(FATAL_ERROR "LIBC_KERNEL_HEADERS should be a path to an existing directory")
+ endif()
+ if (LLVM_LIBC_USE_HOST_KERNEL_HEADERS)
+ set(LIBC_KERNEL_HEADERS "/usr/include" CACHE STRING "Path to Linux kernel headers" FORCE)
+ endif()
+endif()
+
message(STATUS
"Building libc for ${LIBC_TARGET_ARCHITECTURE} on ${LIBC_TARGET_OS} with
diff --git a/libc/docs/getting_started.rst b/libc/docs/getting_started.rst
index 51c295f384a06f..1f62d94d3c3fd5 100644
--- a/libc/docs/getting_started.rst
+++ b/libc/docs/getting_started.rst
@@ -21,6 +21,7 @@ Install dependencies first:
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_C_COMPILER=clang \
-DLLVM_LIBC_FULL_BUILD=ON \
+ -DLLVM_LIBC_USE_HOST_KERNEL_HEADERS=ON \
-DLLVM_LIBC_INCLUDE_SCUDO=ON \
-DCOMPILER_RT_BUILD_SCUDO_STANDALONE_WITH_LLVM_LIBC=ON \
-DCOMPILER_RT_BUILD_GWP_ASAN=OFF \
|
Alternatively, we could just always require folks to provide linux kernel headers. As I said, it's quite trivial to build them from source. |
# setting LIBC_KERNEL_HEADERS), but is simpler from a hello world linux host | ||
# build. | ||
set(LLVM_LIBC_USE_HOST_KERNEL_HEADERS OFF CACHE BOOL "Add /usr/include to the header seach path") | ||
set(LIBC_KERNEL_HEADERS "" CACHE STRING "Path to Linux kernel headers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I wonder if we should be using option
instead? I see that used elsewhere in our cmake. I'm not familiar with option
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two ways to define a boolean flag in CMake:
set(FOO true CACHE BOOL "description")
option(FOO "description" ON)
The difference between the two is that the option
command does not create a CACHE
variable if a normal variable with the same name exists, but using set
with CACHE
variable named FOO
when FOO
exists as a normal variable will yield two copies of FOO
, a normal variable and a CACHE
one.
Which is one more appropriate really depends on your intention. There are places in LLVM build system where we intentionally rely on the set
semantics (e.g. we would use the CACHE
variable as the default value to initialize the normal variable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so it sounds like cmake has distinct "namespace-esque" differentiation between "normal variables" and "cache variables."
For cmake variables that we encourage folks to use to configure llvm-libc, I think we may want option
. That seems very common in libcxx/CMakeLists.txt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -216,6 +216,27 @@ if (LIBC_TARGET_OS_IS_WINDOWS AND LLVM_LIBC_FULL_BUILD) | |||
message(FATAL_ERROR "Windows does not support full mode build.") | |||
endif () | |||
|
|||
if (LIBC_TARGET_OS_IS_LINUX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to automatically set LLVM_LIBC_USE_HOST_KERNEL_HEADERS
for overlay builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worthwhile to consider this. Let me jot down my thoughts.
Doing so would effectively change the implicit default value of LIBC_USE_HOST_KERNEL_HEADERS
based on whether we were doing full build (off) vs overlay build (on).
Two invariants I'd like to maintain:
- In the spirit of [libc][docs] reorder docs to be more beginner friendly #122376, I would like us to consider overlay builds the "more advanced" of the build configurations relative to full builds (due to the ABI compatibility concerns). So if we require folks doing overlay builds to add one more cmake flag, in this case
-DLIBC_USE_HOST_KERNEL_HEADERS=ON
, then I don't feel too bad. - We encourage folks targeting linux to just build the kernel headers; it's pretty straightforward. If they can't do this, then forcing them to admit to bad behavior via another cmake var is somewhat part of the intent of this PR. Perhaps that's being malevolent. Implying
LIBC_USE_HOST_KERNEL_HEADERS
for overlay builds removes the point that folks own up to looking at /usr/include. (Besides, it hurts our hermeticity; once we add /usr/include to the directory search path, we MAY end up including other non-kernel headers since kernel and non-kernel headers are all mixed together in /usr/include. This alone gives me heartburn).
Thinking through the case of wanting to do an overlay build, is there ever a case where we'd want to use newer kernel headers (or perhaps older kernel headers) than what was available on the host? A hypothetical use case for newer headers: you want to build on one host (that has old headers) but support newer syscalls. The result can only run on other machines with newer kernels (when newer syscalls are made at runtime). A hypothetical use case for older headers: you want to build on one host (that has newer headers), but run the libc on another target that is running an older kernel. Both of those hypotheticals seem to be a stretch in my mind. Perhaps overlay mode should (or does) imply "literally the result will only run on my host machine+kernel?"
An inspiration for LIBC_USE_HOST_KERNEL_HEADERS
is -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang
which was a flag in clang that was necessary to use -ftrivial-auto-var-init=
for some time. Basically, it was intended as a way to not outright block people from using -ftrivial-auto-var-init=
, but make it such that there was one more hoop for them to jump through to use it.
In the same vein, I don't want to outright prevent people from looking at /usr/include
for linux kernel headers (indeed, maybe on the buildbots we'll skip building kernel headers for each run), but I would like to gradually move us away from doing so.
By adding LIBC_USE_HOST_KERNEL_HEADERS
, I think that strikes the right balance where folks (and build bots) can transition to LIBC_KERNEL_HEADERS
over time, rather than forcing them to do so right now (and not providing any option otherwise).
So I guess, yes, it would make sense to automatically set it. But that betrays an intent in this PR to add one more hoop for folks who refuse to just build the kernel headers from source to sign on the X that they are potentially including more than just kernel headers when they add /usr/include to the compiler's header search path. Is that being too malicious?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is around default configurations. Currently, LLVM_LIBC_FULL_BUILD
defaults to off, so overlay is the default. We should probably change that at some point, but that's a separate issue. The important thing is that our cmake shouldn't error if you don't specify any flags.
Currently, if you run cmake ../runtimes -G Ninja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_ENABLE_RUNTIMES="libc"
you get an overlay mode build with no allocator. It's not a particularly useful configuration, but it will build and pass our tests. This is important for two reasons:
- First impressions are important, a build failure looks bad, and lots of people don't read the docs.
- A surprising number of people download the
llvm-project
repo, set the enabled projects/runtimes to "all", and runmake install
. They then file bugs when they run into a build error, even if it's for a project they didn't even plan to use. This is annoying and wastes both their time and ours, so I'd like to avoid it if possible.
The other issue is that for overlay mode we do actually need to look at the other libc's headers. All of our type/macro headers in libc/hdr/
use the system's headers in overlay mode, and a couple places (specifically in printf, assert, and a few other places iirc) even call functions from the system's libc. Overlay mode can't be hermetic, and I don't think we can practically make it independent of the system it's targeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other issue is that for overlay mode we do actually need to look at the other libc's headers.
D'oh, you're right!
My main concern is around default configurations.
This is something I hadn't even considered. I can get behind wanting an optionless configure to build out of the box. Let me think more about this over lunch.
Currently, LLVM_LIBC_FULL_BUILD defaults to off, so overlay is the default. We should probably change that at some point, but that's a separate issue.
I agree. Filed #124973.
… LIBC_KERNEL_HEADERS When cross compiling, we DO NOT want to use the host's kernel headers. Adding all of /usr/include via `-dirafter` also isn't very hermetic since it can pull in more than just kernel headers. But peeking at /usr/include simplifies setting up the libc, since then the kernel headers don't have to be built from source. (Building kernel headers from source is quite trivial though). We already support setting -DLIBC_KERNEL_HEADERS=/path/to/kernel/headers. Add a new boolean cmake flag, LLVM_LIBC_USE_HOST_KERNEL_HEADERS, which indicates that the user is opting into just using the hosts kernel headers. Existing host builds may break and need to set this. Setting LIBC_KERNEL_HEADERS currently produces a few unit test failures that I still need to debug, but ideally users targeting linux will always pass LIBC_KERNEL_HEADERS in the future. Add some checks to ensure that at least one of these flags are set, that they're mutually exclusive, that the headers are required when cross compiling, and that the headers path is a directory.
bcfb656
to
d4c49a6
Compare
precommit errors:
|
$> make LLVM=1 INSTALL_HDR_PATH=/path/to/sysroot -C /tmp/linux headers_install | ||
|
||
The headers can be built to target non-host architectures by adding the | ||
``ARCH={arm|arm64|i386}`` to the above invocation of ``make``. | ||
``ARCH={arm|arm64|i386}`` to the above invocation of ``make``. Then you should | ||
set the cmake variable `-DLIBC_KERNEL_HEADERS=/path/to/sysroot` when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency, would it make sense to use /path/to/kernel/headers/
instead of /path/to/sysroot
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you're right about the inconsistency. I think I will resolve it the other way, since existing docs talk about setting
$> SYSROOT=/path/to/sysroot # Remember to set this!
above. So I will use /path/to/sysroot/
more consistently and get rid of /path/to/kernel/headers
.
Done in db5b2f9.
@@ -216,6 +216,27 @@ if (LIBC_TARGET_OS_IS_WINDOWS AND LLVM_LIBC_FULL_BUILD) | |||
message(FATAL_ERROR "Windows does not support full mode build.") | |||
endif () | |||
|
|||
if (LIBC_TARGET_OS_IS_LINUX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is around default configurations. Currently, LLVM_LIBC_FULL_BUILD
defaults to off, so overlay is the default. We should probably change that at some point, but that's a separate issue. The important thing is that our cmake shouldn't error if you don't specify any flags.
Currently, if you run cmake ../runtimes -G Ninja -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DLLVM_ENABLE_RUNTIMES="libc"
you get an overlay mode build with no allocator. It's not a particularly useful configuration, but it will build and pass our tests. This is important for two reasons:
- First impressions are important, a build failure looks bad, and lots of people don't read the docs.
- A surprising number of people download the
llvm-project
repo, set the enabled projects/runtimes to "all", and runmake install
. They then file bugs when they run into a build error, even if it's for a project they didn't even plan to use. This is annoying and wastes both their time and ours, so I'd like to avoid it if possible.
The other issue is that for overlay mode we do actually need to look at the other libc's headers. All of our type/macro headers in libc/hdr/
use the system's headers in overlay mode, and a couple places (specifically in printf, assert, and a few other places iirc) even call functions from the system's libc. Overlay mode can't be hermetic, and I don't think we can practically make it independent of the system it's targeting.
Ah! Forgot to drop the |
probably need to update .github/workflows/libc-overlay-tests.yml, too. |
Talked more with @michaelrj-google about this after lunch. Taking a step back, I think what I care more about is requiring the use of |
When cross compiling, we DO NOT want to use the host's kernel headers. Adding
all of /usr/include via
-dirafter
also isn't very hermetic since it can pullin more than just kernel headers. But peeking at /usr/include simplifies
setting up the libc, since then the kernel headers don't have to be built from
source. (Building kernel headers from source is quite trivial though).
We already support setting -DLIBC_KERNEL_HEADERS=/path/to/kernel/headers. Add a
new boolean cmake flag, LLVM_LIBC_USE_HOST_KERNEL_HEADERS, which indicates that
the user is opting into just using the hosts kernel headers. Existing host
builds may break and need to set this.
Setting LIBC_KERNEL_HEADERS currently produces a few unit test failures that I
still need to debug, but ideally users targeting linux will always pass
LIBC_KERNEL_HEADERS in the future.
Add some checks to ensure that at least one of these flags are set, that
they're mutually exclusive, that the headers are required when cross compiling,
and that the headers path is a directory.