-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Glibc SIGSYS patch for sandboxed localtime_r() #2527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The LocalTime::now() function is used by the logger for some messages (such as warnings). If the first case of calling a log message that included the time happened in a sandboxed thread, glibc would potentially read from the host filesystem on that thread in response to the log message. This effect arises from the implementation of how localtime_r() obtains time zone information if the "TZ" environment variable is not set. A function called __tzfile_read() is called: https://sources.debian.org/src/glibc/2.28-10/time/tzset.c/#L584 Which triggers an fopen(): https://sources.debian.org/src/glibc/2.28-10/time/tzfile.c/#L165 Because this information is cached, it appears possible to work around the problem by making an early call to LocalTime::now() during initialization. Then if calls pertaining to logging come later on the seccomp filtered thread the caching is done already and problems don't occur. This permits the alpine demo from the quick start to appear to run with glibc instead of musl, and built with the standard Rust on an Ubuntu 20.04 installation (inside of VirtualBox 6.1.20). Signed-off-by: Cerul Alain <[email protected]>
@AE1020 Great dive deep and thank you for the detailed explanation ! Personally I'm a bit reluctant about including such workarounds in the code. How long would the timezone be cached ? Would it be possible for it to be evicted from the cache and for the VM to crash unpredictably later ? |
Hi @AE1020, there is a similar issue with musl as well, and we chose was to allow the required syscall in the seccomp filter:
For glibc, can you please tell us what the syscall that triggers the sigsys is? Firecracker usually logs the number of the offending syscall. Firecracker's glibc support is considered only experimental, and the recommended, production libc for Firecracker is musl, as mentioned in the documentation: https://github.com/firecracker-microvm/firecracker/blob/a367796e66eeac42d9ce1294c0fbbca6191e9cf3/docs/getting-started.md#alternative-building-firecracker-using-glibc. As a development workaround, you may run with seccomp disabled, ( Moreover, we're currently working on a feature that will enable users to provide custom file-based seccomp filters to Firecracker at runtime. Here is the feature branch where this is being developed: https://github.com/firecracker-microvm/firecracker/tree/feature/file_based_seccomp. I'll be closing this PR for now, for the above reasons |
I got the problem running guest Ubuntu 20.04 (glibc 2.31) in a Windows 10 VirtualBox 6.1.20 host. /etc/timezone exists and TZ environment variable not set. It was a fresh install created expressly to try Firecracker with, and try running Valgrind on ("good first issue").
I pointed out the specific fopen() call. The path requested was "/". The SIGSYS is generated by openat: https://sources.debian.org/src/glibc/2.31-11/sysdeps/unix/sysv/linux/openat64_nocancel.c/#L43 UPDATE: Looking at it with a debug build of glibc, the "/" may have been a weakness in the vscode + rust debugger showing the leading character only of a longer path. Because the debug build of glibc shows the file path as "/dbg/etc/localtime"
Perhaps it is different as the issue itself occurs during logging. The musl build successfully ran the demo, and the log had these lines in it:
That was where the glibc build output gave:
Then it terminated.
This may work (presuming the "/" request for file was a red herring based on broken debugging). Though... it may be that as a cloud-focused offering, just reducing variability by using UTC for all times makes the most sense...or only using time zones if TZ set in the environment or passed as a config option, etc. Doing UTC unless a time zone is in the config may make the most sense, as if you want any time zone when reading logs, it's the client's.
Yes (and design-wise I absolutely prefer musl, always good to see people use it). But building with mainstream defaults can offer advantages for instrumentation and tooling not tailored to musl builds. And realistically speaking, if one is forced to be running glibc for other reasons it is saves space overall and is faster. Plus multiple implementations mean it can draw attention to gaps in understandings. I think in the area of these seccomp features specifically one wouldn't want to disable it in a variant implementation, to get any insights. Anyway, PRs for glibc support were accepted and it seemed close to working. This change actually did allow it to work in my case. A PR with data and sourcing strikes me as more informative than a bug report. 🤷♂️ |
With the machine I tried, with glibc 2.26, this is the process for reading the local timezone (I ran using strace):
This is a perfect example of why introducing glibc-specific changes does not scale well. They may vary across versions. I also don't see what the concern would be with
There are examples of PRs for glibc support similar to this one that were closed specifically because they were addressing a seccomp violation. We added the experimental flag to glibc later on and developed the said seccomp file-based feature specifically so that we can handle such cases gracefully. We appreciate your effort and indeed you made a very informative PR. However, as glibc support is experimental, and you are trying to run Firecracker with glibc for development/testing purposes, is running with |
I'm gonna reopen this PR for now, to offer clear exposure of the discussion |
Note that all of this is new to me (Firecracker, seccomp, Rust...), so there's a lot I don't understand. But I'd have thought the difference would be that you are not having the call trigger at the same time--not that glibc has changed much. Are you sure your comparison case has that fopen() being called inside of My specific problem happens when
Thanks! I was just reacting a bit to hearing it as "Hey, it's experimental, can't you read?", plus being closed while I was mid-response to the other questions. I can read! :-) So I appreciate you reopening it. But I don't have any specific needs... really it's just a learning experience for me. I was looking to get an idea of Rust from projects that are up my alley of interest rather than slog through a book. Hmmm, maybe I can't read. :-P |
It really wasn't meant like that :) It's just helpful to link documentation whenever you're referring it.
Right, so, in the scenario I've tested, I too get a warning in the logs about the keyboard interrupt, so this is not the problem. Whether the call is made at the same time in this case doesn't matter because I ran using strace, so any system call was recorded, and the code I've pasted in the previous comment is what gets triggered for the specific glibc version I've used. The rust code for
And we appreciate your effort. We're looking forward to your next contributions. As far as I understand, your progress on other issues is unblocked by running with Thanks |
I'm not clear on what you mean by this (e.g. it clearly mattered in my case) If something about your situation caused a warning earlier--such as closer to startup--it would have had the same effect as my workaround. The problem is very specific to when the fopen() happened--not that it happened at all. I won't argue that it could be due to a glibc difference (e.g. yours could cache the time zone on library startup). I'm just saying I do not understand what about your environment would be different if it's permitting the access to "/etc/localtime" in the protected thread, when mine did not allow that access. I've named a possible cause: e.g. you have some other condition (another warning or configuration option) that causes a time request to be made earlier than that moment. Then this would not be related to a difference in the glibc--and it would be a matter of the timing of calls into glibc instead. I'd still think something like this would be more likely than a glibc change, and might be worth pinning down for the sake of comprehension. |
The thing that matters in the context of seccomp is whether a faulty syscall has happened before the In this sense, the timing of when exactly this was called does not matter, because in terms of seccomp, the only thing that matters is whether the filter was installed or not yet. The glibc I'm using isn't caching the timezone at process startup time. Having run with strace, I can see exactly what system calls my code is running, so I'm sure of this. To get a very clear understanding of what is happening on your side, you could run Firecracker under strace, like so: The thing that must differ from my reproduction scenario vs yours is the libc version. If your glibc version was calling I can also paste the strace output for the version of glibc I'm using, if you'd like |
Okay--yes it's calling
But I now gather that for the specific feature of host filesystem protection, you are foregoing seccomp features...in order to use your own more configurable "jailer" solution. Hence seccomp is protecting against other non-filesystem oriented syscalls (so attackers who manage to trick your Rust code can't make it execute new processes, etc). So if one just deletes line 108 in filters.rs firecracker/src/vmm/src/default_syscalls/filters.rs Lines 106 to 109 in ffbc2ef
But I don't know why your glibc 2.26 wouldn't need that change too. Because articles dating back to 2017--speaking about the impacts on seccomp specifically--say that glibc 2.26 was when it happened. ❓ Other than that point, it makes sense now. But perhaps you see why it would be confusing. The seccomp feature is explained as having the purpose of blocking access to anything other than already-open file descriptors, unless you are using a non-standard filtering extension. So I thought rejecting fopen() was the right thing to do, and not the bug. |
Linux has two types of seccomp filtering. It has a default seccomp filter ( Have a look here for more details: https://man7.org/linux/man-pages/man2/seccomp.2.html I can see where your confusion was coming from, no worries. Hope I could help a bit. |
Do you plan to add One might add it with a comment saying it was needed by GLIBC 2.26 and above, because that's when the internet says Glibc changed, But your Glibc 2.26 worked, suggesting it did not (...or perhaps you were running on aarch64?). |
Reason for This PR
This permits the alpine demo from the quick start to appear to
run with glibc instead of musl...when some warnings are being
emitted by the logger that include timestamps. Previously the
warning emission in a glibc build would cause SIGSYS.
Description of Changes
The LocalTime::now() function is used by the logger for some
messages (such as warnings). If the first case of calling
a log message that included the time happened in a sandboxed
thread, glibc would potentially read from the host filesystem
on that thread in response to the log message.
This effect arises from the implementation of how localtime_r()
obtains time zone information if the "TZ" environment variable
is not set. A function called __tzfile_read() is called:
https://sources.debian.org/src/glibc/2.28-10/time/tzset.c/#L584
Which triggers an fopen():
https://sources.debian.org/src/glibc/2.28-10/time/tzfile.c/#L165
Because this information is cached, it appears possible to work
around the problem by making an early call to LocalTime::now()
during initialization. Then if calls pertaining to logging
come later on the seccomp filtered thread the caching is done
already and problems don't occur.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license.
PR Checklist
git commit -s
).unsafe
code is properly documented.firecracker/swagger.yaml
.CHANGELOG.md
.RE: "[ ] All added/changed functionality is tested", I am running this in VirtualBox as I do not have a non-virtual Linux machine at present to apply. The musl-based version does not pass all the tests with that either. This change improves the mentioned issue, without seeming to make anything worse.