-
Notifications
You must be signed in to change notification settings - Fork 8
n-acd: decide if ebpf is used at runtime #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
n-acd is compiled either with functions that implement ebpf functionality using syscall or with dummy functions that always return success and set bpf file descriptor to invalid value. It would be better to decide if the ebpf functionality is used at runtime. Always compile n-acd project with functions implementing ebpf functionality. If syscall to kernel fails on n_acd_bpf_map_create ignore the failure and continue withous using ebpf.
Can you elaborate on your motivation? What is your use-case? Why do you need to check for eBPF at runtime? The current version of n-acd avoids checking for eBPF at runtime, and thus guaranteeing that it will use eBPF if configured with ebpf-support. Thus, it simplifies debugging broken systems, because you know up-front whether it uses eBPF or not. Ideally, we would just hard-code eBPF support and require it. The fallback is currently available for older systems which do not ship the required headers and are unlikely to have runtime support for it. Can you explain why you expect eBPF support to be available at build-time, but the binary be deployed to systems with kernels too old for eBPF? |
there are various cases where BPF fails. I think the main reason is if you run as non-root (e.g. only having CAP_SYS_ADMIN) or in another namespace (where you are root inside the namespace, but not root in the main namespace). This affects for example the unit tests of NetworkManager. But it also affects if you try to run NetworkManager in a system-container. Due to that, we currently build NetworkManager on RHEL and Fedora with BPF disabled. See also I think BPF is mainly a performance optimization to avoid wakeups. But if kernel says it doesn't work, I think we should just proceed and do the best we can (that is, filter packets in user-space). See also https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1049 . Btw, Vojtech is our intern at Red Hat :) |
Thanks for the background! I completely forgot about the privilege-requirements. I don't see any other way than making it runtime-optional in that case.
Welcome to Red Hat! I will have a look at the patches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments inline. I think we also need to check whether we want to support ENOSYS for unavailable epf-syscalls. Maybe @thom311 knows whether NetworkManager supports old systems like this?
@@ -208,8 +208,6 @@ static int n_acd_probe_link(NAcdProbe *probe) { | |||
* entry. | |||
*/ | |||
r = n_acd_ensure_bpf_map_space(probe->acd); | |||
if (r) | |||
return r; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep this error-handler there, otherwise we will suppress real errors from the bpf-helpers.
@@ -238,7 +236,7 @@ static int n_acd_probe_link(NAcdProbe *probe) { | |||
/* | |||
* Add the ip address to the map, if it is not already there. | |||
*/ | |||
if (n_acd_probe_is_unique(probe)) { | |||
if (probe->acd->fd_bpf_map >= 0 && n_acd_probe_is_unique(probe)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would rather prefer for n_acd_bpf_map_add()
to return success early if the passed map is negative. The reason is that we run in a process we do not control, so technically EUID can change during runtime and the bpf-calls might suddenly succeed. If our state always reflects the state as if the BPF map was there, then any call that suddenly succeeds in creating the BPF-map will correctly add all the IPs and have n_bpf_map
reflect the real required state.
@@ -261,7 +259,7 @@ static void n_acd_probe_unlink(NAcdProbe *probe) { | |||
* If this is the only probe for a given IP, remove the IP from the | |||
* kernel BPF map. | |||
*/ | |||
if (n_acd_probe_is_unique(probe)) { | |||
if (probe->acd->fd_bpf_map >= 0 && n_acd_probe_is_unique(probe)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above. I would rather make bpf_map_remove()
bail out early with success if the passed map-fd is negative.
@@ -296,7 +296,7 @@ int n_acd_ensure_bpf_map_space(NAcd *acd) { | |||
|
|||
r = n_acd_bpf_compile(&fd_prog, fd_map, (struct ether_addr*) acd->mac); | |||
if (r) | |||
return r; | |||
fd_prog = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will suppress any real error from n_acd_bpf_compile()
. Why not add N_ACD_E_DENIED
after N_ACD_E_DROPPED
in n-acd-private.h
and make n_acd_bpf_compile()
return that in the exact circumstances where it matters? Then make all callers of it handle that error gracefully, just like you did here.
r = n_acd_bpf_compile(&fd_bpf_prog, acd->fd_bpf_map, (struct ether_addr*) acd->mac); | ||
if (r) | ||
fd_bpf_prog = -1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, I think this should handle a possible N_ACD_E_DENIED
. So something like this:
fd_bpf_prog = -1;
r = n_acd_bpf_map_create(...);
if (!r)
r = n_acd_bpf_compile(...);
if (r && r != N_ACD_E_DENIED)
return r;
n-acd is compiled either with functions that implement ebpf functionality using syscall or with dummy functions that always return success and set bpf file descriptor to invalid value.
It would be better to decide if the ebpf functionality is used at runtime. This can be done by ignoring failure of syscall
made in
n_acd_bpf_map_create
and moving on without using the bpf map. In this case bpf file descriptor should alsobe set to invalid value to prevent other bpf functions from running.