-
Notifications
You must be signed in to change notification settings - Fork 126
bpf: Mitigate Spectre v1 using barriers #8817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: bpf-next_base
Are you sure you want to change the base?
Conversation
Upstream branch: 8582d9a |
a1272ec
to
e95f7e7
Compare
9728a8b
to
27c7945
Compare
Upstream branch: 5709be4 |
e95f7e7
to
cfe0b88
Compare
27c7945
to
4f05866
Compare
Upstream branch: be2fea9 |
cfe0b88
to
d358f9f
Compare
4f05866
to
e9b7177
Compare
Upstream branch: 5cffad0 |
d358f9f
to
80285ab
Compare
e9b7177
to
f106256
Compare
Upstream branch: 53ebef5 |
80285ab
to
c74fc67
Compare
f106256
to
8b59228
Compare
Upstream branch: 6aca583 |
c74fc67
to
e71ad5a
Compare
8b59228
to
9c93586
Compare
Upstream branch: b9c09fb |
e71ad5a
to
46dc21f
Compare
9c93586
to
ca1daab
Compare
Upstream branch: 60400cd |
46dc21f
to
19a5df1
Compare
ca1daab
to
cad9039
Compare
Upstream branch: 4cc2048 |
19a5df1
to
139060e
Compare
cad9039
to
c7849f3
Compare
Upstream branch: 46eb012 |
139060e
to
1ed4582
Compare
c7849f3
to
a041a61
Compare
Upstream branch: 91dbac4 |
1ed4582
to
857cb7c
Compare
a041a61
to
b2666a3
Compare
Currently, __xlated_unpriv and __jited_unpriv do not work because the BPF syscall will overwrite info.jited_prog_len and info.xlated_prog_len with 0 if the process is not bpf_capable(). This bug was not noticed before, because there is no test that actually uses __xlated_unpriv/__jited_unpriv. To resolve this, simply restore the capabilities earlier (but still after loading the program). Adding this here unconditionally is fine because the function first checks that the capabilities were initialized before attempting to restore them. This will be important later when we add tests that check whether a speculation barrier was inserted in the correct location. Signed-off-by: Luis Gerhorst <[email protected]> Fixes: 9c9f733 ("selftests/bpf: allow checking xlated programs in verifier_* tests") Fixes: 7d743e4 ("selftests/bpf: __jited test tag to check disassembly after jit")
This is required to catch the errors later and fall back to a nospec if on a speculative path. Eliminate the regs variable as it is only used once and insn_idx is not modified in-between the definition and usage. Still pass insn simply to match the other check_*() functions. As Eduard points out [1], insn is assumed to correspond to env->insn_idx in many places (e.g, __check_reg_arg()). Move code into do_check_insn(), replace * "continue" with "return 0" after modifying insn_idx * "goto process_bpf_exit" with "return PROCESS_BPF_EXIT" * "do_print_state = " with "*do_print_state = " [1] https://lore.kernel.org/all/[email protected]/ Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
Mark these cases as non-recoverable to later prevent them from being caught when they occur during speculative path verification. Eduard writes [1]: The only pace I'm aware of that might act upon specific error code from verifier syscall is libbpf. Looking through libbpf code, it seems that this change does not interfere with libbpf. [1] https://lore.kernel.org/all/[email protected]/ Signed-off-by: Luis Gerhorst <[email protected]> Reviewed-by: Eduard Zingerman <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
This prevents us from trying to recover from these on speculative paths in the future. Signed-off-by: Luis Gerhorst <[email protected]> Reviewed-by: Eduard Zingerman <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
JITs can set bpf_jit_bypass_spec_v1/v4() if they want the verifier to skip analysis/patching for the respective vulnerability. For v4, this will reduce the number of barriers the verifier inserts. For v1, it allows more programs to be accepted. The primary motivation for this is to not regress unpriv BPF's performance on ARM64 in a future commit where BPF_NOSPEC is also used against Spectre v1. This has the user-visible change that v1-induced rejections on non-vulnerable PowerPC CPUs are avoided. For now, this does not change the semantics of BPF_NOSPEC. It is still a v4-only barrier and must not be implemented if bypass_spec_v4 is always true for the arch. Changing it to a v1 AND v4-barrier is done in a future commit. As an alternative to bypass_spec_v1/v4, one could introduce NOSPEC_V1 AND NOSPEC_V4 instructions and allow backends to skip their lowering as suggested by commit f5e81d1 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4"). Adding bpf_jit_bypass_spec_v1/v4() was found to be preferable for the following reason: * bypass_spec_v1/v4 benefits non-vulnerable CPUs: Always performing the same analysis (not taking into account whether the current CPU is vulnerable), needlessly restricts users of CPUs that are not vulnerable. The only use case for this would be portability-testing, but this can later be added easily when needed by allowing users to force bypass_spec_v1/v4 to false. * Portability is still acceptable: Directly disabling the analysis instead of skipping the lowering of BPF_NOSPEC(_V1/V4) might allow programs on non-vulnerable CPUs to be accepted while the program will be rejected on vulnerable CPUs. With the fallback to speculation barriers for Spectre v1 implemented in a future commit, this will only affect programs that do variable stack-accesses or are very complex. For PowerPC, the SEC_FTR checking in bpf_jit_bypass_spec_v4() is based on the check that was previously located in the BPF_NOSPEC case. For LoongArch, it would likely be safe to set both bpf_jit_bypass_spec_v1() and _v4() according to commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode"). This is omitted here as I am unable to do any testing for LoongArch. Signed-off-by: Luis Gerhorst <[email protected]> Cc: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
This changes the semantics of BPF_NOSPEC (previously a v4-only barrier) to always emit a speculation barrier that works against both Spectre v1 AND v4. If mitigation is not needed on an architecture, the backend should set bpf_jit_bypass_spec_v4/v1(). As of now, this commit only has the user-visible implication that unpriv BPF's performance on PowerPC is reduced. This is the case because we have to emit additional v1 barrier instructions for BPF_NOSPEC now. This commit is required for a future commit to allow us to rely on BPF_NOSPEC for Spectre v1 mitigation. As of this commit, the feature that nospec acts as a v1 barrier is unused. Commit f5e81d1 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4") noted that mitigation instructions for v1 and v4 might be different on some archs. While this would potentially offer improved performance on PowerPC, it was dismissed after the following considerations: * Only having one barrier simplifies the verifier and allows us to easily rely on v4-induced barriers for reducing the complexity of v1-induced speculative path verification. * For the architectures that implemented BPF_NOSPEC, only PowerPC has distinct instructions for v1 and v4. Even there, some insns may be shared between the barriers for v1 and v4 (e.g., 'ori 31,31,0' and 'sync'). If this is still found to impact performance in an unacceptable way, BPF_NOSPEC can be split into BPF_NOSPEC_V1 and BPF_NOSPEC_V4 later. As an optimization, we can already skip v1/v4 insns from being emitted for PowerPC with this setup if bypass_spec_v1/v4 is set. Vulnerability-status for BPF_NOSPEC-based Spectre mitigations (v4 as of this commit, v1 in the future) is therefore: * x86 (32-bit and 64-bit), ARM64, and PowerPC (64-bit): Mitigated - This patch implements BPF_NOSPEC for these architectures. The previous v4-only version was supported since commit f5e81d1 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4") and commit b7540d6 ("powerpc/bpf: Emit stf barrier instruction sequences for BPF_NOSPEC"). * LoongArch: Not Vulnerable - Commit a6f6a95 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode") is the only other past commit related to BPF_NOSPEC and indicates that the insn is not required there. * MIPS: Vulnerable (if unprivileged BPF is enabled) - Commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode") indicates that it is not vulnerable but this contradicts the kernel and Debian documentation. Therefore I assume that there exist vulnerable MIPS CPUs (but maybe not from Loongson?). In the future, BPF_NOSPEC could be implemented for MIPS based on the GCC speculation_barrier [1]. For now, we rely on unprivileged BPF being disabled by default. * Other: Unknown - To the best of my knowledge there is no definitive information available that indicates that any other arch is vulnerable. They are therefore left untouched (BPF_NOSPEC is not implemented, but bypass_spec_v1/v4 is also not set). I did the following testing to ensure the insn encoding is correct: * ARM64: * 'dsb nsh; isb' was successfully tested with the BPF CI in [2] * 'sb' locally using QEMU v7.2.15 -cpu max (emitted sb insn is executed for example with './test_progs -t verifier_array_access') * PowerPC: The following configs were tested locally with ppc64le QEMU v8.2 '-machine pseries -cpu POWER9': * STF_BARRIER_EIEIO + CONFIG_PPC_BOOK32_64 * STF_BARRIER_SYNC_ORI (forced on) + CONFIG_PPC_BOOK32_64 * STF_BARRIER_FALLBACK (forced on) + CONFIG_PPC_BOOK32_64 * CONFIG_PPC_E500 (forced on) + STF_BARRIER_EIEIO * CONFIG_PPC_E500 (forced on) + STF_BARRIER_SYNC_ORI (forced on) * CONFIG_PPC_E500 (forced on) + STF_BARRIER_FALLBACK (forced on) * CONFIG_PPC_E500 (forced on) + STF_BARRIER_NONE (forced on) Most of those cobinations should not occur in practice, but I was not able to get an PPC e6500 rootfs (for testing PPC_E500 without forcing it on). In any case, this should ensure that there are no unexpected conflicts between the insns when combined like this. Individual v1/v4 barriers were already emitted elsewhere. [1] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=29b74545531f6afbee9fc38c267524326dbfbedf ("MIPS: Add speculation_barrier support") [2] #8576 Signed-off-by: Luis Gerhorst <[email protected]> Cc: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
This is made to clarify that this flag will cause a nospec to be added after this insn and can therefore be relied upon to reduce speculative path analysis. Signed-off-by: Luis Gerhorst <[email protected]> Cc: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
This implements the core of the series and causes the verifier to fall back to mitigating Spectre v1 using speculation barriers. The approach was presented at LPC'24 [1] and RAID'24 [2]. If we find any forbidden behavior on a speculative path, we insert a nospec (e.g., lfence speculation barrier on x86) before the instruction and stop verifying the path. While verifying a speculative path, we can furthermore stop verification of that path whenever we encounter a nospec instruction. A minimal example program would look as follows: A = true B = true if A goto e f() if B goto e unsafe() e: exit There are the following speculative and non-speculative paths (`cur->speculative` and `speculative` referring to the value of the push_stack() parameters): - A = true - B = true - if A goto e - A && !cur->speculative && !speculative - exit - !A && !cur->speculative && speculative - f() - if B goto e - B && cur->speculative && !speculative - exit - !B && cur->speculative && speculative - unsafe() If f() contains any unsafe behavior under Spectre v1 and the unsafe behavior matches `state->speculative && error_recoverable_with_nospec(err)`, do_check() will now add a nospec before f() instead of rejecting the program: A = true B = true if A goto e nospec f() if B goto e unsafe() e: exit Alternatively, the algorithm also takes advantage of nospec instructions inserted for other reasons (e.g., Spectre v4). Taking the program above as an example, speculative path exploration can stop before f() if a nospec was inserted there because of Spectre v4 sanitization. In this example, all instructions after the nospec are dead code (and with the nospec they are also dead code speculatively). On x86_64, this depends on the following property of lfence [3]: An LFENCE instruction or a serializing instruction will ensure that no later instructions execute, even speculatively, until all prior instructions complete locally. [...] Inserting an LFENCE instruction after a bounds check prevents later operations from executing before the bound check completes. Regarding the example, this implies that `if B goto e` will not execute before `if A goto e` completes. Once `if A goto e` completes, the CPU should find that the speculation was wrong and continue with `exit`. If there is any other path that leads to `if B goto e` (and therefore `unsafe()`) without going through `if A goto e`, then a nospec will still be needed there. However, this patch assumes this other path will be explored separately and therefore be discovered by the verifier even if the exploration discussed here stops at the nospec. This patch furthermore has the unfortunate consequence that Spectre v1 mitigations now only support architectures which implement BPF_NOSPEC. Before this commit, Spectre v1 mitigations prevented exploits by rejecting the programs on all architectures. Because some JITs do not implement BPF_NOSPEC, this patch therefore may regress unpriv BPF's security to a limited extent: * The regression is limited to systems vulnerable to Spectre v1, have unprivileged BPF enabled, and do NOT emit insns for BPF_NOSPEC. The latter is not the case for x86 64- and 32-bit, arm64, and powerpc 64-bit and they are therefore not affected by the regression. According to commit a6f6a95 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode"), LoongArch is not vulnerable to Spectre v1 and therefore also not affected by the regression. * To the best of my knowledge this regression may therefore only affect MIPS. This is deemed acceptable because unpriv BPF is still disabled there by default. As stated in a previous commit, BPF_NOSPEC could be implemented for MIPS based on GCC's speculation_barrier implementation. * It is unclear which other architectures (besides x86 64- and 32-bit, ARM64, PowerPC 64-bit, LoongArch, and MIPS) supported by the kernel are vulnerable to Spectre v1. Also, it is not clear if barriers are available on these architectures. Implementing BPF_NOSPEC on these architectures therefore is non-trivial. Searching GCC and the kernel for speculation barrier implementations for these architectures yielded no result. * If any of those regressed systems is also vulnerable to Spectre v4, the system was already vulnerable to Spectre v4 attacks based on unpriv BPF before this patch and the impact is therefore further limited. As an alternative to regressing security, one could still reject programs if the architecture does not emit BPF_NOSPEC (e.g., by removing the empty BPF_NOSPEC-case from all JITs except for LoongArch where it appears justified). However, this will cause rejections on these archs that are likely unfounded in the vast majority of cases. In the tests, some are now successful where we previously had a false-positive (i.e., rejection). Change them to reflect where the nospec should be inserted (using __xlated_unpriv) and modify the error message if the nospec is able to mitigate a problem that previously shadowed another problem (in that case __xlated_unpriv does not work, therefore just add a comment). Define SPEC_V1 to avoid duplicating this ifdef whenever we check for nospec insns using __xlated_unpriv, define it here once. This also improves readability. PowerPC can probably also be added here. However, omit it for now because the BPF CI currently does not include a test. Briefly went through all the occurrences of EPERM, EINVAL, and EACCESS in the verifier in order to validate that catching them like this makes sense. [1] https://lpc.events/event/18/contributions/1954/ ("Mitigating Spectre-PHT using Speculation Barriers in Linux eBPF") [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions") [3] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/runtime-speculative-side-channel-mitigations.html ("Managed Runtime Speculative Execution Side Channel Mitigations") Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
This is based on the gadget from the description of commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted branches"). Signed-off-by: Luis Gerhorst <[email protected]>
Insert a nospec before the access to prevent it from ever using an index that is subject to speculative scalar-confusion. The access itself can either happen directly in the BPF program (reads only, check_stack_read_var_off()) or in a helper (read/write, check_helper_mem_access()). This relies on the fact that the speculative scalar confusion that leads to the variable-stack access going OOBs must stem from a prior speculative store or branch bypass. Adding a nospec before the variable-stack access will force all previously bypassed stores/branches to complete and cause the stack access to only ever go to the stack slot that is accessed architecturally. Alternatively, the variable-offset stack access might be a write that can itself be subject to speculative store bypass (this can happen in theory even if this code adds a nospec /before/ the variable-offset write). Only indirect writes by helpers might be affected here (e.g., those taking ARG_PTR_TO_MAP_VALUE). (Because check_stack_write_var_off() does not use check_stack_range_initialized(), in-program variable-offset writes are not affected.) If the in-helper write can be subject to Spectre v4 and the helper writes/overwrites pointers on the BPF stack, they are already a problem for fixed-offset stack accesses and should be subject to Spectre v4 sanitization. Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
ALU sanitization was introduced to ensure that a subsequent ptr access can never go OOB, even under speculation. This is required because we currently allow speculative scalar confusion. Spec. scalar confusion is possible because Spectre v4 sanitization only adds a nospec after critical stores (e.g., scalar overwritten with a pointer). If we add a nospec before the ALU op, none of the operands can be subject to scalar confusion. As an ADD/SUB can not introduce scalar confusion itself, the result will also not be subject to scalar confusion. Therefore, the subsequent ptr access is always safe. We directly fall back to nospec for the sanitization errors REASON_BOUNDS, _TYPE, _PATHS, and _LIMIT, even if we are not on a speculative path. For REASON_STACK, we return the error -ENOMEM directly now. Previously, sanitize_err() returned -EACCES for this case but we change it to -ENOMEM because doing so prevents do_check() from falling back to a nospec if we are on a speculative path. This would not be a serious issue (the verifier would probably run into the -ENOMEM again shortly on the next non-speculative path and still abort verification), but -ENOMEM is more fitting here anyway. An alternative would be -EFAULT, which is also returned for some of the other cases where push_stack() fails, but this is more frequently used for verifier-internal bugs. Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>
Upstream branch: 224ee86 |
857cb7c
to
06daf6b
Compare
Pull request for series with
subject: bpf: Mitigate Spectre v1 using barriers
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=955213