bpf: Mitigate Spectre v1 using barriers #8817

This is required to catch the errors later and fall back to a nospec if on a speculative path. Eliminate the regs variable as it is only used once and insn_idx is not modified in-between the definition and usage. Still pass insn simply to match the other check_*() functions. As Eduard points out [1], insn is assumed to correspond to env->insn_idx in many places (e.g, __check_reg_arg()). Move code into do_check_insn(), replace * "continue" with "return 0" after modifying insn_idx * "goto process_bpf_exit" with "return PROCESS_BPF_EXIT" * "do_print_state = " with "*do_print_state = " [1] https://lore.kernel.org/all/[email protected]/ Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]>

Mark these cases as non-recoverable to later prevent them from being caught when they occur during speculative path verification. Eduard writes [1]: The only pace I'm aware of that might act upon specific error code from verifier syscall is libbpf. Looking through libbpf code, it seems that this change does not interfere with libbpf. [1] https://lore.kernel.org/all/[email protected]/ Signed-off-by: Luis Gerhorst <[email protected]> Reviewed-by: Eduard Zingerman <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]>

This prevents us from trying to recover from these on speculative paths in the future. Signed-off-by: Luis Gerhorst <[email protected]> Reviewed-by: Eduard Zingerman <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]>

JITs can set bpf_jit_bypass_spec_v1/v4() if they want the verifier to skip analysis/patching for the respective vulnerability. For v4, this will reduce the number of barriers the verifier inserts. For v1, it allows more programs to be accepted. The primary motivation for this is to not regress unpriv BPF's performance on ARM64 in a future commit where BPF_NOSPEC is also used against Spectre v1. This has the user-visible change that v1-induced rejections on non-vulnerable PowerPC CPUs are avoided. For now, this does not change the semantics of BPF_NOSPEC. It is still a v4-only barrier and must not be implemented if bypass_spec_v4 is always true for the arch. Changing it to a v1 AND v4-barrier is done in a future commit. As an alternative to bypass_spec_v1/v4, one could introduce NOSPEC_V1 AND NOSPEC_V4 instructions and allow backends to skip their lowering as suggested by commit f5e81d1 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4"). Adding bpf_jit_bypass_spec_v1/v4() was found to be preferable for the following reason: * bypass_spec_v1/v4 benefits non-vulnerable CPUs: Always performing the same analysis (not taking into account whether the current CPU is vulnerable), needlessly restricts users of CPUs that are not vulnerable. The only use case for this would be portability-testing, but this can later be added easily when needed by allowing users to force bypass_spec_v1/v4 to false. * Portability is still acceptable: Directly disabling the analysis instead of skipping the lowering of BPF_NOSPEC(_V1/V4) might allow programs on non-vulnerable CPUs to be accepted while the program will be rejected on vulnerable CPUs. With the fallback to speculation barriers for Spectre v1 implemented in a future commit, this will only affect programs that do variable stack-accesses or are very complex. For PowerPC, the SEC_FTR checking in bpf_jit_bypass_spec_v4() is based on the check that was previously located in the BPF_NOSPEC case. For LoongArch, it would likely be safe to set both bpf_jit_bypass_spec_v1() and _v4() according to commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode"). This is omitted here as I am unable to do any testing for LoongArch. Signed-off-by: Luis Gerhorst <[email protected]> Cc: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>

This changes the semantics of BPF_NOSPEC (previously a v4-only barrier) to always emit a speculation barrier that works against both Spectre v1 AND v4. If mitigation is not needed on an architecture, the backend should set bpf_jit_bypass_spec_v4/v1(). As of now, this commit only has the user-visible implication that unpriv BPF's performance on PowerPC is reduced. This is the case because we have to emit additional v1 barrier instructions for BPF_NOSPEC now. This commit is required for a future commit to allow us to rely on BPF_NOSPEC for Spectre v1 mitigation. As of this commit, the feature that nospec acts as a v1 barrier is unused. Commit f5e81d1 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4") noted that mitigation instructions for v1 and v4 might be different on some archs. While this would potentially offer improved performance on PowerPC, it was dismissed after the following considerations: * Only having one barrier simplifies the verifier and allows us to easily rely on v4-induced barriers for reducing the complexity of v1-induced speculative path verification. * For the architectures that implemented BPF_NOSPEC, only PowerPC has distinct instructions for v1 and v4. Even there, some insns may be shared between the barriers for v1 and v4 (e.g., 'ori 31,31,0' and 'sync'). If this is still found to impact performance in an unacceptable way, BPF_NOSPEC can be split into BPF_NOSPEC_V1 and BPF_NOSPEC_V4 later. As an optimization, we can already skip v1/v4 insns from being emitted for PowerPC with this setup if bypass_spec_v1/v4 is set. Vulnerability-status for BPF_NOSPEC-based Spectre mitigations (v4 as of this commit, v1 in the future) is therefore: * x86 (32-bit and 64-bit), ARM64, and PowerPC (64-bit): Mitigated - This patch implements BPF_NOSPEC for these architectures. The previous v4-only version was supported since commit f5e81d1 ("bpf: Introduce BPF nospec instruction for mitigating Spectre v4") and commit b7540d6 ("powerpc/bpf: Emit stf barrier instruction sequences for BPF_NOSPEC"). * LoongArch: Not Vulnerable - Commit a6f6a95 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode") is the only other past commit related to BPF_NOSPEC and indicates that the insn is not required there. * MIPS: Vulnerable (if unprivileged BPF is enabled) - Commit a6f6a95f2580 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode") indicates that it is not vulnerable but this contradicts the kernel and Debian documentation. Therefore I assume that there exist vulnerable MIPS CPUs (but maybe not from Loongson?). In the future, BPF_NOSPEC could be implemented for MIPS based on the GCC speculation_barrier [1]. For now, we rely on unprivileged BPF being disabled by default. * Other: Unknown - To the best of my knowledge there is no definitive information available that indicates that any other arch is vulnerable. They are therefore left untouched (BPF_NOSPEC is not implemented, but bypass_spec_v1/v4 is also not set). I did the following testing to ensure the insn encoding is correct: * ARM64: * 'dsb nsh; isb' was successfully tested with the BPF CI in [2] * 'sb' locally using QEMU v7.2.15 -cpu max (emitted sb insn is executed for example with './test_progs -t verifier_array_access') * PowerPC: The following configs were tested locally with ppc64le QEMU v8.2 '-machine pseries -cpu POWER9': * STF_BARRIER_EIEIO + CONFIG_PPC_BOOK32_64 * STF_BARRIER_SYNC_ORI (forced on) + CONFIG_PPC_BOOK32_64 * STF_BARRIER_FALLBACK (forced on) + CONFIG_PPC_BOOK32_64 * CONFIG_PPC_E500 (forced on) + STF_BARRIER_EIEIO * CONFIG_PPC_E500 (forced on) + STF_BARRIER_SYNC_ORI (forced on) * CONFIG_PPC_E500 (forced on) + STF_BARRIER_FALLBACK (forced on) * CONFIG_PPC_E500 (forced on) + STF_BARRIER_NONE (forced on) Most of those cobinations should not occur in practice, but I was not able to get an PPC e6500 rootfs (for testing PPC_E500 without forcing it on). In any case, this should ensure that there are no unexpected conflicts between the insns when combined like this. Individual v1/v4 barriers were already emitted elsewhere. [1] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=29b74545531f6afbee9fc38c267524326dbfbedf ("MIPS: Add speculation_barrier support") [2] #8576 Signed-off-by: Luis Gerhorst <[email protected]> Cc: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>

This is made to clarify that this flag will cause a nospec to be added after this insn and can therefore be relied upon to reduce speculative path analysis. Signed-off-by: Luis Gerhorst <[email protected]> Cc: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]>

This implements the core of the series and causes the verifier to fall back to mitigating Spectre v1 using speculation barriers. The approach was presented at LPC'24 [1] and RAID'24 [2]. If we find any forbidden behavior on a speculative path, we insert a nospec (e.g., lfence speculation barrier on x86) before the instruction and stop verifying the path. While verifying a speculative path, we can furthermore stop verification of that path whenever we encounter a nospec instruction. A minimal example program would look as follows: A = true B = true if A goto e f() if B goto e unsafe() e: exit There are the following speculative and non-speculative paths (`cur->speculative` and `speculative` referring to the value of the push_stack() parameters): - A = true - B = true - if A goto e - A && !cur->speculative && !speculative - exit - !A && !cur->speculative && speculative - f() - if B goto e - B && cur->speculative && !speculative - exit - !B && cur->speculative && speculative - unsafe() If f() contains any unsafe behavior under Spectre v1 and the unsafe behavior matches `state->speculative && error_recoverable_with_nospec(err)`, do_check() will now add a nospec before f() instead of rejecting the program: A = true B = true if A goto e nospec f() if B goto e unsafe() e: exit Alternatively, the algorithm also takes advantage of nospec instructions inserted for other reasons (e.g., Spectre v4). Taking the program above as an example, speculative path exploration can stop before f() if a nospec was inserted there because of Spectre v4 sanitization. In this example, all instructions after the nospec are dead code (and with the nospec they are also dead code speculatively). On x86_64, this depends on the following property of lfence [3]: An LFENCE instruction or a serializing instruction will ensure that no later instructions execute, even speculatively, until all prior instructions complete locally. [...] Inserting an LFENCE instruction after a bounds check prevents later operations from executing before the bound check completes. Regarding the example, this implies that `if B goto e` will not execute before `if A goto e` completes. Once `if A goto e` completes, the CPU should find that the speculation was wrong and continue with `exit`. If there is any other path that leads to `if B goto e` (and therefore `unsafe()`) without going through `if A goto e`, then a nospec will still be needed there. However, this patch assumes this other path will be explored separately and therefore be discovered by the verifier even if the exploration discussed here stops at the nospec. This patch furthermore has the unfortunate consequence that Spectre v1 mitigations now only support architectures which implement BPF_NOSPEC. Before this commit, Spectre v1 mitigations prevented exploits by rejecting the programs on all architectures. Because some JITs do not implement BPF_NOSPEC, this patch therefore may regress unpriv BPF's security to a limited extent: * The regression is limited to systems vulnerable to Spectre v1, have unprivileged BPF enabled, and do NOT emit insns for BPF_NOSPEC. The latter is not the case for x86 64- and 32-bit, arm64, and powerpc 64-bit and they are therefore not affected by the regression. According to commit a6f6a95 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode"), LoongArch is not vulnerable to Spectre v1 and therefore also not affected by the regression. * To the best of my knowledge this regression may therefore only affect MIPS. This is deemed acceptable because unpriv BPF is still disabled there by default. As stated in a previous commit, BPF_NOSPEC could be implemented for MIPS based on GCC's speculation_barrier implementation. * It is unclear which other architectures (besides x86 64- and 32-bit, ARM64, PowerPC 64-bit, LoongArch, and MIPS) supported by the kernel are vulnerable to Spectre v1. Also, it is not clear if barriers are available on these architectures. Implementing BPF_NOSPEC on these architectures therefore is non-trivial. Searching GCC and the kernel for speculation barrier implementations for these architectures yielded no result. * If any of those regressed systems is also vulnerable to Spectre v4, the system was already vulnerable to Spectre v4 attacks based on unpriv BPF before this patch and the impact is therefore further limited. As an alternative to regressing security, one could still reject programs if the architecture does not emit BPF_NOSPEC (e.g., by removing the empty BPF_NOSPEC-case from all JITs except for LoongArch where it appears justified). However, this will cause rejections on these archs that are likely unfounded in the vast majority of cases. In the tests, some are now successful where we previously had a false-positive (i.e., rejection). Change them to reflect where the nospec should be inserted (using __xlated_unpriv) and modify the error message if the nospec is able to mitigate a problem that previously shadowed another problem (in that case __xlated_unpriv does not work, therefore just add a comment). Define SPEC_V1 to avoid duplicating this ifdef whenever we check for nospec insns using __xlated_unpriv, define it here once. This also improves readability. PowerPC can probably also be added here. However, omit it for now because the BPF CI currently does not include a test. Briefly went through all the occurrences of EPERM, EINVAL, and EACCESS in the verifier in order to validate that catching them like this makes sense. [1] https://lpc.events/event/18/contributions/1954/ ("Mitigating Spectre-PHT using Speculation Barriers in Linux eBPF") [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions") [3] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/runtime-speculative-side-channel-mitigations.html ("Managed Runtime Speculative Execution Side Channel Mitigations") Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>

This is based on the gadget from the description of commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted branches"). Signed-off-by: Luis Gerhorst <[email protected]>

Insert a nospec before the access to prevent it from ever using an index that is subject to speculative scalar-confusion. The access itself can either happen directly in the BPF program (reads only, check_stack_read_var_off()) or in a helper (read/write, check_helper_mem_access()). This relies on the fact that the speculative scalar confusion that leads to the variable-stack access going OOBs must stem from a prior speculative store or branch bypass. Adding a nospec before the variable-stack access will force all previously bypassed stores/branches to complete and cause the stack access to only ever go to the stack slot that is accessed architecturally. Alternatively, the variable-offset stack access might be a write that can itself be subject to speculative store bypass (this can happen in theory even if this code adds a nospec /before/ the variable-offset write). Only indirect writes by helpers might be affected here (e.g., those taking ARG_PTR_TO_MAP_VALUE). (Because check_stack_write_var_off() does not use check_stack_range_initialized(), in-program variable-offset writes are not affected.) If the in-helper write can be subject to Spectre v4 and the helper writes/overwrites pointers on the BPF stack, they are already a problem for fixed-offset stack accesses and should be subject to Spectre v4 sanitization. Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>

ALU sanitization was introduced to ensure that a subsequent ptr access can never go OOB, even under speculation. This is required because we currently allow speculative scalar confusion. Spec. scalar confusion is possible because Spectre v4 sanitization only adds a nospec after critical stores (e.g., scalar overwritten with a pointer). If we add a nospec before the ALU op, none of the operands can be subject to scalar confusion. As an ADD/SUB can not introduce scalar confusion itself, the result will also not be subject to scalar confusion. Therefore, the subsequent ptr access is always safe. We directly fall back to nospec for the sanitization errors REASON_BOUNDS, _TYPE, _PATHS, and _LIMIT, even if we are not on a speculative path. For REASON_STACK, we return the error -ENOMEM directly now. Previously, sanitize_err() returned -EACCES for this case but we change it to -ENOMEM because doing so prevents do_check() from falling back to a nospec if we are on a speculative path. This would not be a serious issue (the verifier would probably run into the -ENOMEM again shortly on the next non-speculative path and still abort verification), but -ENOMEM is more fitting here anyway. An alternative would be -EFAULT, which is also returned for some of the other cases where push_stack() fails, but this is more frequently used for verifier-internal bugs. Signed-off-by: Luis Gerhorst <[email protected]> Acked-by: Henriette Herzog <[email protected]> Cc: Maximilian Ott <[email protected]> Cc: Milan Stephan <[email protected]>

kernel-patches-daemon-bpf · 2025-05-09T18:37:02Z

Upstream branch: cb4a119
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=958739
version: 3

kernel-patches-daemon-bpf · 2025-05-09T18:41:15Z

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=958739 irrelevant now. Closing PR.

kernel-patches-daemon-bpf bot added new bpf-next V2 labels Apr 21, 2025

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from a1272ec to e95f7e7 Compare April 21, 2025 13:11

kernel-patches-daemon-bpf bot added the V2-ci-pass label Apr 21, 2025

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 9728a8b to 27c7945 Compare April 21, 2025 15:09

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from e95f7e7 to cfe0b88 Compare April 21, 2025 15:11

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 27c7945 to 4f05866 Compare April 22, 2025 21:35

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from cfe0b88 to d358f9f Compare April 22, 2025 21:37

kernel-patches-daemon-bpf bot added V2-ci-fail and removed V2-ci-pass labels Apr 22, 2025

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 4f05866 to e9b7177 Compare April 23, 2025 00:17

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from d358f9f to 80285ab Compare April 23, 2025 00:20

kernel-patches-daemon-bpf bot added V2-ci-pass and removed V2-ci-fail labels Apr 23, 2025

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from e9b7177 to f106256 Compare April 23, 2025 17:27

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from 80285ab to c74fc67 Compare April 23, 2025 17:30

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from f106256 to 8b59228 Compare April 23, 2025 18:13

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from c74fc67 to e71ad5a Compare April 23, 2025 18:16

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 8b59228 to 9c93586 Compare April 23, 2025 18:32

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from e71ad5a to 46dc21f Compare April 23, 2025 18:35

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 9c93586 to ca1daab Compare April 23, 2025 23:52

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from 350c538 to 98f6619 Compare May 6, 2025 21:11

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from e4e98c9 to 8b8229f Compare May 8, 2025 18:43

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from 98f6619 to 2f8c385 Compare May 8, 2025 18:44

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 8b8229f to 4b2ad00 Compare May 9, 2025 17:17

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from 2f8c385 to ad80bc4 Compare May 9, 2025 17:20

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 4b2ad00 to 13a3aab Compare May 9, 2025 18:04

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from ad80bc4 to c9a126f Compare May 9, 2025 18:07

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 13a3aab to f1ad525 Compare May 9, 2025 18:30

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from c9a126f to 380f3dd Compare May 9, 2025 18:32

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from f1ad525 to d99abbe Compare May 9, 2025 18:35

luisgerhorst added 10 commits May 9, 2025 11:37

selftests/bpf: Add test for Spectre v1 mitigation

6276cf8

This is based on the gadget from the description of commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted branches"). Signed-off-by: Luis Gerhorst <[email protected]>

kernel-patches-daemon-bpf bot force-pushed the series/943643=>bpf-next branch from 380f3dd to bd5e5cc Compare May 9, 2025 18:37

kernel-patches-daemon-bpf bot added the accepted label May 9, 2025

kernel-patches-daemon-bpf bot closed this May 9, 2025

kernel-patches-daemon-bpf bot deleted the series/943643=>bpf-next branch May 9, 2025 18:41

bpf: Mitigate Spectre v1 using barriers #8817

bpf: Mitigate Spectre v1 using barriers #8817

Uh oh!

Conversation

kernel-patches-daemon-bpf bot commented Apr 21, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 21, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 21, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 22, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 23, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 23, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 23, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 23, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented Apr 23, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented May 8, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented May 9, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented May 9, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented May 9, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented May 9, 2025

Uh oh!

kernel-patches-daemon-bpf bot commented May 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants