[TSan] Clarify and enforce shadow end alignment #144648

Camsyn · 2025-06-18T07:13:27Z

In TSan, every k bytes of application memory (where k = 8) maps to a single shadow/meta cell. This design leads to two distinct outcomes when calculating the end of a shadow range using MemToShadow(addr_end), depending on the alignment of addr_end:

Exclusive End: If addr_end is aligned (addr_end % k == 0), MemToShadow(addr_end) points to the first shadow cell past the intended range. This address is an exclusive boundary marker, not a cell to be operated on.
Inclusive End: If addr_end is not aligned (addr_end % k != 0), MemToShadow(addr_end) points to the last shadow cell that is part of the range (i.e., the same cell as MemToShadow(addr_end - 1)).

Different TSan functions have different expectations for whether the shadow end should be inclusive or exclusive. However, these expectations are not always explicitly enforced, which can lead to subtle bugs or reliance on unstated invariants.

The core of this patch is to ensure that functions ONLY requiring an exclusive shadow end behave correctly.

Enforcing Existing Invariants:
For functions like MetaMap::MoveMemory and MapShadow, the assumption is that the end address is always k-aligned. While this holds true in the current codebase (e.g., due to some external implicit conditions), this invariant is not guaranteed by the function's internal context. We add explicit assertions to make this requirement clear and to catch any future changes that might violate this assumption.
Fixing Latent Bugs:
In other cases, unaligned end addresses are possible, representing a latent bug. This was the case in UnmapShadow. The size of a memory region being unmapped is not always a multiple of k. When this happens, UnmapShadow would fail to clear the final (tail) portion of the shadow memory.

This patch fixes UnmapShadow by rounding up the size to the next multiple of k before clearing the shadow memory. This is safe because the underlying OS unmap operation is page-granular, and the page size is guaranteed to be a multiple of k.

Notably, this fix makes UnmapShadow consistent with its inverse operation, MemoryRangeImitateWriteOrResetRange, which already performs a similar size round-up.

In summary, this PR:

Adds assertions to MetaMap::MoveMemory and MapShadow to enforce their implicit requirement for k-aligned end addresses.
Fixes a latent bug in UnmapShadow by rounding up the size to ensure the entire shadow range is cleared. A new test case is added to cover this scenario.
Removes a redundant assertion in __tsan_java_move.
Fixes an incorrect shadow end calculation introduced in commit 4052de6. The previous logic, while fixing an overestimation issue, did not properly account for kShadowCell alignment and could lead to underestimation.

This is an improvement that enhances the robustness of the code. Previously, the correct calculation of exclusive EndShadow relied on the assumption that `addr_end % kShadowCell == 0`; however, in many current usages, this assumption was not strictly guaranteed (although it did in fact meet). In addition, computing EndShadow does not require the corresponding address to be AppMem; for example, HighAppEnd is not AppMem, but can still be used to calculate EndShadow. For example, for the AppMem range [0, 1), `s = MemToShadow(0)` is equal to `MemToShadow(1)`. The previous logic would incorrectly deduce an empty shadow range [s, s) while the correct shadow range should be [s, s + kShadowSize * kShadowCnt) to cover all the related shadow memory for the accessed cell. This commit addresses this in two ways: 1. It introduces a dedicated utility function, i.e., `MemToEndShadow`, to correctly calculate the end of a shadow memory range, accounting for the memory cell granularity. 2. It replaces existing (and potentially incorrect) calculations of the shadow end with this new utility function. Additionally, the previous commit 4052de6 resolved a problem with overestimating the shadow end; it did not consider `kShadowCell` and could therefore lead to underestimates. This is also corrected by utilizing the `MemToEndShadow` function.

llvmbot · 2025-06-18T07:13:58Z

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Kunqiu Chen (Camsyn)

Changes

This is an improvement that enhances the robustness of the code.

Previously, the correct calculation of exclusive EndShadow relied on the assumption that addr_end % kShadowCell == 0; however, in many current usages, this assumption was not strictly guaranteed (although it did in fact meet).

In addition, computing EndShadow does not require the corresponding address to be AppMem; for example, HighAppEnd is not AppMem, but can still be used to calculate EndShadow.

For example, for the AppMem range [0, 1), s = MemToShadow(0) is equal to MemToShadow(1). The previous logic would incorrectly deduce an empty shadow range [s, s).
The correct shadow range should be [s, s + kShadowSize * kShadowCnt) to cover all the shadow memory for the accessed cell.

This commit addresses this in two ways:

It introduces a dedicated utility function, i.e., MemToEndShadow, to correctly calculate the end of a shadow memory range, accounting for the memory cell granularity.
It replaces existing (and potentially incorrect) calculations of the shadow end with this new utility function.

Additionally, the previous commit 4052de6 resolved a problem with overestimating the shadow end; it did not consider kShadowCell and could therefore lead to underestimates.
This is also corrected by utilizing the MemToEndShadow function.

Full diff: https://github.com/llvm/llvm-project/pull/144648.diff

5 Files Affected:

(modified) compiler-rt/lib/tsan/rtl/tsan_interface_java.cpp (+1-2)
(modified) compiler-rt/lib/tsan/rtl/tsan_platform.h (+18)
(modified) compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp (+1-1)
(modified) compiler-rt/lib/tsan/rtl/tsan_rtl.cpp (+3-3)
(modified) compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp (+4-5)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface_java.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface_java.cpp
index 7c15a16388268..3e324eef9cb8d 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface_java.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface_java.cpp
@@ -122,7 +122,6 @@ void __tsan_java_move(jptr src, jptr dst, jptr size) {
   DCHECK_GE(dst, jctx->heap_begin);
   DCHECK_LE(dst + size, jctx->heap_begin + jctx->heap_size);
   DCHECK_NE(dst, src);
-  DCHECK_NE(size, 0);
 
   // Assuming it's not running concurrently with threads that do
   // memory accesses and mutex operations (stop-the-world phase).
@@ -132,7 +131,7 @@ void __tsan_java_move(jptr src, jptr dst, jptr size) {
   // We used to move shadow from src to dst, but the trace format does not
   // support that anymore as it contains addresses of accesses.
   RawShadow *d = MemToShadow(dst);
-  RawShadow *dend = MemToShadow(dst + size);
+  RawShadow *dend = MemToEndShadow(dst + size);
   ShadowSet(d, dend, Shadow::kEmpty);
 }
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_platform.h b/compiler-rt/lib/tsan/rtl/tsan_platform.h
index 354f6da6a64a1..5b589df482256 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_platform.h
+++ b/compiler-rt/lib/tsan/rtl/tsan_platform.h
@@ -968,6 +968,24 @@ RawShadow *MemToShadow(uptr x) {
   return reinterpret_cast<RawShadow *>(SelectMapping<MemToShadowImpl>(x));
 }
 
+struct MemToEndShadowImpl {
+  template <typename Mapping>
+  static uptr Apply(uptr x) {
+    return (((x + kShadowCell - 1) &
+             ~(Mapping::kShadowMsk | (kShadowCell - 1))) ^
+            Mapping::kShadowXor) *
+               kShadowMultiplier +
+           Mapping::kShadowAdd;
+  }
+};
+
+// If addr % kShadowCell == 0, then MemToEndShadow(addr) == MemToShadow(addr)
+// Otherwise, MemToEndShadow(addr) == MemToShadow(addr) + kShadowCnt
+ALWAYS_INLINE
+RawShadow *MemToEndShadow(uptr x) {
+  return reinterpret_cast<RawShadow *>(SelectMapping<MemToEndShadowImpl>(x));
+}
+
 struct MemToMetaImpl {
   template <typename Mapping>
   static u32 *Apply(uptr x) {
diff --git a/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp b/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp
index 2c55645a15479..dbf583b362359 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp
@@ -195,7 +195,7 @@ static NOINLINE void MapRodata(char* buffer, uptr size) {
         !segment.IsWritable() && IsAppMem(segment.start)) {
       // Assume it's .rodata
       char *shadow_start = (char *)MemToShadow(segment.start);
-      char *shadow_end = (char *)MemToShadow(segment.end);
+      char *shadow_end = (char *)MemToEndShadow(segment.end);
       for (char *p = shadow_start; p < shadow_end;
            p += marker.size() * sizeof(RawShadow)) {
         internal_mmap(
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
index c83efec8eaca2..4afb84e89936a 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
@@ -532,7 +532,7 @@ static void StopBackgroundThread() {
 
 void DontNeedShadowFor(uptr addr, uptr size) {
   ReleaseMemoryPagesToOS(reinterpret_cast<uptr>(MemToShadow(addr)),
-                         reinterpret_cast<uptr>(MemToShadow(addr + size)));
+                         reinterpret_cast<uptr>(MemToEndShadow(addr + size)));
 }
 
 #if !SANITIZER_GO
@@ -588,12 +588,12 @@ void MapShadow(uptr addr, uptr size) {
   // CHECK_EQ(addr, addr & ~((64 << 10) - 1));  // windows wants 64K alignment
   const uptr kPageSize = GetPageSizeCached();
   uptr shadow_begin = RoundDownTo((uptr)MemToShadow(addr), kPageSize);
-  uptr shadow_end = RoundUpTo((uptr)MemToShadow(addr + size), kPageSize);
+  uptr shadow_end = RoundUpTo((uptr)MemToEndShadow(addr + size), kPageSize);
   if (!MmapFixedNoReserve(shadow_begin, shadow_end - shadow_begin, "shadow"))
     Die();
 #else
   uptr shadow_begin = RoundDownTo((uptr)MemToShadow(addr), (64 << 10));
-  uptr shadow_end = RoundUpTo((uptr)MemToShadow(addr + size), (64 << 10));
+  uptr shadow_end = RoundUpTo((uptr)MemToEndShadow(addr + size), (64 << 10));
   VPrintf(2, "MapShadow for (0x%zx-0x%zx), begin/end: (0x%zx-0x%zx)\n",
           addr, addr + size, shadow_begin, shadow_end);
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
index cf07686d968dc..9652cd0267a79 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
@@ -684,16 +684,15 @@ void MemoryAccessRangeT(ThreadState* thr, uptr pc, uptr addr, uptr size) {
     DCHECK(IsShadowMem(shadow_mem));
   }
 
-  RawShadow* shadow_mem_end = reinterpret_cast<RawShadow*>(
-      reinterpret_cast<uptr>(shadow_mem) + size * kShadowMultiplier - 1);
-  if (!IsShadowMem(shadow_mem_end)) {
-    Printf("Bad shadow end addr: %p (%p)\n", shadow_mem_end,
+  RawShadow* shadow_mem_end = MemToEndShadow(addr + size);
+  if (size > 0 && !IsShadowMem(shadow_mem_end - 1)) {
+    Printf("Bad shadow end addr: %p (%p)\n", shadow_mem_end - 1,
            (void*)(addr + size - 1));
     Printf(
         "Shadow start addr (ok): %p (%p); size: 0x%zx; kShadowMultiplier: "
         "%zx\n",
         shadow_mem, (void*)addr, size, kShadowMultiplier);
-    DCHECK(IsShadowMem(shadow_mem_end));
+    DCHECK(IsShadowMem(shadow_mem_end - 1));
   }
 #endif

Camsyn · 2025-06-18T07:14:16Z

If it is not appropriate to add a new utility function in tsan_platform.h, we can follow the example of MemoryRangeSet or MemoryResetRange, and

add an assertion to guarantee addr_end % kShadowCell == 0; or
use MemToShadow(RoundUpTo(addr_end, kShadowCell)) instead of MemToShadow(addr_end)

for shadow end computation.
However, due to the DCHECK(IsAppMemImpl::Apply<Mapping>(x)) for MemToShadow(x), it still throws an unreasonable error for MemToShadow(AppMem::end), which should be a valid shadow end.

melver · 2025-06-18T07:28:54Z

Is there a bug? If there's a bug, this needs a test. Otherwise, it's unclear what this is improving.

Camsyn · 2025-06-18T08:26:44Z

Is there a bug? If there's a bug, this needs a test. Otherwise, it's unclear what this is improving.

I acknowledge that "This is an improvement that enhances the robustness of the code", because the potential 'bugs' can hardly be triggered under the current implementation.

In tsan_interface_java.cpp:

  DCHECK_EQ(dst % kHeapAlignment, 0);
  DCHECK_EQ(size % kHeapAlignment, 0);
  ...
  // Duplicated assertion
  DCHECK_NE(size, 0);
  ...
  RawShadow *dend = MemToShadow(dst + size);

It should be a bug if kHeapAlignment != kShadowCell, although they are both equal to 8 in the current implementation.

In tsan_rtl_access.cpp:

  RawShadow* shadow_mem_end = reinterpret_cast<RawShadow*>(
      reinterpret_cast<uptr>(shadow_mem) + size * kShadowMultiplier - 1);
  if (!IsShadowMem(shadow_mem_end)) {

The calculation of shadow_mem_end did not consider kShadowCell and could therefore lead to underestimates.
E.g., for app mem [0, 1), it gives a shadow end shadow_of(0) + 1, while the real shadow end is shadow_of(0) + 16.

What's more, if size == 0 (which is not allowed in its caller MemoryAccessRange but allowed in MemoryAccessRangeT), the current calculation of end is problematic.

In tsan_platform_linux.cpp:

TSan iterates the proc map and for each segment,
char *shadow_end = (char *)MemToShadow(segment.end); is correct as the assumption of segment.end % kShadowCell == 0 always holds. But do we need to guarantee this condition more explicitly?

In tsan_rtl.cpp:

void DontNeedShadowFor(uptr addr, uptr size) {
  ReleaseMemoryPagesToOS(reinterpret_cast<uptr>(MemToShadow(addr)),
                         reinterpret_cast<uptr>(MemToShadow(addr + size)));
}

DontNeedShadowFor(addr, size) is used in ThreadFinish to release the shadow related to the thread stack, and used in unmap(p, size) to release the shadow allocated for mmap.
What if size % kShadowCell != 0? I think it might lead to a missing cleanup of the tail shadow.

E.g., for mmap(addr, size, ..), TSan rounds up the size to poison the shadow, while in munmap, TSan does not (see UnmapShadow). If MemToShadow(addr + size) % kPageSize == 0 and MemToEndShadow(addr + size) > MemToShadow(addr + size), the page of the tail shadow cannot be released theoretically.

What's more, I found another potential bug in DontNeedShadowFor while analyzing: DontNeedShadowFor receives [beg, end] as parameters while TSan transfers [beg, end) as arguments, which might lead to page overreleasing.
This is just a ReleaseMemoryPagesToOS comment description error, which mistakes [beg, end) as [beg, end].

In addition, another interface MapShadow(addr, size) shares the same problem.

By the way, this problem seems can be triggered by a well-designed testcase. If necessary, I will design one when I have time.

thurstond · 2025-06-18T16:23:24Z

Thanks for the patch!

If it is not appropriate to add a new utility function in tsan_platform.h, we can follow the example of MemoryRangeSet or MemoryResetRange, and

add an assertion to guarantee addr_end % kShadowCell == 0; or

If the assertion is added, do all the tests pass, and does it fix all the problems that adding MemToEndShadow would? If yes, then that is the more elegant solution (besides, I'd probably personally forget to use MemToEndShadow instead of MemToShadow when updating the code).
(If no: are there are some minor code cleanups that would make the assertion approach work?)

Because [beg, beg + PageSize - 1] obviously contains a full page, but under the current implementation, if end = beg + PageSize - 1, the page is not processed. So I concluded that the original description of include-end was wrong and should be changed to exclude-end

We don't always need to get the real shadow end, for example, some shadow clear, if using the real shadow end, it will cause overcleaning (i.e., clear [0, 1) makes [1, 8) inaccessible when kShadowCell == 8). So when we do need to get the real end, use RoundUp in place. For example, `unmap(addr, sz)` makes `[addr + sz, addr + PageSize)` inaccessible, so we can safely clean up the full shadow (by getting the real shadow end).

github-actions · 2025-06-21T13:03:29Z

✅ With the latest revision this PR passed the C/C++ code formatter.

This reverts commit 4c59a82. Reason: move this unrelated change to another NFC PR.

Camsyn · 2025-06-21T14:20:40Z

Thanks for the patch!

If it is not appropriate to add a new utility function in tsan_platform.h, we can follow the example of MemoryRangeSet or MemoryResetRange, and

add an assertion to guarantee addr_end % kShadowCell == 0; or

If the assertion is added, do all the tests pass, and does it fix all the problems that adding MemToEndShadow would? If yes, then that is the more elegant solution (besides, I'd probably personally forget to use MemToEndShadow instead of MemToShadow when updating the code). (If no: are there are some minor code cleanups that would make the assertion approach work?)

I also really think it's not good to add a new utility function in tsan_platform.h for this scenario.
Adhering to your suggestion, I have removed the new utility function MemToEndShadow and rewritten the PR description.

thurstond · 2025-06-22T02:42:43Z

compiler-rt/test/tsan/java_heap_init2.cpp

@@ -0,0 +1,32 @@
+// RUN: %clangxx_tsan -O1 %s -o %t && %run %t 2>&1 | FileCheck %s


Please add these test cases to a separate pull request. Since your fix is not committed yet, the tests would need an "XFAIL" annotation. Then, after the test cases are landed, this fix PR would update the test by removing the XFAIL. This two-step procedure makes it clear that 1) TSan was broken 2) this PR fixes it.

Camsyn requested review from fmayer, melver and vitalybuka June 18, 2025 07:13

llvmbot added compiler-rt compiler-rt:tsan Thread sanitizer compiler-rt:sanitizer labels Jun 18, 2025

melver requested a review from dvyukov June 18, 2025 07:28

melver requested review from thurstond June 18, 2025 08:44

Camsyn added 4 commits June 21, 2025 15:37

Add two test cases related to incorrect shadow-end calculation

0b868ad

add some assertions

faea01f

Revert "Fix the wrong description."

1d9f632

This reverts commit 4c59a82. Reason: move this unrelated change to another NFC PR.

Camsyn changed the title ~~[TSan] Fix potentially problematic shadow end calculations~~ [TSan] Clarify and enforce shadow end alignment Jun 21, 2025

Make formatter happy

2f7f865

Remove all MemToEndShadow usages

bb37bf3

thurstond reviewed Jun 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TSan] Clarify and enforce shadow end alignment #144648

[TSan] Clarify and enforce shadow end alignment #144648

Camsyn commented Jun 18, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

Camsyn commented Jun 18, 2025 •

edited

Loading

Uh oh!

melver commented Jun 18, 2025

Uh oh!

Camsyn commented Jun 18, 2025 •

edited

Loading

Uh oh!

thurstond commented Jun 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 21, 2025 •

edited

Loading

Uh oh!

Camsyn commented Jun 21, 2025

Uh oh!

thurstond Jun 22, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,32 @@
		// RUN: %clangxx_tsan -O1 %s -o %t && %run %t 2>&1 \| FileCheck %s

[TSan] Clarify and enforce shadow end alignment #144648

Are you sure you want to change the base?

[TSan] Clarify and enforce shadow end alignment #144648

Conversation

Camsyn commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

Camsyn commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

melver commented Jun 18, 2025

Uh oh!

Camsyn commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thurstond commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Camsyn commented Jun 21, 2025

Uh oh!

thurstond Jun 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Camsyn commented Jun 18, 2025 •

edited

Loading

Camsyn commented Jun 18, 2025 •

edited

Loading

Camsyn commented Jun 18, 2025 •

edited

Loading

thurstond commented Jun 18, 2025 •

edited

Loading

github-actions bot commented Jun 21, 2025 •

edited

Loading