Skip to content

[OpenMP] Fix various alignment issues #142376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rorth
Copy link
Collaborator

@rorth rorth commented Jun 2, 2025

When running the openmp testsuite on 32-bit SPARC, several tests FAIL apparently randomly, but always with the same kind of error:

# error: command failed with exit status: -11

The tests die with SIGBUS, as can be seen in truss output:

26461/1:            Incurred fault #5, FLTACCESS  %pc = 0x00010EAC
26461/1:              siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C
26461/1:            Received signal #10, SIGBUS [default]
26461/1:              siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C

i.e. the code is trying an unaligned access which cannot work on SPARC, a strict-alignment target which enforces natural alignment on access. This explains the apparent randomness of the failures: if the memory happens to be aligned appropriately, the tests work, but fail if not.

A Debug build reveals much more:

  • __kmp_alloc currently aligns to sizeof(void *), which isn't enough on strict-alignment targets when the data are accessed as types requiring larger alignment. Therefore, this patch increases alignment to SizeQuant.

  • 32-bit Solaris/sparc libc guarantees 8-byte alignment from malloc, so this patch adjusts SizeQuant to match.

  • There's a SIGBUS in

    __kmpc_fork_teams (loc=0x112f8, argc=0, 
        microtask=0x16cc8 <__omp_offloading_ffbc020a_4b1abe_main_l9_debug__.omp_outlined>)
        at openmp/runtime/src/kmp_csupport.cpp:573
    573	  *(kmp_int64 *)(&this_thr->th.th_teams_size) = 0L;
    

    Casting to a pointer to a type requiring 64-bit alignment when that isn't guaranteed is wrong. Instead, this patch uses memset instead.

  • There's another SIGBUS in

    0xfef8cb9c in __kmp_taskloop_recur (loc=0x10cb8, gtid=0, task=0x23cd00, 
        lb=0x23cd18, ub=0x23cd20, st=1, ub_glob=499, num_tasks=100, grainsize=5, 
        extras=0, last_chunk=0, tc=500, num_t_min=20, 
        codeptr_ra=0xfef8dbc8 <__kmpc_taskloop(ident_t*, int, kmp_task_t*, int, kmp_uint64*, kmp_uint64*, kmp_int64, int, int, kmp_uint64, void*)+240>, 
        task_dup=0x0)
        at openmp/runtime/src/kmp_tasking.cpp:5147
    5147	  p->st = st;
    

    p->st doesn't currently guarantee the 8-byte alignment required by kmp_int64 st. p is set in

     __taskloop_params_t *p = (__taskloop_params_t *)new_task->shareds;
    

    but shareds_offset is currently aligned to sizeof(void *) only. Increasing it to sizeof(kmp_uint64) to match its use fixes the SIGBUS.

With these fixes I get clean openmp test results on 32-bit SPARC (both Solaris and Linux), with one unrelated exception.

Tested on sparc-sun-solaris2.11, sparcv9-sun-solaris2.11, sparc-unknown-linux-gnu, sparc64-unknown-linux-gnu, i386-pc-solaris2.11, amd64-pc-solaris2.11, i686-pc-linux-gnu, and x86_64-pc-linux-gnu.

When running the `openmp` testsuite on 32-bit SPARC, several tests `FAIL`
apparently randomly, but always with the same kind of error:
```
# error: command failed with exit status: -11
```
The tests die with `SIGBUS`, as can be seen in `truss` output:
```
26461/1:            Incurred fault llvm#5, FLTACCESS  %pc = 0x00010EAC
26461/1:              siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C
26461/1:            Received signal llvm#10, SIGBUS [default]
26461/1:              siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C
```
i.e. the code is trying an unaligned access which cannot work on SPARC, a
strict-alignment target which enforces natural alignment on access.  This
explains the apparent randomness of the failures: if the memory happens to
be aligned appropriately, the tests work, but fail if not.

A `Debug` build reveals much more:

- `__kmp_alloc` currently aligns to `sizeof(void *)`, which isn't enough on
  strict-alignment targets when the data are accessed as types requiring
  larger alignment.  Therefore, this patch increases `alignment` to
  `SizeQuant`.

- 32-bit Solaris/sparc `libc` guarantees 8-byte alignment from `malloc`, so
  this patch adjusts `SizeQuant` to match.

- There's a `SIGBUS` in
  ```
  __kmpc_fork_teams (loc=0x112f8, argc=0, 
      microtask=0x16cc8 <__omp_offloading_ffbc020a_4b1abe_main_l9_debug__.omp_outlined>)
      at openmp/runtime/src/kmp_csupport.cpp:573
  573	  *(kmp_int64 *)(&this_thr->th.th_teams_size) = 0L;
  ```
  Casting to a pointer to a type requiring 64-bit alignment when that isn't
  guaranteed is wrong.  Instead, this patch uses `memset` instead.

- There's another `SIGBUS` in
  ```
  0xfef8cb9c in __kmp_taskloop_recur (loc=0x10cb8, gtid=0, task=0x23cd00, 
      lb=0x23cd18, ub=0x23cd20, st=1, ub_glob=499, num_tasks=100, grainsize=5, 
      extras=0, last_chunk=0, tc=500, num_t_min=20, 
      codeptr_ra=0xfef8dbc8 <__kmpc_taskloop(ident_t*, int, kmp_task_t*, int, kmp_uint64*, kmp_uint64*, kmp_int64, int, int, kmp_uint64, void*)+240>, 
      task_dup=0x0)
      at openmp/runtime/src/kmp_tasking.cpp:5147
  5147	  p->st = st;
  ```
  `p->st` doesn't currently guarantee the 8-byte alignment required by
  `kmp_int64 st`.  `p` is set in
  ```
  __taskloop_params_t *p = (__taskloop_params_t *)new_task->shareds;
  ```
  but `shareds_offset` is currently `sizeof(void *)` only.  Increasing it
  to `sizeof(kmp_uint64)` to match its use fixes the `SIGBUS`.

With these fixes I get clean `openmp` test results on 32-bit SPARC (both
Solaris and Linux), with one unrelated exception.

Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`,
`sparc-unknown-linux-gnu`, `sparc64-unknown-linux-gnu`,
`i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and
`x86_64-pc-linux-gnu`.
@rorth rorth requested review from mjklemm and TerryLWilmarth June 2, 2025 12:31
@llvmbot llvmbot added the openmp:libomp OpenMP host runtime label Jun 2, 2025
@brad0 brad0 requested a review from shiltian June 7, 2025 06:44
@@ -73,7 +73,7 @@ static void bectl(kmp_info_t *th, bget_compact_t compact,
/* On IA-32 architecture with Linux* OS, malloc() does not
ensure 16 byte alignment */

#if KMP_ARCH_X86 || !KMP_HAVE_QUAD
#if KMP_ARCH_X86 || KMP_ARCH_SPARC || !KMP_HAVE_QUAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add anything to the comment above this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this is complete mess: glibc sysdeps/i386/malloc-alignment. does #define MALLOC_ALIGNMENT 16, contrary to the comment (haven't checked if this has changed historically). Besides, KMP_HAVE_QUAD is a complete misnomer: it's only defined on x86 targets, so should probably be renamed to KMP_X86_HAVE_QUAD to reflect that.

@shiltian
Copy link
Contributor

shiltian commented Jun 7, 2025

Casting to a pointer to a type requiring 64-bit alignment when that isn't guaranteed is wrong.

kmp_teams_size_t is a struct of two kmp_int32 so its size is 64-bit, and the default alignment is its size, which would also be 64-bit. What could be wrong here?

@rorth
Copy link
Collaborator Author

rorth commented Jun 11, 2025

Casting to a pointer to a type requiring 64-bit alignment when that isn't guaranteed is wrong.

kmp_teams_size_t is a struct of two kmp_int32 so its size is 64-bit, and the default alignment is its size, which would also be 64-bit. What could be wrong here?

No, its alignment is the alignment of the member with the strictest alignment requirement (i.e. alignof(kmp_int32), which is alignof(int), so just 4. So casting the address to a kmp_int64 * extends the alignment requirement, and this is exactly what I saw: an insn requiring 8-byte alignment accessing 4-byte-aligned data.

@brad0
Copy link
Contributor

brad0 commented Jun 19, 2025

@shiltian

Copy link
Contributor

@shiltian shiltian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'd like @jprotze @jpeyton52 to have a second look.

Copy link
Contributor

@mjklemm mjklemm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @jpeyton52 please add your blessing, if you're OK with this PR.

@@ -1861,7 +1861,7 @@ typedef struct kmp_mem_desc { // Memory block descriptor
void *ptr_align; // Pointer to aligned memory, returned
kmp_allocator_t *allocator; // allocator
} kmp_mem_desc_t;
static int alignment = sizeof(void *); // align to pointer size by default
static int alignment = SizeQuant;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot find any assignment to this variable. What is the purpose of having this as a static?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @jprotze notes we can get rid of this global and make it a constexpr size_t:

Suggested change
static int alignment = SizeQuant;
constexpr size_t alignment = SizeQuant;

It is only used once down below in the __kmp_alloc() function (line 1929).

Copy link
Contributor

@jpeyton52 jpeyton52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest looks good to me.

@@ -1861,7 +1861,7 @@ typedef struct kmp_mem_desc { // Memory block descriptor
void *ptr_align; // Pointer to aligned memory, returned
kmp_allocator_t *allocator; // allocator
} kmp_mem_desc_t;
static int alignment = sizeof(void *); // align to pointer size by default
static int alignment = SizeQuant;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @jprotze notes we can get rid of this global and make it a constexpr size_t:

Suggested change
static int alignment = SizeQuant;
constexpr size_t alignment = SizeQuant;

It is only used once down below in the __kmp_alloc() function (line 1929).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp:libomp OpenMP host runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants