Skip to content

Fix signed version of radix_sort #3724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
May 20, 2025
Merged

Conversation

schnellerhase
Copy link
Contributor

@schnellerhase schnellerhase commented May 7, 2025

Fixes bug in radix_sort, reported by @jorgensd, for signed integers where bitwise comparison failed and resulted in either infinite loops or unsorted result.

Fixes it by radix sort only working on unsigned ints internally and inputs with signed getting shifted to corresponding unsigned int ranges.

@schnellerhase schnellerhase marked this pull request as ready for review May 7, 2025 13:21
@IgorBaratta
Copy link
Member

Why not define the projection outside and only apply it when there's signed integers?
Is there any performance overhead introduced by always performing this conversion?

otherwise, LGTM.

@schnellerhase
Copy link
Contributor Author

Good catch - allows for making the logic quite a bit more expressive, thanks.

@schnellerhase
Copy link
Contributor Author

Reusing schnellerhase#24.

On main:

Benchmark                               Time             CPU   Iterations
-------------------------------------------------------------------------
BM_create_box/10/iterations:5        4.08 ms         4.04 ms            5
BM_create_box/50/iterations:5         530 ms          519 ms            5
BM_create_box/100/iterations:5       4640 ms         4627 ms            5

On this branch:

-------------------------------------------------------------------------
Benchmark                               Time             CPU   Iterations
-------------------------------------------------------------------------
BM_create_box/10/iterations:5        4.72 ms         4.71 ms            5
BM_create_box/50/iterations:5         551 ms          551 ms            5
BM_create_box/100/iterations:5       5038 ms         5033 ms            5

Has some impact.

@schnellerhase
Copy link
Contributor Author

Best guess would be that we now always max out the iteration count max_value since we set the most significant bit. Any ideas how to avoid that?

@schnellerhase
Copy link
Contributor Author

schnellerhase commented May 12, 2025

We only need to do the sign flip in case we detect a negative number, maybe check at runtime and use the unmodified projection if no casting is necessary?

@schnellerhase
Copy link
Contributor Author

After optimisation for first bit (ignore for order if all set)

-------------------------------------------------------------------------
Benchmark                               Time             CPU   Iterations
-------------------------------------------------------------------------
BM_create_box/10/iterations:5        3.94 ms         3.90 ms            5
BM_create_box/50/iterations:5         479 ms          479 ms            5
BM_create_box/100/iterations:5       4496 ms         4495 ms            5

@jorgensd
Copy link
Member

@IgorBaratta is this ready to me merged?

@IgorBaratta
Copy link
Member

@IgorBaratta is this ready to me merged?

LGTM!

@jorgensd jorgensd added this pull request to the merge queue May 20, 2025
Merged via the queue into FEniCS:main with commit 1473fe4 May 20, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants