Skip to content

Introduce BlockSize #3716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Conversation

schnellerhase
Copy link
Contributor

@schnellerhase schnellerhase commented Apr 27, 2025

In performance critical parts some block sizes are optimized for by compiling explicit versions with the block size being provided as a compile time constant. At the same time general runtime block sizes are supported through an argument to these functions.

This causes

  1. Code duplication: one path for the runtime and one for the compile time definitions of the block sizes, and
  2. duplicate input of the block sizes: once as template argument once as argument (matching of both is only asserted does not raise in release due to performance impact)

Introduces a BlockSize concept that either holds a runtime int or a compile time std::integral_constant<int, bs> which allows to generate code paths explicitly for certain sizes, while maintaining a shared code path in both cases.

  • form packing optimizes for block sizes 1,2,3 - vector assembly for 1,3: is this miss match intentional?
  • matrix operation routines

@jhale
Copy link
Member

jhale commented Apr 27, 2025

Looks very nice. Could we review the basic approach before you spend lots more time on it?

@schnellerhase
Copy link
Contributor Author

Sure thing. Should be good to go as is and can be extended further when approved. One neat byproduct, that these changes would allow for, are non compile time sized operations on the MatrixCSR which we are currently missing.

@schnellerhase schnellerhase marked this pull request as ready for review April 27, 2025 18:50
@chrisrichardson chrisrichardson self-requested a review April 28, 2025 15:26
Copy link
Contributor

@chrisrichardson chrisrichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good.

@garth-wells
Copy link
Member

Looks really neat.

  • Should the name be more generic, it's basically a runtime or templated integer. I can think of applications outside of block size, e.g. geometric dimension, where it could be useful.
  • Should it support different integer types?
  • Could tests be added to check that when it's a compile time integer that it really is a compiler time integer?

@schnellerhase
Copy link
Contributor Author

For points 1 and 2 that should be no problem - how about: ConstexprType as name for the general concept?

Regarding 3: the interface to retrieve the value (here block_size) needs to be able to produce both a runtime value and a compile time value. Therefore it can not be marked constexpr. Testing for in lining of the compile time variant is also not straight forward as this remains in all cases a compiler decision. Best way to check for its effect, I assume, would be with a benchmark of those cases.

@garth-wells
Copy link
Member

Regarding 3: the interface to retrieve the value (here block_size) needs to be able to produce both a runtime value and a compile time value. Therefore it can not be marked constexpr. Testing for in lining of the compile time variant is also not straight forward as this remains in all cases a compiler decision. Best way to check for its effect, I assume, would be with a benchmark of those cases.

I don't like relying on the compiler to inline things that we know are known at compile time. We have avoided this in the past and preferred being explicit over relying on the compiler and then not knowing what the compiler does.

@schnellerhase
Copy link
Contributor Author

It would be best if the block_size/value function would be constexpr for the compile time case. I will try if I can recover that behaviour.

@schnellerhase
Copy link
Contributor Author

schnellerhase commented Apr 30, 2025

It think I have a fix: value(ConxtexprType<T, V>) is now constexpr for is_compile_v<T, V> == True and otherwise not. The test case showcases that we can assert during compile time now. (Block size is not yet adapted).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants