`size` field: allow `Integer` values of singleton type as elements #107

nsajko · 2025-02-12T16:40:01Z

Currently only Int values may be contained in the size field. This PR additionally allows values of any singleton type that subtypes Integer.

Fixes #54

Currently only `Int` values may be contained in the `size` field. This PR additionally allows values of any singleton type that subtypes `Integer`. Fixes JuliaArrays#54

codecov · 2025-02-12T16:41:58Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.12%. Comparing base (c4d238f) to head (77ee0ca).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #107      +/-   ##
==========================================
- Coverage   96.74%   96.12%   -0.63%     
==========================================
  Files           3        3              
  Lines         246      284      +38     
==========================================
+ Hits          238      273      +35     
- Misses          8       11       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

nsajko · 2025-02-12T16:42:41Z

@PatrickHaecker you were interested in this functionality, would you like to try this out, or provide representative benchmarks?

I didn't benchmark anything yet, but this should speed up indexing when some of the dimensions are static as-is. There's probably more things that could be optimized on top of the current state of the PR.

nsajko · 2025-02-12T16:46:16Z

Here's a usage example using my package TypeDomainNaturalNumbers. There are other packages for static integers, too.

julia> using TypeDomainNaturalNumbers, FixedSizeArrays

julia> siz = (TypeDomainInteger(3), TypeDomainInteger(5))
(3, 5)

julia> Base.issingletontype(typeof(siz))
true

julia> siz === size(FixedSizeArray{Float32}(undef, siz))
true

julia> siz === size(reshape(FixedSizeArray{Float32}(undef, 15), siz))
true

giordano · 2025-02-12T18:26:48Z

src/FixedSizeArray.jl

+    x
+end
+
+struct FixedSizeArray{T,N,Mem<:DenseVector{T},Size<:NTuple{N,Integer}} <: DenseArray{T,N}


Isn't this turning FixedSizeArray too close to StaticArray by having the shape in the type domain?

There are some similarities with both HybridArrays.jl (allows specifying an arbitrary mix of statically and dynamically sized axes) and SizedArray from StaticArrays.jl (allows specifying statically sized axes for a wrapped dynamically-sized array). The main difference (both a benefit and limitation) over both HybridArrays.jl and StaticArrays.jl is that FixedSizeArray is dense (subypes DenseArray). And StaticArrays requires all axes to be statically sized, of course.

In case it's not clear, this PR doesn't remove any existing features, it just generalizes the current design.

I wasn't concerned about removing features but about putting too much information in the type domain. That's a recipe for skyrocketing compile times.

For what it's worth I think the test suite takes about the same time to execute as before. So, as long as all axes are dynamically sized, I don't expect compile time regressions.

I'm not sure you'd see much of a difference in tests here where we mostly do different operations. The problem is more when you do same operations on arrays of different sizes: the generated code is very similar, but each of them triggers full recompilation since the type is different. Of course putting more information in the type domain is more flexible (you can even dispatch on the exact shape of the array which in certain cases can be very convenient) but that comes at a cost. The constant size field isn't as flexible, but it still enables some compile-time optimisations (e.g. when constant-propagation does its job).

The problem is more when you do same operations on arrays of different sizes: the generated code is very similar, but each of them triggers full recompilation since the type is different.

That's definitely true, with this PR it's possible to create lots of distinct types, causing lots of work for the compiler. However in my mind if that situation happens it is almost surely what the user wants, no? The potential to unintentionally construct a FixedSizeArray with a statically sized axis seems small, given that static integers are rarely used.

Do I understand correctly that you're worried about this unintentional construction/type proliferation? If that's it, it should be possible to further diminish the possibility of unintentionally constructing an array with a statically sized axis: we could, for example, require such construction to specify the Size type parameter explicitly. That is, a constructor type that explicitly features a Size type parameter with singleton type constituents would be the only way to construct a FixedSizeArray with a non-Int-sized axis.

Does that seem like an improvement, and perhaps acceptable?

Update: Sorry, you can ignore this comment. I understood that Size can really be used to store the size information and I need to think this through.

Isn't this turning FixedSizeArray too close to StaticArray by having the shape in the type domain?

I have the feeling, that there is a misunderstanding, because I misread the code change in the beginning. Even after this change Size does not contain the size information, i.e. FixedSizeArray does not contain the shape in the type domain. This is in contrast to StaticArrays, although StaticArrays uses the same identifier Size for the shape information in its documentation.

What Size in this PR contains is the type of the size field for each dimension, e.g. only (a Tuple containing) Int8 for a short vector if the user specifies this. And while I now see the logic in the PR between the identifiers size and Size, I suggest to rename Size into SizeType or SizeT or something similar to avoid this confusion.

And while lifting size in the type domain could indeed trigger a lot of additional compilations (and would be best done in a new type in my opinion if anyone planned this), I think putting its type information into the type domain is reasonable. As default Int will be used as before, so there are no additional method compilations for existing users.

I expect only few different types which will be used in a typical program, i.e. a lot of combinations like {Int128, UInt8, UInt32} are unlikely to be ever seen in reality. ~~But it's great that they are available if need arises without any cost when not used.~~ As an NTuple is used, this type is not supported, so it would only be, e.g., {Int8, Int8, Int8}.

Ok, I just realised I did misunderstand the new type parameter: it isn't the size tuple itself, but its type:

julia> FixedSizeArray{Float64}(undef, 0, 0) 0×0 FixedSizeArray{Float64, 2, Memory{Float64}, Tuple{Int64, Int64}}

My concern was unwarranted (I read the code wrong and thought it'd move the tuple itself to the type domain), but I'd like @oscardssmith opinion on this. The type parameters are becoming a little unwieldy here.

I think we probably want to keep the size as Int. the overhead should be low.

PatrickHaecker · 2025-02-14T13:24:33Z

Ok, this is definitely genius in my opinion. I guess the only question might be if it's too genius. ;-)

I originally had expected that there could be two types like FixedSizeArrays (much simplified):

Type parameter for size type

struct FixedSizeVector1{T, SizeType<:Integer} # with `isprimitivetype(SizeType)`
    mem::Memory{T}
    size::SizeType
end

i.e. containing a size field with a type which is a concrete, primitive subtype of Integer. This has the advantage that its size is run-time defined and not parse-time defined and it's relatively compile-friendly with the disadvantage of needing some additional space per object for the size variable.

Type parameter for size

struct FixedSizeVector2{T, Size} # with `Size isa Int`
    mem::Memory{T}
end

i.e. containing the size information in the type. This can be compile-heavy and needs to be defined at parse-time, but saves some space for the size variable.

The problem of FixedSizeVector1 is that due to the 8 byte memory reference (focusing here on amd64 architectures) and the alignment rules, even FixedSizeVector1{Int8, UInt8} |> sizeof is 16 bytes due to padding. So there is only a benefit for more than one dimension, e.g. FixedSizeArray1{Int8, Tuple{UInt8, UInt8}} |> sizeof is 16 and not 24.

However, as long as Memory uses 8 bytes both for :length and for :ptr and inlining of mutable structs is not possible the memory saving is indeed limited. With FixedSizeVector2 or the implementation of this PR, the memory saving can be a bit larger, but again the larger advantages would probably be only there if inlining would be supported and differently sized small arrays are needed.

Questions:

What is the advantage of this PR compared to FixedSizeVector2?
What is the reason that the size tuple fields must be either singleton type or Int but no other primitive subtype of Integer?
Wouldn't this type be better implemented as a separate type with both the existing and the new type subtyping something like AbstractFixedSizeArray?

nsajko · 2025-02-16T18:55:04Z

An issue is that Base.OneTo, used by axes, doesn't handle a length of singleton type well. The solution is to use something like GenericOneTo{Int} here instead of Base.OneTo:

const Supertype = supertype(Base.OneTo)  # in anticipation of `AbstractOneTo`
struct GenericOneTo{ElementType <: Integer, Length <: Integer} <: Supertype{ElementType}
    length::Length
end

EDIT: xref:

nsajko · 2025-02-16T18:55:46Z

I'll answer all the comments in the future BTW.

nsajko · 2025-05-14T18:49:38Z

Closing, as we want to release, and I'm not sure how easy this approach is. To be clear, the PR already manages to move shape info to the type domain in a way controlled by the user, however it's not really clear how to automatically translate that into a performance improvement for all operations supported by fixed size arrays.

nsajko added 2 commits February 12, 2025 17:37

size field: allow Integer values of singleton type as elements

d7a3613

Currently only `Int` values may be contained in the `size` field. This PR additionally allows values of any singleton type that subtypes `Integer`. Fixes JuliaArrays#54

reshape

cee5106

This comment was marked as outdated.

Sign in to view

giordano reviewed Feb 12, 2025

View reviewed changes

nsajko marked this pull request as draft February 12, 2025 19:11

do check_size_type in collect_as, test error path

77ee0ca

nsajko marked this pull request as ready for review February 12, 2025 23:28

nsajko marked this pull request as draft February 16, 2025 18:50

nsajko closed this May 14, 2025

nsajko deleted the size branch May 14, 2025 18:49

size field: allow Integer values of singleton type as elements #107

size field: allow Integer values of singleton type as elements #107

Uh oh!

Conversation

nsajko commented Feb 12, 2025

Uh oh!

codecov bot commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

nsajko commented Feb 12, 2025

Uh oh!

nsajko commented Feb 12, 2025

Uh oh!

This comment was marked as outdated.

giordano Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

nsajko Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

giordano Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

nsajko Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

giordano Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

nsajko Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

PatrickHaecker Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

giordano Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oscardssmith Feb 13, 2025

Choose a reason for hiding this comment

Uh oh!

PatrickHaecker commented Feb 14, 2025

Uh oh!

nsajko commented Feb 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nsajko commented Feb 16, 2025

Uh oh!

nsajko commented May 14, 2025

Uh oh!

Uh oh!

`size` field: allow `Integer` values of singleton type as elements #107

`size` field: allow `Integer` values of singleton type as elements #107

codecov bot commented Feb 12, 2025 •

edited

Loading

PatrickHaecker Feb 13, 2025 •

edited

Loading

giordano Feb 13, 2025 •

edited

Loading

nsajko commented Feb 16, 2025 •

edited

Loading