Skip to content

implement sortperm #773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 16, 2020
Merged

implement sortperm #773

merged 1 commit into from
Jun 16, 2020

Conversation

stev47
Copy link
Contributor

@stev47 stev47 commented Apr 24, 2020

fixes #772

  • Is using similar_type for the index array a good idea?
  • Do we want to keep stability semantics from Base or get more performance?

@c42f
Copy link
Member

c42f commented Apr 24, 2020

Great questions, and very neat PR, thanks!

For a combination of both stability and speed, sorting a vector of Tuple like I did in my first attempt at #772 seemed quite fast. So perhaps there's a way to adapt that approach to the fully featured version?

For similar_type another good question. I feel like it should be the similar_type of the axes, if anything? But SOneTo is not a StaticArray so it's going to turn into SVector in any case. So I feel like just returning SVector{N,Int} would be fine.

@stev47 stev47 force-pushed the feature/sortperm branch from c6d97cf to aa340e2 Compare April 24, 2020 13:10
@stev47
Copy link
Contributor Author

stev47 commented Apr 24, 2020

Thank you for the comments!
I added the lexicographic sorting and it seems to be faster indeed, I wonder if that should be done in Base as well.
Your implementation is additionally pairing the elements in advance and thus trading movements for indexing operations. I have mixed benchmarking results for that against the updated pull request.

@inline sortperm2(v::StaticVector) = last.(sort(tuple.(v,SVector{length(v)}(1:length(v)))))
v = rand(SVector{4,Int})
@btime sortperm($(Ref(v))[]) # 9.158 ns (0 allocations: 0 bytes)
@btime sortperm2($(Ref(v))[]) # 5.737 ns (0 allocations: 0 bytes)
v = rand(SVector{19,Int})
@btime sortperm($(Ref(v))[]) # 105.775 ns (0 allocations: 0 bytes)
@btime sortperm2($(Ref(v))[]) # 284.212 ns (0 allocations: 0 bytes)

Can you confirm? I'm not sure about what approach is better.

@stev47 stev47 mentioned this pull request Apr 24, 2020
@andyferris
Copy link
Member

Is using similar_type for the index array a good idea?

I'm not sure what the answer is. Keep in mind that the output of sortperm is often used as an indexer like v[sortperm(v)] or v1[sortperm(v2)]. What do you want those to be like? Indexing should do similar_type on the indexer not the indexee (somewhat simplified but v1[v2] is basically map(i -> v1[i], v2) which is "similar" to v2).

@stev47
Copy link
Contributor Author

stev47 commented May 11, 2020

I agree, as @c42f originally mentioned similar_type(axes(a)) is conceptually cleaner. Should we go for that instead of the more concrete SVector{length(a),Int}(a)? (though with the current SOneTo they should be equivalent)

@andyferris
Copy link
Member

axes won’t return anything to distinguish it other than that it’s a StaticArray.

I really don’t mind. SVector seems fine to me. I’d be happy to merge like that and iterate on the “similar” approach in the future.

@c42f
Copy link
Member

c42f commented Jun 5, 2020

So, with the tuple-based implementation (rather than the default lt(::Perm)) it turns out we have an inconsistency with Base:

julia> mutable struct S
           i::Int
           id::Symbol
       end

julia> Base.isless(a::S, b::S) = isless(a.i, b.i)

julia> sort([S(1,:a), S(1,:b)], rev=true)
2-element Array{S,1}:
 S(1, :a)
 S(1, :b)

julia> sort(SA[S(1,:a), S(1,:b)], rev=true)
2-element SArray{Tuple{2},S,1,2} with indices SOneTo(2):
 S(1, :b)
 S(1, :a)

In base an explicit < is used for the indices, regardless of rev=true or false. I think that is correct, so I think the tuple is not the problem. Rather, it may be the double comparison which is used in Base (a < b & !(b < a)) which is a pretty annoying speed penalty for a small amount of correctness :-(

Is there some slightly different way to formulate this so that it's both fast and consistent with Base?

@c42f
Copy link
Member

c42f commented Jun 5, 2020

I added the lexicographic sorting and it seems to be faster indeed, I wonder if that should be done in Base as well.

This is an interesting question. While I think lexicographic ordering with respect to lt isn't quite right (for the reasons identified above) I think you're correct that something fishy is going on here, and it's possible that with some tweaking the Base implementation could be made faster.

@c42f
Copy link
Member

c42f commented Jun 5, 2020

By the way @stev47, the following seems interesting and possibly relevant

https://blog.reverberate.org/2020/05/29/hoares-rebuttal-bubble-sorts-comeback.html

@stev47
Copy link
Contributor Author

stev47 commented Jun 6, 2020

Good catch! The additional comparison on the index in Base was actually introduced as a workaround for the floating point comparison quirks (see JuliaLang/julia@d427efd). I couldn't come up with a faster version than what is currently in Base though.
Should we use lt() from Base for correctness (~2x performance penalty), go with the current lexicographic approach or come up with something else (see below)?

The natural way would be to just have lt(p::Perm, a, b) = lt(p.order, p.data[a], p.data[b]) (as originally in Base) and use a stable sorting algorithm for permutations by default instead of forcing that into lt(). One would invoke JuliaLang/julia#6177, but I believe that issue should probably be solved in some way specific to floating point types. Still, bitonic sort is not stable, so one would need to implement some other immutable algorithm.

Thank you for the link. Sorting networks are actually already branchless by design, vectorization on the other hand would definitely be nice-to-have.

@c42f
Copy link
Member

c42f commented Jun 7, 2020

Well I think as a general purpose library we need to be correct first, and fast second. Having a closer look, it seems to me that we can't use the tuple trick in any case because lt is some general ordering which may not be defined for tuples.

The Tuple comparison based version may be faster because it assumes a logical relationship between isequal and isless (isequal(a,b) is same as !isless(a,b) && !isless(b,a)) which may lead to better code in the end. It's possible that shows a way to optimize lt(::Perm) in the particular common case that lt is <.

@stev47 stev47 force-pushed the feature/sortperm branch from aa340e2 to e35a4e1 Compare June 7, 2020 13:47
@c42f
Copy link
Member

c42f commented Jun 16, 2020

Beautiful, thanks for continuing to push this through.

@c42f c42f merged commit 34b84ca into JuliaArrays:master Jun 16, 2020
@tpapp tpapp mentioned this pull request Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement sortperm
3 participants