implement sortperm #773

stev47 · 2020-04-24T11:06:34Z

fixes #772

Is using similar_type for the index array a good idea?
Do we want to keep stability semantics from Base or get more performance?

c42f · 2020-04-24T11:32:05Z

Great questions, and very neat PR, thanks!

For a combination of both stability and speed, sorting a vector of Tuple like I did in my first attempt at #772 seemed quite fast. So perhaps there's a way to adapt that approach to the fully featured version?

For similar_type another good question. I feel like it should be the similar_type of the axes, if anything? But SOneTo is not a StaticArray so it's going to turn into SVector in any case. So I feel like just returning SVector{N,Int} would be fine.

stev47 · 2020-04-24T13:19:01Z

Thank you for the comments!
I added the lexicographic sorting and it seems to be faster indeed, I wonder if that should be done in Base as well.
Your implementation is additionally pairing the elements in advance and thus trading movements for indexing operations. I have mixed benchmarking results for that against the updated pull request.

@inline sortperm2(v::StaticVector) = last.(sort(tuple.(v,SVector{length(v)}(1:length(v)))))
v = rand(SVector{4,Int})
@btime sortperm($(Ref(v))[]) # 9.158 ns (0 allocations: 0 bytes)
@btime sortperm2($(Ref(v))[]) # 5.737 ns (0 allocations: 0 bytes)
v = rand(SVector{19,Int})
@btime sortperm($(Ref(v))[]) # 105.775 ns (0 allocations: 0 bytes)
@btime sortperm2($(Ref(v))[]) # 284.212 ns (0 allocations: 0 bytes)

Can you confirm? I'm not sure about what approach is better.

andyferris · 2020-05-11T00:43:05Z

Is using similar_type for the index array a good idea?

I'm not sure what the answer is. Keep in mind that the output of sortperm is often used as an indexer like v[sortperm(v)] or v1[sortperm(v2)]. What do you want those to be like? Indexing should do similar_type on the indexer not the indexee (somewhat simplified but v1[v2] is basically map(i -> v1[i], v2) which is "similar" to v2).

stev47 · 2020-05-11T07:57:38Z

I agree, as @c42f originally mentioned similar_type(axes(a)) is conceptually cleaner. Should we go for that instead of the more concrete SVector{length(a),Int}(a)? (though with the current SOneTo they should be equivalent)

andyferris · 2020-05-11T08:53:42Z

axes won’t return anything to distinguish it other than that it’s a StaticArray.

I really don’t mind. SVector seems fine to me. I’d be happy to merge like that and iterate on the “similar” approach in the future.

c42f · 2020-06-05T03:13:11Z

So, with the tuple-based implementation (rather than the default lt(::Perm)) it turns out we have an inconsistency with Base:

julia> mutable struct S
           i::Int
           id::Symbol
       end

julia> Base.isless(a::S, b::S) = isless(a.i, b.i)

julia> sort([S(1,:a), S(1,:b)], rev=true)
2-element Array{S,1}:
 S(1, :a)
 S(1, :b)

julia> sort(SA[S(1,:a), S(1,:b)], rev=true)
2-element SArray{Tuple{2},S,1,2} with indices SOneTo(2):
 S(1, :b)
 S(1, :a)

In base an explicit < is used for the indices, regardless of rev=true or false. I think that is correct, so I think the tuple is not the problem. Rather, it may be the double comparison which is used in Base (a < b & !(b < a)) which is a pretty annoying speed penalty for a small amount of correctness :-(

Is there some slightly different way to formulate this so that it's both fast and consistent with Base?

c42f · 2020-06-05T03:16:20Z

I added the lexicographic sorting and it seems to be faster indeed, I wonder if that should be done in Base as well.

This is an interesting question. While I think lexicographic ordering with respect to lt isn't quite right (for the reasons identified above) I think you're correct that something fishy is going on here, and it's possible that with some tweaking the Base implementation could be made faster.

c42f · 2020-06-05T03:33:41Z

By the way @stev47, the following seems interesting and possibly relevant

https://blog.reverberate.org/2020/05/29/hoares-rebuttal-bubble-sorts-comeback.html

stev47 · 2020-06-06T17:27:59Z

Good catch! The additional comparison on the index in Base was actually introduced as a workaround for the floating point comparison quirks (see JuliaLang/julia@d427efd). I couldn't come up with a faster version than what is currently in Base though.
Should we use lt() from Base for correctness (~2x performance penalty), go with the current lexicographic approach or come up with something else (see below)?

The natural way would be to just have lt(p::Perm, a, b) = lt(p.order, p.data[a], p.data[b]) (as originally in Base) and use a stable sorting algorithm for permutations by default instead of forcing that into lt(). One would invoke JuliaLang/julia#6177, but I believe that issue should probably be solved in some way specific to floating point types. Still, bitonic sort is not stable, so one would need to implement some other immutable algorithm.

Thank you for the link. Sorting networks are actually already branchless by design, vectorization on the other hand would definitely be nice-to-have.

c42f · 2020-06-07T12:47:03Z

Well I think as a general purpose library we need to be correct first, and fast second. Having a closer look, it seems to me that we can't use the tuple trick in any case because lt is some general ordering which may not be defined for tuples.

The Tuple comparison based version may be faster because it assumes a logical relationship between isequal and isless (isequal(a,b) is same as !isless(a,b) && !isless(b,a)) which may lead to better code in the end. It's possible that shows a way to optimize lt(::Perm) in the particular common case that lt is <.

fixes JuliaArrays#772

c42f · 2020-06-16T04:32:49Z

Beautiful, thanks for continuing to push this through.

stev47 force-pushed the feature/sortperm branch from c6d97cf to aa340e2 Compare April 24, 2020 13:10

stev47 mentioned this pull request Apr 24, 2020

implement sortperm #772

Closed

andyferris approved these changes May 11, 2020

View reviewed changes

implement sortperm

e35a4e1

fixes JuliaArrays#772

stev47 force-pushed the feature/sortperm branch from aa340e2 to e35a4e1 Compare June 7, 2020 13:47

c42f merged commit 34b84ca into JuliaArrays:master Jun 16, 2020

tpapp mentioned this pull request Dec 3, 2020

sortperm is unstable #858

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

implement sortperm #773

implement sortperm #773

Uh oh!

stev47 commented Apr 24, 2020

Uh oh!

c42f commented Apr 24, 2020

Uh oh!

stev47 commented Apr 24, 2020 •

edited

Loading

Uh oh!

andyferris commented May 11, 2020

Uh oh!

stev47 commented May 11, 2020

Uh oh!

andyferris commented May 11, 2020

Uh oh!

c42f commented Jun 5, 2020

Uh oh!

c42f commented Jun 5, 2020

Uh oh!

c42f commented Jun 5, 2020

Uh oh!

stev47 commented Jun 6, 2020

Uh oh!

c42f commented Jun 7, 2020

Uh oh!

c42f commented Jun 16, 2020

Uh oh!

Uh oh!

implement sortperm #773

implement sortperm #773

Uh oh!

Conversation

stev47 commented Apr 24, 2020

Uh oh!

c42f commented Apr 24, 2020

Uh oh!

stev47 commented Apr 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andyferris commented May 11, 2020

Uh oh!

stev47 commented May 11, 2020

Uh oh!

andyferris commented May 11, 2020

Uh oh!

c42f commented Jun 5, 2020

Uh oh!

c42f commented Jun 5, 2020

Uh oh!

c42f commented Jun 5, 2020

Uh oh!

stev47 commented Jun 6, 2020

Uh oh!

c42f commented Jun 7, 2020

Uh oh!

c42f commented Jun 16, 2020

Uh oh!

Uh oh!

stev47 commented Apr 24, 2020 •

edited

Loading