Skip to content

Improve mapping over keys for IntMap and IntSet #1148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

meooow25
Copy link
Contributor

@meooow25 meooow25 commented Jun 15, 2025

  • Implement the functions in terms of the builders instead of converting to and from a list. This reduces time and allocations.
  • Document that they take linear time for monotonic functions.

Closes #1027


Benchmarks with GHC 9.10.1:

IntSet:

Name            Time - - - - - - - -    Allocated - - - - -
                     A       B     %         A       B     %
map:asc          36 μs   22 μs  -37%    456 KB  136 KB  -70%
map:random      547 μs  513 μs   -6%    2.9 MB  2.6 MB  -10%
mapMonotonic     35 μs   23 μs  -34%    456 KB  136 KB  -70%

IntMap:

Name                    Time - - - - - - - -    Allocated - - - - -
                             A       B     %         A       B     %
mapKeys:asc             101 μs   68 μs  -32%    960 KB  608 KB  -36%
mapKeys:random          594 μs  600 μs   +1%    3.0 MB  2.7 MB  -11%
mapKeysMonotonic         94 μs   63 μs  -32%    960 KB  608 KB  -36%
mapKeysWith:asc:dups    129 μs   61 μs  -52%    1.2 MB  512 KB  -58%

meooow25 added 2 commits June 15, 2025 12:55
* Implement the functions in terms of the builders instead of converting
  to and from a list. This reduces time and allocations.
* Document that they take linear time for monotonic functions.
Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@meooow25
Copy link
Contributor Author

Looks like we can get faster (~15%) if we work with the non-empty part of the builder separately, since we have, e.g.

data MonoState
  = MSNada
  | MSPush {-# UNPACK #-} !Int {-# UNPACK #-} !BitMap !Stack

If we get to first entry separately to avoid the MSNada, the MSPush can be unpacked in the rest of the fold. This isn't something GHC can figure out, though it is a bit like SpecConstr.

The downside is that we'll need a few more functions operating on non-empty MonoState, but it shouldn't be too messy.

@meooow25
Copy link
Contributor Author

It turns out that SpecConstr can do it! This is really neat. The problem here is that foldlWithKey'

foldlWithKey' :: (a -> Key -> b -> a) -> a -> IntMap b -> a
foldlWithKey' f z = \t -> -- Use lambda t to be inlinable with two arguments only.
case t of
Bin p l r
| signBranch p -> go (go z r) l -- put negative numbers before
| otherwise -> go (go z l) r
_ -> go z t
where
go !z' Nil = z'
go z' (Tip kx x) = f z' kx x
go z' (Bin _ l r) = go (go z' l) r
{-# INLINE foldlWithKey' #-}

has go z' Nil = z' which disables it, probably to preserve the object.

But Nil can only occur at the top-level, so we can replace that RHS with error and move the Nil check to the top-level, which does the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mapKeys can be more efficient
2 participants