Skip to content

Commit 880fa44

Browse files
committed
Improve ListUtils docs
* Tighten the performance bounds. The final size of the set is the number of *distinct* elements in the list, not the total number of elements. * Fix missing parenthesis. * Add example.
1 parent 6462ef3 commit 880fa44

File tree

1 file changed

+14
-8
lines changed

1 file changed

+14
-8
lines changed

Data/Containers/ListUtils.hs

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@
1313
-- Portability : portable
1414
--
1515
-- This module provides efficient containers-based functions on the list type.
16+
--
17+
-- In the documentation, \(n\) is the number of elements in the list while
18+
-- \(d\) is the number of distinct elements in the list. \(W\) is the number
19+
-- of bits in an 'Int'.
1620
-----------------------------------------------------------------------------
1721

1822
module Data.Containers.ListUtils (
@@ -33,19 +37,20 @@ import GHC.Exts ( build )
3337
-- *** Ord-based nubbing ***
3438

3539

36-
-- | \( O(n \log n \). The @nubOrd@ function removes duplicate elements from a list.
37-
-- In particular, it keeps only the first occurrence of each element. By using a
38-
-- 'Set' internally it has better asymptotics than the standard 'Data.List.nub'
39-
-- function.
40+
-- | \( O(n \log d) \). The @nubOrd@ function removes duplicate elements from a
41+
-- list. In particular, it keeps only the first occurrence of each element. By
42+
-- using a 'Set' internally it has better asymptotics than the standard
43+
-- 'Data.List.nub' function.
4044
--
4145
-- ==== Strictness
4246
--
4347
-- @nubOrd@ is strict in the elements of the list.
4448
--
4549
-- ==== Efficiency note
4650
--
47-
-- When applicable, it is almost always better to use 'nubInt' or 'nubIntOn' instead
48-
-- of this function. For example, the best way to nub a list of characters is
51+
-- When applicable, it is almost always better to use 'nubInt' or 'nubIntOn'
52+
-- instead of this function, although it can be a little worse in certain
53+
-- pathological cases. For example, to nub a list of characters, use
4954
--
5055
-- @ nubIntOn fromEnum xs @
5156
nubOrd :: Ord a => [a] -> [a]
@@ -114,7 +119,7 @@ constNubOn x _ = x
114119
-- *** Int-based nubbing ***
115120

116121

117-
-- | \( O(n \min(n,W)) \). The @nubInt@ function removes duplicate 'Int'
122+
-- | \( O(n \min(d,W)) \). The @nubInt@ function removes duplicate 'Int'
118123
-- values from a list. In particular, it keeps only the first occurrence
119124
-- of each element. By using an 'IntSet' internally, it attains better
120125
-- asymptotics than the standard 'Data.List.nub' function.
@@ -130,7 +135,8 @@ nubInt = nubIntOn id
130135

131136
-- | The @nubIntOn@ function behaves just like 'nubInt' except it performs
132137
-- comparisons not on the original datatype, but a user-specified projection
133-
-- from that datatype.
138+
-- from that datatype. For example, @nubIntOn 'fromEnum'@ can be used to
139+
-- nub characters and typical fixed-with numerical types efficiently.
134140
--
135141
-- ==== Strictness
136142
--

0 commit comments

Comments
 (0)