Skip to content

Data.List: Replace nub; add nubOrd, nubInt, nubWith

Everyone always complains about nub, but nobody ever does anything about it except (map head . group . sort), which isn't lazy and isn't always faster. :-)

I've implemented a new function nubWith that takes a "stop list" as an argument and filters its target list against the stop list. I've then re-implemented nub and implemented nubOrd and nubInt in terms of nubWith: the stop list is a typeclass, so these implementations are trivial and new implementations are easily added. nubBy is left alone, since there's nothing obvious to be done about it. All of the nubs are still fully lazy.

Basic !QuickCheck tests are provided, and pass.

Performance benchmarking code is provided. The performance of my nub implementation is quite comparable to that of the standard one. My nubOrd and nubInt implementations are dramatically faster, since they use a Set and !IntSet respectively for the stop list. In particular, they are performant on long lists with long nubs, unlike the basic nub.

My implementation is available via git at

git://svcs.cs.pdx.edu/git/nub.git

or can be browsed at

http://svcs.cs.pdx.edu/gitweb?p=nub.git;a=tree and has a maybe-outdated tarball at http://svcs.cs.pdx.edu/haskell/nub.tar.gz

The Nub.hs file itself is attached to this proposal. If the proposal is accepted, I will prepare a patch against current GHC library top-of-tree, but for now it seems easier for everyone to just look at the bits in their current natural habitat.

Trac metadata
Trac field Value
Version
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component libraries/base
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system Multiple
Architecture Multiple
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information