Skip to content

Proposal: Further performance improvements of Data.Set

This proposal depends on #4280 (closed).

This proposal further improves the performance of Data.Set. It consists of five patches:

  1. making the set datatype to store the elements evaluated (by using a bang). This is in accordance with a poll on libraries@haskell.org, see point 1) of http://article.gmane.org/gmane.comp.lang.haskell.libraries/13273
  2. improvements to union and difference implementation (by eliminating high-order function)
  3. improvements to balance function which is used by nearly all methods modificating a set (making one monolithic function which allows perform only one pattern-match on the tree nodes)
  4. correction of the test (the Arbitrary instance could generate unbalanced trees, which the patch 3. discovered)
  5. added benchmark of union, difference and intersection methods

On i386 Intel Core2 and GHC 6.12.1, the improvements are

delete       13.46
deleteMax    13.58
deleteMin    15.93
difference   26.17
insert       12.20
intersection 2.21
member       9.52
union        27.59

The repository of the containers package with these patches (and also several others) is at http://fox.auryn.cz/darcs/containers/.

The patches are also attached (including #4280 (closed), which are not commited yet).

Trac metadata
Trac field Value
Version 6.12.3
Type Task
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component libraries (other)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information