Skip to content

Proposal: Significant performance improvements for Data.Map

Milan Straka's recent Haskell Symposium paper (PDF) shed light on the containers:Data.Map library, indicating there were both algorithmic and stylistic performance improvements to be made.

This proposal provides a patch that dramatically improves performance across the API. Three standard techniques are applied to the code:

Three complementary, but orthogonal patches are provided.

  1. Add a testsuite, with coverage data (currently 91% of top level functions, and all core functions).
  2. Add a micro-benchmark suite based on Criterion, for empirical evidence of improvements to each function.
  3. The optimization patch itself.

All 3 patches should be applied, under this proposal.

The benchmark results are very strong:

  • An average speedup across the core api of 47% (on intel i7 / linux 64 bit), and;
  • 36% on Mac / intel core 2 duo 32 bit).

[[Image(http://i.imgur.com/05BWe.png, 500px)]]

No bugs were identified in the development of the comprehensive testsuite, which includes a large set of unit tests from Kazu Yamamoto

A fork of the containers package, with the 3 patches (and a version bump to make it easier to test out) is available:

darcs get http://code.haskell.org/~dons/code/containers/

Separately, the patches are attached to this ticket.

The consideration period is 3 weeks (before ICFP).


Work carried out over 28/29th August, 2010 in Utrecht, NL, by Johan Tibell and Don Stewart.

Trac metadata
Trac field Value
Version 6.12.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component libraries (other)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information