Skip to content

Review containers changes

The recent containers changes have caused an allocation regression in GHC:

in T1969:
   containers-0.3,                allocations: 246,402,492
   containers + dons/tibbe/milan, allocations: 283,370,180 (+15%)

in T3294:
   containers-0.3,                allocations: 832,307,880
   containers + dons/tibbe/milan, allocations: 950,107,000 (+14%)

Some of the patches were also pushed in a hurry in the buildup to the 7.0 release. We should review them.

Fri Sep 24 16:49:46 BST 2010  Milan Straka <fox@ucw.cz>
  * Fix warnings in Data.Map and Data.Set.
  Ignore-this: cb2c0a8ecf0a57acc5941ba841aa7c40

  Only trivial changes.

Fri Sep 24 16:33:53 BST 2010  Milan Straka <fox@ucw.cz>
  * Finish the started worker/wrapper transformation.
  Ignore-this: baeb24573242beb56c3bbe7ca67f5ff7

  Some methods (insert, lookup) were not modified as the rest
  (like insertWith, delete, ...). Also the `seq` were missing
  sometimes.

Fri Sep 24 16:26:42 BST 2010  Milan Straka <fox@ucw.cz>
  * Merge all the OPTIONS and LANGUAGE module pragmas.
  Ignore-this: 86067abf13f0501f29c13ec7c877533c

Fri Sep 24 16:20:08 BST 2010  Milan Straka <fox@ucw.cz>
  * Remove most INLINE from Map, Set, IntMap and IntSet.
  Ignore-this: c88c4ede21c06bfda20af131c232a720

  Because of a code bloat the INLINEs cause, remove most of
  them. The only INLINEs left are the following:
  - only in Set and Map, because in IntMap and IntSet the specialization
    does not help
  - only on functions which need Ord
  - only on 'small' functions, namely member, notMember, lookup*,
    insert*, delete*, adjust*, alter*, update*

  All other functions of Map, Set, IntMap and IntSet are marked INLINABLE,
  even if they are recursive.

  The INLINEs left are only a short-term solution. In the long run the
  auto-specialization of INLINABLE methods seems a good way (maybe
  SPECIALIZABLE).

Fri Sep 24 12:07:05 BST 2010  Milan Straka <fox@ucw.cz>
  * Comment tests and benchmarks on foldlWithKey' which
  Ignore-this: 71b988389e6ae9a78ea3b0e20156ca2f
  was commented recently by Ian Lynagh.

Thu Sep 23 13:56:04 BST 2010  Milan Straka <fox@ucw.cz>
  * Worker/wrapper transformation for Data.IntSet.
  Ignore-this: b0228582818f7bfb690d0853022a7809

Tue Sep 21 12:58:21 BST 2010  Milan Straka <fox@ucw.cz>
  * Compile only the benchmark source, not the Data/*.hs.
  Ignore-this: f94d9e3ffe126cd057d23490c973a4e9

Tue Sep 21 11:32:25 BST 2010  Milan Straka <fox@ucw.cz>
  * Add criterion-based benchmark for IntSet.hs.
  Ignore-this: 3d31a820830c7382748626bc9a1ba54

  The benchmark is nearly identical copy of Set.hs benchmark.

Tue Sep 21 11:28:02 BST 2010  Milan Straka <fox@ucw.cz>
  * Add a testsuite for Data.IntSet.
  Ignore-this: e55484ee185e71915452bdf2a7b2a2b3

Tue Sep 21 10:18:28 BST 2010  Milan Straka <fox@ucw.cz>
  * Further improve Data.Set balance function.
  Ignore-this: f23be37859224e9bbe919a3c0a71fdc6

  As suggested by Kazu Yamamoto, we split balance to balanceL and
  balanceR, which handle only one-sided inbalance, but need fewer
  tests than balance.

  As nearly all functions modifying the structure use balance, this
  results in speedup of many functions. On my 32-bit GHC 6.12.1,
  11% speedup for insert, 12% speedup for delete.

Tue Sep 21 10:15:47 BST 2010  Milan Straka <fox@ucw.cz>
  * Further improve Data.Map balance function.
  Ignore-this: 8abfd027142a5183b2b5282e96ccb414

  As suggested by Kazu Yamamoto, we split balance to balanceL and
  balanceR, which handle only one-sided inbalance, but need fewer
  tests than balance.

  As nearly all functions modifying the structure use balance, this
  results in speedup of many functions. On my 32-bit GHC 6.12.1,
  20% speedup for insert, 7% speedup for delete, 5% speedup for update.

Tue Sep 21 10:05:07 BST 2010  Milan Straka <fox@ucw.cz>
  * Changing delta to 3 in Data.Set.
  Ignore-this: a47d0c542ed9cee99ad6b17c52c977a1

  Only possible values are 3 and 4. The value 3 has much faster inserts,
  value 4 slightly faster deletes, so choosing 3.

  Also changed the inequalities to rebalance only when one subtree
  is _strictly_ larger than delta * the other one, to mimic the behaviour
  from the proof (both from the Adams' and from the one to come).

Tue Sep 21 10:03:58 BST 2010  Milan Straka <fox@ucw.cz>
  * Changing delta to 3 in Data.Map.
  Ignore-this: 85f733f836b65b2b1038383ddb92e8e1

  Only possible values are 3 and 4. The value 3 has much faster inserts,
  value 4 slightly faster deletes, so choosing 3.

  Also changed the inequalities to rebalance only when one subtree
  is _strictly_ larger than delta * the other one, to mimic the behaviour
  from the proof (both from the Adams' and from the one to come).

Tue Sep 14 16:04:42 BST 2010  Milan Straka <fox@ucw.cz>
  * Correct Data.Set Arbitrary instance never to return unbalanced trees.
  Ignore-this: b5c70fa98a56f225b8eb5faf420677b0

  The previous instance sometimes returned unbalanced trees,
  which broke the tests.

  Also the new instance mimics Data.Map instance more closely in the shape
  of the generated trees.

Tue Sep 14 15:58:41 BST 2010  Milan Straka <fox@ucw.cz>
  * Correct Data.Map Arbitrary instance never to return unbalanced trees.
  Ignore-this: 114bbcc63acdb16b77140ea56aeb0a95

  The previous instance sometimes returned unbalanced trees,
  which broke the tests.

Tue Sep 14 15:20:10 BST 2010  Milan Straka <fox@ucw.cz>
  * Improve Data.Set benchmark.
  Ignore-this: 9b878ae3aa5a43ef083abfd7f9b22513

  Add union, difference and intersection to Data.Set benchmark.

Tue Sep 14 15:17:07 BST 2010  Milan Straka <fox@ucw.cz>
  * Improve benchmark infrastructure and Data.Map benchmark.
  Ignore-this: 67e8dafcb4abcb9c726b9b29c7c320fd

  Renamed Benchmarks.hs to Map.hs, as it only benchmarks Data.Map.
  Improve the Makefile to work with multiple benchmarks.
  Add union, difference and intersection to Data.Map benchmark.

Tue Sep 14 15:04:17 BST 2010  Milan Straka <fox@ucw.cz>
  * Improve the performance of Data.Set balance function.
  Ignore-this: 577c511c219695b8d483af546c7387e8

  The balance function is now one monolithic function, which allows
  to perform all pattern-matches only once.

  Nearly all functions modifying Data.Map use balance.
  The improvements are 12% for insert, 14% for delete (GHC 6.12.1).

Tue Sep 14 15:02:17 BST 2010  Milan Straka <fox@ucw.cz>
  * Improve the performance of Data.Map balance function.
  Ignore-this: 951181e035fcac90674dff3300350a1

  The balance function is now one monolithic function, which allows
  to perform all pattern-matches only once.

  Nearly all functions modifying Data.Map use balance.
  The improvements are 7-11% for various insert*, delete*, alter,
  update or intersection functions (GHC 6.12.1).

Tue Sep 14 14:57:25 BST 2010  Milan Straka <fox@ucw.cz>
  * Improve performance of Data.Set union and difference operations.
  Ignore-this: 6dc4a186ea060b9cdb9e783db71ca280

  Use datatype storing evaluated bound instead of high-order functions.
  The improvements are over 25% for both union and difference (GHC 6.12.1).

Tue Sep 14 14:46:14 BST 2010  Milan Straka <fox@ucw.cz>
  * Improve performance of Data.Map union* and difference* operations.
  Ignore-this: 35b23a40ef33e9fa14eb81fdee4b152d

  Use datatype storing evaluated bound instead of high-order functions.
  The improvements are 22% for union and 20% for difference (GHC 6.12.1).

Mon Sep 13 17:51:32 BST 2010  Milan Straka <fox@ucw.cz>
  * Make the Set store the elements evaluated (bang added).
  Ignore-this: b3f230db5bf30d93d3fddf2c81c5f3b4

Tue Aug 31 13:43:52 BST 2010  Johan Tibell <johan.tibell@gmail.com>
  * Improved performance of Data.Set
  Ignore-this: 38a304a0408d29a2956aa9a1fc0ce755

  Performance improvements are due to manually applying the
  worker/wrapper transformation and strictifying the keys.

  Average speed-up is 32% on a 2GHz Core 2 Duo on OS X 10.5.8

Tue Aug 31 13:42:25 BST 2010  Johan Tibell <johan.tibell@gmail.com>
  * Added benchmarks for Data.Set
  Ignore-this: fcacf88761034b8c534d936f0b336cc0

Tue Aug 31 13:40:30 BST 2010  Johan Tibell <johan.tibell@gmail.com>
  * Added a test suite for Data.Set
  Ignore-this: f430dc302c0fcb8b5d62db2272a1d6f7

  Expression coverage: 74%

Thu Sep 23 13:08:38 BST 2010  simonpj@microsoft.com
  * Remove use of lambdas with irrefutable patterns
  Ignore-this: c36e90a0258c0d5262684c585c321419

Wed Sep 15 14:51:03 BST 2010  Ian Lynagh <igloo@earth.li>
  * Revert the recent contentious changes
  Ignore-this: fe4f71ff1ade51c11421dc9974aa0fda
  These will probably be tidied up and reinstated later, but this gets
  the package back to a releasable state.

Tue Aug 31 12:45:55 BST 2010  Simon Marlow <marlowsd@gmail.com>
  * fix warnings
  Ignore-this: 53df71bc054a779b8ad2dad89c09e02d

Tue Aug 31 10:34:46 BST 2010  Don Stewart <dons@galois.com>
  * Missing MagicHash for IntSet
  Ignore-this: d075f760adb9a2aa0ee04676e38a07cc

Tue Aug 31 10:33:16 BST 2010  Don Stewart <dons@galois.com>
  * Performance improvements for Data.IntMap (worker/wrapper and inlining)
  Ignore-this: 206036448558d270f0eb85ef4cd55368

Tue Aug 31 10:32:40 BST 2010  Don Stewart <dons@galois.com>
  * Add criterion-based benchmarking for IntMap
  Ignore-this: d7d85b9afb513532cc30f5b51a3f825e

Tue Aug 31 10:32:02 BST 2010  Don Stewart <dons@galois.com>
  * Add comprehensive testsuite for IntMap
  Ignore-this: d455fedbc615e5b63ac488e605550557

Tue Aug 31 10:29:56 BST 2010  Don Stewart <dons@galois.com>
  * -O2 -fregs-graph is a uniform 10% improvements for IntMap
  Ignore-this: 2372cf4be945fe7939d0af94e32c567f

Sun Aug 29 13:26:28 BST 2010  Don Stewart <dons@galois.com>
  * Major bump (new functions, clarified strictness properties, vastly better performance)
  Ignore-this: 9bfbc58ecaa24a86be37b8c4cb043457

Sun Aug 29 13:21:47 BST 2010  Don Stewart <dons@galois.com>
  * Add two new functions: foldlWithKey' and insertLookupWithKey'
  Ignore-this: a2f112653ba38737fe1b38609e06c314

  These two functions use strict accumulators, compared to their existing
  counterparts (which are lazy left folds, that appear not to be useful).
  Performance is significantly better.


Sun Aug 29 12:46:11 BST 2010  Don Stewart <dons@galois.com>
  * Add a criterion-based benchmark suite for Data.Map
  Ignore-this: ec61668f5bcb78bd15b72e2728c01c19

  This adds a criterion-based micro-benchmarking suite for Data.Map. It
  can be used to measure performance improvements for individual top-level
  functions.

  Examples here: http://is.gd/eJHIE


Sun Aug 29 17:33:29 BST 2010  Don Stewart <dons@galois.com>
  * Missed base case for updateAt worker. Spotted by Jan-Willem Maessen
  Ignore-this: b8daf1c55c163c16f50c3b54cca2dba1

Sun Aug 29 13:02:45 BST 2010  Don Stewart <dons@galois.com>
  * Performance improvements to Data.Map
  Ignore-this: b4830cddfa6d62e4883f4e0f58ac4e57

  Applied several standard transformations to improve the performance of
  code:

      * Worker/wrapper of all recursive functions with constant arguments
      * Inlining of all (non-recursive) wrappers
      * Consistent use of strict keys

  Average performance improvements across common API (with GHC 6.12.3):

      * Linux / x86_64 / 2.6Ghz i7        : 48%
      * Mac OSX 10.5 / x86 / 2 Ghz Xeon   : 36%

  Graphs and raw data: http://is.gd/eJHIE

  This patch is (mostly) orthogonal to the algorithmic changes suggested
  by Milan Straka in his HW 2010 paper:

      http://research.microsoft.com/~simonpj/papers/containers/containers.pdf

  Those changes could be added separately, for some additional improvments.

  Work carried out over 28/29th August, 2010 in Utrecht, NL, by Johan Tibell
  and Don Stewart.


Sun Aug 29 12:35:45 BST 2010  Don Stewart <dons@galois.com>
  * Add a comprehensive testsuite for Data.Map
  Ignore-this: 891e7fe6bac3523868714ac1ff51c0a3

  This patch adds a joint quickcheck2 / hunit testsuite, with coverage of
  91% of top level functions (remaining features are mostly in instances).

  The coverage data is here:

      http://code.haskell.org/~dons/tests/containers/hpc_index.html

  No bugs were found. It includes unit tests for known past bugs
  (balancing).
Trac metadata
Trac field Value
Version 7.1
Type Bug
TypeOfFailure OtherFailure
Priority high
Resolution Unresolved
Component libraries (other)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information