Skip to content

Deep stacks make execution time go through the roof

In the code below, switching from mapM_ to mapM causes a 20x change in execution time. The code being executed is very very similar (try -ddump-simpl), and the number of bytes allocated is not dissimilar. But the GC time is dramatically more (presumably because we are repeatedly scanning a very deep stack) and, more puzzlingly, the MUT time is much more too.

It all seems to be to do with the stack depth. Not only that, but I think the effect is non-linear: doubling n from 200k to 400k nearly quadruples runtime. Doubling again to 800k makes it say 'Killed'.

module Main(main) where

import System.IO
import System.IO (openFile, IOMode(..), hPutStr)

n = 200000::Int

main = main_fast


main_slow = do	-- Use mapM
  h <- openFile "bardump2" WriteMode
  mapM (do_it h) testlst

main_fast = do	-- Use mapM_
  h <- openFile "bardump2" WriteMode
  mapM_ (do_it h) testlst

do_it h x = hPutStr h "yes"

testlst = [1..n]

Thanks to Ben for identifying this. Below are some figures

--------------------------------------
    main = main_fast, N= 300,000

251,371,168 bytes allocated in the heap
 14,041,824 bytes copied during GC (scavenged)
      1,504 bytes copied during GC (not scavenged)
  8,413,184 bytes maximum residency (5 sample(s))

        448 collections in generation 0 (  2.00s)
          5 collections in generation 1 (  0.01s)

         17 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.17s  (  1.18s elapsed)
  GC    time    2.01s  (  2.03s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    3.18s  (  3.22s elapsed)

  %GC time      63.3%  (63.1% elapsed)

  Alloc rate    215,201,590 bytes per MUT second

  Productivity  36.7% of total user, 36.2% of total elapsed

--------------------------------------
    main = main_slow, N= 300,000

227,615,792 bytes allocated in the heap
     49,224 bytes copied during GC (scavenged)
      1,440 bytes copied during GC (not scavenged)
     40,960 bytes maximum residency (1 sample(s))

        435 collections in generation 0 (  0.00s)
          1 collections in generation 1 (  0.00s)

          1 Mb total memory in use

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.15s  (  0.15s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.15s  (  0.16s elapsed)

  %GC time       0.0%  (1.8% elapsed)

  Alloc rate    1,537,851,022 bytes per MUT second

  Productivity 100.0% of total user, 95.4% of total elapsed

---------------------------------------------
          Non-linear effect

n      fast/slow	Mut	GC	Tot
400k	fast		0.19	0	0.2
	slow		2.14	3.64	5.78

200k	fast		0.09	0	0.09
	slow		0.48	0.96	1.44
Trac metadata
Trac field Value
Version 6.8.2
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC midfield@gmail.com
Operating system Unknown
Architecture Unknown
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information