Deep stacks make execution time go through the roof
In the code below, switching from mapM_
to mapM
causes a 20x change in execution time. The code being executed is very very similar (try -ddump-simpl
), and the number of bytes allocated is not dissimilar. But the GC time is dramatically more (presumably because we are repeatedly scanning a very deep stack) and, more puzzlingly, the MUT time is much more too.
It all seems to be to do with the stack depth. Not only that, but I think the effect is non-linear: doubling n from 200k to 400k nearly quadruples runtime. Doubling again to 800k makes it say 'Killed'.
module Main(main) where
import System.IO
import System.IO (openFile, IOMode(..), hPutStr)
n = 200000::Int
main = main_fast
main_slow = do -- Use mapM
h <- openFile "bardump2" WriteMode
mapM (do_it h) testlst
main_fast = do -- Use mapM_
h <- openFile "bardump2" WriteMode
mapM_ (do_it h) testlst
do_it h x = hPutStr h "yes"
testlst = [1..n]
Thanks to Ben for identifying this. Below are some figures
--------------------------------------
main = main_fast, N= 300,000
251,371,168 bytes allocated in the heap
14,041,824 bytes copied during GC (scavenged)
1,504 bytes copied during GC (not scavenged)
8,413,184 bytes maximum residency (5 sample(s))
448 collections in generation 0 ( 2.00s)
5 collections in generation 1 ( 0.01s)
17 Mb total memory in use
INIT time 0.00s ( 0.00s elapsed)
MUT time 1.17s ( 1.18s elapsed)
GC time 2.01s ( 2.03s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 3.18s ( 3.22s elapsed)
%GC time 63.3% (63.1% elapsed)
Alloc rate 215,201,590 bytes per MUT second
Productivity 36.7% of total user, 36.2% of total elapsed
--------------------------------------
main = main_slow, N= 300,000
227,615,792 bytes allocated in the heap
49,224 bytes copied during GC (scavenged)
1,440 bytes copied during GC (not scavenged)
40,960 bytes maximum residency (1 sample(s))
435 collections in generation 0 ( 0.00s)
1 collections in generation 1 ( 0.00s)
1 Mb total memory in use
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.15s ( 0.15s elapsed)
GC time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 0.15s ( 0.16s elapsed)
%GC time 0.0% (1.8% elapsed)
Alloc rate 1,537,851,022 bytes per MUT second
Productivity 100.0% of total user, 95.4% of total elapsed
---------------------------------------------
Non-linear effect
n fast/slow Mut GC Tot
400k fast 0.19 0 0.2
slow 2.14 3.64 5.78
200k fast 0.09 0 0.09
slow 0.48 0.96 1.44
Trac metadata
Trac field | Value |
---|---|
Version | 6.8.2 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | midfield@gmail.com |
Operating system | Unknown |
Architecture | Unknown |