CallArity taking 20% of compile time
The CallArity
analysis can, apparently, take 20% of compile time. That's a lot for one analysis that is on by default with -O
.
Michael Terepeta writes: Out of curiosity I had a look at compiling haskell-src-exts since that takes quite a while. I've used ghc HEAD and 7.8.4 (both built with BuildFlavour=prof & bootstrapped with a standard ghc 7.8.4) and it's interesting -- the current HEAD takes quite a bit longer and allocates way more than 7.8.4. One of the main things that stand out is the CallArity analysis (which IIRC was not there in 7.8.4). So unless I messed something up with measuring, the analysis seem to be pretty expensive.
HEAD
Sun Apr 12 15:52 2015 Time and Allocation Profiling Report (Final)
ghc +RTS -p -RTS [...]
total time = 147.84 secs (147841 ticks @ 1000 us, 1 processor)
total alloc = 172,378,600,408 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
SimplTopBinds SimplCore 32.4 28.8
CallArity SimplCore 18.4 25.6
lintAnnots CoreLint 4.5 4.6
CoreTidy HscMain 4.5 5.1
pprNativeCode AsmCodeGen 3.2 3.4
OccAnal SimplCore 3.2 3.1
occAnalBind.assoc OccurAnal 2.6 2.5
StgCmm HscMain 2.3 1.9
Simplify SimplCore 2.1 0.2
RegAlloc AsmCodeGen 2.1 2.4
FloatOutwards SimplCore 2.0 1.6
regLiveness AsmCodeGen 1.9 1.9
tc_rn_src_decls TcRnDriver 1.8 1.3
sink CmmPipeline 1.7 1.5
NewStranal SimplCore 1.3 1.5
genMachCode AsmCodeGen 1.1 1.0
layoutStack CmmPipeline 1.0 1.0
HEAD with -fno-call-arity
Sun Apr 12 18:16 2015 Time and Allocation Profiling Report (Final)
ghc +RTS -p -RTS [...] -fno-call-arity
total time = 113.71 secs (113714 ticks @ 1000 us, 1 processor)
total alloc = 121,884,896,720 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
SimplTopBinds SimplCore 37.2 36.6
CoreTidy HscMain 6.0 7.3
lintAnnots CoreLint 5.8 6.5
pprNativeCode AsmCodeGen 4.1 4.8
OccAnal SimplCore 3.6 3.8
occAnalBind.assoc OccurAnal 2.9 3.2
StgCmm HscMain 2.9 2.6
RegAlloc AsmCodeGen 2.6 3.4
FloatOutwards SimplCore 2.6 2.3
regLiveness AsmCodeGen 2.5 2.8
tc_rn_src_decls TcRnDriver 2.4 1.9
Simplify SimplCore 2.4 0.3
sink CmmPipeline 2.1 2.2
NewStranal SimplCore 1.7 2.1
genMachCode AsmCodeGen 1.4 1.4
layoutStack CmmPipeline 1.4 1.4
NativeCodeGen CodeOutput 1.1 1.2
FloatInwards SimplCore 1.1 1.4
do_block Hoopl.Dataflow 1.0 0.6
Digraph.scc Digraph 0.8 1.3
GHC 7.8.4
Sun Apr 12 15:41 2015 Time and Allocation Profiling Report (Final)
ghc +RTS -p -RTS [...]
total time = 93.11 secs (93112 ticks @ 1000 us, 1 processor)
total alloc = 103,135,975,120 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
SimplTopBinds SimplCore 38.5 37.4
pprNativeCode AsmCodeGen 6.2 7.2
StgCmm HscMain 3.9 4.2
RegAlloc AsmCodeGen 3.7 5.1
occAnalBind.assoc OccurAnal 3.3 3.6
OccAnal SimplCore 3.3 3.6
regLiveness AsmCodeGen 3.1 3.4
FloatOutwards SimplCore 2.9 2.4
sink CmmPipeline 2.8 2.8
Simplify SimplCore 2.6 0.3
tc_rn_src_decls TcRnDriver 2.4 2.1
genMachCode AsmCodeGen 1.9 2.0
NewStranal SimplCore 1.8 2.1
layoutStack CmmPipeline 1.8 1.8
Core2Core HscMain 1.3 1.2
deSugar HscMain 1.1 1.1
do_block Hoopl.Dataflow 1.1 0.7
CoreTidy HscMain 1.0 1.1
CorePrep HscMain 1.0 1.1
Digraph.scc Digraph 0.9 1.5
versioninfo MkIface 0.9 1.0
zonkEvBndr_zonkTcTypeToType TcHsSyn 0.6 1.4
Trac metadata
Trac field | Value |
---|---|
Version | 7.10.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |