Profiled program runs 2.5x faster than non-profiled
I was looking at benchmarks game (attached as fasta.ghc-2.hs). I have found that with the flags give there, this program on GHC 8.2.1 runs in about a second with -prof -fprof-auto
and 2.5 seconds without!
To run without profiling:
ghc --make -Wall -fforce-recomp -fllvm -O2 -XBangPatterns -threaded -rtsopts -XOverloadedStrings fasta.ghc-2.hs -o fasta.ghc-2.ghc_run && ./fasta.ghc-2.ghc_run +RTS -N4 -s -RTS 250000 > /dev/null
Same program with profiling
ghc --make -fforce-recomp -fllvm -prof -fprof-auto -O2 -XBangPatterns -threaded -rtsopts -XOverloadedStrings fasta.ghc-2.hs -o fasta.ghc-2.ghc_run && ./fasta.ghc-2.ghc_run +RTS -N4 -p -s -RTS 250000 > /dev/null
I also attach Core outputs for both profiled and unprofiled version.
To me this seems very strange: profiled version is somehow faster. Perhaps what's worse is that this means that there's some optimisation GHC is performing when profiling is not on that makes the program a lot slower than it could be!
This program is not minimised.
Trac metadata
Trac field | Value |
---|---|
Version | 8.2.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |