Performance with O0 is much better than the default or with -O2, runghc performs the best
In this particular case -O2
or the default is 2x slower than -O0
and -O0
is 2x slower than runghc
. Please see the github repo: https://github.com/harendra-kumar/ghc-perf to reproduce the issue. Readme file in the repo has instructions to reproduce.
The issue seems to occur when the code is placed in a different module. When all the code is in the same module the problem does not occur. In that case -O2
or the default is faster than -O0
. However, when the code is split into two modules the performance gets inverted.
Also, it does not occur always, when I tried to change the code to make it simpler for repro the problem did not occur.