Skip to content

Program runs faster with profiling than without

This program, reported by Doaitse, runs nearly twice as fast when it is profiled than when it isn't. You need to install the Utrecht library uulib:

cabal install -p uulib

Now compile and run UULib.hs with and without -prof -auto-all. (It makes little difference whether or not you use -O.) When you run, here are the stats for GHC 7.0.3:

----------------  NOT PROFILED ----------------
bash-3.1$ ./uulib +RTS -s
c:\tmp\uulib.exe +RTS -s 
138000
      51,077,556 bytes allocated in the heap
      19,153,500 bytes copied during GC
       4,914,980 bytes maximum residency (5 sample(s))
         519,828 bytes maximum slop
              12 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:    93 collections,     0 parallel,  4.71s,  4.70s elapsed
  Generation 1:     5 collections,     0 parallel,  2.73s,  2.76s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.05s  (  0.11s elapsed)
  GC    time    7.44s  (  7.45s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    7.49s  (  7.56s elapsed)

  %GC time      99.4%  (98.6% elapsed)

  Alloc rate    1,091,377,449 bytes per MUT second

  Productivity   0.6% of total user, 0.6% of total elapsed


------------- PROFILED ------------------------
bash-3.1$ ./uulib +RTS -s
c:\tmp\uulib.exe +RTS -s 
138000
      79,692,548 bytes allocated in the heap
      31,874,676 bytes copied during GC
       7,774,720 bytes maximum residency (5 sample(s))
         257,912 bytes maximum slop
              19 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:   147 collections,     0 parallel,  2.65s,  2.66s elapsed
  Generation 1:     5 collections,     0 parallel,  1.33s,  1.34s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.09s  (  0.16s elapsed)
  GC    time    3.98s  (  3.99s elapsed)
  RP    time    0.00s  (  0.00s elapsed)
  PROF  time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    4.07s  (  4.15s elapsed)

  %GC time      97.7%  (96.1% elapsed)

  Alloc rate    851,407,014 bytes per MUT second

  Productivity   2.3% of total user, 2.3% of total elapsed

The profiled one:

  • Allocates more
  • But runs nearly twice as fast

Looking at the stats, though, you can see that in both cases 99% of execution time spent in the garbage collector. I don't think I've ever seen a program spend that much time in GC.

So I don't really know what's going on, but it's clearly a rather amazing program.

Trac metadata
Trac field Value
Version 7.0.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC doaitse@uu.nl, scpmw@leeds.ac.uk
Operating system Unknown/Multiple
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information