Skip to content

A benchmark compiled with ghc-9.5.20220830 is 14% slower and allocates 1% more than with 9.0.2

This is the second half of #21715 (closed) split off, with updated repro instructions and known causes ruled out. In particular, this uses HEAD ghc-9.5.20220830 plus !7847 (closed) (and some manual SPECIALIZE pragmas, despite -fexpose-all-unfoldings -fspecialise-aggressively) to rule out simple specialization problems.

IMHO, this is the least actionable of the 3 issues stemming from #21715 (closed), given the long timespan between 9.0.2 and now, low volume of regression and the fact that I can't reproduce this on any other branch of my project, while the other regressions crop up often, in various variants and magnitudes.

To reproduce:

  1. git clone git@github.com:Mikolaj/horde-ad.git
  2. git checkout ghc-report-specialize
  3. cabal bench mnist -w ghc-9.0.2 --enable-optimization --constraint "vector < 0.13" --allow-newer --benchmark-options='-n 1 -m prefix "2-hidden-layer MNIST nn with samples: 500/500" +RTS -s'
  4. cabal bench mnist -w ~/r/ghc/_build/stage1/bin/ghc --enable-optimization --constraint "vector < 0.13" --allow-newer --benchmark-options='-n 1 -m prefix "2-hidden-layer MNIST nn with samples: 500/500" +RTS -s'
  5. compare

Let me also attach the results I'm getting:

~/r/horde-ad$ cabal bench mnist -w ghc-9.0.2 --enable-optimization --constraint "vector < 0.13" --allow-newer --benchmark-options='-n 1 -m prefix "2-hidden-layer MNIST nn with samples: 500/500" +RTS -s'
Resolving dependencies...
Build profile: -w ghc-9.0.2 -O1
In order, the following will be built (use -v for more details):
 - horde-ad-0.1.0.0 (bench:mnist) (first run)
Preprocessing benchmark 'mnist' for horde-ad-0.1.0.0..
Building benchmark 'mnist' for horde-ad-0.1.0.0..
Running 1 benchmarks...
Benchmark mnist: RUNNING...
benchmarking 2-hidden-layer MNIST nn with samples: 500/500|150 s469160 v0 m0=469160
  29,993,613,568 bytes allocated in the heap
  36,535,841,440 bytes copied during GC
     119,503,400 bytes maximum residency (253 sample(s))
       2,890,432 bytes maximum slop
             268 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     22918 colls,     0 par   11.906s  11.920s     0.0005s    0.0017s
  Gen  1       253 colls,     0 par    1.966s   1.967s     0.0078s    0.0306s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    7.949s  (  7.947s elapsed)
  GC      time   13.872s  ( 13.887s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time   21.821s  ( 21.834s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    3,773,309,905 bytes per MUT second

  Productivity  36.4% of total user, 36.4% of total elapsed

Benchmark mnist: FINISH

~/r/horde-ad$ cabal bench mnist -w ghc-9.2.4 --enable-optimization --constraint "vector < 0.13" --allow-newer --benchmark-options='-n 1 -m prefix "2-hidden-layer MNIST nn with samples: 500/500" +RTS -s'
Resolving dependencies...
Build profile: -w ghc-9.2.4 -O1
In order, the following will be built (use -v for more details):
 - horde-ad-0.1.0.0 (bench:mnist) (first run)
Preprocessing benchmark 'mnist' for horde-ad-0.1.0.0..
Building benchmark 'mnist' for horde-ad-0.1.0.0..
Running 1 benchmarks...
Benchmark mnist: RUNNING...
benchmarking 2-hidden-layer MNIST nn with samples: 500/500|150 s469160 v0 m0=469160
  73,220,505,408 bytes allocated in the heap
  36,244,304,376 bytes copied during GC
     119,595,768 bytes maximum residency (198 sample(s))
       2,917,136 bytes maximum slop
             277 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     16083 colls,     0 par   14.166s  14.186s     0.0009s    0.0026s
  Gen  1       198 colls,     0 par    3.142s   3.145s     0.0159s    0.0313s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time   26.487s  ( 26.497s elapsed)
  GC      time   17.309s  ( 17.331s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time   43.796s  ( 43.828s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    2,764,370,213 bytes per MUT second

  Productivity  60.5% of total user, 60.5% of total elapsed

Benchmark mnist: FINISH

~/r/horde-ad$ cabal bench mnist -w ghc-9.4.2 --enable-optimization --constraint "vector < 0.13" --allow-newer --benchmark-options='-n 1 -m prefix "2-hidden-layer MNIST nn with samples: 500/500" +RTS -s'
Resolving dependencies...
Build profile: -w ghc-9.4.2 -O1
In order, the following will be built (use -v for more details):
 - horde-ad-0.1.0.0 (bench:mnist) (first run)
Preprocessing benchmark 'mnist' for horde-ad-0.1.0.0..
Building benchmark 'mnist' for horde-ad-0.1.0.0..
Running 1 benchmarks...
Benchmark mnist: RUNNING...
benchmarking 2-hidden-layer MNIST nn with samples: 500/500|150 s469160 v0 m0=469160
  30,633,303,424 bytes allocated in the heap
  34,157,584,608 bytes copied during GC
     115,291,200 bytes maximum residency (176 sample(s))
       2,864,464 bytes maximum slop
             268 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      5609 colls,     0 par   12.459s  12.471s     0.0022s    0.0062s
  Gen  1       176 colls,     0 par    3.195s   3.197s     0.0182s    0.0294s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    7.686s  (  7.674s elapsed)
  GC      time   15.654s  ( 15.668s elapsed)
  EXIT    time    0.000s  (  0.008s elapsed)
  Total   time   23.340s  ( 23.350s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    3,985,560,410 bytes per MUT second

  Productivity  32.9% of total user, 32.9% of total elapsed

Benchmark mnist: FINISH

~/r/horde-ad$ cabal bench mnist -w ~/r/ghc/_build/stage1/bin/ghc --enable-optimization --constraint "vector < 0.13" --allow-newer --benchmark-options='-n 1 -m prefix "2-hidden-layer MNIST nn with samples: 500/500" +RTS -s'
Resolving dependencies...
Build profile: -w ghc-9.5.20220830 -O1
In order, the following will be built (use -v for more details):
 - horde-ad-0.1.0.0 (bench:mnist) (first run)
Preprocessing benchmark 'mnist' for horde-ad-0.1.0.0..
Building benchmark 'mnist' for horde-ad-0.1.0.0..
Running 1 benchmarks...
Benchmark mnist: RUNNING...
benchmarking 2-hidden-layer MNIST nn with samples: 500/500|150 s469160 v0 m0=469160
  30,345,381,232 bytes allocated in the heap
  36,181,972,552 bytes copied during GC
     115,757,168 bytes maximum residency (176 sample(s))
       2,859,000 bytes maximum slop
             267 MiB total memory in use (0 MiB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      5609 colls,     0 par   13.610s  13.623s     0.0024s    0.0063s
  Gen  1       176 colls,     0 par    3.459s   3.465s     0.0197s    0.0311s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    7.862s  (  7.850s elapsed)
  GC      time   17.069s  ( 17.088s elapsed)
  EXIT    time    0.000s  (  0.002s elapsed)
  Total   time   24.931s  ( 24.940s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    3,859,837,479 bytes per MUT second

  Productivity  31.5% of total user, 31.5% of total elapsed

Benchmark mnist: FINISH
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information