Skip to content

9.4.4 -> 9.6.1-alpha1 - runtime performance regression (speff)

Taken from: #22758 (comment 477321)

Just want to say if this helps, the regression happened in speff's test suite too.

GHC 9.4.4:

  countdown
    10000
      sp.shallow:    OK (0.17s)
        320  μs ±  25 μs, 843 KB allocated, 173 B  copied, 6.0 MB peak memory
      sp.deep:       OK (0.34s)
        326  μs ±  21 μs, 853 KB allocated, 185 B  copied, 6.0 MB peak memory
      ev.shallow:    OK (0.42s)
        410  μs ±  22 μs, 2.4 MB allocated, 560 B  copied, 6.0 MB peak memory
      ev.deep:       OK (0.12s)
        463  μs ±  46 μs, 2.4 MB allocated, 1.1 KB copied, 6.0 MB peak memory
      freer.shallow: OK (0.23s)
        413  μs ±  23 μs, 1.8 MB allocated, 337 B  copied, 9.0 MB peak memory
      freer.deep:    OK (0.20s)
        1.45 ms ± 106 μs,  10 MB allocated, 2.0 KB copied,  12 MB peak memory
      mtl.shallow:   OK (0.20s)
        699  μs ±  45 μs, 3.8 MB allocated, 699 B  copied,  12 MB peak memory
      mtl.deep:      OK (0.21s)
        6.22 ms ± 421 μs,  34 MB allocated, 8.2 KB copied,  12 MB peak memory
      fused.shallow: OK (0.15s)
        2.06 ms ± 198 μs, 8.9 MB allocated, 1.7 KB copied,  21 MB peak memory
      fused.deep:    OK (0.27s)
        15.6 ms ± 1.2 ms,  63 MB allocated,  15 KB copied,  28 MB peak memory
      sem.shallow:   OK (0.39s)
        2.56 ms ± 170 μs,  14 MB allocated, 3.3 KB copied,  30 MB peak memory
      sem.deep:      OK (0.72s)
        5.16 ms ± 392 μs,  37 MB allocated,  10 KB copied,  33 MB peak memory
  pyth
    32
      sp.shallow:    OK (0.30s)
        2.00 ms ± 142 μs, 5.0 MB allocated,  26 KB copied,  33 MB peak memory
      sp.deep:       OK (0.19s)
        4.47 ms ± 412 μs,  10 MB allocated,  52 KB copied,  33 MB peak memory
      ev.shallow:    OK (0.21s)
        1.18 ms ±  89 μs, 4.0 MB allocated, 3.0 KB copied,  33 MB peak memory
      ev.deep:       OK (0.21s)
        2.58 ms ± 194 μs,  13 MB allocated,  11 KB copied,  33 MB peak memory
      freer.shallow: OK (0.51s)
        7.37 ms ± 483 μs,  12 MB allocated, 5.4 MB copied,  51 MB peak memory
      freer.deep:    OK (0.72s)
        10.9 ms ± 178 μs,  40 MB allocated, 6.3 MB copied,  52 MB peak memory
      fused.shallow: OK (0.23s)
        1.46 ms ±  98 μs, 3.3 MB allocated, 3.3 KB copied,  52 MB peak memory
      fused.deep:    OK (0.20s)
        11.0 ms ± 851 μs,  42 MB allocated,  34 KB copied,  52 MB peak memory
      sem.shallow:   OK (0.47s)
        13.9 ms ± 488 μs,  41 MB allocated, 5.2 MB copied,  52 MB peak memory
      sem.deep:      OK (0.46s)
        14.1 ms ± 428 μs,  42 MB allocated, 5.2 MB copied,  52 MB peak memory
  catch
    10000
      sp.shallow:    OK (0.33s)
        1.10 ms ±  53 μs, 4.1 MB allocated, 469 KB copied,  52 MB peak memory
      sp.deep:       OK (0.44s)
        1.48 ms ± 100 μs, 4.9 MB allocated, 996 KB copied,  52 MB peak memory
      fused.shallow: OK (0.50s)
        3.36 ms ± 315 μs, 9.2 MB allocated, 1.3 MB copied,  53 MB peak memory
      fused.deep:    OK (0.83s)
        24.9 ms ± 2.4 ms,  64 MB allocated, 6.1 MB copied,  71 MB peak memory
      sem.shallow:   OK (0.52s)
        3.55 ms ± 107 μs,  11 MB allocated, 1.6 MB copied,  71 MB peak memory
      sem.deep:      OK (0.57s)
        16.6 ms ± 1.2 ms,  68 MB allocated,  10 MB copied,  77 MB peak memory
  local
    10000
      sp.shallow:    OK (0.22s)
        514  μs ±  44 μs, 2.3 MB allocated, 317 B  copied,  77 MB peak memory
      sp.deep:       OK (0.38s)
        541  μs ±  34 μs, 3.1 MB allocated, 465 B  copied,  77 MB peak memory
      fused.shallow: OK (0.25s)
        1.48 ms ± 111 μs, 5.3 MB allocated, 208 KB copied,  77 MB peak memory
      fused.deep:    OK (0.30s)
        7.72 ms ± 343 μs,  27 MB allocated, 639 KB copied,  77 MB peak memory
      sem.shallow:   OK (1.59s)
        5.75 ms ± 343 μs,  16 MB allocated, 4.6 MB copied,  77 MB peak memory
      sem.deep:      OK (0.44s)
        12.4 ms ± 878 μs,  39 MB allocated,  10 MB copied,  78 MB peak memory

GHC 9.6.1-alpha1:

  countdown
    10000
      sp.shallow:    OK (0.18s)
        337  μs ±  26 μs, 843 KB allocated, 174 B  copied, 6.0 MB peak memory
      sp.deep:       OK (0.17s)
        332  μs ±  23 μs, 844 KB allocated, 189 B  copied, 6.0 MB peak memory
      ev.shallow:    OK (0.18s)
        690  μs ±  59 μs, 2.4 MB allocated, 571 B  copied, 6.0 MB peak memory  [!]
      ev.deep:       OK (0.21s)
        815  μs ±  80 μs, 2.4 MB allocated, 1.1 KB copied, 6.0 MB peak memory  [!]
      freer.shallow: OK (0.28s)
        525  μs ±  38 μs, 1.8 MB allocated, 337 B  copied, 9.0 MB peak memory
      freer.deep:    OK (0.48s)
        1.77 ms ± 123 μs,  10 MB allocated, 2.0 KB copied,  12 MB peak memory 
      mtl.shallow:   OK (0.41s)
        1.51 ms ±  46 μs, 3.7 MB allocated, 675 B  copied,  12 MB peak memory  [!]
      mtl.deep:      OK (0.48s)
        15.1 ms ± 1.4 ms,  29 MB allocated, 6.9 KB copied,  12 MB peak memory  [!]
      fused.shallow: OK (0.22s)
        3.14 ms ± 257 μs, 6.7 MB allocated, 1.2 KB copied,  17 MB peak memory  [!]
      fused.deep:    OK (0.18s)
        23.1 ms ± 1.4 ms,  52 MB allocated,  13 KB copied,  23 MB peak memory  [!]
      sem.shallow:   OK (0.19s)
        4.79 ms ± 397 μs,  14 MB allocated, 3.3 KB copied,  25 MB peak memory  [!]
      sem.deep:      OK (0.32s)
        9.01 ms ± 576 μs,  37 MB allocated, 9.8 KB copied,  28 MB peak memory  [!]
  pyth
    32
      sp.shallow:    OK (0.18s)
        2.02 ms ± 185 μs, 5.1 MB allocated,  26 KB copied,  28 MB peak memory
      sp.deep:       OK (0.18s)
        4.67 ms ± 385 μs,  10 MB allocated,  52 KB copied,  28 MB peak memory
      ev.shallow:    OK (0.21s)
        1.23 ms ±  87 μs, 4.0 MB allocated, 3.0 KB copied,  28 MB peak memory
      ev.deep:       OK (0.22s)
        2.83 ms ± 262 μs,  13 MB allocated,  11 KB copied,  28 MB peak memory
      freer.shallow: OK (0.38s)
        11.2 ms ± 948 μs,  15 MB allocated, 5.6 MB copied,  42 MB peak memory  [!]
      freer.deep:    OK (0.25s)
        15.0 ms ± 745 μs,  43 MB allocated, 6.9 MB copied,  44 MB peak memory  [!]
      fused.shallow: OK (0.16s)
        2.19 ms ± 173 μs, 3.2 MB allocated, 3.3 KB copied,  44 MB peak memory  [!]
      fused.deep:    OK (0.17s)
        21.8 ms ± 1.5 ms,  42 MB allocated,  34 KB copied,  44 MB peak memory  [!]
      sem.shallow:   OK (0.24s)
        30.2 ms ± 1.5 ms,  41 MB allocated, 2.8 MB copied,  44 MB peak memory  [!]
      sem.deep:      OK (0.23s)
        29.5 ms ± 1.3 ms,  41 MB allocated, 2.8 MB copied,  44 MB peak memory  [!]
  catch
    10000
      sp.shallow:    OK (0.34s)
        1.14 ms ±  80 μs, 4.1 MB allocated, 469 KB copied,  44 MB peak memory
      sp.deep:       OK (0.82s)
        1.43 ms ±  68 μs, 4.9 MB allocated, 995 KB copied,  44 MB peak memory
      fused.shallow: OK (0.57s)
        8.37 ms ± 610 μs,  11 MB allocated, 3.5 MB copied,  44 MB peak memory  [!]
      fused.deep:    OK (1.26s)
        39.7 ms ± 1.4 ms,  57 MB allocated,  14 MB copied,  49 MB peak memory  [!]
      sem.shallow:   OK (0.58s)
        4.33 ms ± 350 μs, 9.1 MB allocated, 1.3 MB copied,  49 MB peak memory  [!]
      sem.deep:      OK (0.68s)
        20.7 ms ± 649 μs,  59 MB allocated, 8.7 MB copied,  49 MB peak memory  [!]
  local
    10000
      sp.shallow:    OK (0.22s)
        626  μs ±  50 μs, 2.3 MB allocated, 317 B  copied,  49 MB peak memory  [!]
      sp.deep:       OK (0.23s)
        672  μs ±  55 μs, 3.1 MB allocated, 476 B  copied,  49 MB peak memory  [!]
      fused.shallow: OK (0.22s)
        2.81 ms ± 220 μs, 5.8 MB allocated, 225 KB copied,  49 MB peak memory  [!]
      fused.deep:    OK (0.20s)
        11.7 ms ± 849 μs,  24 MB allocated, 665 KB copied,  49 MB peak memory  [!]
      sem.shallow:   OK (0.48s)
        6.95 ms ± 614 μs,  13 MB allocated, 3.6 MB copied,  49 MB peak memory
      sem.deep:      OK (0.43s)
        12.9 ms ± 1.1 ms,  32 MB allocated, 7.6 MB copied,  49 MB peak memory

regression items marked by [!].

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information