Skip to content
  • Herbert Valerio Riedel's avatar
    Optimise `Ord Version` instance · aabee7c3
    Herbert Valerio Riedel authored
    Profiling showed that there are quite a few 'Ord' calls (~70k calls)
    which result in 'versionNumbers' being evaluated because the need to
    compare with Versions not fitting in a Word64 becomes necessary, mostly
    when inserting into a `Map`. So I've optimised this a bit more.  After
    some experimentation, the new `compare` implemetation results in faster
    comparisions (`cmpOpt` is the new optimised impl, `compare` is the
    previous naive one):
    
    comparing two PV1 constructors:
    
        benchmarking compare 1.2.3.4.5 1.2.3.4.5
        time                 42.33 ns   (42.17 ns .. 42.47 ns)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 42.17 ns   (42.13 ns .. 42.28 ns)
        std dev              205.1 ps   (128.7 ps .. 325.7 ps)
    
        benchmarking cmpOpt  1.2.3.4.5 1.2.3.4.5
        time                 33.31 ns   (33.23 ns .. 33.40 ns)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 33.37 ns   (33.29 ns .. 33.46 ns)
        std dev              288.6 ps   (242.9 ps .. 315.8 ps)
    
        benchmarking compare [1.2.3.4.5] [1.2.3.4.5]
        time                 31.92 ns   (31.89 ns .. 31.94 ns)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 31.89 ns   (31.88 ns .. 31.91 ns)
        std dev              37.09 ps   (24.38 ps .. 54.82 ps)
    
    comparing a PV0 constructor with a PV1 constructor when the first digit
    decides the outcome:
    
        benchmarking compare 2.3.4 1.2.3.4.5
        time                 24.96 ns   (24.78 ns .. 25.25 ns)
                             0.996 R²   (0.989 R² .. 1.000 R²)
        mean                 25.50 ns   (24.95 ns .. 26.95 ns)
        std dev              2.802 ns   (1.275 ns .. 5.163 ns)
        variance introduced by outliers: 93% (severely inflated)
    
        benchmarking cmpOpt  2.3.4 1.2.3.4.5
        time                 11.29 ns   (11.27 ns .. 11.30 ns)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 11.28 ns   (11.27 ns .. 11.29 ns)
        std dev              24.69 ps   (12.02 ps .. 46.86 ps)
    
        benchmarking compare [2.3.4] [1.2.3.4.5]
        time                 11.05 ns   (11.04 ns .. 11.06 ns)
                             1.000 R²   (1.000 R² .. 1.000 R²)
        mean                 11.05 ns   (11.04 ns .. 11.06 ns)
        std dev              32.82 ps   (15.89 ps .. 61.64 ps)
    
    These timings are now very close to the 'Ord [Int]' timings.
    
    With this optimisation, the total number of 'versionNumbers' calls
    reported by `+RTS -p` for `cabal new-build --dry`'ing haskell-ide-engine
    went down from originally 73501 calls to 6789 calls.
    aabee7c3