Skip to content

Optimization: Shift dropped list heads by coeffecient to prevent thunk generation

Consider the following snippet(s) equivalent to ([a..b] !! n), the source of (!!) and the source of drop:

normal_list :: Int -> Int
normal_list n = head $ drop n [a..b]

shifted_list :: Int -> Int
shifted_list n = head $ drop (n-n) [(a+n)..b]
xs     !! n | n < 0 = undefined
[]     !! _         = undefined
(x:_)  !! 0         = x
(_:xs) !! n         = xs !! (n-1)
drop n xs     | n <= 0 =  xs
drop _ []              =  []
drop n (_:xs)          =  drop (n-1) xs

Notice the (_:xs) matching in these functions as a result of WHNF.

In the first case, normal_list, thunks are generated for x in (x_:xs) of the target list and overhead is seen in the pattern matching/guard of n in drop.

In the second case, shifted_list, this overhead can be completely removed by adding a coefficient such that the list starts at the programmatically defined lower bound, a, plus the known fact that the head is dropped n times.

Hence, given the example above, consider:

[x * x + 3 | x <- [1..]] !! n
-- versus
[x * x + 3 | x <- [(1+n)..]] !! (n-n)
-- which is optimized into
[x * x + 3 | x <- [(n+1)..]] !! 0
-- which is effectively
head [x * x + 3 | x <- [(n+1)..]]

The operation is turned from O(n) into O(1).

Consider benchmark proving GHC 7.4.2 does not make this optimization under -O2:

import Criterion.Main

normal_list :: Int -> Int
normal_list n = head $ drop n [1..]

shifted_list :: Int -> Int
shifted_list n = head $ drop (n-n) [(1+n)..]

main = defaultMain
    [ bench "normal_list 1000" $ whnf normal_list 1000
    , bench "shifted_list 1000" $ whnf shifted_list 1000
    ]
C:\Users\Kyle\Desktop>ghc -O2 listco.hs
[1 of 1] Compiling Main             ( listco.hs, listco.o )
Linking listco.exe ...

C:\Users\Kyle\Desktop>listco.exe
warming up
estimating clock resolution...
mean is 4.644044 us (160001 iterations)
found 319255 outliers among 159999 samples (199.5%)
  159256 (99.5%) low severe
  159999 (100.0%) high severe
estimating cost of a clock call...
mean is 310.3118 ns (34 iterations)

benchmarking normal_list 1000
Warning: Couldn't open /dev/urandom
Warning: using system clock for seed instead (quality will be lower)
mean: 7.352463 us, lb 7.058339 us, ub 7.646574 us, ci 0.950
std dev: 1.478087 us, lb 1.478066 us, ub 1.478200 us, ci 0.950
variance introduced by outliers: 94.651%
variance is severely inflated by outliers

benchmarking shifted_list 1000
mean: 46.42819 ns, lb 45.44244 ns, ub 47.21689 ns, ci 0.950
std dev: 4.495757 ns, lb 4.035428 ns, ub 4.832396 ns, ci 0.950
variance introduced by outliers: 77.960%
variance is severely inflated by outliers
Trac metadata
Trac field Value
Version 7.4.2
Type FeatureRequest
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information