INLINE can inhibit optimizations by disabling worker-wrapper transform
It was surprising to me, but seems to be documented, that INLINE
can actually inhibit optimizations, apparently due to disabling worker-wrapper transform.
Here's an example from a reddit user:
Main.hs:
module Main where
import Lib
main :: IO ()
main = do
x <- read <$> getLine
print (addOne x)
Lib.hs
module Lib(addOne) where
-- {-# INLINE addInt #-} -- SLOW
-- {-# INLINABLE addInt #-} -- FAST
-- (NO PRAGMA) -- FAST
addInt :: Int -> Int -> Int
addInt x y = x + y
-- manually inhibit with NOINLINE. Does the test case also work with a large enough RHS?
{-# NOINLINE addOne #-}
addOne :: Int -> Int
addOne = addInt 666
What I see on 8.10.2 with -O1
is:
- When
NOINLINE addInt
is specifiedmain
calls an optimizedaddOne
- likewise when no pragma is on
addInt
- when I specify
-fno-worker-wrapper
we call an unoptimizedaddint
- likewise when I specify
INLINE addInt
Since inlining is already so fragile it would be nice if GHC could fall back to worker-wrapper to optimize this kind of code.