Skip to content

Inliner fails to inline a function, causing 20x slowdown

Alexey filed a bug against the mwc-random package recently, indicating a 20x to 40x slowdown on a function named uniformRange - you can see its source here.

In the original definition, there was an INLINE pragma, but Alexey noticed that it wasn't firing and so performance was predictably terrible. He added the SPECIALIZE pragmas that now follow the body of the function.

I looked at -ddump-simpl output with the SPECIALIZE pragmas removed, and sure enough there are no inlining annotations on the function.

The whole purpose of uniformRange is to be used in instance declarations such as the following:

instance Variate Int8 where
    uniform  = uniform1 fromIntegral
    uniformR = uniformRange
    {-# INLINE uniform  #-}
    {-# INLINE uniformR #-}

I have a suspicion that what's going on is that GHC's inliner is declining to do anything because some call site or other (or perhaps several?) isn't fully saturated.

The behaviour of the new inliner is subtle to understand at times - it's not at all obvious when I should rewrite an instance like this, just to satisfy it:

instance Variate Int8 where
    uniform  = uniform1 fromIntegral
    uniformR inliner sacrifice = uniformRange inliner sacrifice
    {-# INLINE uniform  #-}
    {-# INLINE uniformR #-}

Saturating as above turned out to be the solution to the performance problem. I've been able to remove the SPECIALIZE pragmas. However, I'm still worried.

It would be helpful if GHC had a mode that dumped out when (and why) inlinings do *not* take place on functions that have been annotated with INLINE, because I'm surely not the only person who gets caught by this.

Also, aesthetically I find that saturating an application as above makes for tricky-to-read "why are those arguments there?" code, sort of the inliner's version of the dreaded monomorphism restriction: a lexical tic that's tremendously important, but for reasons that most readers will not know about.

Trac metadata
Trac field Value
Version 7.2.1
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information