Silly worker/wrapper split

Cnnsider this program:

flags (Options f x)
= reverse (reverse (reverse (reverse (reverse (reverse (reverse (reverse x)))))))
  `seq` f

data X = X String Bool Bool Bool Bool

data Options = Options !X [Int]

All the reverse stuff is just a trick to make the function big, so that we will get a worker/wapper split. (Small functions are not split.) But basically flags is just a record selector.

Now look at what we get:

$wflags_sxd :: String -> Bool -> Bool -> Bool -> Bool -> [Int]
            -> (# String, Bool, Bool, Bool, Bool #)
[Str=<L,U><L,U><L,U><L,U><L,U><S,1*U>]
$wflags_sxd
  = \ (ww_sx2 :: String)
      (ww_sx3 :: Bool)
      (ww_sx4 :: Bool)
      (ww_sx5 :: Bool)
      (ww_sx6 :: Bool)
      (ww_sx8 :: [Int]) ->
      case GHC.List.reverse1
             @Int
             (GHC.List.reverse1
                @Int
                (GHC.List.reverse1
                   @Int
                   (GHC.List.reverse1
                      @Int
                      (GHC.List.reverse1
                         @Int
                         (GHC.List.reverse1
                            @Int
                            (GHC.List.reverse1
                               @Int
                               (GHC.List.reverse1 @Int ww_sx8 (GHC.Types.[] @Int))
                               (GHC.Types.[] @Int))
                            (GHC.Types.[] @Int))
                         (GHC.Types.[] @Int))
                      (GHC.Types.[] @Int))
                   (GHC.Types.[] @Int))
                (GHC.Types.[] @Int))
             (GHC.Types.[] @Int)
      of
      { __DEFAULT ->
      (# ww_sx2, ww_sx3, ww_sx4, ww_sx5, ww_sx6 #)
      }

flags [InlPrag=NOUSERINLINE[2]] :: Options -> X
[Str=<S(SS),U(U(U,U,U,U,U),U)>,]
flags
  = \ (w_swW :: Options) ->
      case w_swW of
      { Options ww_swZ [Dmd=<S,1*U(U,U,U,U,U)>] ww_sx8 [Dmd=<S,1*U>] ->
      case ww_swZ of { X ww_sx2 ww_sx3 ww_sx4 ww_sx5 ww_sx6 ->
      case $wflags_sxd ww_sx2 ww_sx3 ww_sx4 ww_sx5 ww_sx6 ww_sx8 of
      { (# ww_sxe, ww_sxf, ww_sxg, ww_sxh, ww_sxi #) ->
      Bug.X ww_sxe ww_sxf ww_sxg ww_sxh ww_sxi
      }
      }
      }

This is utterly bonkers. Why does the wrapper unpacking the X field of Options, pass them to the worker, which returns them, only for the wrapper to re-assemble the X value.

This came up when investigating the perf regression on record selectors described in !2873 (closed).

The culprit is this code in DmdAnal:

mkRhsDmd :: AnalEnv -> Arity -> CoreExpr -> CleanDemand
mkRhsDmd env rhs_arity rhs =
  case peelTsFuns rhs_arity (findTypeShape (ae_fam_envs env) (exprType rhs)) of
    Just (TsProd tss) -> mkCallDmds rhs_arity (cleanEvalProdDmd (length tss))
    _                 -> mkCallDmds rhs_arity cleanEvalDmd

cleanEvalDmd :: CleanDemand
cleanEvalDmd = JD { sd = HeadStr, ud = Used }

cleanEvalProdDmd :: Arity -> CleanDemand
cleanEvalProdDmd n = JD { sd = HeadStr, ud = UProd (replicate n useTop) }

This function constructs the "vanilla RHS demand" used by the demand analyser for the RHS of a binding. But the case-expression arranges that when the result type is a product we use cleanEvalProdDmd. That is what gives the funny looking strictness for flags

   [Str=<S(SS),U( U(U,U,U,U,U), U)>,]

And it's that inner U(U,U,U,U,U) that causes the extra unpacking.

This bit of code is justified in Note [Product demands for function body], and that in turn was introduced seven years ago:

commit 99d4e5b4a0bd32813ff8c74e91d2dcf6b3555176
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Fri May 3 14:50:58 2013 +0100

    Implement cardinality analysis

Now, it turns out that things have changed in the meantime. Simply removing the special case, to give

mkRhsDmd :: AnalEnv -> Arity -> CoreExpr -> CleanDemand
mkRhsDmd env rhs_arity rhs = mkCallDmds rhs_arity cleanEvalDmd

still correctly optimises the check function in binary_trees, which is the function idenified in Note [Product demands for function body]. In other words, removing the special case still does the optimisation we want -- but also stops the stupid w/w split above.

TL;DR. Let's try this new, simpler mkRhsDmd.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information