Skip to content

Unnecessary reboxing

This problem came up when I was debugging regressions in some variants of !7029.

Problem

Consider this function:

f x = (g, g)
  where
    g :: Int -> IO Int
    g y = do { if y>2 then print x else return ()
             ; foogle y }

foogle :: Int -> IO Int
{-# NOINLINE foogle #-}
foogle n = return n

We get this code for f, compiling with GHC 9.2 and -O:

Foo.$wf :: forall {a}. Show a => a -> (# Int -> IO Int, Int -> IO Int #)
Foo.$wf
  = \ (@a_s1bJ) ($dShow_s1bK :: Show a_s1bJ) (x_s1bL :: a_s1bJ) ->
      let {
        lvl1_s1bo :: String
        [LclId]
        lvl1_s1bo = show @a_s1bJ $dShow_s1bK x_s1bL } in
      let {
        $wg_s1bH [InlPrag=[2], Dmd=LCL(C1(!L))]
          :: GHC.Prim.Int#
             -> GHC.Prim.State# GHC.Prim.RealWorld
             -> (# GHC.Prim.State# GHC.Prim.RealWorld, Int #)
        $wg_s1bH
          = \ (ww_s1bC :: GHC.Prim.Int#)
              (s_s1bE :: GHC.Prim.State# GHC.Prim.RealWorld) ->
              case GHC.Prim.># ww_s1bC 2# of {
                __DEFAULT -> Foo.f1 (GHC.Types.I# ww_s1bC) s_s1bE;
                1# ->
                  case GHC.IO.Handle.Text.hPutStr2
                         GHC.IO.Handle.FD.stdout lvl1_s1bo GHC.Types.True s_s1bE
                  of
                  { (# ipv_a13o, ipv1_a13p #) ->
                  Foo.f1 (GHC.Types.I# ww_s1bC) ipv_a13o
                  }
              } } in
      let {
        g_s1bj [InlPrag=[2]]
          :: Int
             -> GHC.Prim.State# GHC.Prim.RealWorld
             -> (# GHC.Prim.State# GHC.Prim.RealWorld, Int #)
        g_s1bj
          = \ (y_s1bA :: Int)
              (s_s1bE :: GHC.Prim.State# GHC.Prim.RealWorld) ->
              case y_s1bA of { GHC.Types.I# ww_s1bC ->
              $wg_s1bH ww_s1bC s_s1bE
              } } in
      (# g_s1bj `cast` <blah blah>
         g_s1bj `cast` <blah blah> #)
  • Problem 1. We worker/wrapper g even though it is never applied to anything. That makes no sense; a caller of $wf will get back a pair of functions; when it applies that function, it'll call g which will call $wg. we have allocated two closures instead of one, and slowed down calls a bit. Not great.

  • Problem 2. The wrapper g_s1bj unboxes the argument y, and passes the contents to $wg which reboxes it in every control path! Now things are really worse; the caller passes a boxed Int; g unboxes it; and $wg reboxes it.

Diagnosis for Problem 2

At the demand analysis stage, f looks like this:

f = \ (@a_aUw)
      ($dShow_a12O [Dmd=MP(A,MCM(L),A)] :: Show a_aUw)
      (x_auq :: a_aUw) ->
      let {
        lvl_s1bo :: String
        [LclId,
         Unf=Unf{Src=<vanilla>, TopLvl=False, Value=False, ConLike=False,
                 WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 30 0}]
        lvl_s1bo = show @a_aUw $dShow_a12O x_auq } in
      let {
        g_s1bj
          :: Int
             -> GHC.Prim.State# GHC.Prim.RealWorld
             -> (# GHC.Prim.State# GHC.Prim.RealWorld, Int #)
        [LclId,
         Arity=2,
         Str=<1!L><L> {s1bo->M!L},
         Unf=Unf{Src=<vanilla>, TopLvl=False, Value=True, ConLike=True,
                 WorkFree=True, Expandable=True, Guidance=IF_ARGS [20 0] 121 0}]
        g_s1bj
          = \ (y_aus [Dmd=1!L] :: Int)
              (s_a13m :: GHC.Prim.State# GHC.Prim.RealWorld) ->
              case y_aus of wild_a13t [Dmd=L!L] { GHC.Types.I# x_a13u ->
              case GHC.Prim.># x_a13u 2# of {
                __DEFAULT -> foogle_s13h wild_a13t s_a13m;
                1# ->
                  case GHC.IO.Handle.Text.hPutStr2
                         GHC.IO.Handle.FD.stdout lvl_s1bo GHC.Types.True s_a13m
                  of
                  { (# ipv_a13o, ipv1_a13p [Dmd=A] #) ->
                  foogle_s13h wild_a13t ipv_a13o
                  }
              }
              } } in
      (g_s1bj `cast` .., g_s1bj `cast` ..)

So g is strict in y (since it case-analyses it immediately) but both branches use it boxed, via wild_a13t. Usually a "bothDmd" between a boxed demand (from the branches) and an unboxed demand (from the scrutinee) woul dyield a boxed demand, and w/w would not unbox it.

BUT the case hPutStr2 ... of ... messes it up. It calls deferAfterPreciseException on the demand from branches (a boxed demand)

deferAfterPreciseException :: DmdType -> DmdType
deferAfterPreciseException = lubDmdType exnDmdType

Now exnDmdType behaves like an unboxed demand, and lub betwen boxed and unboxed is (currently) unboxed. Result we give an unboxed demand oy y.

But that's obviously stupid. This deferAfterPreciseException should make things lazy, but no less boxed.

Cure for Problem 2

After some discussion with @sgraf812, we decided to make a speicialised version of lubDmdType for the call in deferAfterPreciseExeption, which makes boxed-ness win.

Cure for Problem 1

Problem 1 is quite separate. One possible cure is this:

  • In the last run of the simplifier, discard the stable unfolding on wrappers for non-recursive, non-exported Ids.

If we discarded g's stable unfolding, there would be only one remaining call to $wg, namely in g's RHS, so the worker would inline back into the wrapper, effectively undoing w/w in the cases where it turned out to have no benefit.

Why non-exported? Because for exported Ids we may well get external calls that beneft from the wrapper.

Why non-recursive? Because if the worker is recursive, we won't be able to inline it into the wrapper anyway.

Where this came up

All this showed up when looking at a substantial regression in nofib/shootout/reverse-complement. That ultimately turned out to be extra allocation in GHC.IO.Handle.Text.hGetBuf. I saw this code

GHC.IO.Handle.Text.$whGetBuf
  = \ (@a_s4L9)
      (h_s4La :: Handle)
      (ww_s4Ld :: Addr#)
      (ww1_s4Lh :: Int#)
      (eta_s4Lj [OS=OneShot] :: State# RealWorld) ->
      case ww1_s4Lh of wild_X1 {
        __DEFAULT ->
          let {
            wild1_a3XQ :: Int
            [LclId, Unf=OtherCon []]
            wild1_a3XQ = GHC.Types.I# wild_X1 } in
          case <# wild_X1 0# of {
            __DEFAULT ->
              let {
                ptr_X0 :: Ptr a_s4L9
                [LclId, Unf=OtherCon []]
                ptr_X0 = GHC.Ptr.Ptr @a_s4L9 ww_s4Ld } in
              let {
                $wact1_s4L7 [InlPrag=[2], Dmd=LCL(C1(C1(C1(C1(C1(C1(C1(!L))))))))]
                  :: forall {dev} {enc_state} {dec_state}.
                     (RawIO.RawIO dev, Buffered.BufferedIO dev) =>
                     dev
                     -> MutVar# RealWorld (Buffer Word8)
                     -> MutVar# RealWorld (dec_state, Buffer Word8)
                     -> MutVar# RealWorld (Buffer CharBufElem)
                     -> Maybe (GHC.IO.Encoding.Types.TextDecoder dec_state)
                     -> State# RealWorld
                     -> (# State# RealWorld, Int #)
                [LclId,
                 Arity=8,
                 Str=<LP(LCL(C1(C1(C1(C1(P(L,1P(1L))))))),A,A,A)><LP(A,LCL(C1(C1(P(L,1P(1P(1L),L))))),A,A,A,A)><L><L><L><L><ML><L>,
                 Unf=OtherCon []]
                $wact1_s4L7
                  = \ (@dev_s4KJ)
                      (@enc_state_s4KK)
                      (@dec_state_s4KL)
                      (ww2_s4KM :: RawIO.RawIO dev_s4KJ)
                      (ww3_s4KO :: Buffered.BufferedIO dev_s4KJ)
                      (ww4_s4KQ
                         :: dev_s4KJ
                         Unf=OtherCon [])
                      (ww5_s4KS :: MutVar# RealWorld (Buffer Word8))
                      (ww6_s4KU :: MutVar# RealWorld (dec_state_s4KL, Buffer Word8))
                      (ww7_s4KV :: MutVar# RealWorld (Buffer CharBufElem))
                      (ww8_s4KY
                         :: Maybe (GHC.IO.Encoding.Types.TextDecoder dec_state_s4KL))
                      (eta1_s4L4 [OS=OneShot] :: State# RealWorld) ->
                      case GHC.IO.Handle.Internals.$wflushCharReadBuffer
                             @dev_s4KJ
                             @enc_state_s4KK
                             @dec_state_s4KL
                             ww5_s4KS
                             ww6_s4KU
                             ww7_s4KV
                             ww8_s4KY
                             eta1_s4L4
                      of ww17_a4cj
                      { __DEFAULT ->
                      case readMutVar# @RealWorld @(Buffer Word8) ww5_s4KS ww17_a4cj of
                      { (# ipv_a3VH, ipv1_a3VI #) ->
                      case ipv1_a3VI of wild2_X5
                      { Buffer bx_d3HU bx1_d3HV ds_d3pV bx2_d3HW bx3_d3HX bx4_d3HY
                               bx5_d3HZ ->
                      case ==# bx4_d3HY bx5_d3HZ of {
                        __DEFAULT ->
                          GHC.IO.Handle.Text.hGetBuf15
                            (GHC.IO.Handle.Types.Handle__
                               @dev_s4KJ
                               @enc_state_s4KK
                               @dec_state_s4KL
                               ww2_s4KM
                               (GHC.IO.Handle.Text.hGetBuf14 @dev_s4KJ)
                               ww3_s4KO
                               (GHC.IO.Handle.Text.hGetBuf13 @dev_s4KJ)
                               ww4_s4KQ
                               GHC.IO.Handle.Text.hGetBuf12
                               ww5_s4KS
                               GHC.IO.Handle.Text.hGetBuf11
                               ww6_s4KU
                               ww7_s4KV
                               (RUBBISH(UnliftedRep)
                                  @(MutVar# RealWorld (BufferList CharBufElem)))
                               (GHC.IO.Handle.Text.hGetBuf10 @enc_state_s4KK)
                               ww8_s4KY
                               GHC.IO.Handle.Text.hGetBuf9
                               GHC.IO.Handle.Text.hGetBuf8
                               GHC.IO.Handle.Text.hGetBuf8
                               GHC.IO.Handle.Text.hGetBuf7)
                            wild2_X5
                            (ptr_X0
                             `cast` ((Ptr Univ(phantom phantom <*>_N :: a_s4L9, Word8))_R
                                     :: Ptr a_s4L9 ~R# Ptr Word8))
                            GHC.IO.Handle.Text.hGetBuf6
                            wild1_a3XQ
                            ipv_a3VH;
                        1# ->
                          GHC.IO.Handle.Text.hGetBuf5
                            (GHC.IO.Handle.Types.Handle__
                               @dev_s4KJ
                               @enc_state_s4KK
                               @dec_state_s4KL
                               ww2_s4KM
                               (GHC.IO.Handle.Text.hGetBuf14 @dev_s4KJ)
                               ww3_s4KO
                               (GHC.IO.Handle.Text.hGetBuf13 @dev_s4KJ)
                               ww4_s4KQ
                               GHC.IO.Handle.Text.hGetBuf12
                               ww5_s4KS
                               GHC.IO.Handle.Text.hGetBuf11
                               ww6_s4KU
                               ww7_s4KV
                               (RUBBISH(UnliftedRep)
                                  @(MutVar# RealWorld (BufferList CharBufElem)))
                               (GHC.IO.Handle.Text.hGetBuf10 @enc_state_s4KK)
                               ww8_s4KY
                               GHC.IO.Handle.Text.hGetBuf9
                               GHC.IO.Handle.Text.hGetBuf8
                               GHC.IO.Handle.Text.hGetBuf8
                               GHC.IO.Handle.Text.hGetBuf7)
                            wild2_X5
                            (ptr_X0
                             `cast` ((Ptr Univ(phantom phantom <*>_N :: a_s4L9, Word8))_R
                                     :: Ptr a_s4L9 ~R# Ptr Word8))
                            GHC.IO.Handle.Text.hGetBuf6
                            wild1_a3XQ
                            ipv_a3VH
                      }
                      }
                      }
                      } } in
              let {
                act1_s4gq [InlPrag=[2], Dmd=LCL(C1(L))]
                  :: Handle__ -> State# RealWorld -> (# State# RealWorld, Int #)
                [LclId,
                 Arity=2,
                 Str=<1!P(LP(LCL(C1(C1(C1(C1(P(L,1P(1L))))))),A,A,A),A,LP(A,LCL(C1(C1(P(L,1P(1P(1L),L))))),A,A,A,A),A,L,A,L,A,L,L,A,A,ML,A,A,A,A)><L>,
                 Unf=Unf{Src=InlineStable, TopLvl=False, Value=True, ConLike=True,
                         WorkFree=True, Expandable=True,
                         Guidance=ALWAYS_IF(arity=2,unsat_ok=True,boring_ok=False)
                         Tmpl= \ (h__s4KH [Occ=Once1!] :: Handle__)
                                 (eta1_s4L4 [Occ=Once1, OS=OneShot] :: State# RealWorld) ->
                                 case h__s4KH of
                                 { Handle__ @dev_s4KJ @enc_state_s4KK @dec_state_s4KL
                                            ww2_s4KM [Occ=Once1] _ [Occ=Dead] ww4_s4KO [Occ=Once1]
                                            _ [Occ=Dead] ww6_s4KQ [Occ=Once1] _ [Occ=Dead]
                                            ww8_s4KS [Occ=Once1] _ [Occ=Dead] ww10_s4KU [Occ=Once1]
                                            ww11_s4KV [Occ=Once1] _ [Occ=Dead] _ [Occ=Dead]
                                            ww14_s4KY [Occ=Once1] _ [Occ=Dead] _ [Occ=Dead]
                                            _ [Occ=Dead] _ [Occ=Dead] ->
                                 $wact1_s4L7
                                   @dev_s4KJ
                                   @enc_state_s4KK
                                   @dec_state_s4KL
                                   ww2_s4KM
                                   ww4_s4KO
                                   ww6_s4KQ
                                   ww8_s4KS
                                   ww10_s4KU
                                   ww11_s4KV
                                   ww14_s4KY
                                   eta1_s4L4
                                 }}]
                act1_s4gq
                  = \ (h__s4KH :: Handle__)
                      (eta1_s4L4 [OS=OneShot] :: State# RealWorld) ->
                      case h__s4KH of
                      { Handle__ @dev_s4KJ @enc_state_s4KK @dec_state_s4KL ww2_s4KM
                                 ww3_s4KN ww4_s4KO ww5_s4KP ww6_s4KQ ww7_s4KR ww8_s4KS ww9_s4KT
                                 ww10_s4KU ww11_s4KV ww12_s4KW ww13_s4KX ww14_s4KY ww15_s4KZ
                                 ww16_s4L0 ww17_s4L1 ww18_s4L2 ->
                      $wact1_s4L7
                        @dev_s4KJ
                        @enc_state_s4KK
                        @dec_state_s4KL
                        ww2_s4KM
                        ww4_s4KO
                        ww6_s4KQ
                        ww8_s4KS
                        ww10_s4KU
                        ww11_s4KV
                        ww14_s4KY
                        eta1_s4L4
                      } } in
              case h_s4La of wild2_a457 {
                FileHandle ds_a459 bx_a45a ->
                  GHC.IO.Handle.Internals.$wwantReadableHandle'
                    @Int
                    GHC.IO.Handle.Text.hGetBuf3
                    wild2_a457
                    bx_a45a
                    (act1_s4gq
                     `cast` (<Handle__>_R
                             %<'Many>_N ->_R Sym (GHC.Types.N:IO[0] <Int>_R)
                             :: (Handle__ -> State# RealWorld -> (# State# RealWorld, Int #))
                                ~R# (Handle__ -> IO Int)))
                    eta_s4Lj;
                DuplexHandle ds_a45c bx_a45d bx1_a45e ->
                  GHC.IO.Handle.Internals.$wwantReadableHandle'
                    @Int
                    GHC.IO.Handle.Text.hGetBuf3
                    wild2_a457
                    bx_a45d
                    (act1_s4gq
                     `cast` (<Handle__>_R
                             %<'Many>_N ->_R Sym (GHC.Types.N:IO[0] <Int>_R)
                             :: (Handle__ -> State# RealWorld -> (# State# RealWorld, Int #))
                                ~R# (Handle__ -> IO Int)))
                    eta_s4Lj
              };
            1# ->
              GHC.IO.Handle.Text.hGetBuf2
                @Int h_s4La GHC.IO.Handle.Text.hGetBuf3 wild1_a3XQ eta_s4Lj
          };
        0# -> (# eta_s4Lj, GHC.IO.Handle.Text.hGetBuf6 #)
      }

Note that

  • act1_s4gq is passed to $wwantReadableHandle'; it is not applied
  • act1_s4gq is a wrapper that unboxes its Handle argument
  • But the worker $wact1_s4L7 reboxes it, in every control path.
Edited by Sebastian Graf
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information