`for_ [0..n]` is not unfolded for some monads
Summary
It is relatively common to express an imperative for
-style loop using for_ [0..n] $ \i ->
. In many cases, GHC is able to inline the call to enumFromTo
and entirely eliminate the list constructor.
However, there are some monads where this does not occur, including some conceptually simple ones, making performance hard to reason about.
I realise this is probably a more general issue, so apologies if this has been reported before. I was not able to find something which looked similar in the issue tracker, but wasn't quite sure what to look for either!
Steps to reproduce
Consider the following example:
{-# OPTIONS_GHC -O2 -ddump-simpl -dsuppress-all #-}
module T (go) where
import Control.Monad.State.Strict
import Data.Foldable
import GHC.Exts
type OurMonad = State ()
go :: Int -> OurMonad ()
go n = do
x <- opaque ()
for_ [0..n] $ \i -> do
opaque x
opaque :: a -> OurMonad ()
opaque = noinline (\_ -> pure ())
-- The NOINLINE pragma isn't sufficient here, we need noinline.
Inspecting the generated core reveals the list construction is not eliminated:
Generated Core
-- RHS size: {terms: 6, types: 5, coercions: 0, joins: 0/0}
go1 = \ @a_a1Cq _ s1_a1MQ -> ((), s1_a1MQ)
-- RHS size: {terms: 3, types: 5, coercions: 17, joins: 0/0}
opaque = \ @a_a1Cq -> noinline (go1 `cast` <Co:17>)
-- RHS size: {terms: 2, types: 1, coercions: 0, joins: 0/0}
go_m1 = opaque ()
-- RHS size: {terms: 58, types: 45, coercions: 26, joins: 1/4}
go
= \ n_a1tg ->
let {
lvl_s22B
= case n_a1tg of { I# y_a21W ->
case ># 0# y_a21W of {
__DEFAULT ->
letrec {
go3_a257
= \ x_a258 ->
: (I# x_a258)
(case ==# x_a258 y_a21W of {
__DEFAULT -> go3_a257 (+# x_a258 1#);
1# -> []
}); } in
go3_a257 0#;
1# -> []
}
} } in
(\ s1_a1N4 ->
case ((go_m1 `cast` <Co:4>) s1_a1N4) `cast` <Co:4> of
{ (a1_a1N7, s'_a1N8) ->
let { lvl1_s22A = opaque a1_a1N7 } in
joinrec {
go2_s25c ds_a235 eta_B0
= case ds_a235 of {
[] -> ((), eta_B0) `cast` <Co:5>;
: y_a238 ys_a239 ->
case ((lvl1_s22A `cast` <Co:4>) eta_B0) `cast` <Co:4> of
{ (a2_a22S, s'1_a22T) ->
jump go2_s25c ys_a239 s'1_a22T
}
}; } in
jump go2_s25c lvl_s22B s'_a1N8
})
`cast` <Co:5>
This was originally encountered in some code which uses the STT
and binary's Get
monad.
Expected behaviour
One would expect the list construction to be eliminated, and the comparison inlined into the join point's body. We can see this occur with other monads, such as IO
or ST
.
Generated Core with type OurMonad = IO
go2 = \ @a_a1z3 _ s_a1IO -> (# s_a1IO, () #)
-- RHS size: {terms: 3, types: 5, coercions: 6, joins: 0/0}
opaque = \ @a_a1z3 -> noinline (go2 `cast` <Co:6>)
-- RHS size: {terms: 46, types: 61, coercions: 4, joins: 1/2}
go1
= \ n_a1sb s_a1J0 ->
case ((opaque ()) `cast` <Co:2>) s_a1J0 of
{ (# ipv_a1J2, ipv1_a1J3 #) ->
case n_a1sb of { I# y_a1XJ ->
case ># 0# y_a1XJ of {
__DEFAULT ->
let { lvl_s1Yk = opaque ipv1_a1J3 } in
joinrec {
go3_s20G x_a1XX s1_a1Yu
= case (lvl_s1Yk `cast` <Co:2>) s1_a1Yu of
{ (# ipv2_a1Yw, ipv3_a1Yx #) ->
case ==# x_a1XX y_a1XJ of {
__DEFAULT -> jump go3_s20G (+# x_a1XX 1#) ipv2_a1Yw;
1# -> (# ipv2_a1Yw, () #)
}
}; } in
jump go3_s20G 0# ipv_a1J2;
1# -> (# ipv_a1J2, () #)
}
}
}
I find this a little surprising, as ST
/IO
and StateT
look similar - I'm assuming there's special handling of the State# s
token?
Environment
- GHC version used: 9.2.3 and 9.4.0.20220501.
Optional:
- Operating System: Linux
- System Architecture: x86_64