Adding a condition blocks fusion leading to 10x worse code
Summary
See the function arrayOfx
here https://github.com/composewell/streamly/blob/7672d3fb3762dd53f20fbd7095d6cde48ea9663d/src/Streamly/Memory/Array/Types.hs#L589
This function transforms a stream which is defined using Yield, Skip and Stop constructors exactly in the same way as the vector package's stream. It takes advantage of stream fusion just like vector. This function is used in the example code here https://github.com/composewell/streamly/blob/ghc-transformation-with-condition/groupsOf.hs . This code fuses well and all constructors are eliminated, a good clean core is generated if line 602 in arrayOfx definition is removed. If that one line, which is a simple condition, is added back then the constructors are not fused.
Here is the entire function in question for quick reference:
arraysOfx
:: forall m a. (Storable a, MonadIO m)
=> Int
-> D.Stream m a
-> D.Stream m (Array a)
arraysOfx n (D.Stream step state) =
D.Stream step' (GroupStart1 state)
where
initial = liftIO $ newArray n
{-# INLINE fstep #-}
-- Commenting the following line generates fused code with 10x better performance
fstep arr@(Array _ end bound) _ | end == bound = return arr
fstep (Array start end bound) x = do
liftIO $ poke end x
return $ Array start (end `plusPtr` sizeOf (undefined :: a)) bound
extract = return
{-# INLINE_LATE step' #-}
step' _ (GroupStart1 st) = do
fs <- initial
return $ D.Skip (GroupBuffer1 st fs 0)
step' gst (GroupBuffer1 st fs i) = do
r <- step (adaptState gst) st
case r of
D.Yield x s -> do
fs' <- fstep fs x
let i' = i + 1
return $
if i' >= n
then D.Skip (GroupYield1 fs' (GroupStart1 s))
else D.Skip (GroupBuffer1 s fs' i')
D.Skip s -> return $ D.Skip (GroupBuffer1 s fs i)
D.Stop -> return $ D.Skip (GroupYield1 fs GroupFinish1)
step' _ (GroupYield1 fs next) = do
r <- extract fs
return $ D.Yield r next
step' _ GroupFinish1 = return D.Stop
The cores for both the cases are committed in the repo:
- https://github.com/composewell/streamly/blob/ghc-transformation-with-condition/groupsOf.dump-simpl.condition-in-func.txt
- https://github.com/composewell/streamly/blob/ghc-transformation-with-condition/groupsOf.dump-simpl.no-condition-in-func.txt
Steps to reproduce
Clone the repo from github https://github.com/composewell/streamly.git, the repro example is on this branch https://github.com/composewell/streamly/tree/ghc-transformation-with-condition . Instructions to run the code and produce core output in both cases are in the example file groupsOf.hs at the root of the repo.
Expected behavior
The code should fuse reliably in both the cases.
Environment
- GHC version used: ghc-8.6.5