Skip to content

Performance regression due to stream fusion issue in GHC-9

Summary

See #19790 (closed) for background. This is likely to be a different root cause so raising another issue for this.

I updated the repo (https://github.com/composewell/streamly-ghc9-regression) with reproduction code for another regression. This time it is in the postscan operation.

Steps to reproduce

You can pull the repo (master branch or postscan branch) and use the following command to build it:

$ ghc --make -O2 -fspec-constr-recursive=4 -ddump-to-file -ddump-simpl Main.hs
$ ./Main +RTS -s

If we build the code with ghc-8.10 vs ghc-9.3+!5658 (closed) we can see that allocations in ghc-8.10 core are much less because the code fuses. In the core generated by ghc-9.3 we can see the Yield/Skip/Stop constructors in the code because it does not fuse.

Expected behavior

GHC-9 should produce code as efficient as GHC-8.

Environment

Edited by harendra
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information