Skip to content

fix for CSE

The current version of CSE is slightly broken -- contrary to the comments explaining it, it does not add an additional mapping when it substitutes the RHS of some binding. Taking the example from CSE.lhs

module Test2 where

data C a b = C a b
a = undefined
{-# NOINLINE a #-}
b = undefined
{-# NOINLINE b #-}

x1 = C a b
x2 = C x1 b
y1 = C a b
y2 = C y1 b

We want to detect that y1 is the same as x1 and so rewrite it to y1 = x1. But at this point we also want to add a new substitution that changes y1 to x1. So that we can get y2 = C x1 b and then y2 = x2. This is not done by the current code:

> ghc -O2 -fforce-recomp Test2.hs -ddump-simpl -dsuppress-all -rtsopts
[1 of 1] Compiling Test2            ( Test2.hs, Test2.o )

==================== Tidy Core ====================
Result size = 40

b = undefined
a = b
x1 = \\ @ a_aaf @ b_aag -> C (a) (b)
x2 = \\ @ b_aao @ a_aap @ b1_aaq -> C (x1) (b)
y1 = x1
y2 = \\ @ b_aaG @ a_aaH @ b1_aaI -> C (x1) (b)

I wrote a patch to fix that and we get the desired result:

> ~/dev/ghc-work/inplace/bin/ghc-stage2 -O2 -fforce-recomp Test2.hs -ddump-simpl -dsuppress-all -rtsopts
[1 of 1] Compiling Test2            ( Test2.hs, Test2.o )

==================== Tidy Core ====================
  Result size = 30

b = undefined
a = b
x1 = \\ @ a_aal @ b_aam -> C (b) (b)
x2 = \\ @ b_aau @ a_aav @ b1_aaw -> C (x1) (b)
y1 = x1
y2 = x2

The fix seems quite easy but it made nofib unhappy -- see nofib1 attachment. Apparently there is quite a bit additional alloctation happening in a few benchmarks. I managed to narrow it down to the GHC.IO.Encoding.\* modules. Adding a simple GHC_OPTIONS -fno-cse seems to improve the performance quite a bit (and above the current HEAD!) -- see nofib2 attachment. There seems to be a bit more code bloat, but the performance looks worth it. I haven't yet looked into exactly what causes the excessive allocation with CSE in the GHC.IO.Encoding.\* modules (and I'm also a bit surprised by that -- I thought the main issue with CSE would be bigger memory usage). So any suggestions are more than welcome. ;)

All the patches are in https://github.com/michalt/ghc (branch "cse") and https://github.com/michalt/packages-base (branch "cse").

Trac metadata
Trac field Value
Version 7.5
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
Edited by Michal Terepeta
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information