Cmm: Sinking, Hp invariants and aliasing.
In Cmm we often have (roughly) this pattern:
stg_sink_things ( P_ x1 )
{
W_ x2, x3, x4;
x2 = W_[x1+1];
x3 = W_[x1+2];
x4 = W_[x1+3];
W_[Hp-8] = x2;
W_[Hp-16] = x3;
W_[Hp-24] = x4;
W_[Hp-32] = SomeCon_info;
return (1);
}
There is already a comment in CmmSink discussing this:
ToDo: this won't currently fix the following commonly occurring code:
x1 = [R1 + 8] x2 = [R1 + 16] .. [Hp - 8] = x1 [Hp - 16] = x2 ..
because [R1 + 8] and [Hp - 8] are both HeapMem. We know that assignments to [Hp + n] do not conflict with any other heap memory, but this is tricky to nail down. What if we had
x = Hp + n [x] = ...
the store to [x] should be "new heap", not "old heap". Furthermore, you could imagine that if we started inlining functions in Cmm then there might well be reads of heap memory that was written in the same basic block. To take advantage of non-aliasing of heap memory we will have to be more clever.
In practice I have seen the problematic case where we don't end up sinking quite often.
So I figured I would try to address this.
The first thing is that we need to have the invariant that any write to Hp
references "new" heap. It's true for the code we generate, and true for our primops code so that is fine.
Next we have the "hard" case of:
> x = Hp + n
> [x] = ...
While it's true that x
ideally would alias to Hp
it's perfectly fine to not optimize this case.
The code generator doesn't really spit out that kind of code and hand written Cmm can always be hand optimized.
So I set out to treat writes to Hp
as new heap which doesn't alias with "old" heap and that was surprisingly easy.
However I felt like that was fundamentally broken so I came up with this counter example.
bar = Hp + off
foo = [bar]
[Hp+off] = <e>
return foo
Here [bar] and [Hp+off] clearly alias. But if we only treat writes to [Hp] as NewHeap and reads from other gcptrs as "old" heap then we end up sinking the read past the write and disaster strike. It worked even with this horrible flaw since GHC doesn't generate reads from Hp
directly or indirectly ever it seems. But that seemes like a horrible flaw.
The is a (conceptually) simple solution however. For every variable check if it somehow depends on Hp. If it does a memory read utilizing it should conflict with Hp. This is pretty simple to implement using hoopl.
I will push a patch with a proof-of-concept for this soon.