GHC should generate workers with unlifted arguments for lifted types.
Motivation
We already have good machinery to facilitate construction and deconstruction canceling out between caller and callee using WW. I feel like we could reasonably expand this machinery to also facilitate to some degree avoiding re-evaluating/tag checks on values between caller and callee.
Currently if we have a silly function like this:
{-# NOINLINE foo #-}
foo !x !y !z !n =
x || y || z || n > (1 :: Int)
It will evaluate x/x/z at least twice. Once for the seqs added by the bangs, and once for actual control flow. Tag inference will at least avoid evaluating these twice inside the body.
But what if the caller had already cased on the the arguments? Then we still recheck for the tag in the worker.
The worker currently looks like this:
= \ (w_s1mU :: Bool)
(w1_s1mV :: Bool)
(w2_s1mW :: Bool)
(ww_s1n0 :: Int#) ->
case w1_s1mV of y_X1 { __DEFAULT ->
case w2_s1mW of z_X2 { __DEFAULT ->
case w_s1mU of {
False ->
case y_X1 of {
False ->
case z_X2 of {
False -> tagToEnum# @Bool (># ww_s1n0 1#);
True -> GHC.Types.True
};
True -> GHC.Types.True
};
True -> GHC.Types.True
}
}
}
But with unlifted data types we shouldn't need to do so.
It would be wonderful if we could instead generate a worker like this:
A.$wfoo
= \ (w_s1mU :: unlifted Bool)
(y_X1 :: unlifted Bool)
(z_X2 :: unlifted Bool)
(ww_s1n0 :: Int#) ->
case lift w_s1mU of { -- lift to bring it back into the lifted domain,
-- but tag inference could still elid the tag check easily.
False ->
case lift y_X1 of {
False ->
case lift z_X2 of {
False -> tagToEnum# @Bool (># ww_s1n0 1#);
True -> GHC.Types.True
};
True -> GHC.Types.True
};
True -> GHC.Types.True
}
}
}
And the decision of weither or not a lifted argument could be driven by demand analysis much in the same way as we drive unboxing currently. Only it would naturally apply to sum types and avoid the risk of reboxing.
The wrapper would be a bigger problem. In the absence of changing Core (which might be reasonable) we would need something like this as a wrapper:
foo = \ (w_s1mU :: Bool) (w1_s1mV :: Bool) (w2_s1mW :: Bool) (w3_s1mX :: Int) -> case seqCoerce# w_s1mU of u -> -- seqCoerce# would cancel out with seqs in the caller via tag inference during codegen. case seqCoerce# w1_s1mV of v -> case seqCoerce# w2_s1mW of w -> case w3_s1mX of { I# ww1_s1n0 -> A.$wfoo u v w ww1_s1n0 }
This is not meant as a concrete implementation proposal. But the idea of having WW passing arguments unlifted crossed my mind a lot during the work on tag inference. None of this needs to be exposed to users so I think it would be reasonable.
It's also a bit open how useful this would be. It obviously adds work for the compiler at the very least so it would need to pay it's dues. And with enough inlining there are very few boundries across which repeated seqs would cancel out to pay these.
However it would allow W/W to improve things where unboxing isn't an option. So hard to say without spending some time looking into this further. Which I don't plan to do in the near future.