Should we CSE simple primop applications?
Sebastian commited this wonderful bit-fiddling code to GHC recently:
-- | Denotes '+' on lower and upper bounds of 'Card'.
plusCard :: Card -> Card -> Card
-- See Note [Algebraic specification for plusCard and multCard]
plusCard (Card a) (Card b)
= Card (bit0 .|. bit1 .|. bitN)
where
bit0 = (a .&. b) .&. 0b001
bit1 = (a .|. b) .&. 0b010
bitN = ((a .|. b) .|. shiftL (a .&. b) 1) .&. 0b100
Curious as I am I couldn't resist but look at what it compiles to:
GHC.Types.Demand.plusCard
:: GHC.Types.Demand.Card
-> GHC.Types.Demand.Card -> GHC.Types.Demand.Card
[GblId, Arity=2, Str=<1P(L)><1P(L)>, Cpr=1, Unf=OtherCon []] =
\r [ds_s7DU ds1_s7DV]
case ds_s7DU of {
GHC.Types.I# x#_s7DX ->
case ds1_s7DV of {
GHC.Types.I# y#_s7DZ ->
case andI# [x#_s7DX y#_s7DZ] of sat_s7E6 [Occ=Once1] {
__DEFAULT ->
case uncheckedIShiftL# [sat_s7E6 1#] of sat_s7E7 [Occ=Once1] {
__DEFAULT ->
case orI# [x#_s7DX y#_s7DZ] of sat_s7E5 [Occ=Once1] {
__DEFAULT ->
case orI# [sat_s7E5 sat_s7E7] of sat_s7E8 [Occ=Once1] {
__DEFAULT ->
case andI# [sat_s7E8 4#] of sat_s7E9 [Occ=Once1] {
__DEFAULT ->
case orI# [x#_s7DX y#_s7DZ] of sat_s7E2 [Occ=Once1] {
__DEFAULT ->
case andI# [sat_s7E2 2#] of sat_s7E3 [Occ=Once1] {
__DEFAULT ->
case andI# [x#_s7DX y#_s7DZ] of sat_s7E0 [Occ=Once1] {
__DEFAULT ->
case andI# [sat_s7E0 1#] of sat_s7E1 [Occ=Once1] {
__DEFAULT ->
case orI# [sat_s7E1 sat_s7E3] of sat_s7E4 [Occ=Once1] {
__DEFAULT ->
case orI# [sat_s7E4 sat_s7E9] of sat_s7Ea [Occ=Once1] {
__DEFAULT -> GHC.Types.I# [sat_s7Ea];
It's good in that it compiles to nice straight line code.
But I noticed that andI# [x#_s7DX y#_s7DZ]
and orI# [x#_s7DX y#_s7DZ]
are both computed twice. Which was surprising to me.
Now sometimes it can be beneficial to avoid doing those sorts of CSE applications, because keeping the result in a register can cause spilling which is worse then repeating a cheap computation. But when we have enough registers around then keeping the computed value around is better as we save on compute.
It's my understanding that solving this in general is a hard problem. But I wonder if we currently avoid commoning up these expressions by design or by accident. And if the later perhaps we should try to common them up! At least then we would know for sure.