SpecConstr should fire rules and run the simple optimizer much in the same way we do for type class specialization.
Consider this (somewhat silly) example:
{-# OPTIONS_GHC -fspec-constr-keen -fspec-constr-count=99 -fspec-constr-threshold=90000 -fspec-constr-recursive=500 #-}
{-# LANGUAGE MagicHash #-}
module M(baz) where
import GHC.Exts
import Data.Coerce
{-# NOINLINE baz #-}
baz = I# (goz 0# 4#)
where
goz :: Int# -> Int# -> Int#
goz 0# 1# = 1#
goz 0# x = goz 0# (f x)
goz _ (0#) = 3#
goz n x = 6#
f x = case x of
_
| isTrue# (x ># 20#) -> flarge (x -# -20#)
| otherwise -> flarge (x +# 20#)
where
flarge 1# = 2#
flarge 2# = 3#
flarge 3# = 4#
flarge 4# = 5#
flarge 5# = 6#
flarge 6# = 7#
flarge 7# = 8#
flarge 8# = 9#
flarge 9# = 10#
flarge 10# = 11#
flarge 11# = 12#
flarge 12# = 13#
flarge 13# = 14#
flarge 14# = 15#
flarge 15# = 16#
flarge x = x
This will eventually reduce to a single number. Naively (with the given spec-constr flags) this function should optimize down to a statically known number at compile time.
One blocker for this to happen is #22781 which I have a fix for already. Applying this fix we run into another issue.
The problem is we specialize goz
for goz 0# 4#
.
This gives us the following specialized RHS:
-- RHS size: {terms: 83, types: 6, coercions: 0, joins: 1/1}
$sgoz_szB :: (# #) -> Int#
[LclId[StrictWorker([])], Arity=1, Str=<L>]
$sgoz_szB
= \ (void_0E :: (# #)) ->
join {
flarge_szr :: Int# -> Int#
[LclId[JoinId(1)(Nothing)], Arity=1, Str=<SL>]
flarge_szr (eta_B1 [Dmd=SL, OS=OneShot] :: Int#)
= case eta_B1 of ds_X2 {
__DEFAULT -> goz_aiQ 0# ds_X2;
1# -> goz_aiQ 0# 2#;
2# -> goz_aiQ 0# 3#;
3# -> goz_aiQ 0# 4#;
4# -> goz_aiQ 0# 5#;
5# -> goz_aiQ 0# 6#;
6# -> goz_aiQ 0# 7#;
7# -> goz_aiQ 0# 8#;
8# -> goz_aiQ 0# 9#;
9# -> goz_aiQ 0# 10#;
10# -> goz_aiQ 0# 11#;
11# -> goz_aiQ 0# 12#;
12# -> goz_aiQ 0# 13#;
13# -> goz_aiQ 0# 14#;
14# -> goz_aiQ 0# 15#;
15# -> goz_aiQ 0# 16#
} } in
case ># 4# 20# of {
__DEFAULT -> jump flarge_szr (+# 4# 20#);
1# -> jump flarge_szr (-# 4# -20#)
}
But that's pretty silly. As it will occur specializations of goz_aiQ 0# 1#
goz_aiQ 0# 2#
... and so on.
If we were to run the simple optimizer on the specialized rhs instead I imagine we would do a lot better.
The case should constant fold away:
case ># 4# 20# of {
__DEFAULT -> jump flarge_szr (+# 4# 20#);
1# -> jump flarge_szr (-# 4# -20#)
=>
jump flarge_szr (+# 4# 20#);
With now just a single occurence of the join point it should inline. Allowing more case of case, resulting in just a constant integer.
I think we should just do the same as we do for the type class specializer. As I understand it there when we specialize a rhs we run the simple optimizer on it in order to discover additional opportunities for specialization.