SpecConstr should fire rules and run the simple optimizer much in the same way we do for type class specialization.

Consider this (somewhat silly) example:

{-# OPTIONS_GHC -fspec-constr-keen -fspec-constr-count=99 -fspec-constr-threshold=90000 -fspec-constr-recursive=500 #-}
{-# LANGUAGE MagicHash #-}

module M(baz) where

import GHC.Exts
import Data.Coerce

{-# NOINLINE baz #-}
baz = I# (goz 0# 4#)
  where
    goz :: Int# -> Int# -> Int#
    goz 0# 1# = 1#
    goz 0# x = goz 0# (f x)
    goz _ (0#) = 3#
    goz n x = 6#

    f x = case x of
        _
          | isTrue# (x ># 20#) -> flarge (x -# -20#)
          | otherwise          -> flarge (x +#  20#)
      where
        flarge 1# = 2#
        flarge 2# = 3#
        flarge 3# = 4#
        flarge 4# = 5#
        flarge 5# = 6#
        flarge 6# = 7#
        flarge 7# = 8#
        flarge 8# = 9#
        flarge 9# = 10#
        flarge 10# = 11#
        flarge 11# = 12#
        flarge 12# = 13#
        flarge 13# = 14#
        flarge 14# = 15#
        flarge 15# = 16#
        flarge x =  x

This will eventually reduce to a single number. Naively (with the given spec-constr flags) this function should optimize down to a statically known number at compile time.

One blocker for this to happen is #22781 which I have a fix for already. Applying this fix we run into another issue.

The problem is we specialize goz for goz 0# 4#.

This gives us the following specialized RHS:

-- RHS size: {terms: 83, types: 6, coercions: 0, joins: 1/1}
$sgoz_szB :: (# #) -> Int#
[LclId[StrictWorker([])], Arity=1, Str=<L>]
$sgoz_szB
  = \ (void_0E :: (# #)) ->
      join {
        flarge_szr :: Int# -> Int#
        [LclId[JoinId(1)(Nothing)], Arity=1, Str=<SL>]
        flarge_szr (eta_B1 [Dmd=SL, OS=OneShot] :: Int#)
          = case eta_B1 of ds_X2 {
              __DEFAULT -> goz_aiQ 0# ds_X2;
              1# -> goz_aiQ 0# 2#;
              2# -> goz_aiQ 0# 3#;
              3# -> goz_aiQ 0# 4#;
              4# -> goz_aiQ 0# 5#;
              5# -> goz_aiQ 0# 6#;
              6# -> goz_aiQ 0# 7#;
              7# -> goz_aiQ 0# 8#;
              8# -> goz_aiQ 0# 9#;
              9# -> goz_aiQ 0# 10#;
              10# -> goz_aiQ 0# 11#;
              11# -> goz_aiQ 0# 12#;
              12# -> goz_aiQ 0# 13#;
              13# -> goz_aiQ 0# 14#;
              14# -> goz_aiQ 0# 15#;
              15# -> goz_aiQ 0# 16#
            } } in
      case ># 4# 20# of {
        __DEFAULT -> jump flarge_szr (+# 4# 20#);
        1# -> jump flarge_szr (-# 4# -20#)
      }

But that's pretty silly. As it will occur specializations of goz_aiQ 0# 1# goz_aiQ 0# 2# ... and so on.

If we were to run the simple optimizer on the specialized rhs instead I imagine we would do a lot better.

The case should constant fold away:

      case ># 4# 20# of {
        __DEFAULT -> jump flarge_szr (+# 4# 20#);
        1# -> jump flarge_szr (-# 4# -20#)
=>
      jump flarge_szr (+# 4# 20#);

With now just a single occurence of the join point it should inline. Allowing more case of case, resulting in just a constant integer.

I think we should just do the same as we do for the type class specializer. As I understand it there when we specialize a rhs we run the simple optimizer on it in order to discover additional opportunities for specialization.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information