Skip to content

Visible rules unrelated to module can affect its optimisation/ the simplifier

Summary

The rules which are loaded in the EPS, even if they are not part of the dependency closure of a given module, affect the outcome of rebuilding a call in the simplifier.

Because the rules are loaded non-deterministically, this means the simplifier can non deterministically produce different results, which breaks interface and object determinism.

In our example, this function application:

(GHC.Classes.<
  @GHC.Num.Integer.Integer
  GHC.Num.Integer.$fOrdInteger
  inp'_a18mlV
  (Data.Convertible.Base.convert
    @b_a18mkc
    @GHC.Num.Integer.Integer
    $dConvertible_a18mk6
    smallest_a18moQ))

by the time it's printed by -ddump-rules-rewrites can either look like

Rule fired
    Rule: Class op <
    Module: (BUILTIN)
    Before: GHC.Classes.<
              TyArg GHC.Num.Integer.Integer
              ValArg GHC.Num.Integer.$fOrdInteger
    After:  GHC.Num.Integer.integerLt
    Cont:   ApplyToVal nodup hole GHC.Num.Integer.Integer
                                  -> GHC.Num.Integer.Integer -> GHC.Types.Bool
              inp'_a18mlV
            ApplyToVal nodup hole GHC.Num.Integer.Integer -> GHC.Types.Bool
              (Data.Convertible.Base.convert
                 @b_a18mkc
                 @GHC.Num.Integer.Integer
                 $dConvertible_a18mk6
                 smallest_a18moQ)
            Select nodup wild_a18lSg
            Select nodup wild_00
            Stop[BoringCtxt] GHC.Internal.Data.Either.Either
                               Data.Convertible.Base.ConvertError b_a18mkc

or

Rule fired
    Rule: Class op <
    Module: (BUILTIN)
    Before: GHC.Classes.<
              TyArg GHC.Num.Integer.Integer
              ValArg GHC.Num.Integer.$fOrdInteger
              ValArg inp'_a2iJ
              ValArg case ($dConvertible_a2rN
                           `cast` (Data.Convertible.Base.N:Convertible[0]
                                       <b_a2rC>_N <GHC.Num.Integer.Integer>_N
                                   :: Data.Convertible.Base.Convertible
                                        b_a2rC GHC.Num.Integer.Integer
                                      ~R# (b_a2rC
                                           -> Data.Convertible.Base.ConvertResult
                                                GHC.Num.Integer.Integer)))
                            smallest_a2if
                     of {
                       GHC.Internal.Data.Either.Left e_i3gv ->
                         Data.Convertible.Base.convert1 @GHC.Num.Integer.Integer e_i3gv;
                       GHC.Internal.Data.Either.Right r_i3gH -> r_i3gH
                     }
    After:  GHC.Num.Integer.integerLt
              [<as above...>]
    Cont:   Select nodup wild_a3iH
            Select nodup wild_00
            Stop[BoringCtxt] GHC.Internal.Data.Either.Either
                               Data.Convertible.Base.ConvertError b_a2rC

Diagnosis

The < function has arity 4. On the first outcome, only the first two arguments (TyArg GHC.Num.Integer.Integer and ValArg GHC.Num.Integer.$fOrdInteger) are moved from the continuation back to the resulting expression when rebuilding. On the second outcome, all four arguments are moved out from the continuation.

The cause of this is discrepancy is the nr_wanted == 0 check in:

rebuildCall env info@(ArgInfo { ai_fun = fun, ai_args = rev_args
                              , ai_rewrite = TryRules nr_wanted rules }) cont
  | nr_wanted == 0 || no_more_args

nr_wanted is the maximum arity of any given loaded visible rule. In the first outcome, nr_wanted == 2, so after moving the first two arguments we stop. In contrast, for the second outcome, nr_wanted == 4 so we ended up moving all the arguments.

Tracing no_more_args and rules shows this for the first outcome:

False [Built in rule for <: "Class op <"]

And for the second, note how there are more visible rules whose arity is 4 rather than the arity 2 of the built in rule for <:

no_more_args: True
rules: ["ByteString.Lazy length/<N -> compareLength/==LT" [~1]
    forall ($dOrd_a18lwL :: Ord Int64)
           (t_a18lkw :: ByteString)
           (n_a18liP :: Int64).
      < @Int64 $dOrd_a18lwL (length t_a18lkw) n_a18liP
      = case compareLength t_a18lkw n_a18liP of {
          __DEFAULT -> False;
          LT -> True
        },
"ByteString.Lazy <N/length -> compareLength/==GT" [~1]
    forall ($dOrd_a18liK :: Ord Int64)
           (t_a18liJ :: ByteString)
           (n_a18liI :: Int64).
      < @Int64 $dOrd_a18liK n_a18liI (length t_a18liJ)
      = case compareLength t_a18liJ n_a18liI of {
          __DEFAULT -> False;
          GT -> True
        },
Built in rule for <: "Class op <"]

Steps to reproduce

The example we've been looking at is Data.Convertible.Utils from the convertible package.

cabal get convertible
cd convertible
cabal build -w <ghc-HEAD>
mv dist-newstyle first-dist-newstyle
cabal build -w <ghc-HEAD>
diff --recursive dist-newstyle first-dist-newstyle

If you get no differences in interface files, remove dist-newstyle and try again until you see something like

Files /var/folders/tv/35hlch6s3y15hfvndc71l6d40000gn/T/tmp.wOCHDBAqKF/convertible-1.1.1.1/dist-newstyle/build/aarch64-osx/ghc-9.11.20240807/convertible-1.1.1.1/build/Data/Convertible.hi and /var/folders/tv/35hlch6s3y15hfvndc71l6d40000gn/T/tmp.wOCHDBAqKF/convertible-1.1.1.1/first-dist-newstyle/build/aarch64-osx/ghc-9.11.20240807/convertible-1.1.1.1/build/Data/Convertible.hi differ

Then use the usual ghc-HEAD --show-iface <.hi> to get a human readable output out of the differing files.

Expected behavior

The output of the simplifier should be stable to guarantee interface and object determinism.

Environment

  • GHC version used: HEAD (this is an issue exposed only after 9.10)

@simonpj We would appreciate your input on what you think we should do about this

Edited by Rodrigo Mesquita
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information