GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2020-01-15T18:40:29Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/17570Refactor CafInfo type for better compiler error checking2020-01-15T18:40:29ZÖmer Sinan AğacanRefactor CafInfo type for better compiler error checking## Motivation
Currently the CafInfo type is defined like this:
```haskell
data CafInfo
= MayHaveCafRefs -- i.e. CAFFY
| NoCafRefs -- i.e. not CAFFY
```
Every `Id` has a `CafInfo` in its `IdInfo`:
```haskell
data IdInfo = IdIn...## Motivation
Currently the CafInfo type is defined like this:
```haskell
data CafInfo
= MayHaveCafRefs -- i.e. CAFFY
| NoCafRefs -- i.e. not CAFFY
```
Every `Id` has a `CafInfo` in its `IdInfo`:
```haskell
data IdInfo = IdInfo
{ ...
, cafInfo :: CafInfo
...
}
```
The problem is this type forces us to give every `Id` a `CafInfo` on
initialization even though in practice most of the time we only know `CafInfo`s
of `Id`s after the relevant analysis.
(There are cases where we know the `CafInfo` on initialization, but those are
exceptions rather than the norm)
Becuase it's safe to assume an Id CAFFY the default `CafInfo` value is
`MayHaveCafRefs` and most `Id`s are thus initialized as CAFFY.
The problem is that if we have a bug in the compiler and don't assign a
`CafInfo` to an `Id` (maybe the analysis missed the `Id`, or there are bugs in
the plumbing code) everything silently works beucase assuming a non-CAFFY Id
CAFFY does not cause any problems in the generated code.
## Proposal
I propose adding one more constructor to the type:
```
data CafInfo
= MayHaveCafRefs
| NoCafRefs
-- New constructor:
| UnknownCafInfo -- verbose name to avoid name clashes (sigh)
```
and initialize most `Id`s with `UnknownCafInfo`:
```
vanillaCafInfo :: CafInfo
vanillaCafInfo = UnknownCafInfo
```
`CafInfo` queries on these `Id`s would then fail:
```
mayHaveCafRefs :: HasCallStack => CafInfo -> Bool
mayHaveCafRefs MayHaveCafRefs = True
mayHaveCafRefs NoCafRefs = False
mayHaveCafRefs UnknownCafInfo = pprPanic "mayHaveCafRefs" (text "Unknown CafInfo")
```
## Implementation details
As of today it's a little bit tricky to implement this because `CafInfo` is used
in various parts of the simplifier and code generator.
However after !2100 and !1304 merged this will be almost trivial to implement:
after those patches `CafInfo`s will only be used by the SRT analysis pass
(`CmmBuildInfoTables`). `CLabel`s of the current module will have
`UnknownCafInfo` as their `CafInfo`s, and `CmmBuildInfoTables` will be updated
to handle `UnknownCafInfo` (those will be considered definitions in the current
module).
Imported `Id`s are expected to have accurate (non-`UnknownCafInfo`) `CafInfo`s.
Interface file syntax does not need to change, but interface file generator will
be changed to make sure no `Id`s written to an interface file will have
`UnknownCafInfo`: if an `Id` is in an interface file it needs to be analyzed by
`CmmBuildInfoTables` and have an accurate `CafInfo`.
(That does not imply that we'll be writing the `CafInfo`s to the interface files
with `-fomit-iface-pragmas` -- we'll simply check that `Id`s have accurate
`CafInfo`s)
(See also https://gitlab.haskell.org/ghc/ghc/wikis/CafInfo-rework for the
progress !2100 and !1304)⊥