Alfredo Di Napoli · fedfefc8
--- a/Errors-as-(structured)-values.md
+++ b/Errors-as-(structured)-values.md
@@ -67,9 +67,9 @@ data WarnReason

 </details>

-# New API, initial attempt (start here)
+# New API

-This is a first sketch of the new API:
+Let's start by making `Messages` and `MsgEnvelope` polymorphic over `e`, which is the particular message they now carry:

 ```haskell

@@ -82,72 +82,14 @@ data MsgEnvelope e = MsgEnvelope
   , errMsgSeverity    :: Severity
   } deriving Functor

-data MessageClass
-  = MCOutput
-  | MCFatal
-  | MCInteractive
-  | MCDump
-  | MCInfo
-  | MCDiagnostic Severity
-  deriving (Eq, Show)
-
-data Severity
-  = SevWarning !WarnReason -- ^ Born as a warning or demoted from an error.
-  | SevError   !ErrReason  -- ^ Born as an error  or promoted from a warning (e.g. -Werror)
-  deriving (Eq, Show)
-
-class RenderableDiagnostic a where
-  renderDiagnostic :: a -> DecoratedSDoc
-
+-- Text organized into bullets.
 newtype DecoratedSDoc = Decorated { unDecorated :: [SDoc] }

 -- .. operations on messages
 ```

-## New API key points
-
-API design explanation/considerations:
-
-* `Messages` is now a `newtype` so it can be expressed in terms of an opaque interface, and
-  it's parameterised over an abstract payload `e`. It's just a bag of **envelopes**;
-
-* The `ErrMsg` is renamed `MsgEnvelope` and has to be intended as an envelope
-  that carries information about the _content_ it stores inside. We can call such
-  envelope content the **diagnostic**;
-
-* The old `Severity` type is split into two types, `MessageClass` and `Severity`. The former
-  is the _class_ of the message (is this a debug message? A dump one? An information log
-  emitted by GHC somewhere?) whereas the `Severity` is the severity of the **diagnostic**. This
-  split **prevents** the construction of **impossible states**, like creating a `MsgEnvelope`
-  which `Severity` is `MCDump`, for example.
-
-* The **diagnostic** is the _envelope content_ of a `MsgEnvelope`, and it characterises the
-  particular provenance of the envelope (is this a parser error? Is this a TcRn warning?). For
-  example, `MsgEnvelope PsMessage` is an envelope which was created during GHC parsing phase,
-  and represents a parsing diagnostic (either an error or a warning);
-
-* A `MsgEnvelope` has a `Severity`, which type reflects the fluid relationship between
-  warnings and errors. The `Messages` type simply collects facts about the GHC running program:
-  peeking into the individual envelope tells us:
-     `a)` If this is an error or a warning;
-     `b)` What does this `MsgEnvelope` carries inside its `errMsgDiagnostic`;
-
-* There is a **fluid** relationship between warnings and errors. A warning can be turned
-  into a fatal error and an error can be relaxed for example by deferring type errors. We
-  can distinguish between warnings and errors by looking at each `MsgEnvelope`'s `Severity`;
-
-* We should try to give a `MsgEnvelope` the right `Severity` "at birth", without doing dodgy
-  demotions or promotions later in the codebase, as it makes much harder to track down the
-  precise semantic of the diagnostic, or even trying to reconstruct the original "provenance"
-  (was this a warning now turned into an error? If yes, when?);
-
-* We render back the messages into an `SDoc` via the `RenderableDiagnostic` type class. We
-  use `renderDiagnostic` to turn the structured message into something that can be printed
-  on screen by GHC when it needs to report facts about the compiled program (errors, warnings
-  etc);
+This allows us to move away from `SDoc` and simply instantiate `e` with a structured message, specific a particular compilation phase (parsing, tcRn, ...)

-* A `DecoratedSDoc` is a newtype that allows us to collect a list of `SDoc` from various
-  printing functions and place a bullet between each of them, so that they render nicely.

 # The envelope contents (i.e. the diagnostics)

@@ -156,10 +98,10 @@ We can get each subsystem to define their own message (diagnostic) types. At the

 ``` haskell
 -- somewhere in the parser
-data PsMessage = PsMessage DecoratedSDoc
+data PsMessage = PsUknownMessage DecoratedSDoc

 -- in GHC.Tc.Monad/Types?
-data TcRnMessage = TcRnMessage DecoratedSDoc
+data TcRnMessage = TcRnUnknownMessage DecoratedSDoc

 -- somewhere under GHC.Driver
 data GhcMessage where
@@ -181,7 +123,7 @@ With those types in place, we could begin instantiating `e` to the relevant type

 Finally, we could start turning concrete errors into dedicated constructors of `PsMessage`/`TcRnMessage`. Starting slowly with simple [`not in scope` errors](https://gitlab.haskell.org/ghc/ghc/-/blob/master/compiler/GHC/Rename/Unbound.hs#L64) and the likes, before converting over [the entire typechecking error infrastructure](https://gitlab.haskell.org/ghc/ghc/-/blob/master/compiler/GHC/Tc/Errors.hs) and more.  For example:
 ```
-data TcRnMessage = TcRnMessage DecoratedSDoc
+data TcRnMessage = TcRnUnknownMessage DecoratedSDoc
  | OutOfScopeErr RdrName
  | ...
 ```
@@ -191,18 +133,41 @@ This might involve systematically retaining a bit more information (context line

 At the "top level", in the driver, where we call the different subsystems to process Haskell modules, we would end up accumulating and reporting `GhcMessage` values. The goal is to have the GHC program emit the exact same diagnostics as it does today, but affect the API in such a way that GHC API users would at this point get a chance to work with the algebraic error descriptions, making inspection and custom treatment a whole lot easier to implement. We could perhaps even demonstrate it at this point by implementing a little "demo" of sorts for this new way to consume errors.

+# New API key points

-# Improving the proposal
+API design explanation/considerations:

-The "New API" design above suffers from an infelicity: duplication. Imagine we have
+* `Messages` is now a `newtype` so it can be expressed in terms of an opaque interface, and
+  it's parameterised over an abstract payload `e`. It's just a bag of **envelopes**;

-```hs
-data TcRnMessage = TcRnOutOfScope ... | TcRnBadTelescope ... | ...
-```
+* The `ErrMsg` is renamed `MsgEnvelope` and has to be intended as an envelope
+  that carries information about the _content_ it stores inside. We can call such
+  envelope content the **diagnostic**;
+
+* The old `Severity` type is split into two types, `MessageClass` and `Severity`. The former
+  is the _class_ of the message (is this a debug message? A dump one? An information log
+  emitted by GHC somewhere?) whereas the `Severity` is the severity of the **diagnostic**. This
+  split **prevents** the construction of **impossible states**, like creating a `MsgEnvelope`
+  which `Severity` is `MCDump`, for example.
+
+* The **diagnostic** is the _envelope content_ of a `MsgEnvelope`, and it characterises the
+  particular provenance of the envelope (is this a parser error? Is this a TcRn warning?). For
+  example, `MsgEnvelope PsMessage` is an envelope which was created during GHC parsing phase,
+  and represents a parsing diagnostic (either an error or a warning);
+
+* A `MsgEnvelope` has a `Severity`, which type reflects the fluid relationship between
+  warnings and errors. The `Messages` type simply collects facts about the GHC running program:
+  peeking into the individual envelope tells us:
+     `a)` If this is an error or a warning;
+     `b)` What does this `MsgEnvelope` carries inside its `errMsgDiagnostic`;

-and we have a `MsgEnvelope TcRnMessage`. Such a structure effectively contains two "reasons" for the error: one encoded in the choice of constructor of `TcRnMessage`, and one in the payload of the `errMsgSeverity` field. This alternative is meant to address this problem.
+* There is a **fluid** relationship between warnings and errors. A warning can be turned
+  into a fatal error and an error can be relaxed for example by deferring type errors. We
+  can distinguish between warnings and errors by looking at each `MsgEnvelope`'s `Severity`;
+
+* A `DecoratedSDoc` is a newtype that allows us to collect a list of `SDoc` from various
+  printing functions and place a bullet between each of them, so that they render nicely.

-Key points to consider:
 * Every warning or error arises for a *reason*.
  * Most warnings are controlled by specific `-Wblah-blah` flags; we call these *reasons* for the warning.
  * Some warnings are not associated with a flag. We currently say these have "no reason" for arising, but really, we mean that there is no flag.