Commit 4e31cae2 authored by chak's avatar chak
Browse files

[project @ 2002-01-16 23:34:31 by chak]

Added an intro to the workings of the desugarer and details about the pattern
matching compiler.
parent dff6ab2d
......@@ -6,7 +6,7 @@
<h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.7]</h1>
<h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.9]</h1>
<!-- Contributors: Whoever makes substantial additions or changes to the
document, please add your name and keep the order alphabetic. Moreover,
......@@ -51,6 +51,7 @@
<li><a href="the-beast/basicTypes.html">The Basics</a>
<li><a href="the-beast/vars.html">The Real Story about Variables, Ids, TyVars, and the like</a>
<li><a href="the-beast/typecheck.html">Checking Types</a>
<li><a href="the-beast/desugar.html">Sugar Free: From Haskell To Core</a>
<li><a href="the-beast/simplifier.html">The Mighty Simplifier</a>
<li><a href="the-beast/mangler.html">The Evil Mangler</a>
<li><a href="the-beast/alien.html">Alien Functions</a>
......@@ -80,7 +81,7 @@
<!-- hhmts start -->
Last modified: Tue Jan 8 18:23:25 EST 2002
Last modified: Wed Jan 16 13:09:56 EST 2002
<!-- hhmts end -->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
<title>The GHC Commentary - Sugar Free: From Haskell To Core</title>
<h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1>
Up until after type checking, GHC keeps the source program in an
abstract representation of Haskell source without removing any of the
syntactic sugar (such as, list comprehensions) that could easily be
represented by more primitive Haskell. This complicates part of the
front-end considerably as the abstract syntax of Haskell (as exported by
the module <a
is much more complex than a simplified representation close to, say, the
<a href="">Haskell
Kernel</a> would be. However, having a representation that is as close
as possible to the surface syntax simplifies the generation of clear
error messages. As GHC (quite in contrast to "conventional" compilers)
prints code fragments as part of error messages, the choice of
representation is especially important.
Nonetheless, as soon as the input has passed all static checks, it is
transformed into GHC's principal intermediate language that goes by the
name of <em>Core</em> and whose representation is exported by the
module <a
All following compiler phases, except code generation operate on Core.
Due to Andrew Tolmach's effort, there is also an <a
representation for Core.</a>
The conversion of the compiled module from <code>HsSyn</code> into that
of <code>CoreSyn</code> is performed by a phase called the
<em>desugarer</em>, which is located in
It's operation is detailed in the following.
<h2>Auxilliary Functions</h2>
The modules <a
defines the desugarer monad (of type <code>DsM</code>) which maintains
the environment needed for desugaring. In particular, it encapsulates a
unique supply for generating new variables, a map to lookup standard
names (such as functions from the prelude), a source location for error
messages, and a pool to collect warning messages generated during
desugaring. Initialisation of the environment happens in the function <a
which is also the main entry point into the desugarer.
The generation of Core code often involves the use of standard functions
for which proper identifiers (i.e., values of type <code>Id</code> that
actually refer to the definition in the right Prelude) need to be
obtained. This is supported by the function
<code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>.
<h2>Pattern Matching</h2>
Nested pattern matching with guards and everything is translated into
the simple, flat case expressions of Core by the following modules:
<dd>This modules contains the main pattern-matching compiler in the form
of a function called <code>match</code>. There is some documentation
as to how <code>match</code> works contained in the module itself.
Generally, the implemented algorithm is similar to the one described
in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The
Implementation of Functional Programming Languages</em>.
<code>Match</code> exports a couple of functions with not really
intuitive names. In particular, it exports <code>match</code>,
<code>matchWrapper</code>, <code>matchExport</code>, and
<code>matchSimply</code>. The function <code>match</code>, which is
the main work horse, is only used by the other matching modules. The
function <code>matchExport</code> - despite it's name - is merely used
internally in <code>Match</code> and handles warning messages (see
below for more details). The actual interface to the outside is
<code>matchWrapper</code>, which converts the output of the type
checker into the form needed by the pattern matching compiler (i.e., a
list of <code>EquationInfo</code>). Similar in function to
<code>matchWrapper</code> is <code>matchSimply</code>, which provides
an interface for the case where a single expression is to be matched
against a single pattern (as, for example, is the case in bindings in
a <code>do</code> expression).
<dd>This module generates code for a set of alternative constructor
patterns that belong to a single type by means of the routine
<code>matchConFamily</code>. More precisely, the routine gets a set
of equations where the left-most pattern of each equation is a
constructor pattern with a head symbol from the same type as that of
all the other equations. A Core case expression is generated that
distinguihes between all these constructors. The routine is clever
enough to generate a sparse case expression and to add a catch-all
default case only when needed (i.e., if the case expression isn't
exhaustive already). There is also an explanation at the start of the
<dd>Generates code for a set of alternative literal patterns by means of
the routine <code>matchLiterals</code>. The principle is similar to
that of <code>matchConFamily</code>, but all left-most patterns are
literals of the same type.
<dd>This module provides a set of auxilliary definitions as well as the
data types <code>EquationInfo</code> and <code>MatchResult</code> that
form the input and output, respectively, of the pattern matching
<dd>This module does not really contribute the compiling pattern
matching, but it inspects sets of equations to find whether there are
any overlapping patterns or non-exhaustive pattern sets. This task is
implemented by the function <code>check</code>, which returns a list of
patterns that are part of a non-exhaustive case distinction as well as a
set of equation labels that can be reached during execution of the code;
thus, the remaining equations are shadowed due to overlapping patterns.
The function <code>check</code> is invoked and its result converted into
suitable warning messages by the function <code>Match.matchExport</code>
(which is a wrapper for <code>Match.match</code>).
The central function <code>match</code>, given a set of equations,
proceeds in a number of steps:
<li>It starts by desugaring the left-most pattern of each equation using
the function <code>tidy1</code> (indirectly via
<code>tidyEqnInfo</code>). During this process, non-elementary
pattern (e.g., those using explicit list syntax <code>[x, y, ...,
z]</code>) are converted to a standard constructor pattern and also
irrefutable pattern are removed.
<li>Then, a process called <em>unmixing</em> clusters the equations into
blocks (without re-ordering them), such that the left-most pattern of
all equations in a block are either all variables, all literals, or
all constructors.
<li>Each block is, then, compiled by <code>matchUnmixedEqns</code>,
which forwards the handling of literal pattern blocks to
<code>MatchLit.matchLiterals</code>, of constructor pattern blocks to
<code>MatchCon.matchConFamily</code>, and hands variable pattern
blocks back to <code>match</code>.
<!-- hhmts start -->
Last modified: Wed Jan 16 23:40:16 EST 2002
<!-- hhmts end -->
......@@ -73,8 +73,11 @@ data ParseResult a = POk PState a
pattern. This and similar checks are performed by the routines from <a
href=""><code>ParseUtil</code></a>. In
some cases, these routines do, in addition to checking for
wellformedness, also transform the parse tree, such that it fits into the
syntactic context in which it has been parsed.
wellformedness, also transform the parse tree, such that it fits into
the syntactic context in which it has been parsed; in fact, this happens
for patterns, which are transformed from a representation of type
<code>RdrNameHsExpr</code> into a representation of type
<h2>The Haskell Interface Parser</h2>
......@@ -89,7 +92,7 @@ data ParseResult a = POk PState a
<!-- hhmts start -->
Last modified: Wed Dec 12 17:45:36 EST 2001
Last modified: Wed Jan 16 00:30:14 EST 2002
<!-- hhmts end -->
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment