Commit 0a432d48 authored by chak's avatar chak
Browse files

[project @ 2001-11-18 12:32:22 by chak]

Added a new section that covers lexing and parsing.
parent 116328b6
......@@ -6,7 +6,7 @@
<h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.4]</h1>
<h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.5]</h1>
<!-- Contributors: Whoever makes substantial additions or changes to the
document, please add your name and keep the order alphabetic. Moreover,
......@@ -46,6 +46,7 @@
<h2>The Beast Dissected</h2>
<li><a href="the-beast/driver.html">The Glorious Driver</a>
<li><a href="the-beast/syntax.html">Just Syntax</a>
<li><a href="the-beast/basicTypes.html">The Basics</a>
<li><a href="the-beast/vars.html">The Real Story about Variables, Ids, TyVars, and the like</a>
<li><a href="the-beast/typecheck.html">Checking Types</a>
......@@ -78,7 +79,7 @@
<!-- hhmts start -->
Last modified: Tue Nov 13 10:32:11 EST 2001
Last modified: Sat Nov 17 14:10:48 EST 2001
<!-- hhmts end -->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
<title>The GHC Commentary - Just Syntax</title>
<h1>The GHC Commentary - Just Syntax</h1>
The lexical and syntactic analyser for Haskell programs are located in
<h2>The Lexer</h2>
The lexer is a rather tedious piece of Haskell code contained in the
module <a
Its complexity partially stems from covering, in addition to Haskell 98,
also the whole range of GHC language extensions plus its ability to
analyse interface files in addition to normal Haskell source. The lexer
defines a parser monad <code>P a</code>, where <code>a</code> is the
type of the result expected from a successful parse. More precisely, a
result of type
data ParseResult a = POk PState a
| PFailed Message</pre>
is produced with <code>Message</code> being from <a
(and currently is simply a synonym for <code>SDoc</code>).
The record type <code>PState</code> contains information such as the
current source location, buffer state, contexts for layout processing,
and whether Glasgow extensions are accepted (either due to
<code>-fglasgow-exts</code> or due to reading an interface file). Most
of the fields of <code>PState</code> store unboxed values; in fact, even
the flag indicating whether Glasgow extensions are enabled is
represented by an unboxed integer instead of by a <code>Bool</code>. My
(= chak's) guess is that this is to avoid having to perform a
<code>case</code> on a boxed value in the inner loop of the lexer.
The same lexer is used by the Haskell source parser, the Haskell
interface parser, and the package configuration parser.
<h2>The Haskell Source Parser</h2>
The parser for Haskell source files is defined in the form of a parser
specification for the parser generator <a
href="">Happy</a> in the file <a
The parser exports three entry points for parsing entire modules
(<code>parseModule</code>, individual statements
(<code>parseStmt</code>), and individual identifiers
(<code>parseIdentifier</code>), respectively. The last two are needed
for GHCi. All three require a parser state (of type
<code>PState</code>) and are invoked from <a
<h2>The Haskell Interface Parser</h2>
The parser for interface files is also generated by Happy from <a href=""><code>ParseIface.y</code></a>.
It's main routine <code>parseIface</code> is invoked from <a href=""><code>RnHiFiles</code></a><code>.readIface</code>.
<h2>The Package Configuration Parser</h2>
The parser for configuration files is by far the smallest of the three
and defined in <a href=""><code>ParsePkgConf.y</code></a>.
It exports <code>loadPackageConfig</code>, which is used by <a href=""><code>DriverState</code></a><code>.readPackageConf</code>.
<!-- hhmts start -->
Last modified: Sun Nov 18 21:22:38 EST 2001
<!-- hhmts end -->
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment