Overview
GHC is structured into two parts:
- The
ghc
package (in subdirectorycompiler
), which implements almost all GHC's functionality. It is an ordinary Haskell library, and can be imported into a Haskell program by sayingimport GHC
. - The
ghc
binary (in subdirectoryghc
) which imports theghc
package, and implements the I/O for theghci
interactive loop.
Here's an overview of the module structure of the top levels of GHC library. (Note: more precisly, this is the plan. Currently the module Make
below is glommed into the giant module GHC
.)
|---------------------------------|
| GHC |
| The root module for the GHC API |
| Very little code; |
| just simple wrappers |
|---------------------------------|
/ \
/ \
/ \
|------------------------| |------------------------|
| GHC.Driver.Make | | GHC.Runtime.Eval |
| Implements --make | | Stuff to support the |
| Deals with compiling | | GHCi interactive envt |
| multiple modules | | |
|------------------------| |------------------------|
| |
| |
| -------------------- |
- - - - - -| - - -| GHC.Driver.Monad |- - - | - - - - - - - -
| -------------------- |
| |
| |
|-------------------------| |
| GHC.Driver.Pipeline | |
| Deals with compiling | |
| *a single module* | |
| through all its stages | |
| (cpp, unlit, compile, | |
| assemble, link etc) | |
|-------------------------| |
\ |
\ |
\ |
|----------------------------------------------|
| GHC.Driver.Main |
| Compiling a single module (or expression or |
| stmt) to bytecode, or to a M.hc or M.s file |
|----------------------------------------------|
| | | | |
Parse Rename Typecheck Optimise CodeGen
There are some important functions if you are tracing how things get from GHC to GHC.Driver.Main (formerly known as HscMain).
-
compileOne
is the compilation entry point for--make
mode (it's invoked byupsweep_mod
in GHC.Driver.Make). It callshscIncrementalCompile
, and then fires up theGHC.Driver.Pipeline
to finish up code generation. -
runPhase
is the compilation entry point for-c
mode. It successfully processes files until we have anHsc
input file, at which point it callshscIncrementalCompile
. The rest of the pipeline is handled automatically by the driver. -
hscIncrementalCompile
is the primary entrypoint forGHC.Driver.Main
. It callshscIncrementalFrontend
, and if typechecking was necessary, it also runs the simplifier and desugarer, and writes out the interface file. -
hscIncrementalFrontend
is the recompilation checker: it checks if we actually need to compile the file in question; if so it callsgenericHscFrontend
to actually parse and typecheck. (Note that this does NOT do any backend stuff: that will be handled byhscIncrementalCompile
.)
The driver pipeline
The driver pipeline consist of a couple of phases that call other programs and generate a series of intermediate files. Code responsible for managing the order of phases is in compiler/GHC/Driver/Phases.hs, while managing the driver pipeline as a whole is coded in compiler/GHC/Driver/Pipeline.hs. Note that driver pipeline is not the same thing as compilation pipeline: the latter is part of the former.
Let's take a look at the overall structure of the driver pipeline. When we compile Foo.hs
or Foo.lhs
("lhs" extension means that Literate Haskell is being used) the following phases are being called (some of them depending on additional conditions like file extensions or enabled flags):
-
Run the unlit pre-processor,
unlit
, to remove the literate markup, generatingFoo.lpp
. Theunlit
processor is a C program kept in utils/unlit. -
Run the C preprocessor,
cpp
, (if-cpp
is specified), generatingFoo.hspp
. -
Run the compiler itself. This does not start a separate process; it's just a call to a Haskell function. This step always generates an 'interface file'
Foo.hi
, and depending on what flags you give, it also generates a compiled file. As GHC supports three backend code generators currently (a native code generator, a C code generator and an llvm code generator) the possible range of outputs depends on the backend used. All three support assembly output:- Object code: no flags required, file
Foo.o
(supported by all three backends) - Assembly code: flag
-S
, fileFoo.s
(supported by all three backends) - C code: flags
-C
, fileFoo.hc
(only supported by C backend)
- Object code: no flags required, file
-
In the
-fvia-C
case: (This case is outdated.)- Run the C compiler on
Foo.hc
, to generateFoo.s
.
- Run the C compiler on
-
If
-split-objs
is in force, run the splitter onFoo.s
. This splitsFoo.s
into lots of small files. The idea is that the static linker will thereby avoid linking dead code. -
Run the assembler on
Foo.s
, or if-split-objs
is in force, on each individual assembly file.
The compiler pipeline
The compiler itself, independent of the external tools, is also structured as a pipeline. For details (and a diagram), see Commentary/Compiler/HscMain
Video
Video of compilation pipeline explanation from 2006: Compilation Pipeline and interface files (17'30")