Commit 0dfa7b62 authored by Takenobu Tani's avatar Takenobu Tani

Update links for source code

This fix using `sed -i -e `'s#haskell.org/ghc/ghc/tree/master/ghc/#haskell.org/ghc/ghc/blob/master/#g'
parent e8c16002
......@@ -32,7 +32,7 @@ currently being attempted; see [Building/Shake](building/shake) for more details
The following are a few of the most important files in the build system. For a more complete overview of the source-tree layout, see [Commentary/SourceTree](commentary/source-tree).
- **[ghc.mk](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/ghc.mk)**
- **[ghc.mk](https://gitlab.haskell.org/ghc/ghc/blob/master/ghc.mk)**
This is where you should start reading: `ghc.mk` is the main file in
the build system which ties together all the other build-system
......@@ -40,7 +40,7 @@ files. It uses **make**'s `include` directive to include all the
files in `mk/*.mk`, `rules/*.mk`, and all the other `ghc.mk` files
elsewhere in the tree.
- **[Makefile](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/Makefile)**
- **[Makefile](https://gitlab.haskell.org/ghc/ghc/blob/master/Makefile)**
The top-level `Makefile`, recursively invokes `make` on `ghc.mk`
according to the [phase ordering idiom](building/architecture/idiom/phase-ordering).
......
......@@ -20,4 +20,4 @@ See also [Idiom: macros](building/architecture/idiom/macros) where many applicat
## Variables affecting compilation
The file [rules/distdir-way-opts.mk](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/rules/distdir-way-opts.mk) contains a list of the variables affecting compilation, such as `$1_$2_HC_OPTS` and `$1_$2_MORE_HC_OPTS`.
The file [rules/distdir-way-opts.mk](https://gitlab.haskell.org/ghc/ghc/blob/master/rules/distdir-way-opts.mk) contains a list of the variables affecting compilation, such as `$1_$2_HC_OPTS` and `$1_$2_MORE_HC_OPTS`.
......@@ -344,7 +344,7 @@ your OS.
The splitter is another evil Perl script
([driver/split/ghc-split.lprl](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/driver/split/ghc-split.lprl)). Object splitting is what happens
([driver/split/ghc-split.lprl](https://gitlab.haskell.org/ghc/ghc/blob/master/driver/split/ghc-split.lprl)). Object splitting is what happens
when the `-split-objs` option is passed to GHC: the object file is
split into many smaller objects. This feature is used when building
libraries, so that a program statically linked against the library
......@@ -366,7 +366,7 @@ generator is described in detail in [Commentary/Compiler/Backends/NCG](commentar
To support GHCi, you need to port the dynamic linker
([rts/Linker.c](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/rts/Linker.c)). The linker currently supports the
([rts/Linker.c](https://gitlab.haskell.org/ghc/ghc/blob/master/rts/Linker.c)). The linker currently supports the
ELF and PEi386 object file formats - if your platform uses one of
these then things will be significantly easier. The majority of Unix
platforms use the ELF format these days. Even so, there are some
......
......@@ -28,7 +28,7 @@ $ make test WAY=optasm TEST=tc053
The testsuite also has a concept called, *ways*. These refer to different settings in which a test case can be compiled and/or run. They correspond to things such as checking a test passes both when the native code generator is used and when the LLVM code generator is used.
The following ways are defined (see the file [testsuite/config/ghc](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/testsuite/config/ghc)
The following ways are defined (see the file [testsuite/config/ghc](https://gitlab.haskell.org/ghc/ghc/blob/master/testsuite/config/ghc)
for the complete list):
```wiki
......@@ -55,7 +55,7 @@ certain ways are enabled automatically if the GHC build in the local
tree supports them. Ways that are enabled this way are `optasm`,
`optllvm`, `profasm`, `threaded1`, `threaded2`, `profthreaded`, `ghci`,
and whichever of `static`/`dyn` is not GHC's default mode.
See also: [testsuite/mk/test.mk](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/testsuite/mk/test.mk).
See also: [testsuite/mk/test.mk](https://gitlab.haskell.org/ghc/ghc/blob/master/testsuite/mk/test.mk).
These values are supported for `VERBOSE=n`; the default is `VERBOSE=3`:
......
......@@ -18,7 +18,7 @@ The existing build system performs the following major steps:
- **configure**: take a bunch of `*.in` files {`config.mk.in`, `ghc.cabal.in`, `ghc-bin.cabal.in`, ...} and generate {`config.mk`, `ghc.cabal`, `ghc-bin.cabal`, ...}.
- **make**: do the rest of the build in three phases by invoking `ghc.mk` with the phase parameter set to one of {`0`, `1`, `final`}. All other `*.mk` files {`config.mk`, `tree.mk`, ...} are included in `ghc.mk`. The approximate build order is described in [ghc.mk](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/ghc.mk).
- **make**: do the rest of the build in three phases by invoking `ghc.mk` with the phase parameter set to one of {`0`, `1`, `final`}. All other `*.mk` files {`config.mk`, `tree.mk`, ...} are included in `ghc.mk`. The approximate build order is described in [ghc.mk](https://gitlab.haskell.org/ghc/ghc/blob/master/ghc.mk).
The goal is to eventually replace all of the above with a single shake script that will be invoking `autoconf`, `configure`, etc., and taking care of all dependencies. Specific parts of the old build system that will be shake-ified are: `boot`, `ghc-cabal` and `*.mk`.
......
......@@ -215,7 +215,7 @@ To understand more about what you can put in `mk/build.mk`, read on.
The following are some common variables that you might want to set in
your `mk/build.mk`. For other variables that you can override,
take a look in [mk/config.mk.in](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/mk/config.mk.in).
take a look in [mk/config.mk.in](https://gitlab.haskell.org/ghc/ghc/blob/master/mk/config.mk.in).
- **`SRC_HC_OPTS`**
......
......@@ -19,10 +19,10 @@ This document provides details of working of GHCi primarily for the normal mode
When a source code is loaded in ghci or the user enters an expression, at the front end of the compiler, we annotate the source code with **ticks**, based on the program coverage tool of Andy Gill and Colin Runciman. Ticks are uniquely numbered with respect to a particular module. Ticks are annotations on expressions, so each tick is associated with a source span, which identifies the start and end locations of the ticked expression.
The instrumentation is implemented in [compiler/deSugar/Coverage.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/deSugar/Coverage.hs). For details on the heuristics of this instrumentation, see the use of `TickForBreakPoints`. (It would be nice to have this documented properly)
The instrumentation is implemented in [compiler/deSugar/Coverage.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/deSugar/Coverage.hs). For details on the heuristics of this instrumentation, see the use of `TickForBreakPoints`. (It would be nice to have this documented properly)
For each module we also allocate an array of breakpoint flags, with one entry for each tick in that module. This array is managed by the GHC storage manager, so it can be garbage collected if the module is re-loaded and re-ticked. We retain this array inside the `ModGuts` data structure, which is defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/HscTypes.hs). This array is stored inside something called `ModBreaks`, which also stores an association list of source spans and ticks.
For each module we also allocate an array of breakpoint flags, with one entry for each tick in that module. This array is managed by the GHC storage manager, so it can be garbage collected if the module is re-loaded and re-ticked. We retain this array inside the `ModGuts` data structure, which is defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/HscTypes.hs). This array is stored inside something called `ModBreaks`, which also stores an association list of source spans and ticks.
### Byte code generation
......@@ -32,7 +32,7 @@ For each `Tick` a special breakpoint instruction `BRK_FUN` is added during byte
In the coverage tool the ticks are turned into real code which performs a side effect when evaluated. In the debugger the ticks are purely annotations. They are used to pass information to the byte code generator, which generates special breakpoint instructions for ticked expressions.
The byte code generator turns `CoreSyn` into a bunch of Byte Code Objects (BCOs). BCOs are heap objects which correspond to top-level bindings, and `let` and `case` expressions. Each BCO contains a sequence of byte code instructions (BCIs), which are executed by the byte code interpreter ([rts/Interpreter.c](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/rts/Interpreter.c)). Each BCO also contains some local data which is needed in the instructions.
The byte code generator turns `CoreSyn` into a bunch of Byte Code Objects (BCOs). BCOs are heap objects which correspond to top-level bindings, and `let` and `case` expressions. Each BCO contains a sequence of byte code instructions (BCIs), which are executed by the byte code interpreter ([rts/Interpreter.c](https://gitlab.haskell.org/ghc/ghc/blob/master/rts/Interpreter.c)). Each BCO also contains some local data which is needed in the instructions.
The BCIs for this BCO are generated as usual, and we prefix a new special breakpoint instruction (`BRK_FUN`) on the front. Thus, when the BCO is evaluated, the first thing it will do is interpret the breakpoint instruction, and hence decide whether to break or not. We annotate the BCO with information about the tick, such as its free variables, and the breakpoint number.
......@@ -47,7 +47,7 @@ To understand what happens when a breakpoint is hit, it is necessary to know how
### Execution of an Expression in GHCi
When the user types in an expression (as a string) it is parsed, type checked, and compiled, and then run. In [compiler/main/InteractiveEval.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/InteractiveEval.hs) we have the function:
When the user types in an expression (as a string) it is parsed, type checked, and compiled, and then run. In [compiler/main/InteractiveEval.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/InteractiveEval.hs) we have the function:
```
-- | Run a statement in the current interactive context.
......@@ -60,14 +60,14 @@ execStmt
The `GhcMonad` carries a `Session` which contains the gobs of environmental information which is important to the compiler. The `String` is what the user typed in, and `ExecResult`, is the answer that you get back if the execution terminates. `ExecResult` is defined like so:
(in [compiler/main/InteractiveEvalTypes.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/InteractiveEvalTypes.hs)
(in [compiler/main/InteractiveEvalTypes.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/InteractiveEvalTypes.hs)
```
dataExecResult=ExecComplete{ execResult ::EitherSomeException[Name], execAllocation ::Word64}|ExecBreak{ breakNames ::[Name], breakInfo ::MaybeBreakInfo}
```
Normally what happens is that `execStmt` forks a new thread to handle the evaluation of the expression. It calls `evalStmt` ([compiler/ghci/GHCi.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/ghci/GHCi.hs) in both remote and normal mode) to create an `EvalStmt``Message`. This message is processed by the `evalStmt` ([libraries/ghci/GHCi/Run.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/libraries/ghci/GHCi/Run.hs) in normal mode). This in turns calls the `sandboxIO` to do `forkIO`. It then blocks on an `MVar` and waits for the thread to finish.
Normally what happens is that `execStmt` forks a new thread to handle the evaluation of the expression. It calls `evalStmt` ([compiler/ghci/GHCi.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/ghci/GHCi.hs) in both remote and normal mode) to create an `EvalStmt``Message`. This message is processed by the `evalStmt` ([libraries/ghci/GHCi/Run.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/libraries/ghci/GHCi/Run.hs) in normal mode). This in turns calls the `sandboxIO` to do `forkIO`. It then blocks on an `MVar` and waits for the thread to finish.
This `MVar` is (now) called `statusMVar`, because it carries the execution status of the computation which is being evaluated. We will discuss its type shortly. When the thread finishes it fills in `statusMVar`, which wakes up `execStmt`, and it returns a `ExecResult`.
......
......@@ -235,7 +235,7 @@ To maintain compatibility, use [HsVersions.h](commentary/coding-style#) (see bel
### `HsVersions.h`
`HsVersions.h` is a CPP header file containing a number of macros that help smooth out the differences between compiler versions. It defines, for example, macros for library module names which have moved between versions. Take a look [compiler/HsVersions.h](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/HsVersions.h).
`HsVersions.h` is a CPP header file containing a number of macros that help smooth out the differences between compiler versions. It defines, for example, macros for library module names which have moved between versions. Take a look [compiler/HsVersions.h](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/HsVersions.h).
```c
#include "HsVersions.h"
......
# GHC Commentary: The Compiler
The compiler itself is written entirely in Haskell, and lives in the many sub-directories of the [compiler](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler) directory.
The compiler itself is written entirely in Haskell, and lives in the many sub-directories of the [compiler](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler) directory.
- [Compiler Module Dependencies](module-dependencies) (deals with the arcane mutual recursions among GHC's many data types)
- [Coding guidelines](commentary/coding-style)
......@@ -83,14 +83,14 @@ The part called [HscMain](commentary/compiler/hsc-main) deals with compiling a s
- `--make` is almost a trivial client of the GHC API, and is implemented in [compiler/main/GhcMake.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/GhcMake.hs).
- `-M`, the Makefile dependency generator, is also a client of the GHC API and is implemented in [compiler/main/DriverMkDepend.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/DriverMkDepend.hs).
- `-M`, the Makefile dependency generator, is also a client of the GHC API and is implemented in [compiler/main/DriverMkDepend.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/DriverMkDepend.hs).
- The "one-shot" mode, where GHC compiles each file on the command line separately (eg. `ghc -c Foo.hs`). This mode bypasses the GHC API, and is implemented
directly on top of [HscMain](commentary/compiler/hsc-main), since it compiles only one file at a time. In fact, this is all that
GHC consisted of prior to version 5.00 when GHCi and `--make` were introduced.
GHC is packaged as a single binary in which all of these front-ends are present, selected by the command-line flags indicated above. There is a single command-line interface implemented in [ghc/Main.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/ghc/Main.hs).
GHC is packaged as a single binary in which all of these front-ends are present, selected by the command-line flags indicated above. There is a single command-line interface implemented in [ghc/Main.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/ghc/Main.hs).
In addition, GHC is compiled, without its front ends, as a *library* which can be imported by any Haskell program; see [the GHC API](commentary/compiler/api).
......@@ -52,7 +52,7 @@ The `hscTarget` field of `DynFlags` tells the compiler what kind of output to ge
The targets specify the source files or modules at the top of the dependency tree. For a Haskell program there is often just a single target `Main.hs`, but for a library the targets would consist of every visible module in the library.
The `Target` type is defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/HscTypes.hs). Note that a `Target` includes not just the file or module name, but also optionally the complete source text of the module as a `StringBuffer`: this is to support an interactive development environment where the source file is being edited, and the in-memory copy of the source file is to be used in preference to the version on disk.
The `Target` type is defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/HscTypes.hs). Note that a `Target` includes not just the file or module name, but also optionally the complete source text of the module as a `StringBuffer`: this is to support an interactive development environment where the source file is being edited, and the in-memory copy of the source file is to be used in preference to the version on disk.
## Dependency Analysis
......@@ -65,7 +65,7 @@ The `downsweep` function takes the targets and returns a list of `ModSummary` co
## The ModSummary type
A `ModSummary` (defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/HscTypes.hs)) contains various information about a module:
A `ModSummary` (defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/HscTypes.hs)) contains various information about a module:
- Its `Module`, which includes the package that it belongs to
- Its `ModLocation`, which lists the pathnames of all the files associated with the module
......@@ -74,10 +74,10 @@ A `ModSummary` (defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.or
- ... some other things
We collect `ModSumary` information for all the modules we are interested in during the *downsweep*, below. Extracting the information about the module name and the imports from a source file is the job of [compiler/main/HeaderInfo.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/HeaderInfo.hs) which partially parses the source file.
We collect `ModSumary` information for all the modules we are interested in during the *downsweep*, below. Extracting the information about the module name and the imports from a source file is the job of [compiler/main/HeaderInfo.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/HeaderInfo.hs) which partially parses the source file.
Converting a given module name into a `ModSummary` is done by `summariseModule` in [compiler/main/GHC.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/GHC.hs). Similarly, if we have a filename rather than a module name, we generate a `ModSummary` using `summariseFile`.
Converting a given module name into a `ModSummary` is done by `summariseModule` in [compiler/main/GHC.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/GHC.hs). Similarly, if we have a filename rather than a module name, we generate a `ModSummary` using `summariseFile`.
## Loading (compiling) the Modules
......
......@@ -19,37 +19,37 @@ NOTE! The native code generator was largely rewritten as part of the C-- backend
### Files, Parts
After GHC has produced [Cmm](commentary/compiler/cmm-type) (use -ddump-cmm or -ddump-opt-cmm to view), the Native Code Generator (NCG) transforms Cmm into architecture-specific assembly code. The NCG is located in [compiler/nativeGen](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen) and is separated into eight modules:
After GHC has produced [Cmm](commentary/compiler/cmm-type) (use -ddump-cmm or -ddump-opt-cmm to view), the Native Code Generator (NCG) transforms Cmm into architecture-specific assembly code. The NCG is located in [compiler/nativeGen](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen) and is separated into eight modules:
- [compiler/nativeGen/AsmCodeGen.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/AsmCodeGen.hs)
- [compiler/nativeGen/AsmCodeGen.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/AsmCodeGen.hs)
top-level module for the NCG, imported by [compiler/main/CodeOutput.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/CodeOutput.hs); also defines the Monad for optimising generic Cmm code, `CmmOptM`
- [compiler/nativeGen/MachCodeGen.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/MachCodeGen.hs)
top-level module for the NCG, imported by [compiler/main/CodeOutput.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/CodeOutput.hs); also defines the Monad for optimising generic Cmm code, `CmmOptM`
- [compiler/nativeGen/MachCodeGen.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/MachCodeGen.hs)
generates architecture-specific instructions (a Haskell-representation of assembler) from Cmm code
- [compiler/nativeGen/MachInstrs.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/MachInstrs.hs)
- [compiler/nativeGen/MachInstrs.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/MachInstrs.hs)
contains data definitions and some functions (comparison, size, simple conversions) for machine instructions, mostly carried out through the `Instr` data type, defined here
- [compiler/nativeGen/NCGMonad.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/NCGMonad.hs)
- [compiler/nativeGen/NCGMonad.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/NCGMonad.hs)
defines the the main monad in the NCG: the Native code Machine instruction Monad, `NatM`, and related functions. *Note: the NCG switches between two monads at times, especially in `AsmCodeGen`: `NatM` and the `UniqSM` Monad used throughout the compiler.*
- [compiler/nativeGen/PIC.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/PIC.hs)
- [compiler/nativeGen/PIC.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/PIC.hs)
handles generation of position independent code and issues related to dynamic linking in the NCG; related to many other modules outside the NCG that handle symbol import, export and references, including `CLabel`, `Cmm`, `codeGen` and the RTS, and the Mangler
- [compiler/nativeGen/PprMach.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/PprMach.hs)
- [compiler/nativeGen/PprMach.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/PprMach.hs)
Pretty prints machine instructions (`Instr`) to assembler code (currently readable by GNU's `as`), with some small modifications, especially for comparing and adding floating point numbers on x86 architectures
- [compiler/nativeGen/RegAllocInfo.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/RegAllocInfo.hs)
- [compiler/nativeGen/RegAllocInfo.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/RegAllocInfo.hs)
defines the main register information function, `regUsage`, which takes a set of real and virtual registers and returns the actual registers used by a particular `Instr`; register allocation is in AT&T syntax order (source, destination), in an internal function, `usage`; defines the `RegUsage` data type
- [compiler/nativeGen/RegisterAlloc.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/RegisterAlloc.hs)
- [compiler/nativeGen/RegisterAlloc.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/RegisterAlloc.hs)
one of the most complicated modules in the NCG, `RegisterAlloc` manages the allocation of registers for each *basic block* of Haskell-abstracted assembler code: management involves *liveness* analysis, allocation or deletion of temporary registers, *spilling* temporary values to the *spill stack* (memory) and many optimisations. *See [The Cmm language](commentary/compiler/cmm-type) for the definition of a *basic block* (in Haskell, *`type CmmBasicBlock = GenBasicBlock CmmStmt`*).*
and one header file:
- [compiler/nativeGen/NCG.h](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/NCG.h)
- [compiler/nativeGen/NCG.h](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/NCG.h)
defines macros used to separate architecture-specific code in the Haskell NCG files; since GHC currently only generates machine code for the architecture on which it was compiled (GHC is not currently a cross-compiler), the Haskell NCG files become considerably smaller after preprocessing; ideally all architecture-specific code would reside in separate files and GHC would have them available to support cross-compiler capabilities.
......
......@@ -158,4 +158,4 @@ These are some ideas for improving the current allocator, most potentially usefu
For the architectures currently supported, x86, x86_64 and ppc, the native code generator currently emits code using only two register classes `RcInteger` and `RcDouble`. As these classes are disjoint (ie, none of the regs from one class alias with with regs from another), checking whether a node of a certain class is trivially colorable reduces to counting up the number of neighbours of that class.
* If the NCG starts to use aliasing register classes eg: both 32bit `RcFloat`s and 64bit `RcDouble`s on sparc; combinations of 8, 16, and 32 bit integers on x86 / x86_x6 or usage of sse / altivec regs in different modes, then this can be supported via the method described in \[Smith et al\]. The allocator was designed with this in mind - ie, by passing a function to test if a node is trivially colorable as a parameter to the coloring function - and there is already a description of the register set for x86 in [compiler/nativeGen/RegArchX86.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/nativeGen/RegArchX86.hs), but the native code generator doesn't currently emit code to test it against.
* If the NCG starts to use aliasing register classes eg: both 32bit `RcFloat`s and 64bit `RcDouble`s on sparc; combinations of 8, 16, and 32 bit integers on x86 / x86_x6 or usage of sse / altivec regs in different modes, then this can be supported via the method described in \[Smith et al\]. The allocator was designed with this in mind - ie, by passing a function to test if a node is trivially colorable as a parameter to the coloring function - and there is already a description of the register set for x86 in [compiler/nativeGen/RegArchX86.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/nativeGen/RegArchX86.hs), but the native code generator doesn't currently emit code to test it against.
# GHC Commentary: The C code generator
Source: [compiler/cmm/PprC.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/cmm/PprC.hs)
Source: [compiler/cmm/PprC.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/cmm/PprC.hs)
This phase takes [Cmm](commentary/compiler/cmm-type) and generates plain C code. The C code generator is very simple these days, in fact it can almost be considered pretty-printing. It is only used for unregisterised compilers.
......@@ -81,7 +81,7 @@ declarations don't overlap. So we either have to scan the whole code to figure
will do here.
- all RTS symbols already have declarations (mostly with the correct
type) in [includes/StgMiscClosures.h](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/includes/StgMiscClosures.h), so no declarations are generated.
type) in [includes/StgMiscClosures.h](https://gitlab.haskell.org/ghc/ghc/blob/master/includes/StgMiscClosures.h), so no declarations are generated.
- certain labels are known to have been defined earlier in the same file,
so a declaration can be omitted (e.g. SRT labels)
......@@ -101,6 +101,6 @@ we need to emit special code to reference these labels).
For all other labels referenced by RTS .cmm code, we assume they are
RTS labels, and hence already declared in [includes/StgMiscClosures.h](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/includes/StgMiscClosures.h). This is
RTS labels, and hence already declared in [includes/StgMiscClosures.h](https://gitlab.haskell.org/ghc/ghc/blob/master/includes/StgMiscClosures.h). This is
the only choice here: since we don't know the type of the label (info,
entry etc.), we can't generate a correct declaration.
......@@ -6,7 +6,7 @@ This page gives a hopefully comprehensive view of how `Bool` type is wired-in in
## Constants for Bool type and data constructors
All data constructors, type constructors and so on have their unique identifier which is needed during the compilation process. For the wired-in types these unique values are defined in the [compiler/prelude/PrelNames.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/prelude/PrelNames.hs). In case of `Bool` the relevant definitions look like this:
All data constructors, type constructors and so on have their unique identifier which is needed during the compilation process. For the wired-in types these unique values are defined in the [compiler/prelude/PrelNames.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/prelude/PrelNames.hs). In case of `Bool` the relevant definitions look like this:
```wiki
boolTyConKey, falseDataConKey, trueDataConKey :: Unique
......@@ -18,7 +18,7 @@ trueDataConKey = mkPreludeDataConUnique 15 -- line 1451
### A side note on generating Unique values
The `mkPreludeTyConUnique` and `mkPreludeDataConUnique` take care of generating a unique `Unique` value. They are defined in [compiler/basicTypes/Unique.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Unique.hs):
The `mkPreludeTyConUnique` and `mkPreludeDataConUnique` take care of generating a unique `Unique` value. They are defined in [compiler/basicTypes/Unique.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Unique.hs):
```wiki
data Unique = MkUnique FastInt
......@@ -31,12 +31,12 @@ mkPreludeDataConUnique i = mkUnique '6' (2*i)
```
You will find definition of `mkUnique :: Char -> Int -> Unique` at line 135 in [compiler/basicTypes/Unique.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Unique.hs).
You will find definition of `mkUnique :: Char -> Int -> Unique` at line 135 in [compiler/basicTypes/Unique.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Unique.hs).
## Defining wired-in information about Bool
All the wired-in information that compiler needs to know about `Bool` is defined in [compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/prelude/TysWiredIn.hs). This file exports following functions related to `Bool`:
All the wired-in information that compiler needs to know about `Bool` is defined in [compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/prelude/TysWiredIn.hs). This file exports following functions related to `Bool`:
```wiki
boolTy, boolTyCon, boolTyCon_RDR, boolTyConName,
......@@ -62,13 +62,13 @@ falseDataConName = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "False") fa
trueDataConName = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "True") trueDataConKey trueDataCon
```
`boolTyConKey`, `falseDataConKey` and `trueDataConKey` are `Unique` values defined earlier. `boolTyCon`, `falseDataCon` and `trueDataCon` are yet undefined. Type of syntax is defined in [compiler/basicTypes/Name.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Name.hs), line 129:
`boolTyConKey`, `falseDataConKey` and `trueDataConKey` are `Unique` values defined earlier. `boolTyCon`, `falseDataCon` and `trueDataCon` are yet undefined. Type of syntax is defined in [compiler/basicTypes/Name.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Name.hs), line 129:
```wiki
data BuiltInSyntax = BuiltInSyntax | UserSyntax
```
`BuiltInSyntax` is used for things like (:), \[\] and tuples. All other things are `UserSyntax`. `gHC_TYPES` is a module `GHC.Types` to which these type and data constructors get assigned. It is defined in [compiler/prelude/PrelNames.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/prelude/PrelNames.hs):
`BuiltInSyntax` is used for things like (:), \[\] and tuples. All other things are `UserSyntax`. `gHC_TYPES` is a module `GHC.Types` to which these type and data constructors get assigned. It is defined in [compiler/prelude/PrelNames.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/prelude/PrelNames.hs):
```wiki
gHC_TYPES = mkPrimModule (fsLit "GHC.Types") -- line 359
......@@ -77,7 +77,7 @@ mkPrimModule :: FastString -> Module -- line 435
mkPrimModule m = mkModule primPackageId (mkModuleNameFS m)
```
`FastString` is a string type based on `ByteStrings` and the `fsLit` function converts a standard Haskell `Strings` to `FastString`. See [compiler/utils/FastString.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/utils/FastString.hs) for more details.
`FastString` is a string type based on `ByteStrings` and the `fsLit` function converts a standard Haskell `Strings` to `FastString`. See [compiler/utils/FastString.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/utils/FastString.hs) for more details.
### A side note on creating wired-in Names
......@@ -100,7 +100,7 @@ data NameSort
```
The `mkWiredInTyConName` and `mkWiredInDataConName` are functions that create `Name`s for wired in types and data constructors. They are defined in [compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/prelude/TysWiredIn.hs), lines 163-173:
The `mkWiredInTyConName` and `mkWiredInDataConName` are functions that create `Name`s for wired in types and data constructors. They are defined in [compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/prelude/TysWiredIn.hs), lines 163-173:
```wiki
mkWiredInTyConName :: BuiltInSyntax -> Module -> FastString -> Unique -> TyCon -> Name
......@@ -117,7 +117,7 @@ mkWiredInDataConName built_in modu fs unique datacon
```
The `mkWiredInName` is defined in [compiler/basicTypes/Name.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Name.hs) (lines 279-283), and it just assigns values to fields of `Name`:
The `mkWiredInName` is defined in [compiler/basicTypes/Name.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Name.hs) (lines 279-283), and it just assigns values to fields of `Name`:
```wiki
mkWiredInName :: Module -> OccName -> Unique -> TyThing -> BuiltInSyntax -> Name
......@@ -130,7 +130,7 @@ mkWiredInName mod occ uniq thing built_in
## RdrNames for Bool
Having defined `Name`s for `Bool`, the [RdrName](commentary/compiler/rdr-name-type)s can be defined ([compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/prelude/TysWiredIn.hs), lines 221-225):
Having defined `Name`s for `Bool`, the [RdrName](commentary/compiler/rdr-name-type)s can be defined ([compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/prelude/TysWiredIn.hs), lines 221-225):
```wiki
boolTyCon_RDR, false_RDR, true_RDR :: RdrName
......@@ -170,14 +170,14 @@ Note that `boolTyCon` is on the list of wired in type constructors created by `w
### A side note on functions generating type and data constructors
[compiler/types/TypeRep.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/types/TypeRep.hs), lines 281-282:
[compiler/types/TypeRep.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/types/TypeRep.hs), lines 281-282:
```wiki
mkTyConTy :: TyCon -> Type
mkTyConTy tycon = TyConApp tycon []
```
[compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/prelude/TysWiredIn.hs), 247-257:
[compiler/prelude/TysWiredIn.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/prelude/TysWiredIn.hs), 247-257:
```wiki
pcTyCon :: Bool -> RecFlag -> Name -> Maybe CType -> [TyVar] -> [DataCon] -> TyCon
......@@ -239,7 +239,7 @@ falseDataConId = dataConWorkId falseDataCon
trueDataConId = dataConWorkId trueDataCon
```
`falseDataConId` and `trueDataConId` just extract `Id` from previously defined data constructors. These definitions are from [compiler/basicTypes/DataCon.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/DataCon.hs):
`falseDataConId` and `trueDataConId` just extract `Id` from previously defined data constructors. These definitions are from [compiler/basicTypes/DataCon.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/DataCon.hs):
```wiki
data DataCon -- line 253
......
This diff is collapsed.
......@@ -4,7 +4,7 @@
GHC's many flavours of command line flags make the code interpreting them rather involved. The following provides a brief overview of the processing of these options. Since the addition of the interactive front-end to GHC, there are two kinds of flags: static and dynamic. Static flags can only be set once on the command line. They remain the same throughout the whole GHC session (so for example you cannot change them within GHCi using `:set` or with `OPTIONS_GHC` pragma in the source code). Dynamic flags are the opposite: they can be changed in GHCi sessions using `:set` command or `OPTIONS_GHC` pragma in the source code. There are few static flags and it is likely that in the future there will be even less. Thus, you won't see many static flag references in the source code, but you will see a lot of functions that use dynamic flags.
Command line flags are described by Flag data type defined in [compiler/main/CmdLineParser.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/CmdLineParser.hs):
Command line flags are described by Flag data type defined in [compiler/main/CmdLineParser.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/CmdLineParser.hs):
```wiki
data Flag m = Flag
......@@ -20,18 +20,18 @@ This file contains functions that actually parse the command line parameters.
## Static flags
Static flags are managed by functions in [compiler/main/StaticFlags.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/StaticFlags.hs).
Static flags are managed by functions in [compiler/main/StaticFlags.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/StaticFlags.hs).
Function `parseStaticFlags ::` is an entry point for parsing static flags. It is called by the `main :: IO ()` function of GHC in [ghc/Main.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/ghc/Main.hs). Two global IORefs are used to parse static flags: `v_opt_C_ready` and `v_opt_C`. These are defined using `GLOBAL_VAR` macro from [compiler/HsVersions.h](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/HsVersions.h). First IORef is a flag that checks whether the static flags are parsed at the right time. Initialized to `False`, it is set to `True` after the parsing is done. `v_opt_C` is a `[String]` used to store parsed flags (see `addOpt` and `removeOpt` functions).
Function `parseStaticFlags ::` is an entry point for parsing static flags. It is called by the `main :: IO ()` function of GHC in [ghc/Main.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/ghc/Main.hs). Two global IORefs are used to parse static flags: `v_opt_C_ready` and `v_opt_C`. These are defined using `GLOBAL_VAR` macro from [compiler/HsVersions.h](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/HsVersions.h). First IORef is a flag that checks whether the static flags are parsed at the right time. Initialized to `False`, it is set to `True` after the parsing is done. `v_opt_C` is a `[String]` used to store parsed flags (see `addOpt` and `removeOpt` functions).
In [compiler/main/StaticFlags.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/StaticFlags.hs), `flagsStatic :: [Flag IO]` defines a list of static flags and what actions should be taken when these flags are encountered (see `Flag` data type above). It also contains some helper functions to check whether particular flags have been set. Functions `staticFlags :: [String]` and `packed_staticFlags :: [FastString]` return a list of parsed command line static flags, provided that parsing has been done (checking the value of `v_opt_C_ready`).
In [compiler/main/StaticFlags.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/StaticFlags.hs), `flagsStatic :: [Flag IO]` defines a list of static flags and what actions should be taken when these flags are encountered (see `Flag` data type above). It also contains some helper functions to check whether particular flags have been set. Functions `staticFlags :: [String]` and `packed_staticFlags :: [FastString]` return a list of parsed command line static flags, provided that parsing has been done (checking the value of `v_opt_C_ready`).
## Dynamic flags
They are managed by functions in [compiler/main/DynFlags.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/DynFlags.hs) file. Looking from the top you will find data types used to described enabled dynamic flags: `DumpFlag`, `GeneralFlag`, `WarningFlag`, `Language`, `SafeHaskellMode`, `ExtensionFlag` and finally `DynFlags`. Function `defaultDynFlags :: Settings -> DynFlags` initializes some of the flags to default values. Available dynamic flags and their respective actions are defined by `dynamic_flags :: [Flag (CmdLineP DynFlags)]`. Also, `fWarningFlags :: [FlagSpec WarningFlag]`, `fFlags :: [FlagSpec GeneralFlag]`, `xFlags :: [FlagSpec ExtensionFlag]` and a few more smaller functions define even more flags needed for example for language extensions, warnings and other things. These flags are descibred by the data type `FlagSpec f`:
They are managed by functions in [compiler/main/DynFlags.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/DynFlags.hs) file. Looking from the top you will find data types used to described enabled dynamic flags: `DumpFlag`, `GeneralFlag`, `WarningFlag`, `Language`, `SafeHaskellMode`, `ExtensionFlag` and finally `DynFlags`. Function `defaultDynFlags :: Settings -> DynFlags` initializes some of the flags to default values. Available dynamic flags and their respective actions are defined by `dynamic_flags :: [Flag (CmdLineP DynFlags)]`. Also, `fWarningFlags :: [FlagSpec WarningFlag]`, `fFlags :: [FlagSpec GeneralFlag]`, `xFlags :: [FlagSpec ExtensionFlag]` and a few more smaller functions define even more flags needed for example for language extensions, warnings and other things. These flags are descibred by the data type `FlagSpec f`:
```wiki
type FlagSpec flag
......
......@@ -8,24 +8,24 @@ Video: [GHC Core language](http://www.youtube.com/watch?v=EQA69dvkQIk&list=PLBkR
The Core language is GHC's central data types. Core is a very small, explicitly-typed, variant of System F. The exact variant is called [System FC](commentary/compiler/fc), which embodies equality constraints and coercions.
The `CoreSyn` type, and the functions that operate over it, gets an entire directory [compiler/coreSyn](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn):
The `CoreSyn` type, and the functions that operate over it, gets an entire directory [compiler/coreSyn](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn):
- [compiler/coreSyn/CoreSyn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreSyn.hs): the data type itself.
- [compiler/coreSyn/CoreSyn.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreSyn.hs): the data type itself.
- [compiler/coreSyn/PprCore.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/PprCore.hs): pretty-printing.
- [compiler/coreSyn/CoreFVs.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreFVs.hs): finding free variables.
- [compiler/coreSyn/CoreSubst.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreSubst.hs): substitution.
- [compiler/coreSyn/CoreUtils.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreUtils.hs): a variety of other useful functions over Core.
- [compiler/coreSyn/PprCore.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/PprCore.hs): pretty-printing.
- [compiler/coreSyn/CoreFVs.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreFVs.hs): finding free variables.
- [compiler/coreSyn/CoreSubst.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreSubst.hs): substitution.
- [compiler/coreSyn/CoreUtils.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreUtils.hs): a variety of other useful functions over Core.
- [compiler/coreSyn/CoreUnfold.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreUnfold.hs): dealing with "unfoldings".
- [compiler/coreSyn/CoreUnfold.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreUnfold.hs): dealing with "unfoldings".
- [compiler/coreSyn/CoreLint.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreLint.hs): type-check the Core program. This is an incredibly-valuable consistency check, enabled by the flag `-dcore-lint`.
- [compiler/coreSyn/CoreLint.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreLint.hs): type-check the Core program. This is an incredibly-valuable consistency check, enabled by the flag `-dcore-lint`.
- [compiler/coreSyn/CoreTidy.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreTidy.hs): part of the [the CoreTidy pass](commentary/compiler/hsc-main) (the rest is in [compiler/main/TidyPgm.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/TidyPgm.hs)).
- [compiler/coreSyn/CorePrep.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CorePrep.hs): [the CorePrep pass](commentary/compiler/hsc-main)
- [compiler/coreSyn/CoreTidy.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreTidy.hs): part of the [the CoreTidy pass](commentary/compiler/hsc-main) (the rest is in [compiler/main/TidyPgm.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/TidyPgm.hs)).
- [compiler/coreSyn/CorePrep.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CorePrep.hs): [the CorePrep pass](commentary/compiler/hsc-main)
Here is the entire Core type [compiler/coreSyn/CoreSyn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreSyn.hs):
Here is the entire Core type [compiler/coreSyn/CoreSyn.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreSyn.hs):
```wiki
type CoreExpr = Expr Var
......@@ -56,7 +56,7 @@ That's it. All of Haskell gets compiled through this tiny core.
If you want to learn more about such AST-parametrization, I encourage you to read a blog post about it: [http://blog.ezyang.com/2013/05/the-ast-typing-problem](http://blog.ezyang.com/2013/05/the-ast-typing-problem) .
Binder is used (as the name suggest) to bind a variable to an expression. The `Expr` data type is parametrized by the binder type. The most common one is the `type CoreBndr = Var` where `Var` comes from [compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Var.hs), which in fact is a `Name` with some extra informations attached (like types).
Binder is used (as the name suggest) to bind a variable to an expression. The `Expr` data type is parametrized by the binder type. The most common one is the `type CoreBndr = Var` where `Var` comes from [compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Var.hs), which in fact is a `Name` with some extra informations attached (like types).
Here are some notes about the individual constructors of `Expr`.
......@@ -133,7 +133,7 @@ case (reverse xs) of y { DEFAULT -> f y }
Case expressions have several invariants
- The `res_ty` type is the same as the type of any of the right-hand sides (up to refining unification -- coreRefineTys in [compiler/types/Unify.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/types/Unify.hs) -- in pre-[FC](commentary/compiler/fc)).
- The `res_ty` type is the same as the type of any of the right-hand sides (up to refining unification -- coreRefineTys in [compiler/types/Unify.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/types/Unify.hs) -- in pre-[FC](commentary/compiler/fc)).
- If there is a `DEFAULT` alternative, it must appear first. This makes finding a `DEFAULT` alternative easy, when it exists.
......@@ -166,7 +166,7 @@ allowed. In other words, it is possible to come across a definition of a
variable that has the same name (`realUnique`) as some other one that is
already in scope. One of the possible ways to deal with that is to
use `Subst` (substitution environment from
[compiler/coreSyn/CoreSubst.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/CoreSubst.hs)), which maintains the list of
[compiler/coreSyn/CoreSubst.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/CoreSubst.hs)), which maintains the list of
variables in scope and makes it possible to clone (i.e. rename) only the
variables that actually capture names of some earlier ones. For some more
explanations about this approach see
......@@ -176,4 +176,4 @@ explanations about this approach see
## Human readable Core generation
If you are interested in the way Core is translated into human readable form, you should check the sources for [compiler/coreSyn/PprCore.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/coreSyn/PprCore.hs). It is especially useful if you want to see how the Core data types are being built, especially when there is no Show instance defined for them.
If you are interested in the way Core is translated into human readable form, you should check the sources for [compiler/coreSyn/PprCore.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/coreSyn/PprCore.hs). It is especially useful if you want to see how the Core data types are being built, especially when there is no Show instance defined for them.
......@@ -21,7 +21,7 @@ The structure of the Core-to-Core pipeline is determined in the `getCoreToDo` fu
- **Simplifier, gentle run**
- **Specialisation**: specialisation attempts to eliminate overloading. More details can be found in the comments in [compiler/specialise/Specialise.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/specialise/Specialise.hs).
- **Specialisation**: specialisation attempts to eliminate overloading. More details can be found in the comments in [compiler/specialise/Specialise.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/specialise/Specialise.hs).
- **Full laziness, 1st pass**: floats let-bindings outside of lambdas. This pass includes annotating bindings with level information and then running the float-out pass. In this first pass of the full laziness we don't float partial applications and bindings that contain free variables - this will be done by the second pass later in the pipeline. See "Further Reading" section below for pointers where to find the description of the full laziness algorithm.
......@@ -29,7 +29,7 @@ The structure of the Core-to-Core pipeline is determined in the `getCoreToDo` fu
- **Float in, 1st pass**: the opposite of full laziness, this pass floats let-bindings as close to their use sites as possible. It will not undo the full laziness by sinking bindings inside a lambda, unless the lambda is one-shot. At this stage we have not yet run the demand analysis, so we only have demand information for things that we imported.
- **Call arity**: attempts to eta-expand local functions based on how they are used. If run, this pass is followed by a 0 phase of the simplifier. See Notes in [compiler/simplCore/CallArity.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/simplCore/CallArity.hs) and the relevant paper.
- **Call arity**: attempts to eta-expand local functions based on how they are used. If run, this pass is followed by a 0 phase of the simplifier. See Notes in [compiler/simplCore/CallArity.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/simplCore/CallArity.hs) and the relevant paper.
- **Demand analysis, 1st pass** (a.k.a. strictness analysis): runs the [demand analyser](commentary/compiler/demand) followed by worker-wrapper transformation ([JFP paper](http://ittc.ku.edu/~andygill/papers/wrapper.pdf)) and 0 phase of the simplifier. This pass tries to determine if some expressions are certain to be used and whether they will be used once or many times (cardinality analysis). We currently don't have means of saying that a binding is certain to be used many times. We can only determine that it is certain to be one-shot (ie. used only once) or probable to be one shot. Demand analysis pass only annotates Core with strictness information. This information is later used by worker/wrapper pass to perform transformations. CPR analysis is also done during demand analysis.
......
......@@ -13,12 +13,12 @@ This discussion is going to omit concerns related to dynamic code loading in GHC
## The overall driver
The meat of this logic is in [compiler/main/GhcMake.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/GhcMake.hs), with primary entry point the function `load` (in the case of `--make`, this function is called with `LoadAllTargets`, instructing all target modules to be compiled, which is stored in `hsc_targets`).
The meat of this logic is in [compiler/main/GhcMake.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/GhcMake.hs), with primary entry point the function `load` (in the case of `--make`, this function is called with `LoadAllTargets`, instructing all target modules to be compiled, which is stored in `hsc_targets`).
### Dependency analysis
Dependency analysis is carried out by the `depanal` function; the resulting `ModuleGraph` is stored into `hsc_mod_graph`. Essentially, this pass looks at all of the imports of the target modules (`hsc_targets`), and recursively pulls in all of their dependencies (stopping at package boundaries.) The resulting module graph consists of a list of `ModSummary` (defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/HscTypes.hs)), which record various information about modules prior to compilation (recompilation checking, even), such as their module identity (the current package name plus the module name), whether or not the file is a boot file, where the source file lives. Dependency analysis inside GHC is often referred to as **downsweep**.
Dependency analysis is carried out by the `depanal` function; the resulting `ModuleGraph` is stored into `hsc_mod_graph`. Essentially, this pass looks at all of the imports of the target modules (`hsc_targets`), and recursively pulls in all of their dependencies (stopping at package boundaries.) The resulting module graph consists of a list of `ModSummary` (defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/HscTypes.hs)), which record various information about modules prior to compilation (recompilation checking, even), such as their module identity (the current package name plus the module name), whether or not the file is a boot file, where the source file lives. Dependency analysis inside GHC is often referred to as **downsweep**.
ToDo: say something about how hs-boot files are
......@@ -43,7 +43,7 @@ ToDo: say something about stability; it's per SCC
Compilation, also known as **upsweep**, walks the module graph in topological order and compiles everything. Depending on whether or not we are doing parallel compilation, this implemented by `upsweep` or by `parUpsweep`. In this section, we'll talk about the sequential upsweep.
The key data structure which we are filling in as we perform compilation is the **home package table** or HPT (`hsc_HPT`, defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/main/HscTypes.hs)). As its name suggests, it contains informations from the \*home package\*, i.e. the package we are currently compiling. Its entries, `HomeModInfo`, contain the sum total knowledge of a module after compilation: both its pre-linking interface `ModIface` as well as the post-linking details `ModDetails`.
The key data structure which we are filling in as we perform compilation is the **home package table** or HPT (`hsc_HPT`, defined in [compiler/main/HscTypes.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/main/HscTypes.hs)). As its name suggests, it contains informations from the \*home package\*, i.e. the package we are currently compiling. Its entries, `HomeModInfo`, contain the sum total knowledge of a module after compilation: both its pre-linking interface `ModIface` as well as the post-linking details `ModDetails`.
We \*clear\* out the home package table in the session (for `--make`, this was empty anyway), but we pass in the old HPT.
......
......@@ -6,16 +6,16 @@ Video: [Types and Classes](http://www.youtube.com/watch?v=pN9rhQHcfCo&list=PLBkR
For each kind of Haskell entity (identifier, type variable, type constructor, data constructor, class) GHC has a data type to represent it. Here they are:
- **Type constructors** are represented by the `TyCon` type ([compiler/types/TyCon.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/types/TyCon.hs)).
- **Classes** are represented by the `Class` type ([compiler/types/Class.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/types/Class.hs)).
- **Data constructors** are represented by the `DataCon` type ([compiler/basicTypes/DataCon.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/DataCon.hs)).
- **Pattern synonyms** are represented by the `PatSyn` type ([compiler/basicTypes/PatSyn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/PatSyn.hs)).
- **Term variables**`Id` and **type variables**`TyVar` are both represented by the `Var` type ([compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Var.hs)).
- **Type constructors** are represented by the `TyCon` type ([compiler/types/TyCon.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/types/TyCon.hs)).
- **Classes** are represented by the `Class` type ([compiler/types/Class.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/types/Class.hs)).
- **Data constructors** are represented by the `DataCon` type ([compiler/basicTypes/DataCon.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/DataCon.hs)).
- **Pattern synonyms** are represented by the `PatSyn` type ([compiler/basicTypes/PatSyn.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/PatSyn.hs)).
- **Term variables**`Id` and **type variables**`TyVar` are both represented by the `Var` type ([compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Var.hs)).
All of these entities have a `Name`, but that's about all they have in common. However they are sometimes treated uniformly:
- A **`TyThing`** ([compiler/types/TypeRep.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/types/TypeRep.hs)) is simply the sum of all four:
- A **`TyThing`** ([compiler/types/TypeRep.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/types/TypeRep.hs)) is simply the sum of all four:
```wiki
data TyThing = AnId Id
......@@ -37,7 +37,7 @@ So you can see that the GHC data structures for entities is a *graph* not tree:
## Type variables and term variables
Type variables and term variables are represented by a single data type, `Var`, thus ([compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Var.hs)):
Type variables and term variables are represented by a single data type, `Var`, thus ([compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Var.hs)):
```wiki
type Id = Var
......@@ -51,7 +51,7 @@ It's incredibly convenient to use a single data type for both, rather than using
- We only need one lambda constructor in Core: `Lam :: Var -> CoreExpr -> CoreExpr`.
The `Var` type distinguishes the two sorts of variable; indeed, it makes somewhat finer distinctions ([compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/Var.hs)):
The `Var` type distinguishes the two sorts of variable; indeed, it makes somewhat finer distinctions ([compiler/basicTypes/Var.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/Var.hs)):
```wiki
data Var
......@@ -124,7 +124,7 @@ All the value bindings in the module being compiled (whether top level or not) a
## `GlobalIdDetails` and implict Ids
`GlobalId`s are further classified by their `GlobalIdDetails`. This type is defined in [compiler/basicTypes/IdInfo.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/basicTypes/IdInfo.hs), because it mentions other structured types such as `DataCon`. Unfortunately it is *used* in Var.hs so there's a hi-boot knot to get it there. Anyway, here's the declaration (elided a little):
`GlobalId`s are further classified by their `GlobalIdDetails`. This type is defined in [compiler/basicTypes/IdInfo.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/basicTypes/IdInfo.hs), because it mentions other structured types such as `DataCon`. Unfortunately it is *used* in Var.hs so there's a hi-boot knot to get it there. Anyway, here's the declaration (elided a little):
```wiki
data GlobalIdDetails
......
......@@ -20,7 +20,7 @@ form `T1 :=: T2`. (`c :: T1 :=: T2`) is a proof that a term of type `T1`
can be coerced to type `T2`.
Coercions are classified by a new sort of kind (with the form
`T1 :=: T2`). Most of the coercion construction and manipulation functions
are found in the `Coercion` module, [compiler/types/Coercion.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/types/Coercion.hs).
are found in the `Coercion` module, [compiler/types/Coercion.hs](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/types/Coercion.hs).
Coercions appear in Core in the form of `Cast` expressions:
......
......@@ -7,17 +7,17 @@ Video: [Abstract Syntax Types](http://www.youtube.com/watch?v=lw7kbUvAmK4&list=P
The program is initially parsed into "**`HsSyn`**", a collection of data types that describe the full abstract syntax of Haskell. `HsSyn` is a pretty big collection of types: there are 52 data types at last count. Many are pretty trivial, but a few have a lot of constructors (`HsExpr` has 40). `HsSyn` represents Haskell in its full glory, complete with all syntactic sugar.
The `HsSyn` modules live in the [compiler/hsSyn](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/hsSyn) directory. Each module declares a related group of declarations, *and* gives their pretty-printer.
The `HsSyn` modules live in the [compiler/hsSyn](https://gitlab.haskell.org/ghc/ghc/blob/master/compiler/hsSyn) directory. Each module declares a related group of declarations, *and* gives their pretty-printer.
- [compiler/hsSyn/HsSyn.hs](https://gitlab.haskell.org/ghc/ghc/tree/master/ghc/compiler/hsSyn/HsSyn.hs): the root module. It exports everything you need, and it's generally what you should import.