... | ... | @@ -14,7 +14,7 @@ TODO This page is possibly outdated. Update to the latest information. |
|
|
This page was written with more detail than usual since you may need to know how to work with Cmm as a programming language. Cmm is the basis for the future of GHC, Native Code Generation, and if you are interested in hacking Cmm at least this page might help reduce your learning curve. As a finer detail, if you read the [Compiler pipeline](commentary/compiler/hsc-main) wiki page or glanced at the diagram there you may have noticed that whether you are working backward from an `intermediate C` (Haskell-C "HC", `.hc`) file or an Assembler file you get to Cmm before you get to the STG language, the Simplifier or anything else. In other words, for really low-level debugging you may have an easier time if you know what Cmm is about. Cmm also has opportunities for implementing small and easy hacks, such as little optimisations and implementing new Cmm Primitive Operations.
|
|
|
|
|
|
|
|
|
A portion of the [RTS](commentary/rts) is written in Cmm: [rts/Apply.cmm](/trac/ghc/browser/ghc/rts/Apply.cmm), [rts/Exception.cmm](/trac/ghc/browser/ghc/rts/Exception.cmm), [rts/HeapStackCheck.cmm](/trac/ghc/browser/ghc/rts/HeapStackCheck.cmm), [rts/PrimOps.cmm](/trac/ghc/browser/ghc/rts/PrimOps.cmm), [rts/StgMiscClosures.cmm](/trac/ghc/browser/ghc/rts/StgMiscClosures.cmm), [rts/StgStartup.cmm](/trac/ghc/browser/ghc/rts/StgStartup.cmm) and [rts/StgStdThunks.cmm](/trac/ghc/browser/ghc/rts/StgStdThunks.cmm). (For notes related to `PrimOps.cmm` see the [PrimOps](commentary/prim-ops) page; for much of the rest, see the [HaskellExecution](commentary/rts/haskell-execution) page.) Cmm is optimised before GHC outputs either HC or Assembler. The C compiler (from HC, pretty printed by [compiler/cmm/PprC.hs](/trac/ghc/browser/ghc/compiler/cmm/PprC.hs)) and the [Native Code Generator](commentary/compiler/backends/ncg) (NCG) [Backends](commentary/compiler/backends) are closely tied to data representations and transformations performed in Cmm. In GHC, Cmm roughly performs a function similar to the intermediate [ Register Transfer Language (RTL)](http://gcc.gnu.org/onlinedocs/gccint/RTL.html) in GCC.
|
|
|
A portion of the [RTS](commentary/rts) is written in Cmm: [rts/Apply.cmm](/trac/ghc/browser/ghc/rts/Apply.cmm), [rts/Exception.cmm](/trac/ghc/browser/ghc/rts/Exception.cmm), [rts/HeapStackCheck.cmm](/trac/ghc/browser/ghc/rts/HeapStackCheck.cmm), [rts/PrimOps.cmm](/trac/ghc/browser/ghc/rts/PrimOps.cmm), [rts/StgMiscClosures.cmm](/trac/ghc/browser/ghc/rts/StgMiscClosures.cmm), [rts/StgStartup.cmm](/trac/ghc/browser/ghc/rts/StgStartup.cmm) and [rts/StgStdThunks.cmm](/trac/ghc/browser/ghc/rts/StgStdThunks.cmm). (For notes related to `PrimOps.cmm` see the [PrimOps](commentary/prim-ops) page; for much of the rest, see the [HaskellExecution](commentary/rts/haskell-execution) page.) Cmm is optimised before GHC outputs either HC or Assembler. The C compiler (from HC, pretty printed by [compiler/cmm/PprC.hs](/trac/ghc/browser/ghc/compiler/cmm/PprC.hs)) and the [Native Code Generator](commentary/compiler/backends/ncg) (NCG) [Backends](commentary/compiler/backends) are closely tied to data representations and transformations performed in Cmm. In GHC, Cmm roughly performs a function similar to the intermediate [Register Transfer Language (RTL)](http://gcc.gnu.org/onlinedocs/gccint/RTL.html) in GCC.
|
|
|
|
|
|
# Table of Contents
|
|
|
|
... | ... | @@ -51,7 +51,7 @@ A portion of the [RTS](commentary/rts) is written in Cmm: [rts/Apply.cmm](/trac/ |
|
|
|
|
|
# The Cmm language
|
|
|
|
|
|
`Cmm` is the GHC implementation of the `C--` language; it is also the extension of Cmm source code files: `.cmm` (see [What the hell is a .cmm file?](commentary/rts/cmm)). The GHC [Code Generator](commentary/compiler/code-gen) (`CodeGen`) compiles the STG program into `C--` code, represented by the `Cmm` data type. This data type follows the [ definition of \`C--\`](http://www.cminusminus.org/) pretty closely but there are some remarkable differences. For a discussion of the Cmm implementation noting most of those differences, see the [Basic Cmm](commentary/compiler/cmm-type#basic-cmm) section, below.
|
|
|
`Cmm` is the GHC implementation of the `C--` language; it is also the extension of Cmm source code files: `.cmm` (see [What the hell is a .cmm file?](commentary/rts/cmm)). The GHC [Code Generator](commentary/compiler/code-gen) (`CodeGen`) compiles the STG program into `C--` code, represented by the `Cmm` data type. This data type follows the [definition of \`C--\`](http://www.cminusminus.org/) pretty closely but there are some remarkable differences. For a discussion of the Cmm implementation noting most of those differences, see the [Basic Cmm](commentary/compiler/cmm-type#basic-cmm) section, below.
|
|
|
|
|
|
- [compiler/cmm/Cmm.hs](/trac/ghc/browser/ghc/compiler/cmm/Cmm.hs): the main data type definition.
|
|
|
- [compiler/cmm/CmmMachOp.hs](/trac/ghc/browser/ghc/compiler/cmm/CmmMachOp.hs): data types defining the machine operations (e.g. floating point divide) provided by `Cmm`.
|
... | ... | @@ -176,18 +176,18 @@ Note that if `p (bits32 i) { ... }` were written as a Cmm-parseable procedure, a |
|
|
|
|
|
## Basic Cmm
|
|
|
|
|
|
FIXME The links in this section are dead. But the files can be found here: [ http://www.cs.tufts.edu/\~nr/c--/index.html](http://www.cs.tufts.edu/~nr/c--/index.html). Relevant discussion about the documentations of C--: [ https://mail.haskell.org/pipermail/ghc-devs/2014-September/006301.html](https://mail.haskell.org/pipermail/ghc-devs/2014-September/006301.html)
|
|
|
FIXME The links in this section are dead. But the files can be found here: [http://www.cs.tufts.edu/\~nr/c--/index.html](http://www.cs.tufts.edu/~nr/c--/index.html). Relevant discussion about the documentations of C--: [ https://mail.haskell.org/pipermail/ghc-devs/2014-September/006301.html](https://mail.haskell.org/pipermail/ghc-devs/2014-September/006301.html)
|
|
|
|
|
|
|
|
|
Cmm is a high level assembler with a syntax style similar to C. This section describes Cmm by working up from assembler--the C-- papers and specification work down from C. At the least, you should know what a "high level" assembler is, see [ What is a High Level Assembler?](http://webster.cs.ucr.edu/AsmTools/HLA/HLADoc/HLARef/HLARef3.html#1035157). Cmm is different than other high level assembler languages in that it was designed to be a semi-portable intermediate language for compilers; most other high level assemblers are designed to make the tedium of assembly language more convenient and intelligible to humans. If you are completely new to C--, I highly recommend these papers listed on the [ C-- Papers](http://cminusminus.org/papers.html) page:
|
|
|
Cmm is a high level assembler with a syntax style similar to C. This section describes Cmm by working up from assembler--the C-- papers and specification work down from C. At the least, you should know what a "high level" assembler is, see [What is a High Level Assembler?](http://webster.cs.ucr.edu/AsmTools/HLA/HLADoc/HLARef/HLARef3.html#1035157). Cmm is different than other high level assembler languages in that it was designed to be a semi-portable intermediate language for compilers; most other high level assemblers are designed to make the tedium of assembly language more convenient and intelligible to humans. If you are completely new to C--, I highly recommend these papers listed on the [ C-- Papers](http://cminusminus.org/papers.html) page:
|
|
|
|
|
|
- [ C--: A Portable Assembly Language that Supports Garbage Collection (1999)](http://cminusminus.org/abstracts/ppdp.html) (Paper page with Abstract)
|
|
|
- [ C--: A Portable Assembly Language (1997)](http://cminusminus.org/abstracts/pal-ifl.html) (Paper page with Abstract)
|
|
|
- [ A Single Intermediate Language That Supports Multiple Implementations of Exceptions (2000)](http://cminusminus.org/abstracts/c--pldi-00.html) (Paper page with Abstract)
|
|
|
- [ The C-- Language Specification Version 2.0 (CVS Revision 1.128, 23 February 2005)](http://cminusminus.org/extern/man2.pdf) (PDF)
|
|
|
- [C--: A Portable Assembly Language that Supports Garbage Collection (1999)](http://cminusminus.org/abstracts/ppdp.html) (Paper page with Abstract)
|
|
|
- [C--: A Portable Assembly Language (1997)](http://cminusminus.org/abstracts/pal-ifl.html) (Paper page with Abstract)
|
|
|
- [A Single Intermediate Language That Supports Multiple Implementations of Exceptions (2000)](http://cminusminus.org/abstracts/c--pldi-00.html) (Paper page with Abstract)
|
|
|
- [The C-- Language Specification Version 2.0 (CVS Revision 1.128, 23 February 2005)](http://cminusminus.org/extern/man2.pdf) (PDF)
|
|
|
|
|
|
|
|
|
Cmm is not a stand alone C-- compiler; it is an implementation of C-- embedded in the GHC compiler. One difference between Cmm and a C-- compiler like [ Quick C--](http://cminusminus.org/code.html) is this: Cmm uses the C preprocessor (cpp). Cpp lets Cmm *integrate* with C code, especially the C header defines in [includes](/trac/ghc/browser/ghc/includes), and among many other consequences it makes the C-- `import` and `export` statements irrelevant; in fact, according to [compiler/cmm/CmmParse.y](/trac/ghc/browser/ghc/compiler/cmm/CmmParse.y) they are ignored. The most significant action taken by the Cmm modules in the Compiler is to optimise Cmm, through [compiler/cmm/CmmOpt.hs](/trac/ghc/browser/ghc/compiler/cmm/CmmOpt.hs). The Cmm Optimiser generally runs a few simplification passes over primitive Cmm operations, inlines simple Cmm expressions that do not contain global registers (these would be left to one of the [Backends](commentary/compiler/backends), which currently cannot handle inlines with global registers) and performs a simple loop optimisation.
|
|
|
Cmm is not a stand alone C-- compiler; it is an implementation of C-- embedded in the GHC compiler. One difference between Cmm and a C-- compiler like [Quick C--](http://cminusminus.org/code.html) is this: Cmm uses the C preprocessor (cpp). Cpp lets Cmm *integrate* with C code, especially the C header defines in [includes](/trac/ghc/browser/ghc/includes), and among many other consequences it makes the C-- `import` and `export` statements irrelevant; in fact, according to [compiler/cmm/CmmParse.y](/trac/ghc/browser/ghc/compiler/cmm/CmmParse.y) they are ignored. The most significant action taken by the Cmm modules in the Compiler is to optimise Cmm, through [compiler/cmm/CmmOpt.hs](/trac/ghc/browser/ghc/compiler/cmm/CmmOpt.hs). The Cmm Optimiser generally runs a few simplification passes over primitive Cmm operations, inlines simple Cmm expressions that do not contain global registers (these would be left to one of the [Backends](commentary/compiler/backends), which currently cannot handle inlines with global registers) and performs a simple loop optimisation.
|
|
|
|
|
|
### Code Blocks in Cmm
|
|
|
|
... | ... | @@ -673,7 +673,7 @@ data CmmStatic |
|
|
```
|
|
|
|
|
|
|
|
|
Note the `CmmAlign` constructor: this maps to the assembler directive `.align N` to set alignment for a data item (hopefully one you remembered to label). This is the same as the `align` directive noted in Section 4.5 of the [ C-- specification (PDF)](http://cminusminus.org/extern/man2.pdf). In the current implementation of Cmm the `align` directive seems superfluous because [compiler/nativeGen/PprMach.hs](/trac/ghc/browser/ghc/compiler/nativeGen/PprMach.hs) translates `Section`s to assembler with alignment directives corresponding to the target architecture (see [Sections and Directives](commentary/compiler/cmm-type#sections-and-directives), below).
|
|
|
Note the `CmmAlign` constructor: this maps to the assembler directive `.align N` to set alignment for a data item (hopefully one you remembered to label). This is the same as the `align` directive noted in Section 4.5 of the [C-- specification (PDF)](http://cminusminus.org/extern/man2.pdf). In the current implementation of Cmm the `align` directive seems superfluous because [compiler/nativeGen/PprMach.hs](/trac/ghc/browser/ghc/compiler/nativeGen/PprMach.hs) translates `Section`s to assembler with alignment directives corresponding to the target architecture (see [Sections and Directives](commentary/compiler/cmm-type#sections-and-directives), below).
|
|
|
|
|
|
#### Labels
|
|
|
|
... | ... | @@ -1372,7 +1372,7 @@ For an example, the floating point sine function, `sinFloat#` in [compiler/prelu |
|
|
## Cmm Design: Observations and Areas for Potential Improvement
|
|
|
|
|
|
|
|
|
"If the application of a primitive operator causes a system exception, such as division by zero, this is an unchecked run-time error. (A future version of this specification may provide a way for a program to recover from such an exception.)" C-- spec, Section 7.4. Cmm may be able to implement a partial solution to this problem, following the paper: [ A Single Intermediate Language That Supports Multiple Implementations of Exceptions (2000)](http://cminusminus.org/abstracts/c--pldi-00.html). (TODO write notes to wiki and test fix.)
|
|
|
"If the application of a primitive operator causes a system exception, such as division by zero, this is an unchecked run-time error. (A future version of this specification may provide a way for a program to recover from such an exception.)" C-- spec, Section 7.4. Cmm may be able to implement a partial solution to this problem, following the paper: [A Single Intermediate Language That Supports Multiple Implementations of Exceptions (2000)](http://cminusminus.org/abstracts/c--pldi-00.html). (TODO write notes to wiki and test fix.)
|
|
|
|
|
|
|
|
|
The IEEE 754 specification for floating point numbers defines exceptions for certain floating point operations, including:
|
... | ... | |