|
# GHC Commentary: The Code Generator
|
|
CONVERSION ERROR
|
|
|
|
|
|
[compiler/codeGen](/trac/ghc/browser/ghc/compiler/codeGen)
|
|
Error: HttpError (HttpExceptionRequest Request {
|
|
|
|
host = "ghc.haskell.org"
|
|
|
|
port = 443
|
|
See [The Storage Manager](commentary/rts/storage) for the [Layout of the stack](commentary/rts/storage/stack).
|
|
secure = True
|
|
|
|
requestHeaders = []
|
|
## Storage manager representations
|
|
path = "/trac/ghc/wiki/Commentary/Compiler/CodeGen"
|
|
|
|
queryString = "?version=7"
|
|
|
|
method = "GET"
|
|
The code generator needs to know the layout of heap objects, because it generates code that accesses and constructs those heap objects. The runtime also needs to know about the layout of heap objects, because it contains the garbage collector. How can we share the definition of storage layout such that the code generator and the runtime both have access to it, and so that we don't have to keep two independent definitions in sync?
|
|
proxy = Nothing
|
|
|
|
rawBody = False
|
|
|
|
redirectCount = 10
|
|
Currently we solve the problem this way:
|
|
responseTimeout = ResponseTimeoutDefault
|
|
|
|
requestVersion = HTTP/1.1
|
|
- C types representing heap objects are defined in the C header files, see for example [includes/Closures.h](/trac/ghc/browser/ghc/includes/Closures.h).
|
|
}
|
|
|
|
(StatusCodeException (Response {responseStatus = Status {statusCode = 403, statusMessage = "Forbidden"}, responseVersion = HTTP/1.1, responseHeaders = [("Date","Sun, 10 Mar 2019 07:04:13 GMT"),("Server","Apache/2.2.22 (Debian)"),("Strict-Transport-Security","max-age=63072000; includeSubDomains"),("Vary","Accept-Encoding"),("Content-Encoding","gzip"),("Content-Length","261"),("Content-Type","text/html; charset=iso-8859-1")], responseBody = (), responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}) "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>403 Forbidden</title>\n</head><body>\n<h1>Forbidden</h1>\n<p>You don't have permission to access /trac/ghc/wiki/Commentary/Compiler/CodeGen\non this server.</p>\n<hr>\n<address>Apache/2.2.22 (Debian) Server at ghc.haskell.org Port 443</address>\n</body></html>\n"))
|
|
- A C program, [includes/mkDerivedConstants.c](/trac/ghc/browser/ghc/includes/mkDerivedConstants.c), `#includes` the runtime headers.
|
|
|
|
This program is built and run when you type `make` or `make boot` in `includes/`. It is
|
|
Original source:
|
|
run twice: once to generate `includes\DerivedConstants.h`, and again to generate
|
|
|
|
`includes/GHCConstants.h`.
|
|
```trac
|
|
|
|
|
|
- The file `DerivedConstants.h` contains lots of `#defines` like this:
|
|
|
|
|
|
= GHC Commentary: The Code Generator =
|
|
```wiki
|
|
|
|
#define OFFSET_StgTSO_why_blocked 18
|
|
[[GhcFile(compiler/codeGen)]]
|
|
```
|
|
|
|
|
|
See [wiki:Commentary/Rts/Storage The Storage Manager] for the [wiki:Commentary/Rts/Storage/Stack Layout of the stack].
|
|
which says that the offset to the why_blocked field of an `StgTSO` is 18 bytes. This file
|
|
|
|
is `#included` into [includes/Cmm.h](/trac/ghc/browser/ghc/includes/Cmm.h), so these offests are available to the
|
|
|
|
[hand-written .cmm files](commentary/rts/cmm).
|
|
== Storage manager representations ==
|
|
|
|
|
|
- The file `GHCConstants.h` contains similar definitions:
|
|
The code generator needs to know the layout of heap objects, because it generates code that accesses and constructs those heap objects. The runtime also needs to know about the layout of heap objects, because it contains the garbage collector. How can we share the definition of storage layout such that the code generator and the runtime both have access to it, and so that we don't have to keep two independent definitions in sync?
|
|
|
|
|
|
```wiki
|
|
Currently we solve the problem this way:
|
|
oFFSET_StgTSO_why_blocked = 18::Int
|
|
|
|
```
|
|
* C types representing heap objects are defined in the C header files, see for example [[GhcFile(includes/Closures.h)]].
|
|
|
|
|
|
This time the definitions are in Haskell syntax, and this file is `#included` directly into
|
|
* A C program, [[GhcFile(includes/mkDerivedConstants.c)]], `#includes` the runtime headers.
|
|
[compiler/main/Constants.lhs](/trac/ghc/browser/ghc/compiler/main/Constants.lhs). This is the way that these offsets are made
|
|
This program is built and run when you type `make` or `make boot` in `includes/`. It is
|
|
available to GHC's code generator.
|
|
run twice: once to generate `includes\DerivedConstants.h`, and again to generate
|
|
|
|
`includes/GHCConstants.h`.
|
|
## Generated Cmm Naming Convention
|
|
|
|
|
|
* The file `DerivedConstants.h` contains lots of `#defines` like this:
|
|
|
|
{{{
|
|
See [compiler/cmm/CLabel.hs](/trac/ghc/browser/ghc/compiler/cmm/CLabel.hs)
|
|
#define OFFSET_StgTSO_why_blocked 18
|
|
|
|
}}}
|
|
|
|
which says that the offset to the why_blocked field of an `StgTSO` is 18 bytes. This file
|
|
Labels generated by the code generator are of the form `<name>_<type>`
|
|
is `#included` into [[GhcFile(includes/Cmm.h)]], so these offests are available to the
|
|
where `<name>` is `<Module>_<name>` for external names and `<unique>` for
|
|
[wiki:Commentary/Rts/Cmm hand-written .cmm files].
|
|
internal names. `<type>` is one of the following:
|
|
|
|
|
|
* The file `GHCConstants.h` contains similar definitions:
|
|
<table><tr><th>info</th>
|
|
{{{
|
|
<td>Info table
|
|
oFFSET_StgTSO_why_blocked = 18::Int
|
|
</td></tr>
|
|
}}}
|
|
<tr><th>srt</th>
|
|
This time the definitions are in Haskell syntax, and this file is `#included` directly into
|
|
<td>Static reference table
|
|
[[GhcFile(compiler/main/Constants.lhs)]]. This is the way that these offsets are made
|
|
</td></tr>
|
|
available to GHC's code generator.
|
|
<tr><th>srtd</th>
|
|
|
|
<td>Static reference table descriptor
|
|
== Generated Cmm Naming Convention ==
|
|
</td></tr>
|
|
|
|
<tr><th>entry</th>
|
|
See [[GhcFile(compiler/cmm/CLabel.hs)]]
|
|
<td>Entry code (function, closure)
|
|
|
|
</td></tr>
|
|
Labels generated by the code generator are of the form {{{<name>_<type>}}}
|
|
<tr><th>slow</th>
|
|
where {{{<name>}}} is {{{<Module>_<name>}}} for external names and {{{<unique>}}} for
|
|
<td>Slow entry code (if any)
|
|
internal names. {{{<type>}}} is one of the following:
|
|
</td></tr>
|
|
|
|
<tr><th>ret</th>
|
|
info:: Info table
|
|
<td>Direct return address
|
|
srt:: Static reference table
|
|
</td></tr>
|
|
srtd:: Static reference table descriptor
|
|
<tr><th>vtbl</th>
|
|
entry:: Entry code (function, closure)
|
|
<td>Vector table
|
|
slow:: Slow entry code (if any)
|
|
</td></tr>
|
|
ret:: Direct return address
|
|
<tr><th>\<n\>_alt</th>
|
|
vtbl:: Vector table
|
|
<td>Case alternative (tag n)
|
|
<n>_alt:: Case alternative (tag n)
|
|
</td></tr>
|
|
dflt:: Default case alternative
|
|
<tr><th>dflt</th>
|
|
btm:: Large bitmap vector
|
|
<td>Default case alternative
|
|
closure:: Static closure
|
|
</td></tr>
|
|
con_entry:: Dynamic Constructor entry code
|
|
<tr><th>btm</th>
|
|
con_info:: Dynamic Constructor info table
|
|
<td>Large bitmap vector
|
|
static_entry:: Static Constructor entry code
|
|
</td></tr>
|
|
static_info:: Static Constructor info table
|
|
<tr><th>closure</th>
|
|
sel_info:: Selector info table
|
|
<td>Static closure
|
|
sel_entry:: Selector entry code
|
|
</td></tr>
|
|
cc:: Cost centre
|
|
<tr><th>con_entry</th>
|
|
ccs:: Cost centre stack
|
|
<td>Dynamic Constructor entry code
|
|
|
|
</td></tr>
|
|
Many of these distinctions are only for documentation reasons. For
|
|
<tr><th>con_info</th>
|
|
example, _ret is only distinguished from _entry to make it easy to
|
|
<td>Dynamic Constructor info table
|
|
tell whether a code fragment is a return point or a closure/function
|
|
</td></tr>
|
|
entry.
|
|
<tr><th>static_entry</th>
|
|
|
|
<td>Static Constructor entry code
|
|
== Modules ==
|
|
</td></tr>
|
|
{{{CodeGen}}}:: Top level. Called by the {{{HscMain}}} module.
|
|
<tr><th>static_info</th>
|
|
|
|
<td>Static Constructor info table
|
|
{{{CgMonad}}}:: The monad that most of codeGen operates inside
|
|
</td></tr>
|
|
* Reader
|
|
<tr><th>sel_info</th>
|
|
* State
|
|
<td>Selector info table
|
|
* (could be Writer?)
|
|
</td></tr>
|
|
* fork
|
|
<tr><th>sel_entry</th>
|
|
* flatten
|
|
<td>Selector entry code
|
|
|
|
</td></tr>
|
|
{{{CgExpr}}}:: Seems to be the core function since everything in STG is an expression
|
|
<tr><th>cc</th>
|
|
|
|
<td>Cost centre
|
|
=== Not yet classified ===
|
|
</td></tr>
|
|
Please help classify these if you know what they are.
|
|
<tr><th>ccs</th>
|
|
|
|
<td>Cost centre stack
|
|
Bitmap
|
|
</td></tr></table>
|
|
|
|
|
|
ClosureInfo:: Stores info about the memory layouts of closures
|
|
|
|
SMRep::
|
|
Many of these distinctions are only for documentation reasons. For
|
|
Storage manager representation of closures.
|
|
example, _ret is only distinguished from _entry to make it easy to
|
|
Part of ClosureInfo but kept separate to "keep nhc happy."
|
|
tell whether a code fragment is a return point or a closure/function
|
|
|
|
entry.
|
|
CgTicky
|
|
|
|
CgUtils
|
|
## Modules
|
|
|
|
|
|
CgBindery
|
|
<table><tr><th>`CodeGen`</th>
|
|
CgHeapery
|
|
<td>Top level. Called by the `HscMain` module.
|
|
CgStackery
|
|
</td></tr></table>
|
|
|
|
|
|
CgClosure
|
|
<table><tr><th>`CgMonad`</th>
|
|
CgCon
|
|
<td>The monad that most of codeGen operates inside
|
|
|
|
|
|
CgCase
|
|
- Reader
|
|
CgLetNoEscape
|
|
- State
|
|
|
|
- (could be Writer?)
|
|
CgHpc
|
|
- fork
|
|
CgParallel
|
|
- flatten
|
|
CgProf
|
|
|
|
|
|
</td></tr></table>
|
|
CgInfoTbls
|
|
|
|
CgCallConv
|
|
<table><tr><th>`CgExpr`</th>
|
|
|
|
<td>Seems to be the core function since everything in STG is an expression
|
|
CgPrimOp
|
|
</td></tr></table>
|
|
CgTailCall
|
|
|
|
CgForeignCall
|
|
### Not yet classified
|
|
|
|
|
|
``` |
|
|
|
|
|
Please help classify these if you know what they are.
|
|
|
|
|
|
|
|
>
|
|
|
|
> Bitmap
|
|
|
|
> ClosureInfo
|
|
|
|
> SMRep
|
|
|
|
|
|
|
|
>
|
|
|
|
> CgTicky
|
|
|
|
> CgUtils
|
|
|
|
|
|
|
|
>
|
|
|
|
> CgBindery
|
|
|
|
> CgHeapery
|
|
|
|
> CgStackery
|
|
|
|
|
|
|
|
>
|
|
|
|
> CgClosure
|
|
|
|
> CgCon
|
|
|
|
|
|
|
|
>
|
|
|
|
> CgCase
|
|
|
|
> CgLetNoEscape
|
|
|
|
|
|
|
|
>
|
|
|
|
> CgHpc
|
|
|
|
> CgParallel
|
|
|
|
> CgProf
|
|
|
|
|
|
|
|
>
|
|
|
|
> CgInfoTbls
|
|
|
|
> CgCallConv
|
|
|
|
|
|
|
|
>
|
|
|
|
> CgPrimOp
|
|
|
|
> CgTailCall
|
|
|
|
> CgForeignCall |
|
|