Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,273
    • Issues 4,273
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 413
    • Merge Requests 413
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Wiki
  • High level Cmm

Last edited by Moritz Angermann Jul 08, 2020
Page history New page

High level Cmm

High-level Cmm

This is the discussion page about a "high-level" variant of Cmm.

Motivation

Garbage Collection

Asterius is a GHC-based Haskell-to-WebAssembly compiler. Currently, Asterius emits WebAssembly from Cmm, and at the Cmm level, there's a lot of implicit convention about how the runtime works, e.g. closure representation and heap allocation. The Asterius custom runtime needs to work with these conventions, and implement its own garbage collector.

It's possible to use the host platform to represent closures and do garbage collection. WebAssembly's reference types proposal allow JavaScript objects to be imported into WebAssembly as opaque references, and by representing closures as JavaScript objects, a language runtime can get garbage collection for free. schism is such an example. WebAssembly is also working on an MVP of GC proposal which adds native support of garbage-collected structs and arrays.

In order to take advantage of the host platform's garbage collector, we need to work with a "high-level" variant of Cmm. It should satisfy the following properties:

  • The garbage collected pointer type should be completely opaque
  • Closure allocation is done by a single primitive, no explicit heap/stack check
  • Can be quickly lowered to vanilla Cmm for NCG, and native runtime performance of generated code doesn't regress.

Such a high-level Cmm will be useful to any GHC backend which targets a managed runtime.

Intermediate Representations

When targeting intermediate representations for assembly, e.g. LLVM's IR, the Cmm we have is already lowered too far. This results in those code generators having to reconstruct information from the Cmm that should ideally be available to them.

This is mostly around computed offsets. Instead of getting the offset into structures an intermediate layer between STG and Cmm should retain offsets as multiples of words + bytes as needed to allow the code generator to make intelligent choices about instruction selection for element retrieval. Right now we have to work with packed structs as that is what GHC internally assumes the layout to be.

Required features in high-level Cmm

  • CmmGcPtr as a CmmType. CmmGcPtr is not tied to the platform's word size, and there is no bitcast between CmmGcPtr and regular Cmm types.
  • CmmGcLoad/CmmGcStore, which load/store a single CmmGcPtr field in a closure. The closure field address consists of two CmmExprs: the closure address and the offset.
  • CmmGcAlloc as a CmmNode, which return the CmmGcPtr of the allocated closure.
  • Dedicated set of GlobalRegs for passing CmmGcPtrs in Cmm function calls.

Lowering to vanilla Cmm

  • CmmGcPtr is converted to gcWord.
  • CmmGcLoad/CmmGcStore is converted to CmmLoad/CmmStore.
  • CmmGcAlloc is converted to the regular heap check.

Interaction with existing runtime features

  • STG stack: allocating a stack frame should use the same mechanism of allocating a regular closure, possibly with extra annotation that this is for the stack. For managed runtimes, the stack can be modeled as an array or linked list of stack frame closures, and no need to handle underflow/overflow. For NCG, it can be lowered to regular stack check.
  • Dynamic pointer tagging: won't work at all in high-level Cmm. We always need to read a closure's info table to extract the desired info, e.g. if it's evaluated.
  • Selector thunk optimization and IND elimination: works with high-level Cmm.
  • byteArrayContents#: expose the payload of a pinned ByteArray# as an unmanaged pointer. Works with high-level Cmm, since we support non-gcptr payload in closures, and we can put a malloced pointer in a ByteArray#.
  • anyToAddr#: reinterpret-cast from a gcptr to an unmanaged pointer. Won't work in high-level Cmm.
  • Weak pointers and finalizers: the semantics of weak pointers and finalizers will depend on the embedder of high-level Cmm.
  • ghc-heap compatability: likely won't work.
  • Compact region: unknown.
Clone repository

GHC Home
GHC User's Guide

Joining In

Newcomers info
Mailing Lists & IRC
The GHC Team

Documentation

GHC Status Info
Working conventions
Building Guide
Debugging
Commentary

Wiki

Title Index
Recent Changes