Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,251
    • Issues 4,251
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 396
    • Merge Requests 396
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Wiki
    • Data parallel
  • design

Last edited by Tobias Dammers Mar 29, 2019
Page history New page

design

High-level design of nested data parallelism in GHC

The NDP part of the compiler is made up of two components: a parallel array library and code vectorisation, a transformation which eliminates nested parallelism. The library defines the type of parallel arrays, supporting operations and typeclasses and a loop fusion framework. Crucially, the actual representation of a parallel array is determined by the type of its elements. For instance, [:Int:] is just an array of unboxed Ints whereas [:(a,b):] is, essentially, a pair of arrays ([:a:],[:b:]). Associated data types and type synonyms allow us to implement this entirely in the library, without having to modify the compiler. In contrast to this, code vectorisation is implemented as a Core-to-Core transformation in GHC. In order to be able to deal with higher-order functions in parallel contexts, we also perform closure conversion as part of code vectorisation.

This is a rough sketch of how the various components fit into the compiler pipeline:

                    Program                                                    Library
                    =======                                                    =======

                    Haskell
                 parallel arrays         <--------------+
                nested parallelism                      |
                       |                                |
                       * desugaring                     |
                       |                                |
                       v                                |
                     Core                               |
                 parallel arrays         <--------------+---------------  Parallel arrays ([:.:])
                nested parallelism                      |                         |
                       |                                |                         |
                       * vectorisation                  |                         |
                       |                                |                         |
                       v                                |                         |
                     Core                               |                         |
                 parallel arrays         <--------------+                         |
                 flat parallelism                                                 |
                       |                                                          |
                       * inlining                                                 |
                       |                                                          |
                       v                                                          |
                     Core                                                         v
                 unboxed arrays          <------------------------------  Parallel operations
              (parallel operations)                                        on unboxed arrays
                       |                                                          |
                       * inlining                                                 |
                       |                                                          |
                       v                                                          |
                     Core                                                         v
                distributed types        <----+------------------------    Distributed types
                 unboxed arrays               |                                   |
             (sequential operations)          |                   +---------------+---------------+
                       |                      |                   |                               |
                       * fusion               |                   v                               |
                       |                      +-----------  Unboxed arrays                        |
                       v                      |              (sequential)                         |
                     Core                     |                                                   |
                distributed types        <----+                                                   |
                 unboxed arrays                                                                   |
             (sequential operations)                                                              |
                       |                                                                          |
                       * inlining                                                                 |
                       |                                                                          |
                       v                                                                          |
                     Core                                                                         v
                 gang operations         <----------------------------------------------------  Gangs
                   ByteArrays
                       |
                       * optimisation
                       |
                       v
                     Core
                 gang operations
                       |
                       * code generation
                       |
                       v
                  Object code
                 RTS with gangs
Haskell with nested parallelism This is vanilla Haskell with the parallel array data type [:.:], array comprehensions, and collective array operations.
Core with nested parallelism The result of normal desugaring; in particular, array comprehensions are eliminated.
Core with flat parallelism code vectorisation replaces nested parallelism by operations on flat arrays.
Core with parallel operations on unboxed arrays All operations on [:.:] are inlined, leaving only parallel operations on simple unboxed arrays.
Core with distributed types and sequential unboxed arrays Parallel operations on unboxed arrays, which are implemented in terms of distributed types and sequential array operations, are inlined. Fusion rules are applied to the result.
Core with gang operations Operations on distributed types and unboxed arrays are inlined, producing code which only makes use of gang operations and the standard libraries. It is optimised as usual.

See the paper Data Parallel Haskell: a status report for an in-depth illustration of this architecture with concrete code of a running example at each transformation stage.

Clone repository

GHC Home
GHC User's Guide

Joining In

Newcomers info
Mailing Lists & IRC
The GHC Team

Documentation

GHC Status Info
Working conventions
Building Guide
Debugging
Commentary

Wiki

Title Index
Recent Changes