Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 5,247
    • Issues 5,247
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 577
    • Merge requests 577
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell CompilerGlasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #21142
Closed
Open
Issue created Feb 26, 2022 by doyougnu@doyougnuDeveloper

What is the RuntimeRep requirements for novel Backends?

What

This issue was requested in the conversation of !7577 and serves as a summary of that thread. It should be the single place to discuss these issues. Please make no further comment on !7577 and feel free to edit/rename if I've missed something important.

Progenitor issue: #21078 (closed)

Status

** ON HOLD **
Please see !7577 status

The problem

The essential problem is Does the current design of RuntimeRep satisfy the needs of novel backends?. The discussion becomes tricky because answering that question implies more questions:

  1. How is equality on RuntimeRep defined? The definition of a given RuntimeRep is defined by the platform, but this means that for new backends we are trying to reason about platforms we do not yet support or know about.
  2. Given (1) How are we to make RuntimeRep platform dependent and extensible for future backends? See the suggestions section below.

Background

This problem arose in work on the new javascript backend (see #21078 (closed)).

  1. We (the IOG team) believed we needed a new prim type, I (Jeff) called Opaque#, that served as a handle to arbitrary platform-specific data (javascript values). Crucially, Opaque#s are not necessarily pointer-sized. This feature is on the critical path for GHCJS because it allows us to marshal types to and from javascript. In the previous ghcjs, Opaque# is called JSVal. Thus with this prim type we should be able to basically copy marshalling code from the old ghcjs:
-- data JSVal = JSVal ByteArray# -- for reference, old implementation in old ghcjs
newtype JSVal = JSVal Opaque#    -- new implementation with prim type

-- Pure marshalling to a javascript value
class PToJSVal a where
  pToJSVal :: a -> JSVal

-- Pure mashalling from javascript to a haskell value
class PFromJSVal a where
  pFromJSVal :: JSVal -> a
  1. So I (Jeff) implemented this new prim type. In the process of doing so it occurred to me that we needed a new RuntimeRep to support the type. But none of the current RuntimeRep cases support our use case:
data RuntimeRep = VecRep VecCount VecElem   -- ^ a SIMD vector type
                | TupleRep [RuntimeRep]     -- ^ An unboxed tuple of the given reps
                | SumRep [RuntimeRep]       -- ^ An unboxed sum of the given reps
                | BoxedRep Levity -- ^ boxed; represented by a pointer
                | IntRep          -- ^ signed, word-sized value
                | Int8Rep         -- ^ signed,  8-bit value
                | Int16Rep        -- ^ signed, 16-bit value
                | Int32Rep        -- ^ signed, 32-bit value
                | Int64Rep        -- ^ signed, 64-bit value
                | WordRep         -- ^ unsigned, word-sized value
                | Word8Rep        -- ^ unsigned,  8-bit value
                | Word16Rep       -- ^ unsigned, 16-bit value
                | Word32Rep       -- ^ unsigned, 32-bit value
                | Word64Rep       -- ^ unsigned, 64-bit value
                | AddrRep         -- ^ A pointer, but /not/ to a Haskell value
                | FloatRep        -- ^ a 32-bit floating point number
                | DoubleRep       -- ^ a 64-bit floating point number

Ideally we would use AddrRep, but because Opaque# has no guarantee to be pointer sized (a fact of javascript as a target platform) we cannot use it.

Goals

  • A RuntimeRep that works for GHCJS and for Asterius, i.e., for javascript backend and web assembly backend.

Suggestions

Add a new runtime rep for each backend

data RuntimeRep = VecRep VecCount VecElem   -- ^ a SIMD vector type
                | TupleRep [RuntimeRep]     -- ^ An unboxed tuple of the given reps
                | SumRep [RuntimeRep]       -- ^ An unboxed sum of the given reps
                | BoxedRep Levity -- ^ boxed; represented by a pointer
                | ...
                | JSValRep          -- ^ Javascript values and objects
                | JVMValRep         -- ^ JVM values and objects
                | FooValRep         -- ^ Foo values and objects

Use CPP

data RuntimeRep = VecRep VecCount VecElem   -- ^ a SIMD vector type
                | TupleRep [RuntimeRep]     -- ^ An unboxed tuple of the given reps
                | SumRep [RuntimeRep]       -- ^ An unboxed sum of the given reps
                | BoxedRep Levity -- ^ boxed; represented by a pointer
                | ...
                | JSValRep          -- ^ Javascript values and objects
                | JVMValRep         -- ^ JVM values and objects
                | FooValRep         -- ^ Foo values and objects
#ifdef javascript_HOST_ARCH
                | JSRef
#elif jvm_HOST_ARCH
                | JVMRef
#elif beam_HOST_ARCH
  ... 
#elif foo_HOST_ARCH
  ... 

Use Pattern synonyms

@Ericson2314 writes:

The ability to do

pattern Opaque =
#if defined javascript_HOST_ARCH
  JsRef
#elif defined wasm_HOST_ARCH
  WasmRef
#elif defined jvm_HOST_ARCH
  JvmRef
#endif

does make me think @monoidal is right that erring on the side of more separate things is fine. We can always unify them later, but splitting apart is not so easy!

Points in the previous discourse

In this section I'll try to run down the major points in the thread on !7577 to consolidate the conversation.

What is wrong with old implementation?

Summary

  • It is a hack using ByteArray#
  • It is unclear to me (Jeff) why exactly this is problematic. What issues does it cause exactly? Is there something we want to do but can't due to this implementation? Is it a slow implementation? Or just conceptually wrong?

Discourse

According to @luite:

JSVal in GHCJS is currently represented as data JSVal = JSVal ByteArray#, where an arbitrary JavaScript value is stored at the position of the ByteArray# field.

But this is a bit of a hack. At the moment we have these primitive types with their JS representation:

  • Word#/Word32#: JS number
  • Int#/Int32#: JS number
  • ByteArray#: JS object that wraps a typed array
  • Addr#: A pair of a JS number (offset) and a typed array object

None of these types exactly matches the "any JavaScript value" that we proposed the Opaque# type for.

In response @sgraf812 suggests less invasive changes than a prim type:

Why can't we have

  • type Opaque# :: TYPE JSRep
  • type ByteArray# :: TYPE JSRep or newtype ByteArray# a = ByteArray# Opaque#

Would that work? ... Thinking about it, I naively claim

To the untyped JS backend, every RuntimeRep except the special AddrRep and BoxedRep could be treated the same.

(AddrRep needs pointer arithmetic, BoxedRep needs support from the RTS, I suppose. Hence they are excluded.)

That is, we could define the axioms type Word :: TYPE DoubleRep or type Double :: TYPE IntRep and still manage to compile valid JavaScript. Is that right? Why isn't it?

Questions

  1. Are JSVal or Opaque types are GC'd by the javascript runtime? (I assume yes)
  2. What is wrong with the ByteArray# implementation exactly?
  3. @sgraf812 questions from above; generally, is a new prim type actually required? More specifically:
  4. Could we get away with changing the representation of a type rather than adding a prim type
  5. Could we use type Word :: TYPE DoubleRep and still compile valid js since js is essentially untyped anyway?

Needs on the wasm side

Summary:

  • wasm does not need Opaque# as I have defined it.
  • JSVal# in the wasm backend live on the Haskell heap and thus BoxedRep Unlifted can be used
  • But this means the wasm backend needs special logic in GHC's GC.

@TerrorJack writes:

To provide a bit more context from the wasm side: when we add JavaScript interop for wasm support, our JSVal# prim type is expected to be UnliftedRep, with a word-sized payload to represent a table index. The JSVal# closures are managed by the C garbage collector. We do need additional hooks in GC, so the live JSVal#s on the Haskell heap can be collected and reported to JS periodically though. So it seems the Opaque type design here cannot be used per-se for wasm.

Specifically because in wasm, JSVal#s exist on the Haskell heap and thus BoxedRep Unlifted works.

No, AddrRep has no special meaning in wasm. It's expected to be C memory address.

UnliftedRep is really BoxedRep Unlifted, which is the representation of an unlifted, boxed pointer to the Haskell heap, managed by GHC's GC. BoxedRep Unlifted points to the Haskell heap and AddrRep points to something in C land (or WebAsm/JS land) that the Haskell GC shouldn't need to follow.

Yes, and that's exactly what I want. All JSVal#s exist on the Haskell heap; they do need special handling to cooperate with JS though, and whatever special handling we add is supposed to be no-op on other native platforms.

Questions

  1. What logic is required in the GC to collect and communicate collection of wasm JSVals?
  2. What exactly is the definition of JSVal in wasm? newtype JSVal# = JSVal# Addr# with rep BoxedRep Unlifted?

Other major points

  • @bgamari notes that AddrRep is always ignored by GHC's GC.
  • That fact about AddrRep leads us back full circle to the platform dependency of RuntimeRep, as noted by @TerrorJack:

I agree it's a very important property. JSRep, JVMRep, CLRRep or whatever foreign runtime rep may all have differences re how they interact with GC.

Edited Mar 22, 2022 by doyougnu
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking