Skip to content

WIP: A step towards a working fix #25636

recursion-ninja requested to merge wip/fix-25636 into master

Introduction

This MR is a work in progress to fix issue #25636.

Looking at the MR from a high level, it seems that the MR soon have resolved the problem. Unfortunately, it has not resolved issue #25636 and therefore I must have missed some smaller technical detail in the long process of extending the bytecode interpreter. I need some assistance via code review in locating what is missing in the MR to complete this patch.

Reproduction

Build the branch then run the following:

cd testsuite/tests/codeGen/should_run/T23146
../../../../../_build/stage1/bin/ghc --interactive T23146_liftedeq.hs
ghci > main

It used to throw a compiler panic. Then I got it to only throw a linker error:

GHC.ByteCode.Linker.lookupCE
During interactive linking, GHCi couldn't find the following symbol:
  closure:$WUNil
This may be due to you not asking GHCi to load extra object files,
archives or DLLs needed by your current session.  Restart GHCi, specifying
the missing library using the -L/path/to/object/dir and -lmissinglibname
flags, or simply by naming the relevant files on the GHCi command line.
Alternatively, this link failure might indicate a bug in GHCi.
If you suspect the latter, please report this as a GHC bug:
  https://www.haskell.org/ghc/reportabug

Currently calling main appears to loop indefinitely.

The Patch

We require ensuring that unlifted data types are strictly evaluated in the bytecode interpreter.

Due to "kind erasure" when converting from STG to ByteCode, we cannot simply query the ByteCode Object (BCO) for it's kind levity and branch accordingly into strict or lazy evaluation.

Instead, we must tag the BCO in some way during the STG to ByteCode transformation so that it can later be correctly evaluated by the interpreter.

Here is the workflow:

  1. We create a new logical type for the interpreter, an Unlifted Data Constructor (UDC). This UDC can be either 'unlinked' or 'resolved' (similar to BCOs).

  2. All UDCs begin as an unlinked UDC, represented by the data-type UnlinkedUDC defined in compiler/GHC/ByteCode/Types.hs. This data-type stores the binding's Name so that it can be correctly referenced and also stores the ConInfoTable fo that the unlifted value can be dynamically constructed at runtime. The ConInfoTable data-type and type-class instances have been moved from libraries/ghci/GHCi/Message.hs to ``libraries/ghci/GHCi/ResolvedBCO.hs` to avoid module import dependencies.

  3. When converting the STG to ByteCode, within the schemeTopBind function defined in compiler/GHC/StgToByteCode.hs, the compiler tests the kind of the object while the kind information is still in scope (before "kind erasure").

  4. The function schemeTopBind is called by byteCodeGen also defined in compiler/GHC/StgToByteCode.hs. The results of the schemeTopBind now contain both UDCs and BCOs. These types are partitioned into two FlatBags.

  5. The function byteCodeGen then calls assembleBCOs defined in compiler/GHC/ByteCode/Asm.hs, with both FlatBags of UDC and of BCOs (where before htere were only BCOs).

  6. The function assembleBCOs dutifully adds the FlatBag of UDCs to the CompiledByteCode data-type defined compiler/GHC/ByteCode/Types.hs.

  7. The CompiledByteCode data-type is eventually consumed by the linkSomeBCOs function defined in compiler/GHC/Linker/Loader.hs. Here the BCOs are converted from UnlinkedBCOs to ResolvedBCOs by recursively and lazily instantiating all pointers/references contained in the collection of BCOs. This same function now needs to also strictly instantiate the UCDs and have the BCOs correctly point to them.

  8. The reference resolution of the UDCs/BCSs of the CompiledByteCode happens in the linkBCO function, defined in compiler/GHC/ByteCode/Linker.hs. A sub-call to resolvePtr defined in the same module now takes both name references to BCOs and to UDCs and conditionally creates ResolvedBCORef or ResolvedBCORefUnlifted values, respectively. The result of this process is a ResolvedBCO.

  9. The result of linkSomeBCOs after calling linkBCO on all supplied BCOs, is a mapping of Names to HValues.

  10. The mapping is then used by dynLinkBCOs defined in compiler/GHC/Linker/Loader.hs to add the BCOs to the closure_env of the LoaderState. The function dynLinkBCOs is transitively used by the top level binds loadDecls and loadModule of the same module.

  11. The GHCi interpreter interface is extended with a new operation CreateUDCs defined in libraries/ghci/GHCi/Message.hs. This is used for instantiating the UDCs.

  12. Naturally the GHCi interpreter evaluator defined in libraries/ghci/GHCi/Run.hs is also extended to take a CreateUDCs value and call the createUDCs function defined in libraries/ghci/GHCi/CreateBCO.hs.

  13. The createUDCs function transitively calls the new prim-op newUDC# in order to instantiate the UDC value.

  14. The newUDC# prim-op is defined in compiler/GHC/Builtin/primops.txt.pp by the NewUDCOp operator. This prim-op *should be associated with the C function stg_newUDHzh defined in rts/PrimOps.cmm

Somewhere, something isn't quite lining up and the bytecode interpreter is still not correctly evaluation unlifted data constructors.

Merge request reports

Loading