Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,836
    • Issues 4,836
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 459
    • Merge requests 459
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell Compiler
  • GHCGHC
  • Wiki
  • CafInfo rework

Last edited by Ömer Sinan Ağacan Dec 09, 2019
Page history New page

CafInfo rework

CafInfo rework

This page summarizes the CafInfo rework done in !1304 (merged).

At the time of this writing (before !1304 (merged)) the CafInfos of top-level bindings are computed in the tidying pass (tidyProgram, [Tidy the top-level bindings]) based on the current state of the Core and some predictions ([Disgusting computation of CafRefs]).

This is not ideal, it'd be better if we assigned absolutely final CafInfos to top-level binders after all the transformations are taken place so that no pass will have to preserve existing analysis results (e.g. corePrepPgm won't need hacks to preserve CafInfos generated in tidyProgram, [CafInfo and floating]) or have to predict what a future pass will do (e.g. tidyProgram won't have to predict how corePrepPgm will change CafInfos, [CAFfyness inconsistencies due to eta expansion in TidyPgm] and [Disgusting computation of CafRefs])

The idea is being implemented in !1304 (merged). We generate CafInfos during CmmBuildInfoTables, the pass that generates info tables and SRTs. This pass is quite late in the compilation, after this pass we have raw Cmms which are never changed later in a way that invalidates SRTs (and CafInfos), so the CafInfos are final at that point.

There were a few complications on the way:

  • (pre- !1304 (merged)) CmmBuildInfoTables did not do a full analysis starting from scratch, it relied on some previous analysis results generated in tidyProgram. Specifically it assumed existence CafInfos of static data (e.g. top-level constructors). Because we no longer do any CafInfo analysis before CmmBuildInfoTables with !1304 (merged), CmmBuildInfoTables is upated to handle static data as well. The full SRT algorithm is described in the Algorithm section of module documentation of CmmBuildInfoTables.

  • Because we don't know CafInfos of statics before CmmBuildInfoTables now, the Cmm type generated by STG-to-Cmm pass is refactored. We now have two different CmmDecl types, one for pre-CmmBuildInfoTables Cmm decls that use CmmStatics for static data (CmmDecl), another for post-CmmBuildInfoTables Cmm decls that use RawCmmStatics for static data (CmmDeclSRTs). CmmBuildInfoTables.updInfoSRTs takes analysis results and turns CmmDecls into CmmDeclSRTs. cmmToRawCmm deals with CmmDeclSRTs instead of CmmDecls.

  • The analysis done in CmmBuildInfoTables generates a map from labels in the module to their SRTs (SRTMap). To be able to map Ids to CafInfos from this map we consider labels with Names (i.e. exported labels), and turn SRTMap into a NameSet in HscMain.doCodeGen. The NameSet holds non-CAFFY Names in the module -- all other Names are CAFFY.

    To actually find out if an exported Id is CAFFY or not we check if the Ids name is in the set.

    (Note that non-exported Ids don't have Names so they won't be in the map)

  • Worst complication is we previously generated interface of a module in tidyProgram, and CafInfos of exported Ids are part of the interface of a module. So if we want to generate CafInfos later in the compilation we have to generate interfaces later too.

    Interface of a module is represented with two different types:

    • ModIface: This is the type serialized as .hi files. Used in one-shot mode (i.e. without --make) by the importing modules.
    • ModDetails: This is the type that goes into the module graph (HomePackageTable) when building in batch mode (--make).

    These two types need to be in sync as they represent the same thing (interface of a module).

    Previously we generated ModDetails in tidyProgram and generated ModIface right after tidyProgram. !1633 (closed) refactored ModIface generation to generate it after code generation (which is refactored again later in !1969 (closed), and !2230 (closed) is also relevant). However it completely missed ModDetails generation (a mistake), so after this we correctly generated ModIface of a module with correct CafInfos in one-shot mode in !1304 (merged), bug we used the interface without CafInfos in batch mode.

    The ongoing work for generating ModDetails with ModIface is in !2100 (closed). See the MR for the current problems.

(Because ModIface/ModDetails refactoring was large enough it's done in separate MRs)

Current status

The main MR (!1304 (merged)) is ready, but it depends on !2100 (closed), which is not ready. See the MR for the current status.

References

Note [Tidy the top-level bindings]

Next we traverse the bindings top to bottom. For each top-level binder

  1. Make it into a GlobalId; its IdDetails becomes VanillaGlobal, reflecting the fact that from now on we regard it as a global, not local, Id

  2. Give it a system-wide Unique. [Even non-exported things need system-wide Uniques because the byte-code generator builds a single Name->BCO symbol table.]

    We use the NameCache kept in the HscEnv as the source of such system-wide uniques.

    For external Ids, use the original-name cache in the NameCache to ensure that the unique assigned is the same as the Id had in any previous compilation run.

  3. Rename top-level Ids according to the names we chose in step 1. If it's an external Id, make it have a External Name, otherwise make it have an Internal Name. This is used by the code generator to decide whether to make the label externally visible

  4. Give it its UTTERLY FINAL IdInfo; in ptic, * its unfolding, if it should have one * its arity, computed from the number of visible lambdas

Finally, substitute these new top-level binders consistently throughout, including in unfoldings. We also tidy binders in RHSs, so that they print nicely in interfaces.

Note [Disgusting computation of CafRefs]

We compute hasCafRefs here, because IdInfo is supposed to be finalised after TidyPgm. But CorePrep does some transformations that affect CAF-hood. So we have to predict the result here, which is revolting.

In particular CorePrep expands Integer and Natural literals. So in the prediction code here we resort to applying the same expansion (cvt_literal). There are also numerous other ways in which we can introduce inconsistencies between CorePrep and TidyPgm. See Note [CAFfyness inconsistencies due to eta expansion in TidyPgm] for one such example.

Ugh! What ugliness we hath wrought.

Note [CafInfo and floating]

What happens when we try to float bindings to the top level? At this point all the CafInfo is supposed to be correct, and we must make certain that is true of the new top-level bindings. There are two cases to consider

a) The top-level binding is marked asCafRefs. In that case we are basically fine. The floated bindings had better all be lazy lets, so they can float to top level, but they'll all have HasCafRefs (the default) which is safe.

b) The top-level binding is marked NoCafRefs. This really happens Example. CoreTidy produces

  $fApplicativeSTM [NoCafRefs] = D:Alternative retry# ...blah...

Now CorePrep has to eta-expand to

  $fApplicativeSTM = let sat = \xy. retry x y
                     in D:Alternative sat ...blah...

So what we want is

  sat [NoCafRefs] = \xy. retry x y
  $fApplicativeSTM [NoCafRefs] = D:Alternative sat ...blah...

So, gruesomely, we must set the NoCafRefs flag on the sat bindings, and substitute the modified 'sat' into the old RHS.

It should be the case that 'sat' is itself [NoCafRefs] (a value, no cafs) else the original top-level binding would not itself have been marked [NoCafRefs]. The DEBUG check in CoreToStg for consistentCafInfo will find this.

This is all very gruesome and horrible. It would be better to figure out CafInfo later, after CorePrep. We'll do that in due course. Meanwhile this horrible hack works.

Note [CAFfyness inconsistencies due to eta expansion in TidyPgm]

Eta expansion during CorePrep can have non-obvious negative consequences on the CAFfyness computation done by TidyPgm (see Note [Disgusting computation of CafRefs] in TidyPgm). This late expansion happens/happened for a few reasons:

  • CorePrep previously eta expanded unsaturated primop applications, as described in Note [Primop wrappers]).

  • CorePrep still does eta expand unsaturated data constructor applications.

In particular, consider the program:

data Ty = Ty (RealWorld# -> (# RealWorld#, Int #))

-- Is this CAFfy?
x :: STM Int
x = Ty (retry# @Int)

Consider whether x is CAFfy. One might be tempted to answer "no". Afterall, f obviously has no CAF references and the application (retry# @Int) is essentially just a variable reference at runtime.

However, when CorePrep expanded the unsaturated application of 'retry#' it would rewrite this to

x = \u []
   let sat = retry# @Int
   in Ty sat

This is now a CAF. Failing to handle this properly was the cause of #16846 (closed). We fixed this by eliminating the need to eta expand primops, as described in Note [Primop wrappers]), However we have not yet done the same for data constructor applications.

Clone repository Edit sidebar

GHC Home
GHC User's Guide

Joining In

Newcomers info
Mailing Lists & IRC
The GHC Team

Documentation

GHC Status Info
Working conventions
Building Guide
Debugging
Commentary

Wiki

Title Index
Recent Changes