1. 02 Dec, 2020 3 commits
    • Richard Eisenberg's avatar
      Rename the flattener to become the rewriter. · d66660ba
      Richard Eisenberg authored
      Now that flattening doesn't produce flattening variables,
      it's not really flattening anything: it's rewriting. This
      change also means that the rewriter can no longer be confused
      the core flattener (in GHC.Core.Unify), which is sometimes used
      during type-checking.
      d66660ba
    • Richard Eisenberg's avatar
      Remove flattening variables · 8bb52d91
      Richard Eisenberg authored
      This patch redesigns the flattener to simplify type family applications
      directly instead of using flattening meta-variables and skolems. The key new
      innovation is the CanEqLHS type and the new CEqCan constraint (Ct). A CanEqLHS
      is either a type variable or exactly-saturated type family application; either
      can now be rewritten using a CEqCan constraint in the inert set.
      
      Because the flattener no longer reduces all type family applications to
      variables, there was some performance degradation if a lengthy type family
      application is now flattened over and over (not making progress). To
      compensate, this patch contains some extra optimizations in the flattener,
      leading to a number of performance improvements.
      
      Close #18875.
      Close #18910.
      
      There are many extra parts of the compiler that had to be affected in writing
      this patch:
      
      * The family-application cache (formerly the flat-cache) sometimes stores
        coercions built from Given inerts. When these inerts get kicked out, we must
        kick out from the cache as well. (This was, I believe, true previously, but
        somehow never caused trouble.) Kicking out from the cache requires adding a
        filterTM function to TrieMap.
      
      * This patch obviates the need to distinguish "blocking" coercion holes from
        non-blocking ones (which, previously, arose from CFunEqCans). There is thus
        some simplification around coercion holes.
      
      * Extra commentary throughout parts of the code I read through, to preserve
        the knowledge I gained while working.
      
      * A change in the pure unifier around unifying skolems with other types.
        Unifying a skolem now leads to SurelyApart, not MaybeApart, as documented
        in Note [Binding when looking up instances] in GHC.Core.InstEnv.
      
      * Some more use of MCoercion where appropriate.
      
      * Previously, class-instance lookup automatically noticed that e.g. C Int was
        a "unifier" to a target [W] C (F Bool), because the F Bool was flattened to
        a variable. Now, a little more care must be taken around checking for
        unifying instances.
      
      * Previously, tcSplitTyConApp_maybe would split (Eq a => a). This is silly,
        because (=>) is not a tycon in Haskell. Fixed now, but there are some
        knock-on changes in e.g. TrieMap code and in the canonicaliser.
      
      * New function anyFreeVarsOf{Type,Co} to check whether a free variable
        satisfies a certain predicate.
      
      * Type synonyms now remember whether or not they are "forgetful"; a forgetful
        synonym drops at least one argument. This is useful when flattening; see
        flattenView.
      
      * The pattern-match completeness checker invokes the solver. This invocation
        might need to look through newtypes when checking representational equality.
        Thus, the desugarer needs to keep track of the in-scope variables to know
        what newtype constructors are in scope. I bet this bug was around before but
        never noticed.
      
      * Extra-constraints wildcards are no longer simplified before printing.
        See Note [Do not simplify ConstraintHoles] in GHC.Tc.Solver.
      
      * Whether or not there are Given equalities has become slightly subtler.
        See the new HasGivenEqs datatype.
      
      * Note [Type variable cycles in Givens] in GHC.Tc.Solver.Canonical
        explains a significant new wrinkle in the new approach.
      
      * See Note [What might match later?] in GHC.Tc.Solver.Interact, which
        explains the fix to #18910.
      
      * The inert_count field of InertCans wasn't actually used, so I removed
        it.
      
      Though I (Richard) did the implementation, Simon PJ was very involved
      in design and review.
      
      This updates the Haddock submodule to avoid #18932 by adding
      a type signature.
      
      -------------------------
      Metric Decrease:
          T12227
          T5030
          T9872a
          T9872b
          T9872c
      Metric Increase:
          T9872d
      -------------------------
      8bb52d91
    • Richard Eisenberg's avatar
      Move core flattening algorithm to Core.Unify · 72a87fbc
      Richard Eisenberg authored
      This sets the stage for a later change, where this
      algorithm will be needed from GHC.Core.InstEnv.
      
      This commit also splits GHC.Core.Map into
      GHC.Core.Map.Type and GHC.Core.Map.Expr,
      in order to avoid module import cycles
      with GHC.Core.
      72a87fbc
  2. 01 Dec, 2020 1 commit
  3. 29 Nov, 2020 1 commit
    • Ben Gamari's avatar
      withTimings: Emit allocations counter · 1bc104b0
      Ben Gamari authored
      This will allow us to back out the allocations per compiler pass from
      the eventlog. Note that we dump the allocation counter rather than the
      difference since this will allow us to determine how much work is done
      *between* `withTiming` blocks.
      1bc104b0
  4. 28 Nov, 2020 4 commits
  5. 26 Nov, 2020 3 commits
    • Andreas Klebinger's avatar
      RegAlloc: Add missing raPlatformfield to RegAllocStatsSpill · 3e3555cc
      Andreas Klebinger authored
      Fixes #18994
      
      Co-Author: Benjamin Maurer <maurer.benjamin@gmail.com>
      3e3555cc
    • Sylvain Henry's avatar
      Fix toArgRep to support 64-bit reps on all systems · cdbd16f5
      Sylvain Henry authored
      [This is @Ericson2314 writing a commit message for @hsyl20's patch.]
      
      (Progress towards #11953, #17377, #17375)
      
      `Int64Rep` and `Word64Rep` are currently broken on 64-bit systems.  This
      is because they should use "native arg rep" but instead use "large arg
      rep" as they do on 32-bit systems, which is either a non-concept or a
      128-bit rep depending on one's vantage point.
      
      Now, these reps currently aren't used during 64-bit compilation, so the
      brokenness isn't observed, but I don't think that constitutes reasons
      not to fix it. Firstly, the linked issues there is a clearly expressed
      desire to use explicit-bitwidth constructs in more places. Secondly, per
      [1], there are other bugs that *do* manifest from not threading
      explicit-bitwidth information all the way through the compilation
      pipeline. One can therefore view this as one piece of the larger effort
      to do that, improve ergnomics, and squash remaining bugs.
      
      Also, this is needed for !3658. I could just merge this as part of that,
      but I'm keen on merging fixes "as they are ready" so the fixes that
      aren't ready are isolated and easier to debug.
      
      [1]: https://mail.haskell.org/pipermail/ghc-devs/2020-October/019332.html
      cdbd16f5
    • Moritz Angermann's avatar
      [Sized Cmm] properly retain sizes. · be5d74ca
      Moritz Angermann authored
      This replaces all Word<N> = W<N># Word# and Int<N> = I<N># Int#  with
      Word<N> = W<N># Word<N># and Int<N> = I<N># Int<N>#, thus providing us
      with properly sized primitives in the codegenerator instead of pretending
      they are all full machine words.
      
      This came up when implementing darwinpcs for arm64.  The darwinpcs reqires
      us to pack function argugments in excess of registers on the stack.  While
      most procedure call standards (pcs) assume arguments are just passed in
      8 byte slots; and thus the caller does not know the exact signature to make
      the call, darwinpcs requires us to adhere to the prototype, and thus have
      the correct sizes.  If we specify CInt in the FFI call, it should correspond
      to the C int, and not just be Word sized, when it's only half the size.
      
      This does change the expected output of T16402 but the new result is no
      less correct as it eliminates the narrowing (instead of the `and` as was
      previously done).
      
      Bumps the array, bytestring, text, and binary submodules.
      Co-Authored-By: Ben Gamari's avatarBen Gamari <ben@well-typed.com>
      
      Metric Increase:
          T13701
          T14697
      be5d74ca
  6. 24 Nov, 2020 1 commit
  7. 22 Nov, 2020 2 commits
  8. 21 Nov, 2020 4 commits
    • Ben Gamari's avatar
      dwarf: Apply info table offset consistently · a4a6dc2a
      Ben Gamari authored
      Previously we failed to apply the info table offset to the aranges and
      DIEs, meaning that we often failed to unwind in gdb. For some reason
      this only seemed to manifest in the RTS's Cmm closures. Nevertheless,
      now we can unwind completely up to `main`
      a4a6dc2a
    • Sylvain Henry's avatar
      Don't initialize plugins in the Core2Core pipeline · 72f2257c
      Sylvain Henry authored
      Some plugins can be added via TH (cf addCorePlugin). Initialize them in
      the driver instead of in the Core2Core pipeline.
      72f2257c
    • Sylvain Henry's avatar
      Move Plugins into HscEnv (#17957) · ecfd0278
      Sylvain Henry authored
      Loaded plugins have nothing to do in DynFlags so this patch moves them
      into HscEnv (session state).
      
      "DynFlags plugins" become "Driver plugins" to still be able to register
      static plugins.
      
      Bump haddock submodule
      ecfd0278
    • Ben Gamari's avatar
      Introduce -fprof-callers flag · 53ad67ea
      Ben Gamari authored
      This introducing a new compiler flag to provide a convenient way to
      introduce profiler cost-centers on all occurrences of the named
      identifier.
      
      Closes #18566.
      53ad67ea
  9. 20 Nov, 2020 2 commits
    • Sebastian Graf's avatar
      Demand: Interleave usage and strictness demands (#18903) · 0aec78b6
      Sebastian Graf authored
      As outlined in #18903, interleaving usage and strictness demands not
      only means a more compact demand representation, but also allows us to
      express demands that we weren't easily able to express before.
      
      Call demands are *relative* in the sense that a call demand `Cn(cd)`
      on `g` says "`g` is called `n` times. *Whenever `g` is called*, the
      result is used according to `cd`". Example from #18903:
      
      ```hs
      h :: Int -> Int
      h m =
        let g :: Int -> (Int,Int)
            g 1 = (m, 0)
            g n = (2 * n, 2 `div` n)
            {-# NOINLINE g #-}
        in case m of
          1 -> 0
          2 -> snd (g m)
          _ -> uncurry (+) (g m)
      ```
      
      Without the interleaved representation, we would just get `L` for the
      strictness demand on `g`. Now we are able to express that whenever
      `g` is called, its second component is used strictly in denoting `g`
      by `1C1(P(1P(U),SP(U)))`. This would allow Nested CPR to unbox the
      division, for example.
      
      Fixes #18903.
      While fixing regressions, I also discovered and fixed #18957.
      
      Metric Decrease:
          T13253-spj
      0aec78b6
    • Sebastian Graf's avatar
      Fix strictness signatures of `prefetchValue*#` primops · 321d1bd8
      Sebastian Graf authored
      Their strictness signatures said the primops are strict in their first
      argument, which is wrong: Handing it a thunk will prefetch the pointer
      to the thunk, but not evaluate it. Hence not strict.
      
      The regression test `T8256` actually tests for laziness in the first
      argument, so GHC apparently never exploited the strictness signature.
      
      See also #8256 (comment 310867),
      where this came up.
      321d1bd8
  10. 19 Nov, 2020 1 commit
    • Sebastian Graf's avatar
      PmCheck: Print types of uncovered patterns (#18932) · 8150f654
      Sebastian Graf authored
      In order to avoid confusion as in #18932, we display the type of the
      match variables in the non-exhaustiveness warning, e.g.
      
      ```
      T18932.hs:14:1: warning: [-Wincomplete-patterns]
          Pattern match(es) are non-exhaustive
          In an equation for ‘g’:
              Patterns of type  ‘T a’, ‘T a’, ‘T a’ not matched:
                  (MkT2 _) (MkT1 _) (MkT1 _)
                  (MkT2 _) (MkT1 _) (MkT2 _)
                  (MkT2 _) (MkT2 _) (MkT1 _)
                  (MkT2 _) (MkT2 _) (MkT2 _)
                  ...
         |
      14 | g (MkT1 x) (MkT1 _) (MkT1 _) = x
         | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      ```
      
      It also allows us to omit the type signature on wildcard matches which
      we previously showed in only some situations, particularly
      `-XEmptyCase`.
      
      Fixes #18932.
      8150f654
  11. 16 Nov, 2020 1 commit
  12. 15 Nov, 2020 6 commits
    • Moritz Angermann's avatar
      AArch64/arm64 adjustments · 8887102f
      Moritz Angermann authored
      This addes the necessary logic to support aarch64 on elf, as well
      as aarch64 on mach-o, which Apple calls arm64.
      
      We change architecture name to AArch64, which is the official arm
      naming scheme.
      8887102f
    • Ryan Scott's avatar
      Use tcSplitForAllInvisTyVars (not tcSplitForAllTyVars) in more places · 645444af
      Ryan Scott authored
      The use of `tcSplitForAllTyVars` in `tcDataFamInstHeader` was the immediate
      cause of #18939, and replacing it with a new `tcSplitForAllInvisTyVars`
      function (which behaves like `tcSplitForAllTyVars` but only splits invisible
      type variables) fixes the issue. However, this led me to realize that _most_
      uses of `tcSplitForAllTyVars` in GHC really ought to be
      `tcSplitForAllInvisTyVars` instead. While I was in town, I opted to replace
      most uses of `tcSplitForAllTys` with `tcSplitForAllTysInvis` to reduce the
      likelihood of such bugs in the future.
      
      I say "most uses" above since there is one notable place where we _do_ want
      to use `tcSplitForAllTyVars`: in `GHC.Tc.Validity.forAllTyErr`, which produces
      the "`Illegal polymorphic type`" error message if you try to use a higher-rank
      `forall` without having `RankNTypes` enabled. Here, we really do want to split
      all `forall`s, not just invisible ones, or we run the risk of giving an
      inaccurate error message in the newly added `T18939_Fail` test case.
      
      I debated at some length whether I wanted to name the new function
      `tcSplitForAllInvisTyVars` or `tcSplitForAllTyVarsInvisible`, but in the end,
      I decided that I liked the former better. For consistency's sake, I opted to
      rename the existing `splitPiTysInvisible` and `splitPiTysInvisibleN` functions
      to `splitInvisPiTys` and `splitPiTysInvisN`, respectively, so that they use the
      same naming convention. As a consequence, this ended up requiring a `haddock`
      submodule bump.
      
      Fixes #18939.
      645444af
    • Ryan Scott's avatar
      Name (tc)SplitForAll- functions more consistently · d61adb3d
      Ryan Scott authored
      There is a zoo of `splitForAll-` functions in `GHC.Core.Type` (as well as
      `tcSplitForAll-` functions in `GHC.Tc.Utils.TcType`) that all do very similar
      things, but vary in the particular form of type variable that they return. To
      make things worse, the names of these functions are often quite misleading.
      Some particularly egregious examples:
      
      * `splitForAllTys` returns `TyCoVar`s, but `splitSomeForAllTys` returns
        `VarBndr`s.
      * `splitSomeForAllTys` returns `VarBndr`s, but `tcSplitSomeForAllTys` returns
        `TyVar`s.
      * `splitForAllTys` returns `TyCoVar`s, but `splitForAllTysInvis` returns
        `InvisTVBinder`s. (This in particular arose in the context of #18939, and
        this finally motivated me to bite the bullet and improve the status quo
        vis-à-vis how we name these functions.)
      
      In an attempt to bring some sanity to how these functions are named, I have
      opted to rename most of these functions en masse to use consistent suffixes
      that describe the particular form of type variable that each function returns.
      In concrete terms, this amounts to:
      
      * Functions that return a `TyVar` now use the suffix `-TyVar`.
        This caused the following functions to be renamed:
        * `splitTyVarForAllTys` -> `splitForAllTyVars`
        * `splitForAllTy_ty_maybe` -> `splitForAllTyVar_maybe`
        * `tcSplitForAllTys` -> `tcSplitForAllTyVars`
        * `tcSplitSomeForAllTys` -> `tcSplitSomeForAllTyVars`
      * Functions that return a `CoVar` now use the suffix `-CoVar`.
        This caused the following functions to be renamed:
        * `splitForAllTy_co_maybe` -> `splitForAllCoVar_maybe`
      * Functions that return a `TyCoVar` now use the suffix `-TyCoVar`.
        This caused the following functions to be renamed:
        * `splitForAllTy` -> `splitForAllTyCoVar`
        * `splitForAllTys` -> `splitForAllTyCoVars`
        * `splitForAllTys'` -> `splitForAllTyCoVars'`
        * `splitForAllTy_maybe` -> `splitForAllTyCoVar_maybe`
      * Functions that return a `VarBndr` now use the suffix corresponding to the
        most relevant type synonym. This caused the following functions to be renamed:
        * `splitForAllVarBndrs` -> `splitForAllTyCoVarBinders`
        * `splitForAllTysInvis` -> `splitForAllInvisTVBinders`
        * `splitForAllTysReq` -> `splitForAllReqTVBinders`
        * `splitSomeForAllTys` -> `splitSomeForAllTyCoVarBndrs`
        * `tcSplitForAllVarBndrs` -> `tcSplitForAllTyVarBinders`
        * `tcSplitForAllTysInvis` -> `tcSplitForAllInvisTVBinders`
        * `tcSplitForAllTysReq` -> `tcSplitForAllReqTVBinders`
        * `tcSplitForAllTy_maybe` -> `tcSplitForAllTyVarBinder_maybe`
      
      Note that I left the following functions alone:
      
      * Functions that split apart things besides `ForAllTy`s, such as `splitFunTys`
        or `splitPiTys`. Thankfully, there are far fewer of these functions than
        there are functions that split apart `ForAllTy`s, so there isn't much of a
        pressing need to apply the new naming convention elsewhere.
      * Functions that split apart `ForAllCo`s in `Coercion`s, such as
        `GHC.Core.Coercion.splitForAllCo_maybe`. We could theoretically apply the new
        naming convention here, but then we'd have to figure out how to disambiguate
        `Type`-splitting functions from `Coercion`-splitting functions. Ultimately,
        the `Coercion`-splitting functions aren't used nearly as much as the
        `Type`-splitting functions, so I decided to leave the former alone.
      
      This is purely refactoring and should cause no change in behavior.
      d61adb3d
    • Ben Gamari's avatar
    • Ben Gamari's avatar
      nativeGen/dwarf: Only produce DW_AT_source_note DIEs in -g3 · 1e19183d
      Ben Gamari authored
      Standard debugging tools don't know how to understand these so let's not
      produce them unless asked.
      1e19183d
    • Ben Gamari's avatar
      nativeGen/dwarf: Fix procedure end addresses · 0a7e592c
      Ben Gamari authored
      Previously the `.debug_aranges` and `.debug_info` (DIE) DWARF
      information would claim that procedures (represented with a
      `DW_TAG_subprogram` DIE) would only span the range covered by their entry
      block. This omitted all of the continuation blocks (represented by
      `DW_TAG_lexical_block` DIEs), confusing `perf`. Fix this by introducing
      a end-of-procedure label and using this as the `DW_AT_high_pc` of
      procedure `DW_TAG_subprogram` DIEs
      
      Fixes #17605.
      0a7e592c
  13. 13 Nov, 2020 2 commits
    • Sebastian Graf's avatar
      Arity: Emit "Exciting arity" warning only after second iteration (#18937) · 197d59fa
      Sebastian Graf authored
      See Note [Exciting arity] why we emit the warning at all and why we only
      do after the second iteration now.
      
      Fixes #18937.
      197d59fa
    • Sebastian Graf's avatar
      Arity: Rework `ArityType` to fix monotonicity (#18870) · 63fa3997
      Sebastian Graf authored
      As we found out in #18870, `andArityType` is not monotone, with
      potentially severe consequences for termination of fixed-point
      iteration. That showed in an abundance of "Exciting arity" DEBUG
      messages that are emitted whenever we do more than one step in
      fixed-point iteration.
      
      The solution necessitates also recording `OneShotInfo` info for
      `ABot` arity type. Thus we get the following definition for `ArityType`:
      
      ```
      data ArityType = AT [OneShotInfo] Divergence
      ```
      
      The majority of changes in this patch are the result of refactoring use
      sites of `ArityType` to match the new definition.
      
      The regression test `T18870` asserts that we indeed don't emit any DEBUG
      output anymore for a function where we previously would have.
      Similarly, there's a regression test `T18937` for #18937, which we
      expect to be broken for now.
      
      Fixes #18870.
      63fa3997
  14. 12 Nov, 2020 1 commit
    • Ben Gamari's avatar
      compiler: Fix recompilation checking · 5353fd50
      Ben Gamari authored
      In ticket #18733 we noticed a rather serious deficiency in the current
      fingerprinting logic for recursive groups. I have described the old
      fingerprinting story and its problems in Note [Fingerprinting recursive
      groups] and have reworked the story accordingly to avoid these issues.
      
      Fixes #18733.
      5353fd50
  15. 11 Nov, 2020 6 commits
    • Krzysztof Gogolewski's avatar
      5506f134
    • Ömer Sinan Ağacan's avatar
      Fix and enable object unloading in GHCi · c34a4b98
      Ömer Sinan Ağacan authored
      Fixes #16525 by tracking dependencies between object file symbols and
      marking symbol liveness during garbage collection
      
      See Note [Object unloading] in CheckUnload.c for details.
      c34a4b98
    • Ben Gamari's avatar
      Enable -fexpose-internal-symbols when debug level >=2 · 584058dd
      Ben Gamari authored
      This seems like a reasonable default as the object file size increases
      by around 5%.
      584058dd
    • Ben Gamari's avatar
      codeGen: Produce local symbols for module-internal functions · c6264a2d
      Ben Gamari authored
      It turns out that some important native debugging/profiling tools (e.g.
      perf) rely only on symbol tables for function name resolution (as
      opposed to using DWARF DIEs). However, previously GHC would emit
      temporary symbols (e.g. `.La42b`) to identify module-internal
      entities. Such symbols are dropped during linking and therefore not
      visible to runtime tools (in addition to having rather un-helpful unique
      names). For instance, `perf report` would often end up attributing all
      cost to the libc `frame_dummy` symbol since Haskell code was no covered
      by any proper symbol (see #17605).
      
      We now rather follow the model of C compilers and emit
      descriptively-named local symbols for module internal things. Since this
      will increase object file size this behavior can be disabled with the
      `-fno-expose-internal-symbols` flag.
      
      With this `perf record` can finally be used against Haskell executables.
      Even more, with `-g3` `perf annotate` provides inline source code.
      c6264a2d
    • Ben Gamari's avatar
      Move this_module into NCGConfig · 6e23695e
      Ben Gamari authored
      In various places in the NCG we need the Module currently being
      compiled. Let's move this into the environment instead of chewing threw
      another register.
      6e23695e
    • Ben Gamari's avatar
      nativeGen: Make makeImportsDoc take an NCGConfig rather than DynFlags · fcfda909
      Ben Gamari authored
      It appears this was an oversight as there is no reason the full DynFlags
      is necessary.
      fcfda909
  16. 06 Nov, 2020 2 commits
    • Moritz Angermann's avatar
      [AArch64] Aarch64 Always PIC · 2cb87909
      Moritz Angermann authored
      2cb87909
    • Sylvain Henry's avatar
      Refactor -dynamic-too handling · c85f4928
      Sylvain Henry authored
      1) Don't modify DynFlags (too much) for -dynamic-too: now when we
         generate dynamic outputs for "-dynamic-too", we only set "dynamicNow"
         boolean field in DynFlags instead of modifying several other fields.
         These fields now have accessors that take dynamicNow into account.
      
      2) Use DynamicTooState ADT to represent -dynamic-too state. It's much
         clearer than the undocumented "DynamicTooConditional" that was used
         before.
      
      As a result, we can finally remove the hscs_iface_dflags field in
      HscRecomp. There was a comment on this field saying:
      
         "FIXME (osa): I don't understand why this is necessary, but I spent
         almost two days trying to figure this out and I couldn't .. perhaps
         someone who understands this code better will remove this later."
      
      I don't fully understand the details, but it was needed because of the
      changes made to the DynFlags for -dynamic-too.
      
      There is still something very dubious in GHC.Iface.Recomp: we have to
      disable the "dynamicNow" flag at some point for some Backpack's "heinous
      hack" to continue to work. It may be because interfaces for indefinite
      units are always non-dynamic, or because we mix and match dynamic and
      non-dynamic interfaces (#9176), or something else, who knows?
      c85f4928