Skip to content
Snippets Groups Projects
  1. Jul 16, 2019
  2. Jul 14, 2019
    • John Ericson's avatar
      Expunge #ifdef and #ifndef from the codebase · d7c6c471
      John Ericson authored and Marge Bot's avatar Marge Bot committed
      These are unexploded minds as far as the linter is concerned. I don't
      want to hit in my MRs by mistake!
      
      I did this with `sed`, and then rolled back some changes in the docs,
      config.guess, and the linter itself.
      d7c6c471
  3. Jul 10, 2019
    • John Ericson's avatar
      Remove most uses of TARGET platform macros · 0472f0f6
      John Ericson authored and Marge Bot's avatar Marge Bot committed
      These prevent multi-target builds. They were gotten rid of in 3 ways:
      
      1. In the compiler itself, replacing `#if` with runtime `if`. In these
      cases, we care about the target platform still, but the target platform
      is dynamic so we must delay the elimination to run time.
      
      2. In the compiler itself, replacing `TARGET` with `HOST`. There was
      just one bit of this, in some code splitting strings representing lists
      of paths. These paths are used by GHC itself, and not by the compiled
      binary. (They are compiler lookup paths, rather than RPATHS or something
      that does matter to the compiled binary, and thus would legitamentally
      be target-sensative.) As such, the path-splitting method only depends on
      where GHC runs and not where code it produces runs. This should have
      been `HOST` all along.
      
      3. Changing the RTS. The RTS doesn't care about the target platform,
      full stop.
      
      4. `includes/stg/HaskellMachRegs.h` This file is also included in the
      genapply executable. This is tricky because the RTS's host platform
      really is that utility's target platform. so that utility really really
      isn't multi-target either. But at least it isn't an installed part of
      GHC, but just a one-off tool when building the RTS. Lying with the
      `HOST` to a one-off program (genapply) that isn't installed doesn't seem so bad.
      It's certainly better than the other way around of lying to the RTS
      though not to genapply. The RTS is more important, and it is installed,
      *and* this header is installed as part of the RTS.
      0472f0f6
  4. Jul 05, 2019
  5. Jul 02, 2019
  6. Jun 28, 2019
    • Ben Gamari's avatar
      rts: Assert that LDV profiling isn't used with parallel GC · bd660ede
      Ben Gamari authored
      I'm not entirely sure we are careful about ensuring this; this is a
      last-ditch check.
      bd660ede
    • Travis Whitaker's avatar
      Correct closure observation, construction, and mutation on weak memory machines. · 11bac115
      Travis Whitaker authored and Ben Gamari's avatar Ben Gamari committed
      
      Here the following changes are introduced:
          - A read barrier machine op is added to Cmm.
          - The order in which a closure's fields are read and written is changed.
          - Memory barriers are added to RTS code to ensure correctness on
            out-or-order machines with weak memory ordering.
      
      Cmm has a new CallishMachOp called MO_ReadBarrier. On weak memory machines, this
      is lowered to an instruction that ensures memory reads that occur after said
      instruction in program order are not performed before reads coming before said
      instruction in program order. On machines with strong memory ordering properties
      (e.g. X86, SPARC in TSO mode) no such instruction is necessary, so
      MO_ReadBarrier is simply erased. However, such an instruction is necessary on
      weakly ordered machines, e.g. ARM and PowerPC.
      
      Weam memory ordering has consequences for how closures are observed and mutated.
      For example, consider a closure that needs to be updated to an indirection. In
      order for the indirection to be safe for concurrent observers to enter, said
      observers must read the indirection's info table before they read the
      indirectee. Furthermore, the entering observer makes assumptions about the
      closure based on its info table contents, e.g. an INFO_TYPE of IND imples the
      closure has an indirectee pointer that is safe to follow.
      
      When a closure is updated with an indirection, both its info table and its
      indirectee must be written. With weak memory ordering, these two writes can be
      arbitrarily reordered, and perhaps even interleaved with other threads' reads
      and writes (in the absence of memory barrier instructions). Consider this
      example of a bad reordering:
      
      - An updater writes to a closure's info table (INFO_TYPE is now IND).
      - A concurrent observer branches upon reading the closure's INFO_TYPE as IND.
      - A concurrent observer reads the closure's indirectee and enters it. (!!!)
      - An updater writes the closure's indirectee.
      
      Here the update to the indirectee comes too late and the concurrent observer has
      jumped off into the abyss. Speculative execution can also cause us issues,
      consider:
      
      - An observer is about to case on a value in closure's info table.
      - The observer speculatively reads one or more of closure's fields.
      - An updater writes to closure's info table.
      - The observer takes a branch based on the new info table value, but with the
        old closure fields!
      - The updater writes to the closure's other fields, but its too late.
      
      Because of these effects, reads and writes to a closure's info table must be
      ordered carefully with respect to reads and writes to the closure's other
      fields, and memory barriers must be placed to ensure that reads and writes occur
      in program order. Specifically, updates to a closure must follow the following
      pattern:
      
      - Update the closure's (non-info table) fields.
      - Write barrier.
      - Update the closure's info table.
      
      Observing a closure's fields must follow the following pattern:
      
      - Read the closure's info pointer.
      - Read barrier.
      - Read the closure's (non-info table) fields.
      
      This patch updates RTS code to obey this pattern. This should fix long-standing
      SMP bugs on ARM (specifically newer aarch64 microarchitectures supporting
      out-of-order execution) and PowerPC. This fixes issue #15449.
      
      Co-Authored-By: default avatarBen Gamari <ben@well-typed.com>
      11bac115
    • Sylvain Henry's avatar
      Fix GCC warnings with __clear_cache builtin (#16867) · 4ec233ec
      Sylvain Henry authored and Marge Bot's avatar Marge Bot committed
      4ec233ec
  7. Jun 27, 2019
    • Matthew Pickering's avatar
      rts: Do not traverse nursery for dead closures in LDV profile · 07cffc49
      Matthew Pickering authored and Marge Bot's avatar Marge Bot committed
      It is important that `heapCensus` and `LdvCensusForDead` traverse the
      same areas.
      
      `heapCensus` increases the `not_used` counter which tracks how many
      closures are live but haven't been used yet.
      
      `LdvCensusForDead` increases the `void_total` counter which tracks how
      many dead closures there are.
      
      The `LAG` is then calculated by substracting the `void_total` from
      `not_used` and so it is essential that `not_used >= void_total`. This
      fact is checked by quite a few assertions.
      
      However, if a program has low maximum residency but allocates a lot in
      the nursery then these assertions were failing (see #16753 and #15903)
      because `LdvCensusForDead` was observing dead closures from the nursery
      which totalled more than the `not_used`. The same closures were not
      counted by `heapCensus`.
      
      Therefore, it seems that the correct fix is to make `LdvCensusForDead`
      agree with `heapCensus` and not traverse the nursery for dead closures.
      
      Fixes #16100 #16753 #15903 #8982
      07cffc49
    • Matthew Pickering's avatar
      rts: Correct assertion in LDV_recordDead · ed4cbd93
      Matthew Pickering authored and Marge Bot's avatar Marge Bot committed
      It is possible that void_total is exactly equal to not_used and the
      other assertions for this check for <= rather than <.
      ed4cbd93
    • Matthew Pickering's avatar
      rts: Correct handling of LARGE ARR_WORDS in LDV profiler · a586b33f
      Matthew Pickering authored and Marge Bot's avatar Marge Bot committed
      This implements the correct fix for #11627 by skipping over the slop
      (which is zeroed) rather than adding special case logic for LARGE
      ARR_WORDS which runs the risk of not performing a correct census by
      ignoring any subsequent blocks.
      
      This approach implements similar logic to that in Sanity.c
      a586b33f
  8. Jun 26, 2019
  9. Jun 22, 2019
    • Ben Gamari's avatar
      rts: Reset STATIC_LINK field of reverted CAFs · b0d6bf2a
      Ben Gamari authored and Marge Bot's avatar Marge Bot committed
      When we revert a CAF we must reset the STATIC_LINK field lest the GC
      might ignore the CAF (e.g. as it carries the STATIC_FLAG_LIST flag) and
      will consequently overlook references to object code that we are trying
      to unload. This would result in the reachable object code being
      unloaded. See Note [CAF lists] and Note [STATIC_LINK fields].
      
      This fixes #16842.
      
      Idea-due-to: Phuong Trinh <lolotp@fb.com>
      b0d6bf2a
  10. Jun 13, 2019
  11. Jun 12, 2019
  12. Jun 11, 2019
  13. Jun 09, 2019
  14. Jun 08, 2019
  15. Jun 07, 2019
  16. Jun 01, 2019
  17. May 31, 2019
  18. May 30, 2019
  19. May 29, 2019
  20. May 27, 2019
    • Jasper Van der Jeugt's avatar
      Fix padding of entries in .prof files · 95b79173
      Jasper Van der Jeugt authored and Marge Bot's avatar Marge Bot committed
      When the number of entries of a cost centre reaches 11 digits, it takes
      up the whole space reserved for it and the prof file ends up looking
      like:
      
          ... no.        entries  %time %alloc   %time %alloc
      
              ...
          ... 120918     978250    0.0    0.0     0.0    0.0
          ... 118891          0    0.0    0.0    73.3   80.8
          ... 11890229702412351    8.9   13.5    73.3   80.8
          ... 118903  153799689    0.0    0.1     0.0    0.1
              ...
      
      This results in tooling not being able to parse the .prof file.  I
      realise we have the JSON output as well now, but still it'd be good to
      fix this little weirdness.
      
      Original bug report and full prof file can be seen here:
      <https://github.com/jaspervdj/profiteur/issues/28>.
      95b79173
  21. May 25, 2019
  22. May 22, 2019
    • Alec Theriault's avatar
      RTS: Fix restrictive cast · ecc9366a
      Alec Theriault authored and Marge Bot's avatar Marge Bot committed
      Commit e75a9afd added an `unsigned` cast
      to account for OSes that have signed `rlim_t` signed. Unfortunately,
      the `unsigned` cast has the unintended effect of narrowing `rlim_t` to
      only 4 bytes. This leads to some spurious out of memory crashes
      (in particular: Haddock crashes with OOM whenn building docs of
      `ghc`-the-library).
      
      In this case, `W_` is a better type to cast to: we know it will be
      unsigned too and it has the same type as `*len` (so we don't suffer from
      accidental narrowing).
      ecc9366a
Loading