1. 17 Dec, 2016 1 commit
    • Sergei Trofimovich's avatar
      UNREG: include CCS_OVERHEAD to STG · 2fa00f5b
      Sergei Trofimovich authored
      Commit 394231b3 aded
      CCS_OVERHEAD annotation to 'rts/Apply.cmm'.
      
      Before the change CCS_OVERHEAD was used only in C code.
      
      The change exports CCS_OVERHEAD to STG.
      
      Fixes UNREG build failure:
        rts_dist_HC rts/dist/build/Apply.p_o
          /tmp/ghc29563_0/ghc_4.hc: In function 'cm_entry':
      
          /tmp/ghc29563_0/ghc_4.hc:73:13: error:
           error: 'CCS_OVERHEAD' undeclared (first use in this function)
           *((P_)((W_)&CCS_OVERHEAD+72)) = ...
                       ^~~~~~~~~~~~
      Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>
      2fa00f5b
  2. 07 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Overhaul of Compact Regions (#12455) · 7036fde9
      Simon Marlow authored
      Summary:
      This commit makes various improvements and addresses some issues with
      Compact Regions (aka Compact Normal Forms).
      
      This was the most important thing I wanted to fix.  Compaction
      previously prevented GC from running until it was complete, which
      would be a problem in a multicore setting.  Now, we compact using a
      hand-written Cmm routine that can be interrupted at any point.  When a
      GC is triggered during a sharing-enabled compaction, the GC has to
      traverse and update the hash table, so this hash table is now stored
      in the StgCompactNFData object.
      
      Previously, compaction consisted of a deepseq using the NFData class,
      followed by a traversal in C code to copy the data.  This is now done
      in a single pass with hand-written Cmm (see rts/Compact.cmm). We no
      longer use the NFData instances, instead the Cmm routine evaluates
      components directly as it compacts.
      
      The new compaction is about 50% faster than the old one with no
      sharing, and a little faster on average with sharing (the cost of the
      hash table dominates when we're doing sharing).
      
      Static objects that don't (transitively) refer to any CAFs don't need
      to be copied into the compact region.  In particular this means we
      often avoid copying Char values and small Int values, because these
      are static closures in the runtime.
      
      Each Compact# object can support a single compactAdd# operation at any
      given time, so the Data.Compact library now enforces mutual exclusion
      using an MVar stored in the Compact object.
      
      We now get exceptions rather than killing everything with a barf()
      when we encounter an object that cannot be compacted (a function, or a
      mutable object).  We now also detect pinned objects, which can't be
      compacted either.
      
      The Data.Compact API has been refactored and cleaned up.  A new
      compactSize operation returns the size (in bytes) of the compact
      object.
      
      Most of the documentation is in the Haddock docs for the compact
      library, which I've expanded and improved here.
      
      Various comments in the code have been improved, especially the main
      Note [Compact Normal Forms] in rts/sm/CNF.c.
      
      I've added a few tests, and expanded a few of the tests that were
      there.  We now also run the tests with GHCi, and in a new test way
      that enables sanity checking (+RTS -DS).
      
      There's a benchmark in libraries/compact/tests/compact_bench.hs for
      measuring compaction speed and comparing sharing vs. no sharing.
      
      The field totalDataW in StgCompactNFData was unnecessary.
      
      Test Plan:
      * new unit tests
      * validate
      * tested manually that we can compact Data.Aeson data
      
      Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd
      
      Subscribers: thomie, simonpj
      
      Differential Revision: https://phabricator.haskell.org/D2751
      
      GHC Trac Issues: #12455
      7036fde9
  3. 30 Aug, 2016 1 commit
    • mniip's avatar
      Tag pointers in interpreted constructors · a25bf267
      mniip authored
      Instead of stg_interp_constr_entry there are now 7 functions (one for
      each value of the tag bits) that tag the constructor pointer before
      returning. This is consistent with compiled constructors' entry code,
      and expectations that compiled code places on compiled constructors. The
      iserv protocol is extended with an extra field that explains what
      pointer tag the constructor should use.
      
      Test Plan: Added tests for #12523
      
      Reviewers: erikd, bgamari, hvr, austin, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: osa1, thomie, rwbarton
      
      Differential Revision: https://phabricator.haskell.org/D2473
      
      GHC Trac Issues: #12523
      a25bf267
  4. 21 Jul, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Implement unboxed sum primitive type · 714bebff
      Ömer Sinan Ağacan authored
      Summary:
      This patch implements primitive unboxed sum types, as described in
      https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes.
      
      Main changes are:
      
      - Add new syntax for unboxed sums types, terms and patterns. Hidden
        behind `-XUnboxedSums`.
      
      - Add unlifted unboxed sum type constructors and data constructors,
        extend type and pattern checkers and desugarer.
      
      - Add new RuntimeRep for unboxed sums.
      
      - Extend unarise pass to translate unboxed sums to unboxed tuples right
        before code generation.
      
      - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better
        code generation when sum values are involved.
      
      - Add user manual section for unboxed sums.
      
      Some other changes:
      
      - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to
        `MultiValAlt` to be able to use those with both sums and tuples.
      
      - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really
        wrong, given an `Any` `TyCon`, there's no way to tell what its kind
        is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`.
      
      - Fix some bugs on the way: #12375.
      
      Not included in this patch:
      
      - Update Haddock for new the new unboxed sum syntax.
      
      - `TemplateHaskell` support is left as future work.
      
      For reviewers:
      
      - Front-end code is mostly trivial and adapted from unboxed tuple code
        for type checking, pattern checking, renaming, desugaring etc.
      
      - Main translation routines are in `RepType` and `UnariseStg`.
        Documentation in `UnariseStg` should be enough for understanding
        what's going on.
      
      Credits:
      
      - Johan Tibell wrote the initial front-end and interface file
        extensions.
      
      - Simon Peyton Jones reviewed this patch many times, wrote some code,
        and helped with debugging.
      
      Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin,
                 simonmar, hvr, erikd
      
      Reviewed By: simonpj
      
      Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire,
                   thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D2259
      714bebff
  5. 20 Jul, 2016 1 commit
    • gcampax's avatar
      Compact Regions · cf989ffe
      gcampax authored
      This brings in initial support for compact regions, as described in the
      ICFP 2015 paper "Efficient Communication and Collection with Compact
      Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni
      Campagna.
      
      Some things may change before the 8.2 release, but I (Simon M.) wanted
      to get the main patch committed so that we can iterate.
      
      What documentation there is is in the Data.Compact module in the new
      compact package.  We'll need to extend and polish the documentation
      before the release.
      
      Test Plan:
      validate
      (new test cases included)
      
      Reviewers: ezyang, simonmar, hvr, bgamari, austin
      
      Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
      
      Differential Revision: https://phabricator.haskell.org/D1264
      
      GHC Trac Issues: #11493
      cf989ffe
  6. 04 Jun, 2016 1 commit
  7. 18 May, 2016 1 commit
  8. 08 Mar, 2016 1 commit
    • Sergei Trofimovich's avatar
      Split external symbol prototypes (EF_) (Trac #11395) · 90e1e160
      Sergei Trofimovich authored
      Before the patch both Cmm and C symbols were declared
      with 'EF_' macro:
      
          #define EF_(f)    extern StgFunPtr f()
      
      but for Cmm symbols we know exact prototypes.
      
      The patch splits there prototypes in to:
      
          #define EFF_(f)   void f() /* See Note [External function prototypes] */
          #define EF_(f)    StgFunPtr f(void)
      
      Cmm functions are 'EF_' (External Functions),
      C functions are 'EFF_' (External Foreign Functions).
      
      While at it changed external C function prototype
      to return 'void' to workaround ghc bug on m68k.
      Described in detail in Trac #11395.
      
      This makes simple tests work on m68k-linux target!
      
      Thanks to Michael Karcher for awesome analysis
      happening in Trac #11395.
      Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>
      
      Test Plan: ran "hello world" on m68k successfully
      
      Reviewers: simonmar, austin, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1975
      
      GHC Trac Issues: #11395
      90e1e160
  9. 27 Feb, 2016 1 commit
  10. 23 Jan, 2016 1 commit
    • Joachim Breitner's avatar
      Remove unused IND_PERM · f42db157
      Joachim Breitner authored
      it seems that this closure type has not been in use since 5d52d9, so all
      this is dead and untested code. This removes it. Some of the code might
      be useful for a counting indirection as described in #10613, so when
      implementing that, have a look at what this commit removes.
      
      Test Plan: validate on harbormaster
      
      Reviewers: austin, bgamari, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1821
      f42db157
  11. 21 Dec, 2015 1 commit
    • Simon Marlow's avatar
      Maintain cost-centre stacks in the interpreter · c8c44fd9
      Simon Marlow authored
      Summary:
      Breakpoints become SCCs, so we have detailed call-stack info for
      interpreted code.  Currently this only works when GHC is compiled with
      -prof, but D1562 (Remote GHCi) removes this constraint so that in the
      future call stacks will be available without building your own GHCi.
      
      How can you get a stack trace?
      
      * programmatically: GHC.Stack.currentCallStack
      * I've added an experimental :where command that shows the stack when
        stopped at a breakpoint
      * `error` attaches a call stack automatically, although since calls to
        `error` are often lifted out to the top level, this is less useful
        than it might be (ImplicitParams still works though).
      * Later we might attach call stacks to all exceptions
      
      Other related changes in this diff:
      
      * I reduced the number of places that get ticks attached for
        breakpoints.  In particular there was a breakpoint around the whole
        declaration, which was often redundant because it bound no variables.
        This reduces clutter in the stack traces and speeds up compilation.
      
      * I tidied up some RealSrcSpan stuff in InteractiveUI, and made a few
        other small cleanups
      
      Test Plan: validate
      
      Reviewers: ezyang, bgamari, austin, hvr
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1595
      
      GHC Trac Issues: #11047
      c8c44fd9
  12. 12 Nov, 2015 1 commit
  13. 20 Oct, 2014 1 commit
  14. 02 Oct, 2014 1 commit
    • Edward Z. Yang's avatar
      Rename _closure to _static_closure, apply naming consistently. · 35672072
      Edward Z. Yang authored
      Summary:
      In preparation for indirecting all references to closures,
      we rename _closure to _static_closure to ensure any old code
      will get an undefined symbol error.  In order to reference
      a closure foobar_closure (which is now undefined), you should instead
      use STATIC_CLOSURE(foobar).  For convenience, a number of these
      old identifiers are macro'd.
      
      Across C-- and C (Windows and otherwise), there were differing
      conventions on whether or not foobar_closure or &foobar_closure
      was the address of the closure.  Now, all foobar_closure references
      are addresses, and no & is necessary.
      
      CHARLIKE/INTLIKE were not changed, simply alpha-renamed.
      
      Part of remove HEAP_ALLOCED patch set (#8199)
      
      Depends on D265
      Signed-off-by: Edward Z. Yang's avatarEdward Z. Yang <ezyang@mit.edu>
      
      Test Plan: validate
      
      Reviewers: simonmar, austin
      
      Subscribers: simonmar, ezyang, carter, thomie
      
      Differential Revision: https://phabricator.haskell.org/D267
      
      GHC Trac Issues: #8199
      35672072
  15. 24 Sep, 2014 1 commit
  16. 17 Sep, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Implement `decodeDouble_Int64#` primop · b62bd5ec
      Herbert Valerio Riedel authored
      The existing `decodeDouble_2Int#` primop is rather inconvenient to use
      (and in fact is not even used by `integer-gmp`) as the mantissa is split
      into 3 components which would actually fit in an `Int64#` value.
      
      However, `decodeDouble_Int64#` is to be used by the new `integer-gmp2`
      re-implementation (see #9281).
      
      Moreover, `decodeDouble_2Int#` performs direct bit-wise operations on the
      IEEE representation which can be replaced by a combination of the
      portable standard C99 `scalbn(3)` and `frexp(3)` functions.
      
      Differential Revision: https://phabricator.haskell.org/D160
      b62bd5ec
  17. 16 Aug, 2014 3 commits
    • Gabor Greif's avatar
      Revert "Fix typos 'resizze'" · 53cc943a
      Gabor Greif authored
      this is z-encoding (as hvr tells me)
      
      This reverts commit 425d5178.
      53cc943a
    • Gabor Greif's avatar
      Fix typos 'resizze' · 425d5178
      Gabor Greif authored
      425d5178
    • Herbert Valerio Riedel's avatar
      Implement {resize,shrink}MutableByteArray# primops · 246436f1
      Herbert Valerio Riedel authored
      The two new primops with the type-signatures
      
        resizeMutableByteArray# :: MutableByteArray# s -> Int#
                                -> State# s -> (# State# s, MutableByteArray# s #)
      
        shrinkMutableByteArray# :: MutableByteArray# s -> Int#
                                -> State# s -> State# s
      
      allow to resize MutableByteArray#s in-place (when possible), and are useful
      for algorithms where memory is temporarily over-allocated. The motivating
      use-case is for implementing integer backends, where the final target size of
      the result is either N or N+1, and only known after the operation has been
      performed.
      
      A future commit will implement a stateful variant of the
      `sizeofMutableByteArray#` operation (see #9447 for details), since now the
      size of a `MutableByteArray#` may change over its lifetime (i.e before
      it gets frozen or GCed).
      
      Test Plan: ./validate --slow
      
      Reviewers: ezyang, austin, simonmar
      
      Reviewed By: austin, simonmar
      
      Differential Revision: https://phabricator.haskell.org/D133
      246436f1
  18. 30 Jun, 2014 1 commit
    • tibbe's avatar
      Re-add more primops for atomic ops on byte arrays · 4ee4ab01
      tibbe authored
      This is the second attempt to add this functionality. The first
      attempt was reverted in 950fcae4, due
      to register allocator failure on x86. Given how the register
      allocator currently works, we don't have enough registers on x86 to
      support cmpxchg using complicated addressing modes. Instead we fall
      back to a simpler addressing mode on x86.
      
      Adds the following primops:
      
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      
      Makes these pre-existing out-of-line primops inline:
      
       * fetchAddIntArray#
       * casIntArray#
      4ee4ab01
  19. 26 Jun, 2014 1 commit
  20. 24 Jun, 2014 1 commit
    • tibbe's avatar
      Add more primops for atomic ops on byte arrays · d8abf85f
      tibbe authored
      Summary:
      Add more primops for atomic ops on byte arrays
      
      Adds the following primops:
      
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      
      Makes these pre-existing out-of-line primops inline:
      
       * fetchAddIntArray#
       * casIntArray#
      d8abf85f
  21. 29 Mar, 2014 2 commits
    • tibbe's avatar
      Add missing symbols to linker · 838bfb22
      tibbe authored
      The copy array family of primops were moved out-of-line.
      838bfb22
    • tibbe's avatar
      Add SmallArray# and SmallMutableArray# types · 90329b6c
      tibbe authored
      These array types are smaller than Array# and MutableArray# and are
      faster when the array size is small, as they don't have the overhead
      of a card table. Having no card table reduces the closure size with 2
      words in the typical small array case and leads to less work when
      updating or GC:ing the array.
      
      Reduces both the runtime and memory allocation by 8.8% on my insert
      benchmark for the HashMap type in the unordered-containers package,
      which makes use of lots of small arrays. With tuned GC settings
      (i.e. `+RTS -A6M`) the runtime reduction is 15%.
      
      Fixes #8923.
      90329b6c
  22. 22 Mar, 2014 1 commit
    • tibbe's avatar
      codeGen: inline allocation optimization for clone array primops · 1eece456
      tibbe authored
      The inline allocation version is 69% faster than the out-of-line
      version, when cloning an array of 16 unit elements on a 64-bit
      machine.
      
      Comparing the new and the old primop implementations isn't
      straightforward. The old version had a missing heap check that I
      discovered during the development of the new version. Comparing the
      old and the new version would requiring fixing the old version, which
      in turn means reimplementing the equivalent of MAYBE_CG in StgCmmPrim.
      
      The inline allocation threshold is configurable via
      -fmax-inline-alloc-size which gives the maximum array size, in bytes,
      to allocate inline. The size does not include the closure header size.
      
      Allowing the same primop to be either inline or out-of-line has some
      implication for how we lay out heap checks. We always place a heap
      check around out-of-line primops, as they may allocate outside of our
      knowledge. However, for the inline primops we only allow allocation
      via the standard means (i.e. virtHp). Since the clone primops might be
      either inline or out-of-line the heap check layout code now consults
      shouldInlinePrimOp to know whether a primop will be inlined.
      1eece456
  23. 17 Feb, 2014 1 commit
  24. 13 Jan, 2014 1 commit
  25. 21 Nov, 2013 1 commit
    • Simon Marlow's avatar
      In the DEBUG rts, track when CAFs are GC'd · e82fa829
      Simon Marlow authored
      This resurrects some old code and makes it work again.  The idea is
      that we want to get an error message if we ever enter a CAF that has
      been GC'd, rather than following its indirection which will likely
      cause a segfault.  Without this patch, these bugs are hard to track
      down in gdb, because the IND_STATIC code overwrites R1 (the pointer to
      the CAF) with its indirectee before jumping into bad memory, so we've
      lost the address of the CAF that got GC'd.
      
      Some associated refactoring while I was here.
      e82fa829
  26. 01 Oct, 2013 2 commits
    • Simon Marlow's avatar
    • Simon Marlow's avatar
      Remove use of R9, and fix associated bugs · 11b5ce55
      Simon Marlow authored
      We were passing the function address to stg_gc_prim_p in R9, which was
      wrong because the call was a high-level call and didn't declare R9 as
      a parameter.  Passing R9 as an argument is the right way, but
      unfortunately that exposed another bug: we were using the same macro
      in some low-level Cmm, where it is illegal to call functions with
      arguments (see Note [syntax of cmm files]).  So we now have low-level
      variants of STK_CHK() and STK_CHK_P() for use in low-level Cmm code.
      11b5ce55
  27. 23 Sep, 2013 2 commits
  28. 21 Aug, 2013 3 commits
  29. 13 Jul, 2013 1 commit
  30. 10 Jul, 2013 1 commit
  31. 09 Jul, 2013 1 commit
  32. 22 Jun, 2013 1 commit
  33. 15 Jun, 2013 1 commit