This project is mirrored from Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
  1. 30 Oct, 2017 1 commit
  2. 19 Sep, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      compiler: introduce custom "GhcPrelude" Prelude · f63bc730
      Herbert Valerio Riedel authored
      This switches the compiler/ component to get compiled with
      -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
      This is motivated by the upcoming "Prelude" re-export of
      `Semigroup((<>))` which would cause lots of name clashes in every
      modulewhich imports also `Outputable`
      Reviewers: austin, goldfire, bgamari, alanz, simonmar
      Reviewed By: bgamari
      Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
      Differential Revision:
  3. 02 Jun, 2017 1 commit
    • Ryan Scott's avatar
      Use lengthIs and friends in more places · a786b136
      Ryan Scott authored
      While investigating #12545, I discovered several places in the code
      that performed length-checks like so:
      length ts == 4
      This is not ideal, since the length of `ts` could be much longer than 4,
      and we'd be doing way more work than necessary! There are already a slew
      of helper functions in `Util` such as `lengthIs` that are designed to do
      this efficiently, so I found every place where they ought to be used and
      did just that. I also defined a couple more utility functions for list
      length that were common patterns (e.g., `ltLength`).
      Test Plan: ./validate
      Reviewers: austin, hvr, goldfire, bgamari, simonmar
      Reviewed By: bgamari, simonmar
      Subscribers: goldfire, rwbarton, thomie
      Differential Revision:
  4. 01 May, 2017 1 commit
  5. 25 Apr, 2017 1 commit
    • Peter Trommler's avatar
      PPC NCG: Implement callish prim ops · 89a3241f
      Peter Trommler authored
      Provide PowerPC optimised implementations of callish prim ops.
      The generic implementation of quotient remainder prim ops uses
      a division and a remainder operation. There is no remainder on
      PowerPC and so we need to implement remainder "by hand" which
      results in a duplication of the divide operation when using the
      generic code.
      Avoid this duplication by implementing the prim op in the native
      code generator.
      Use PowerPC's instructions for long multiplication.
      Addition and subtraction
      Use PowerPC add/subtract with carry/overflow instructions
      MO_Clz and MO_Ctz
      Use PowerPC's CNTLZ instruction and implement count trailing
      zeros using count leading zeros
      Implement an algorithm given by Henry Warren in "Hacker's Delight"
      using PowerPC divide instruction. TODO: Use long division instructions
      when available (POWER7 and later).
      Test Plan: validate on AIX and 32-bit Linux
      Reviewers: simonmar, erikd, hvr, austin, bgamari
      Reviewed By: erikd, hvr, bgamari
      Subscribers: trofi, kgardas, thomie
      Differential Revision:
  6. 07 Mar, 2017 1 commit
  7. 29 Nov, 2016 1 commit
  8. 19 Oct, 2016 1 commit
    • Peter Trommler's avatar
      StgCmmPrim: Add missing write barrier. · 2cb8cc26
      Peter Trommler authored
      On architectures with weak memory consistency a write barrier
      is needed before the write to the pointer array.
      Fixes #12469
      Test Plan: rebuilt Stackage nightly twice on powerpc64le
      Reviewers: hvr, rrnewton, erikd, austin, simonmar, bgamari
      Reviewed By: erikd, bgamari
      Subscribers: thomie
      Differential Revision:
      GHC Trac Issues: #12469
  9. 31 Aug, 2016 1 commit
  10. 10 Aug, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Remove StgRubbishArg and CmmArg · 9684dbb1
      Ömer Sinan Ağacan authored
      The idea behind adding special "rubbish" arguments was in unboxed sum types
      depending on the tag some arguments are not used and we don't want to move some
      special values (like 0 for literals and some special pointer for boxed slots)
      for those arguments (to stack locations or registers). "StgRubbishArg" was an
      indicator to the code generator that the value won't be used. During Stg-to-Cmm
      we were then not generating any move or store instructions at all.
      This caused problems in the register allocator because some variables were only
      initialized in some code paths. As an example, suppose we have this STG: (after
          Lib.$WT =
              \r [dt_sit]
                      case dt_sit of {
                        Lib.F dt_siv [Occ=Once] ->
                            (#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#];
                        Lib.I dt_siw [Occ=Once] ->
                            (#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw];
                  { (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE];
      This basically unpacks a sum type to an unboxed sum with 3 fields, and then
      moves the unboxed sum to a constructor (`Lib.T`).
      This is the Cmm for the inner case expression (case expression in the scrutinee
      position of the outer case):
              -- look at dt_sit's tag
              if (_ciT::P64 != 1) goto ciS; else goto ciR;
          ciS: -- Tag is 2, i.e. Lib.F
              _siw::I64 = I64[_siu::P64 + 6];
              _giE::I64 = _siw::I64;
              _giD::P64 = stg_RUBBISH_ENTRY_info;
              _giC::I64 = 2;
              goto ciU;
          ciR: -- Tag is 1, i.e. Lib.I
              _siv::P64 = P64[_siu::P64 + 7];
              _giD::P64 = _siv::P64;
              _giC::I64 = 1;
              goto ciU;
      Here one of the blocks `ciS` and `ciR` is executed and then the execution
      continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch
      `_giE` is not initialized, because it's "rubbish" in the STG and so we don't
      generate an assignment during code generator. The code generator then panics
      during the register allocations:
          ghc-stage1: panic! (the 'impossible' happened)
            (GHC version 8.1.20160722 for x86_64-unknown-linux):
                  LocalReg's live-in to graph ciY {_giE::I64}
      (`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a
      pointer slot, we have to initialize it otherwise garbage collector follows the
      pointer to some random place. So we only remove assignment if the "rubbish" arg
      has unboxed type.)
      This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize
      rubbish slots. If the slot is for boxed types we use the existing `absentError`,
      otherwise we initialize the slot with literal 0.
      Reviewers: simonpj, erikd, austin, simonmar, bgamari
      Reviewed By: erikd
      Subscribers: thomie
      Differential Revision:
  11. 21 Jul, 2016 1 commit
    • Ömer Sinan Ağacan's avatar
      Implement unboxed sum primitive type · 714bebff
      Ömer Sinan Ağacan authored
      This patch implements primitive unboxed sum types, as described in
      Main changes are:
      - Add new syntax for unboxed sums types, terms and patterns. Hidden
        behind `-XUnboxedSums`.
      - Add unlifted unboxed sum type constructors and data constructors,
        extend type and pattern checkers and desugarer.
      - Add new RuntimeRep for unboxed sums.
      - Extend unarise pass to translate unboxed sums to unboxed tuples right
        before code generation.
      - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better
        code generation when sum values are involved.
      - Add user manual section for unboxed sums.
      Some other changes:
      - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to
        `MultiValAlt` to be able to use those with both sums and tuples.
      - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really
        wrong, given an `Any` `TyCon`, there's no way to tell what its kind
        is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`.
      - Fix some bugs on the way: #12375.
      Not included in this patch:
      - Update Haddock for new the new unboxed sum syntax.
      - `TemplateHaskell` support is left as future work.
      For reviewers:
      - Front-end code is mostly trivial and adapted from unboxed tuple code
        for type checking, pattern checking, renaming, desugaring etc.
      - Main translation routines are in `RepType` and `UnariseStg`.
        Documentation in `UnariseStg` should be enough for understanding
        what's going on.
      - Johan Tibell wrote the initial front-end and interface file
      - Simon Peyton Jones reviewed this patch many times, wrote some code,
        and helped with debugging.
      Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin,
                 simonmar, hvr, erikd
      Reviewed By: simonpj
      Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire,
                   thomie, mpickering
      Differential Revision:
  12. 20 Jul, 2016 1 commit
    • gcampax's avatar
      Compact Regions · cf989ffe
      gcampax authored
      This brings in initial support for compact regions, as described in the
      ICFP 2015 paper "Efficient Communication and Collection with Compact
      Normal Forms" (Edward Z. Yang and implemented by Giovanni
      Some things may change before the 8.2 release, but I (Simon M.) wanted
      to get the main patch committed so that we can iterate.
      What documentation there is is in the Data.Compact module in the new
      compact package.  We'll need to extend and polish the documentation
      before the release.
      Test Plan:
      (new test cases included)
      Reviewers: ezyang, simonmar, hvr, bgamari, austin
      Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
      Differential Revision:
      GHC Trac Issues: #11493
  13. 31 Dec, 2015 1 commit
  14. 31 Oct, 2015 1 commit
  15. 11 Sep, 2015 1 commit
  16. 02 Sep, 2015 1 commit
  17. 21 Aug, 2015 1 commit
  18. 03 Aug, 2015 1 commit
    • Michal Terepeta's avatar
      Support MO_U_QuotRem2 in LLVM backend · 92f5385d
      Michal Terepeta authored
      This adds support for MO_U_QuotRem2 in LLVM backend.  Similarly to
      MO_U_Mul2 we use the standard LLVM instructions (in this case 'udiv'
      and 'urem') but do the computation on double the word width (e.g., for
      64-bit we will do them on 128 registers).
      Test Plan: validate
      Reviewers: rwbarton, austin, bgamari
      Reviewed By: bgamari
      Subscribers: thomie
      Differential Revision:
      GHC Trac Issues: #9430
  19. 20 Jul, 2015 1 commit
    • Michal Terepeta's avatar
      LlvmCodeGen: add support for MO_U_Mul2 CallishMachOp · 82ffc80d
      Michal Terepeta authored
      This adds support MO_U_Mul2 to the LLVM backend by simply using 'mul'
      instruction but operating at twice the bit width (e.g., for 64 bit
      words we will generate mul that operates on 128 bits and then extract
      the two 64 bit values for the result of the CallishMachOp).
      Test Plan: validate
      Reviewers: rwbarton, austin, bgamari
      Reviewed By: bgamari
      Subscribers: thomie
      Differential Revision:
      GHC Trac Issues: #9430
  20. 04 Jul, 2015 1 commit
    • Michal Terepeta's avatar
      Support MO_{Add,Sub}IntC and MO_Add2 in the LLVM backend · b1d1c652
      Michal Terepeta authored
      This includes:
      - Adding new LlvmType called LMStructP that represents an unpacked
        struct (this is necessary since LLVM's instructions the
        llvm.sadd.with.overflow.* return an unpacked struct).
      - Modifications to LlvmCodeGen.CodeGen to generate the LLVM
        instructions for the primops.
      - Modifications to StgCmmPrim to actually use those three instructions
        if we use the LLVM backend (so far they were only used for NCG).
      Test Plan: validate
      Reviewers: austin, rwbarton, bgamari
      Reviewed By: bgamari
      Subscribers: thomie, bgamari
      Differential Revision:
      GHC Trac Issues: #9430
  21. 16 Jun, 2015 1 commit
  22. 15 Dec, 2014 1 commit
    • Carter Schonwald's avatar
      Changing prefetch primops to have a `seq`-like interface · f44333ea
      Carter Schonwald authored
      The current primops for prefetching do not properly work in pure code;
      namely, the primops are not 'hoisted' into the correct call sites based
      on when arguments are evaluated. Instead, they should use a `seq`-like
      interface, which will cause it to be evaluated when the needed term is.
      See #9353 for the full discussion.
      Test Plan: updated tests for pure prefetch in T8256 to reflect the design changes in #9353
      Reviewers: simonmar, hvr, ekmett, austin
      Reviewed By: ekmett, austin
      Subscribers: merijn, thomie, carter, simonmar
      Differential Revision:
      GHC Trac Issues: #9353
  23. 09 Sep, 2014 1 commit
    • Austin Seipp's avatar
      Make Applicative a superclass of Monad · d94de872
      Austin Seipp authored
      This includes pretty much all the changes needed to make `Applicative`
      a superclass of `Monad` finally. There's mostly reshuffling in the
      interests of avoid orphans and boot files, but luckily we can resolve
      all of them, pretty much. The only catch was that
      Alternative/MonadPlus also had to go into Prelude to avoid this.
      As a result, we must update the hsc2hs and haddock submodules.
      Signed-off-by: default avatarAustin Seipp <>
      Test Plan: Build things, they might not explode horribly.
      Reviewers: hvr, simonmar
      Subscribers: simonmar
      Differential Revision:
  24. 23 Aug, 2014 1 commit
    • rwbarton's avatar
      Add MO_AddIntC, MO_SubIntC MachOps and implement in X86 backend · cfd08a99
      rwbarton authored
      These MachOps are used by addIntC# and subIntC#, which in turn are
      used in integer-gmp when adding or subtracting small Integers. The
      following benchmark shows a ~6% speedup after this commit on x86_64
      (building GHC with BuildFlavour=perf).
          {-# LANGUAGE MagicHash #-}
          import GHC.Exts
          import Criterion.Main
          count :: Int -> Integer
          count (I# n#) = go n# 0
            where go :: Int# -> Integer -> Integer
                  go 0# acc = acc
                  go n# acc = go (n# -# 1#) $! acc + 1
          main = defaultMain [bgroup "count"
                                [bench "100" $ whnf count 100]]
      Differential Revision:
  25. 14 Aug, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Implement new CLZ and CTZ primops (re #9340) · e0c1767d
      Herbert Valerio Riedel authored
      This implements the new primops
        clz#, clz32#, clz64#,
        ctz#, ctz32#, ctz64#
      which provide efficient implementations of the popular
      count-leading-zero and count-trailing-zero respectively
      (see testcase for a pure Haskell reference implementation).
      On x86, NCG as well as LLVM generates code based on the BSF/BSR
      instructions (which need extra logic to make the 0-case well-defined).
      Test Plan: validate and succesful tests on i686 and amd64
      Reviewers: rwbarton, simonmar, ezyang, austin
      Subscribers: simonmar, relrod, ezyang, carter
      Differential Revision:
      GHC Trac Issues: #9340
  26. 12 Aug, 2014 2 commits
    • tibbe's avatar
      StgCmmPrim: add note to stop using fixed size signed types for sizes · 43420497
      tibbe authored
      We use fixed size signed types to e.g. represent array sizes. This
      means that the size can overflow.
    • tibbe's avatar
      shouldInlinePrimOp: Fix Int overflow · 6f862dfa
      tibbe authored
      There were two overflow issues in shouldInlinePrimOp. The first one is
      due to a negative CmmInt literal being created if the array size was
      given as larger than 2^63-1 (on a 64-bit platform.) This meant that
      large array sizes could compare as being smaller than
      The second issue is that we casted the Integer to an Int in the
      comparison, which again meant that large array sizes could compare as
      being smaller than maxInlineAllocSize.
      The attempt to allocate a large array inline then caused a segfault.
      Fixes #9416.
  27. 10 Aug, 2014 1 commit
  28. 30 Jun, 2014 1 commit
    • tibbe's avatar
      Re-add more primops for atomic ops on byte arrays · 4ee4ab01
      tibbe authored
      This is the second attempt to add this functionality. The first
      attempt was reverted in 950fcae4, due
      to register allocator failure on x86. Given how the register
      allocator currently works, we don't have enough registers on x86 to
      support cmpxchg using complicated addressing modes. Instead we fall
      back to a simpler addressing mode on x86.
      Adds the following primops:
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      Makes these pre-existing out-of-line primops inline:
       * fetchAddIntArray#
       * casIntArray#
  29. 26 Jun, 2014 1 commit
  30. 24 Jun, 2014 1 commit
    • tibbe's avatar
      Add more primops for atomic ops on byte arrays · d8abf85f
      tibbe authored
      Add more primops for atomic ops on byte arrays
      Adds the following primops:
       * atomicReadIntArray#
       * atomicWriteIntArray#
       * fetchSubIntArray#
       * fetchOrIntArray#
       * fetchXorIntArray#
       * fetchAndIntArray#
      Makes these pre-existing out-of-line primops inline:
       * fetchAddIntArray#
       * casIntArray#
  31. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
  32. 30 Mar, 2014 1 commit
  33. 29 Mar, 2014 1 commit
    • tibbe's avatar
      Add SmallArray# and SmallMutableArray# types · 90329b6c
      tibbe authored
      These array types are smaller than Array# and MutableArray# and are
      faster when the array size is small, as they don't have the overhead
      of a card table. Having no card table reduces the closure size with 2
      words in the typical small array case and leads to less work when
      updating or GC:ing the array.
      Reduces both the runtime and memory allocation by 8.8% on my insert
      benchmark for the HashMap type in the unordered-containers package,
      which makes use of lots of small arrays. With tuned GC settings
      (i.e. `+RTS -A6M`) the runtime reduction is 15%.
      Fixes #8923.
  34. 28 Mar, 2014 1 commit
    • tibbe's avatar
      Make copy array ops out-of-line by default · e54828bf
      tibbe authored
      This should reduce code size when there's little to gain from inlining
      these primops, while still retaining the inlining benefit when the
      size of the copy is known statically.
  35. 22 Mar, 2014 1 commit
    • tibbe's avatar
      codeGen: inline allocation optimization for clone array primops · 1eece456
      tibbe authored
      The inline allocation version is 69% faster than the out-of-line
      version, when cloning an array of 16 unit elements on a 64-bit
      Comparing the new and the old primop implementations isn't
      straightforward. The old version had a missing heap check that I
      discovered during the development of the new version. Comparing the
      old and the new version would requiring fixing the old version, which
      in turn means reimplementing the equivalent of MAYBE_CG in StgCmmPrim.
      The inline allocation threshold is configurable via
      -fmax-inline-alloc-size which gives the maximum array size, in bytes,
      to allocate inline. The size does not include the closure header size.
      Allowing the same primop to be either inline or out-of-line has some
      implication for how we lay out heap checks. We always place a heap
      check around out-of-line primops, as they may allocate outside of our
      knowledge. However, for the inline primops we only allow allocation
      via the standard means (i.e. virtHp). Since the clone primops might be
      either inline or out-of-line the heap check layout code now consults
      shouldInlinePrimOp to know whether a primop will be inlined.
  36. 13 Mar, 2014 1 commit
  37. 11 Mar, 2014 3 commits
    • tibbe's avatar
      Fix incorrect loop condition in inline array allocation · c1d74ab9
      tibbe authored
      Also make sure allocHeapClosure updates profiling counters with the
      memory allocated.
    • Simon Marlow's avatar
      Refactor inline array allocation · b684f27e
      Simon Marlow authored
      - Move array representation knowledge into SMRep
      - Separate out low-level heap-object allocation so that we can reuse
        it from doNewArrayOp
      - remove card-table initialisation, we can safely ignore the card
        table for newly allocated arrays.
    • tibbe's avatar
      codeGen: allocate small arrays of statically known size inline · 22f010e0
      tibbe authored
      This results in a 46% runtime decrease when allocating an array of 16
      unit elements on a 64-bit machine.
      In order to allow newArray# to have both an inline and an out-of-line
      implementation, cgOpApp is refactored slightly. The new implementation
      of cgOpApp should make it easier to add other primops with both inline
      and out-of-line implementations in the future.