1. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
  2. 14 May, 2014 1 commit
  3. 04 May, 2014 1 commit
  4. 02 May, 2014 1 commit
    • Simon Marlow's avatar
      Per-thread allocation counters and limits · b0534f78
      Simon Marlow authored
      This tracks the amount of memory allocation by each thread in a
      counter stored in the TSO.  Optionally, when the counter drops below
      zero (it counts down), the thread can be sent an asynchronous
      exception: AllocationLimitExceeded.  When this happens, given a small
      additional limit so that it can handle the exception.  See
      documentation in GHC.Conc for more details.
      Allocation limits are similar to timeouts, but
        - timeouts use real time, not CPU time.  Allocation limits do not
          count anything while the thread is blocked or in foreign code.
        - timeouts don't re-trigger if the thread catches the exception,
          allocation limits do.
        - timeouts can catch non-allocating loops, if you use
          -fno-omit-yields.  This doesn't work for allocation limits.
      I couldn't measure any impact on benchmarks with these changes, even
      for nofib/smp.
  5. 29 Apr, 2014 1 commit
  6. 13 Apr, 2014 1 commit
  7. 04 Apr, 2014 1 commit
  8. 29 Mar, 2014 1 commit
    • tibbe's avatar
      Add SmallArray# and SmallMutableArray# types · 90329b6c
      tibbe authored
      These array types are smaller than Array# and MutableArray# and are
      faster when the array size is small, as they don't have the overhead
      of a card table. Having no card table reduces the closure size with 2
      words in the typical small array case and leads to less work when
      updating or GC:ing the array.
      Reduces both the runtime and memory allocation by 8.8% on my insert
      benchmark for the HashMap type in the unordered-containers package,
      which makes use of lots of small arrays. With tuned GC settings
      (i.e. `+RTS -A6M`) the runtime reduction is 15%.
      Fixes #8923.
  9. 22 Mar, 2014 1 commit
    • tibbe's avatar
      codeGen: inline allocation optimization for clone array primops · 1eece456
      tibbe authored
      The inline allocation version is 69% faster than the out-of-line
      version, when cloning an array of 16 unit elements on a 64-bit
      Comparing the new and the old primop implementations isn't
      straightforward. The old version had a missing heap check that I
      discovered during the development of the new version. Comparing the
      old and the new version would requiring fixing the old version, which
      in turn means reimplementing the equivalent of MAYBE_CG in StgCmmPrim.
      The inline allocation threshold is configurable via
      -fmax-inline-alloc-size which gives the maximum array size, in bytes,
      to allocate inline. The size does not include the closure header size.
      Allowing the same primop to be either inline or out-of-line has some
      implication for how we lay out heap checks. We always place a heap
      check around out-of-line primops, as they may allocate outside of our
      knowledge. However, for the inline primops we only allow allocation
      via the standard means (i.e. virtHp). Since the clone primops might be
      either inline or out-of-line the heap check layout code now consults
      shouldInlinePrimOp to know whether a primop will be inlined.
  10. 17 Mar, 2014 1 commit
  11. 13 Mar, 2014 1 commit
  12. 11 Mar, 2014 2 commits
  13. 03 Feb, 2014 2 commits
    • Jan Stolarek's avatar
      Eliminate duplicate code in Cmm pipeline · dba9bf67
      Jan Stolarek authored
      End of Cmm pipeline used to be split into two alternative flows,
      depending on whether we did proc-point splitting or not. There
      was a lot of code duplication between these two branches. But it
      wasn't really necessary as the differences can be easily enclosed
      within an if-then-else. I observed no impact of this change on
      compilation performance.
    • Jan Stolarek's avatar
      Document deprecations in Hoopl · 526cbc7a
      Jan Stolarek authored
  14. 02 Feb, 2014 2 commits
  15. 01 Feb, 2014 3 commits
  16. 26 Jan, 2014 1 commit
  17. 16 Jan, 2014 4 commits
    • Simon Marlow's avatar
      Allow the argument to 'reserve' to be a compile-time expression · 58e5843a
      Simon Marlow authored
      By using the constant-folder to reduce it to an integer.
    • Simon Marlow's avatar
      Add a way to reserve temporary stack space in high-level Cmm · eaa37a0f
      Simon Marlow authored
      We occasionally need to reserve some temporary memory in a primop for
      passing to a foreign function.  We've been using the stack for this,
      but when we moved to high-level Cmm it became quite fragile because
      primops are in high-level Cmm and the stack is supposed to be under
      the control of the Cmm pipeline.
      So this change puts things on a firmer footing by adding a new Cmm
      construct 'reserve'.  e.g. in decodeFloat_Int#:
          reserve 2 = tmp {
            mp_tmp1  = tmp + WDS(1);
            mp_tmp_w = tmp;
            /* Perform the operation */
            ccall __decodeFloat_Int(mp_tmp1 "ptr", mp_tmp_w "ptr", arg);
            r1 = W_[mp_tmp1];
            r2 = W_[mp_tmp_w];
      reserve is described in CmmParse.y.
      Unfortunately the argument to reserve must be a compile-time constant.
      We might have to extend the parser to allow expressions with
      arithmetic operators if this is too restrictive.
      Note also that the return instruction for the procedure must be
      outside the scope of the reserved stack area, so we have to extract
      the values from the reserved area before we close the scope.  This
      means some more local variables (r1, r2 in the example above).  The
      generated code is more or less identical to what we had before though.
    • Gabor Greif's avatar
      Typo in comment · 11f5cd94
      Gabor Greif authored
    • Simon Marlow's avatar
      Documentation on the stack layout algorithm · 78a506a9
      Simon Marlow authored
  18. 10 Jan, 2014 1 commit
  19. 28 Nov, 2013 3 commits
  20. 22 Nov, 2013 3 commits
  21. 03 Nov, 2013 1 commit
  22. 26 Oct, 2013 1 commit
  23. 25 Oct, 2013 2 commits
  24. 24 Oct, 2013 1 commit
  25. 18 Oct, 2013 3 commits