Skip to content
Snippets Groups Projects
  1. Sep 23, 2016
  2. Sep 12, 2016
    • Simon Marlow's avatar
      Add hs_try_putmvar() · 454033b5
      Simon Marlow authored
      Summary:
      This is a fast, non-blocking, asynchronous, interface to tryPutMVar that
      can be called from C/C++.
      
      It's useful for callback-based C/C++ APIs: the idea is that the callback
      invokes hs_try_putmvar(), and the Haskell code waits for the callback to
      run by blocking in takeMVar.
      
      The callback doesn't block - this is often a requirement of
      callback-based APIs.  The callback wakes up the Haskell thread with
      minimal overhead and no unnecessary context-switches.
      
      There are a couple of benchmarks in
      testsuite/tests/concurrent/should_run.  Some example results comparing
      hs_try_putmvar() with using a standard foreign export:
      
          ./hs_try_putmvar003 1 64 16 100 +RTS -s -N4     0.49s
          ./hs_try_putmvar003 2 64 16 100 +RTS -s -N4     2.30s
      
      hs_try_putmvar() is 4x faster for this workload (see the source for
      hs_try_putmvar003.hs for details of the workload).
      
      An alternative solution is to use the IO Manager for this.  We've tried
      it, but there are problems with that approach:
      * Need to create a new file descriptor for each callback
      * The IO Manger thread(s) become a bottleneck
      * More potential for things to go wrong, e.g. throwing an exception in
        an IO Manager callback kills the IO Manager thread.
      
      Test Plan: validate; new unit tests
      
      Reviewers: niteria, erikd, ezyang, bgamari, austin, hvr
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2501
      454033b5
  3. Sep 09, 2016
    • bitonic's avatar
      Make start address of `osReserveHeapMemory` tunable via command line -xb · 1b5f9207
      bitonic authored and Tamar Christina's avatar Tamar Christina committed
      Summary:
      We stumbled upon a case where an external library (OpenCL) does not work
      if a specific address (0x200000000) is taken.
      
      It so happens that `osReserveHeapMemory` starts trying to mmap at 0x200000000:
      
      ```
              void *hint = (void*)((W_)8 * (1 << 30) + attempt * BLOCK_SIZE);
              at = osTryReserveHeapMemory(*len, hint);
      ```
      
      This makes it impossible to use Haskell programs compiled with GHC 8
      with C functions that use OpenCL.
      
      See this example ​https://github.com/chpatrick/oclwtf for a repro.
      
      This patch allows the user to work around this kind of behavior outside
      our control by letting the user override the starting address through an
      RTS command line flag.
      
      Reviewers: bgamari, Phyx, simonmar, erikd, austin
      
      Reviewed By: Phyx, simonmar
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2513
      1b5f9207
  4. Aug 30, 2016
    • mniip's avatar
      Tag pointers in interpreted constructors · a25bf267
      mniip authored
      Instead of stg_interp_constr_entry there are now 7 functions (one for
      each value of the tag bits) that tag the constructor pointer before
      returning. This is consistent with compiled constructors' entry code,
      and expectations that compiled code places on compiled constructors. The
      iserv protocol is extended with an extra field that explains what
      pointer tag the constructor should use.
      
      Test Plan: Added tests for #12523
      
      Reviewers: erikd, bgamari, hvr, austin, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: osa1, thomie, rwbarton
      
      Differential Revision: https://phabricator.haskell.org/D2473
      
      GHC Trac Issues: #12523
      a25bf267
  5. Aug 19, 2016
  6. Aug 15, 2016
  7. Aug 05, 2016
    • avd's avatar
      codeGen: Remove binutils<2.17 hack, fixes T11758 · e3e2e49a
      avd authored and Ben Gamari's avatar Ben Gamari committed
      There was a complication on the x86_64 platform, where pointers were 64
      bits, but the tools didn't support 64-bit relative relocations.  This
      was true before binutils 2.17, which nowadays is quite standart (even
      CentOs 5 is shipped with 2.17).
      
      Hacks were removed from x86 genSwitch and asm pretty printer. Also
      [x86-64-relative] note was dropped from
      includes/rts/storage/InfoTables.h as it's not referenced anywhere now.
      
      Reviewers: austin, simonmar, rwbarton, erikd, bgamari
      
      Reviewed By: simonmar, erikd, bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2426
      e3e2e49a
  8. Aug 01, 2016
  9. Jul 27, 2016
  10. Jul 21, 2016
    • Ömer Sinan Ağacan's avatar
      Implement unboxed sum primitive type · 714bebff
      Ömer Sinan Ağacan authored
      Summary:
      This patch implements primitive unboxed sum types, as described in
      https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes.
      
      Main changes are:
      
      - Add new syntax for unboxed sums types, terms and patterns. Hidden
        behind `-XUnboxedSums`.
      
      - Add unlifted unboxed sum type constructors and data constructors,
        extend type and pattern checkers and desugarer.
      
      - Add new RuntimeRep for unboxed sums.
      
      - Extend unarise pass to translate unboxed sums to unboxed tuples right
        before code generation.
      
      - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better
        code generation when sum values are involved.
      
      - Add user manual section for unboxed sums.
      
      Some other changes:
      
      - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to
        `MultiValAlt` to be able to use those with both sums and tuples.
      
      - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really
        wrong, given an `Any` `TyCon`, there's no way to tell what its kind
        is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`.
      
      - Fix some bugs on the way: #12375.
      
      Not included in this patch:
      
      - Update Haddock for new the new unboxed sum syntax.
      
      - `TemplateHaskell` support is left as future work.
      
      For reviewers:
      
      - Front-end code is mostly trivial and adapted from unboxed tuple code
        for type checking, pattern checking, renaming, desugaring etc.
      
      - Main translation routines are in `RepType` and `UnariseStg`.
        Documentation in `UnariseStg` should be enough for understanding
        what's going on.
      
      Credits:
      
      - Johan Tibell wrote the initial front-end and interface file
        extensions.
      
      - Simon Peyton Jones reviewed this patch many times, wrote some code,
        and helped with debugging.
      
      Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin,
                 simonmar, hvr, erikd
      
      Reviewed By: simonpj
      
      Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire,
                   thomie, mpickering
      
      Differential Revision: https://phabricator.haskell.org/D2259
      714bebff
  11. Jul 20, 2016
    • gcampax's avatar
      Compact Regions · cf989ffe
      gcampax authored and Simon Marlow's avatar Simon Marlow committed
      This brings in initial support for compact regions, as described in the
      ICFP 2015 paper "Efficient Communication and Collection with Compact
      Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni
      Campagna.
      
      Some things may change before the 8.2 release, but I (Simon M.) wanted
      to get the main patch committed so that we can iterate.
      
      What documentation there is is in the Data.Compact module in the new
      compact package.  We'll need to extend and polish the documentation
      before the release.
      
      Test Plan:
      validate
      (new test cases included)
      
      Reviewers: ezyang, simonmar, hvr, bgamari, austin
      
      Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
      
      Differential Revision: https://phabricator.haskell.org/D1264
      
      GHC Trac Issues: #11493
      cf989ffe
  12. Jul 16, 2016
  13. Jun 17, 2016
    • Simon Marlow's avatar
      NUMA cleanups · 498ed266
      Simon Marlow authored
      - Move the numaMap and nNumaNodes out of RtsFlags to Capability.c
      - Add a test to tests/rts
      498ed266
  14. Jun 10, 2016
    • Simon Marlow's avatar
      Rts flags cleanup · c88f31a0
      Simon Marlow authored
      * Remove unused/old flags from the structs
      * Update old comments
      * Add missing flags to GHC.RTS
      * Simplify GHC.RTS, remove C code and use hsc2hs instead
      * Make ParFlags unconditional, and add support to GHC.RTS
      c88f31a0
    • Simon Marlow's avatar
      NUMA support · 9e5ea67e
      Simon Marlow authored
      Summary:
      The aim here is to reduce the number of remote memory accesses on
      systems with a NUMA memory architecture, typically multi-socket servers.
      
      Linux provides a NUMA API for doing two things:
      * Allocating memory local to a particular node
      * Binding a thread to a particular node
      
      When given the +RTS --numa flag, the runtime will
      * Determine the number of NUMA nodes (N) by querying the OS
      * Assign capabilities to nodes, so cap C is on node C%N
      * Bind worker threads on a capability to the correct node
      * Keep a separate free lists in the block layer for each node
      * Allocate the nursery for a capability from node-local memory
      * Allocate blocks in the GC from node-local memory
      
      For example, using nofib/parallel/queens on a 24-core 2-socket machine:
      
      ```
      $ ./Main 15 +RTS -N24 -s -A64m
        Total   time  173.960s  (  7.467s elapsed)
      
      $ ./Main 15 +RTS -N24 -s -A64m --numa
        Total   time  150.836s  (  6.423s elapsed)
      ```
      
      The biggest win here is expected to be allocating from node-local
      memory, so that means programs using a large -A value (as here).
      
      According to perf, on this program the number of remote memory accesses
      were reduced by more than 50% by using `--numa`.
      
      Test Plan:
      * validate
      * There's a new flag --debug-numa=<n> that pretends to do NUMA without
        actually making the OS calls, which is useful for testing the code
        on non-NUMA systems.
      * TODO: I need to add some unit tests
      
      Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2199
      9e5ea67e
  15. Jun 04, 2016
  16. May 18, 2016
    • Ben Gamari's avatar
      rts: Add isPinnedByteArray# primop · 310371ff
      Ben Gamari authored and Ben Gamari's avatar Ben Gamari committed
      Adds a primitive operation to determine whether a particular
      `MutableByteArray#` is backed by a pinned buffer.
      
      Test Plan: Validate with included testcase
      
      Reviewers: austin, simonmar
      
      Reviewed By: austin, simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2217
      
      GHC Trac Issues: #12059
      310371ff
    • mlen's avatar
      Fix histograms for ticky code · f0f0ac85
      mlen authored and Ben Gamari's avatar Ben Gamari committed
      This patch fixes Cmm generation required to produce histograms when
      compiling with -ticky flag, strips dead code from rts/Ticky.c and
      reworks it to use a shared constant in both C and Haskell code.
      
      Fixes #8308.
      
      Test Plan: T8308
      
      Reviewers: jstolarek, simonpj, austin
      
      Reviewed By: simonpj
      
      Subscribers: mpickering, simonpj, bgamari, mlen, thomie, jstolarek
      
      Differential Revision: https://phabricator.haskell.org/D931
      
      GHC Trac Issues: #8308
      f0f0ac85
  17. May 17, 2016
    • Erik de Castro Lopo's avatar
      rts: More const correct-ness fixes · 33c029dd
      Erik de Castro Lopo authored
      In addition to more const-correctness fixes this patch fixes an
      infelicity of the previous const-correctness patch (995cf0f3) which
      left `UNTAG_CLOSURE` taking a `const StgClosure` pointer parameter
      but returning a non-const pointer. Here we restore the original type
      signature of `UNTAG_CLOSURE` and add a new function
      `UNTAG_CONST_CLOSURE` which takes and returns a const `StgClosure`
      pointer and uses that wherever possible.
      
      Test Plan: Validate on Linux, OS X and Windows
      
      Reviewers: Phyx, hsyl20, bgamari, austin, simonmar, trofi
      
      Reviewed By: simonmar, trofi
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2231
      33c029dd
  18. May 16, 2016
    • Peter Trommler's avatar
      PPC: Implement SMP primitives using gcc built-ins · 563a4857
      Peter Trommler authored and Ben Gamari's avatar Ben Gamari committed
      The SMP primitives were missing appropriate memory barriers
      (sync, isync instructions) on all PowerPCs.
      
      Use the built-ins _sync_* provided by gcc and clang. This
      reduces code size significantly.
      
      Remove broken mark for concprog001 on powerpc64. The referenced
      ticket number (11259) was wrong.
      
      Test Plan: validate on powerpc and ARM
      
      Reviewers: erikd, austin, simonmar, bgamari, hvr
      
      Reviewed By: bgamari, hvr
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2225
      
      GHC Trac Issues: #12070
      563a4857
  19. May 12, 2016
  20. May 11, 2016
  21. May 10, 2016
    • Ben Gamari's avatar
      stg/Types.h: Fix comment and #include · 3ca78062
      Ben Gamari authored
      3ca78062
    • wereHamster's avatar
      Use stdint types for Stg{Word,Int}{8,16,32,64} · 260a5648
      wereHamster authored and Ben Gamari's avatar Ben Gamari committed
      We can't define Stg{Int,Word} in terms of {,u}intptr_t because STG
      depends on them being the exact same size as void*, and {,u}intptr_t
      does not make that guarantee. Furthermore, we also need to define
      StgHalf{Int,Word}, so the preprocessor if needs to stay. But we can at
      least keep it in a single place instead of repeating it in various
      files.
      
      Also define STG_{INT,WORD}{8,16,32,64}_{MIN,MAX} and use it in HsFFI.h,
      further reducing the need for CPP in other files.
      
      Reviewers: austin, bgamari, simonmar, hvr, erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2182
      260a5648
  22. May 04, 2016
    • Erik de Castro Lopo's avatar
      rts: Replace `nat` with `uint32_t` · db9de7eb
      Erik de Castro Lopo authored
      The `nat` type was an alias for `unsigned int` with a comment saying
      it was at least 32 bits. We keep the typedef in case client code is
      using it but mark it as deprecated.
      
      Test Plan: Validated on Linux, OS X and Windows
      
      Reviewers: simonmar, austin, thomie, hvr, bgamari, hsyl20
      
      Differential Revision: https://phabricator.haskell.org/D2166
      db9de7eb
    • Simon Marlow's avatar
      Add +RTS -AL<size> · f703fd6b
      Simon Marlow authored
      +RTS -AL<size> controls the total size of large objects that can be
      allocated before a GC is triggered.  Previously this was always just the
      value of -A, and the limit mainly existed to prevent runaway allocation
      in pathalogical programs that allocate a lot of large objects.  However,
      since the limit is shared between all cores, on a large multicore the
      default becomes more restrictive, and can end up triggering GC well
      before it would normally have been.
      
      Arguably a better default would be A*N, but this is probably excessive.
      Adding a flag lets you choose, and I've left the default as it was.
      
      See docs for usage.
      f703fd6b
    • Simon Marlow's avatar
      Allow limiting the number of GC threads (+RTS -qn<n>) · 76ee2607
      Simon Marlow authored
      This allows the GC to use fewer threads than the number of capabilities.
      At each GC, we choose some of the capabilities to be "idle", which means
      that the thread running on that capability (if any) will sleep for the
      duration of the GC, and the other threads will do its work.  We choose
      capabilities that are already idle (if any) to be the idle capabilities.
      
      The idea is that this helps in the following situation:
      
      * We want to use a large -N value so as to make use of hyperthreaded
        cores
      * We use a large heap size, so GC is infrequent
      * But we don't want to use all -N threads in the GC, because that
        thrashes the memory too much.
      
      See docs for usage.
      76ee2607
  23. May 01, 2016
  24. Apr 29, 2016
  25. Apr 26, 2016
    • Simon Marlow's avatar
      RTS: Add setInCallCapability() · e68195a9
      Simon Marlow authored
      This allows an OS thread to specify which capability it should run on
      when it makes a call into Haskell.  It is intended for a fairly
      specialised use case, when the client wants to have tighter control over
      the mapping between OS threads and Capabilities - perhaps 1:1
      correspondence, for example.
      e68195a9
  26. Apr 18, 2016
  27. Apr 16, 2016
    • Herbert Valerio Riedel's avatar
      Rework CC/CC_STAGE0 handling in `configure.ac` · 865602e0
      Herbert Valerio Riedel authored and Ben Gamari's avatar Ben Gamari committed
      Rather than using the non-standard/idiomatic `--with-{gcc,clang}=...`
      scheme use the `CC=...` style scheme.
      
      The basic idea is to have Autoconf's CC/CFLAG/CPPFLAG apply to
      stage{1,2,3}, while having a separate _STAGE0 set of env-vars
      denote the bootstrap-toolchain flags/programs.
      
      This should be simpler, less confusing, and somewhat more in line with
      Autoconf's idioms (allowing us to reuse more of Autoconf rather than
      (re)inventing our own confusing non-standard m4 macros to do stuff that
      Autoconf could almost do already for us)
      
      Morever, expose CC_STAGE0 as a so-called "precious" variable.
      
      So now we can better control which bootstrapping gcc is used
      (by default the one used by the stage0 ghc, unless CC_STAGE0 is
      overriden)
      
      ```
      Some influential environment variables:
        CC_STAGE0   C compiler command (bootstrap)
        CC          C compiler command
        CFLAGS      C compiler flags
        ...
      
      Use these variables to override the choices made by `configure' or to
      help it to find libraries and programs with nonstandard names/locations.
      ```
      
      Test Plan: I've tested that cross-compiling with
      `--target=powerpc-linux-gnu` still works, and tried a few variants of
      settting `CC=` and `CC_STAGE0=`; `./validate` passed as well
      
      Reviewers: erikd, austin, bgamari, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: Phyx, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2078
      865602e0
  28. Apr 12, 2016
    • Simon Marlow's avatar
      Allocate blocks in the GC in batches · f4446c5b
      Simon Marlow authored
      Avoids contention for the block allocator lock in the GC; this can be
      seen in the gc_alloc_block_sync counter emitted by +RTS -s.
      
      I experimented with this a while ago, and there was already
      commented-out code for it in GCUtils.c, but I've now improved it so that
      it doesn't result in significantly worse memory usage.
      
      * The old method of putting spare blocks on ws->part_list was wasteful,
        the spare blocks are now shared between all generations and retained
        between GCs.
      
      * repeated allocGroup() results in fragmentation, so I switched to using
        allocLargeChunk() instead which is fragmentation-friendly; we already
        use it for the same reason in nursery allocation.
      f4446c5b
  29. Apr 10, 2016
  30. Mar 28, 2016
    • Herbert Valerio Riedel's avatar
      Autoconf: detect and set CFLAGS/CPPFLAGS needed for C99 mode · afc48f89
      Herbert Valerio Riedel authored
      This is the first phase of addressing #11757 which aims to make C99
      support a base-line requirement for GHC and clean up the code-base to
      use C99 facilities when sensible.
      
      This patch exploits the logic/heuristic used by `AC_PROG_CC_C99` to
      determine the flags needed in case the C compiler isn't able to compile
      C99 code in its current mode. We can't use `AC_PROG_CC_C99` directly
      though because GHC's build-system expects CC to contain a filename
      without any flags, while `AC_PROG_CC_C99` would e.g. result in
      `CC="gcc -std=gnu99"`. Morever, we support different `CC`s for
      stage0/1/2, so we need a version of `AC_PROG_CC_C99` for which we can
      specify the `CC`/`CFLAGS` variables to operate on. This is what
      `FP_SET_CFLAGS_C99` does.
      
      Note that Clang has been defaulting to C99+ for a long time, while GCC 5
      defaults to C99+ as well. So this has mostly an affect on older GCCs
      versions prior to 5.0 and possibly compilers other than GCC/Clang (which
      are not officially supported for building GHC anyway).
      
      Reviewers: kgardas, erikd, bgamari, austin
      
      Reviewed By: erikd
      
      Differential Revision: https://phabricator.haskell.org/D2045
      afc48f89
    • Herbert Valerio Riedel's avatar
      Scrap IRIX support · 0bca3f3a
      Herbert Valerio Riedel authored
      Long time ago, IRIX was way ahead of its time in the last century with
      its SMP capabilities of scaling up to 1024 processors and other features
      such as XFS or OpenGL that originated in IRIX and live on to this day in
      other operating systems.
      
      However, IRIX's last software update was in 2006 and support ended
      around 2013 according to [1], so it's considered an extinct platform by
      now. So this commit message is effectively an obituary for GHC's IRIX
      support.
      
      R.I.P. IRIX
      
       [1]: https://en.wikipedia.org/wiki/IRIX
      0bca3f3a
  31. Mar 27, 2016
Loading