This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
  1. 23 Apr, 2017 1 commit
  2. 17 Apr, 2017 1 commit
    • Sergei Trofimovich's avatar
      hs_add_root() RTS API removal · a92ff5d6
      Sergei Trofimovich authored
      Before ghc-7.2 hs_add_root() had to be used to initialize haskell
      modules when haskell was called from FFI.
      
      commit a52ff761
      ("Change the way module initialisation is done (#3252, #4417)")
      removed needs for hs_add_root() and made function a no-op.
      For backward compatibility '__stginit_<module>' symbol was
      not removed.
      
      This change removes no-op hs_add_root() function and unused
      '__stginit_<module>' symbol from each haskell module.
      Signed-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      
      Test Plan: ./validate
      
      Reviewers: simonmar, austin, bgamari, erikd
      
      Reviewed By: simonmar
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3460
      a92ff5d6
  3. 16 Apr, 2017 1 commit
  4. 05 Apr, 2017 1 commit
  5. 04 Apr, 2017 2 commits
  6. 02 Apr, 2017 2 commits
    • Simon Marlow's avatar
      Report heap overflow in the same way as stack overflow · 61ba4518
      Simon Marlow authored
      Now that we throw an exception for heap overflow, we should only print
      the heap overflow message in the main thread when the HeapOverflow
      exception is caught, rather than as a side effect in the GC.
      
      Stack overflows were already done this way, I just made heap overflow
      consistent with stack overflow, and did some related cleanup.
      
      Fixes broken T2592(profasm) which was reporting the heap overflow
      message twice (you would only notice when building with profiling
      libs enabled).
      
      Test Plan: validate
      
      Reviewers: bgamari, niteria, austin, DemiMarie, hvr, erikd
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3394
      61ba4518
    • Sergei Trofimovich's avatar
      FastMutInt: fix Int and Ptr sizes when crosscompiling · d89b0471
      Sergei Trofimovich authored
      Similar to
        https://ghc.haskell.org/trac/ghc/ticket/13491
        https://phabricator.haskell.org/D3122
      
      SIZEOF_HSINT and SIZEOF_VOID_P are sizes of
      target platform. These values are usually
      not correct when stage1 is built.
      
      It means the code
      
      ```haskell
        newFastMutInt = IO $ \s ->
         case newByteArray# size s of { (# s, arr #) ->
         (# s, FastMutInt arr #) }
         where !(I# size) = SIZEOF_HSINT
      ```
      would try to allocate only 4 bytes on 64-bit-host
      targeting 32-bit system.
      
      It does not matter in practice as newByteArray#
      implementation rounds up passed value to host's
      word size. But one day it might not.
      
      To prevent this class of problems in compiler/
      directory 'MachDeps.h' contents is hidden when
      ghc-stage1 (-DSTAGE=1) is built.
      Signed-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      
      Reviewers: austin, rwbarton, simonmar, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D3405
      d89b0471
  7. 01 Mar, 2017 1 commit
    • Ben Gamari's avatar
      rts: Fix build · b86d226f
      Ben Gamari authored
      I evidently neglected to consider that validate doesn't build profiled
      ways. Arg.
      b86d226f
  8. 28 Feb, 2017 1 commit
    • Ben Gamari's avatar
      rts: Allow profile output path to be specified on RTS command line · db2a6676
      Ben Gamari authored
      This introduces a RTS option, -po, which allows the user to override the stem
      used to form the output file names of the heap profile and cost center summary.
      
      It's a bit unclear to me whether this is really the interface we want.
      Alternatively we could just allow the user to specify the `.hp` and `.prof` file
      names separately. This would arguably be a bit more straightforward and would
      allow the user to name JSON output with an appropriate `.json` suffix if they so
      desired. However, this would come at the cost of taking more of the option
      space, which is a somewhat precious commodity.
      
      Test Plan: Validate, try using `-po` RTS option
      
      Reviewers: simonmar, austin, erikd
      
      Reviewed By: simonmar
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D3182
      db2a6676
  9. 23 Feb, 2017 1 commit
    • Ben Gamari's avatar
      JSON profiler reports · a2043332
      Ben Gamari authored
      This introduces a JSON output format for cost-centre profiler reports.
      It's not clear whether this is really something we want to introduce
      given that we may also move to a more Haskell-driven output pipeline in
      the future, but I nevertheless found this helpful, so I thought I would
      put it up.
      
      Test Plan: Compile a program with `-prof -fprof-auto`; run with `+RTS
      -pj`
      
      Reviewers: austin, erikd, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: duncan, maoe, thomie, simonmar
      
      Differential Revision: https://phabricator.haskell.org/D3132
      a2043332
  10. 08 Feb, 2017 1 commit
  11. 04 Feb, 2017 1 commit
    • Takenobu Tani's avatar
      Fix comment (old file names) in includes/ · 9984024a
      Takenobu Tani authored
      [skip ci]
      
      There ware some old file names (.lhs, ...) at comments.
      
      * includes/rts/Bytecodes.h
        - ghc/compiler/ghci/ByteCodeGen.lhs -> ByteCodeAsm.hs
      
      * includes/rts/Constants.h
        - libraries/base/GHC/Conc.lhs -> libraries/base/GHC/Conc/Sync.hs
      
      * includes/rts/storage/FunTypes.h
        - utils/genapply/GenApply.hs -> utils/genappl/Main.hs
        - compiler/codeGen/CgCallConv.lhs -> compiler/codeGen/StgCmmLayout.hs
      
      * includes/stg/MiscClosures.h
        - compiler/codeGen/CgStackery.lhs -> compiler/codeGen/StgCmmArgRep.hs
        - HeapStackCheck.hc  -> HeapStackCheck.cmm
      
      Reviewers: bgamari, austin, simonmar, erikd
      
      Reviewed By: erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D3074
      9984024a
  12. 02 Feb, 2017 1 commit
    • Ben Gamari's avatar
      Add support for StaticPointers in GHCi · eedb3df0
      Ben Gamari authored
      Here we add support to GHCi for StaticPointers. This process begins by
      adding remote GHCi messages for adding entries to the static pointer
      table. We then collect binders needing SPT entries after linking and
      send the interpreter a message adding entries with the appropriate
      fingerprints.
      
      Test Plan: `make test TEST=StaticPtr`
      
      Reviewers: facundominguez, mboes, simonpj, simonmar, goldfire, austin,
      hvr, erikd
      
      Reviewed By: simonpj, simonmar
      
      Subscribers: RyanGlScott, simonpj, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2504
      
      GHC Trac Issues: #12356
      eedb3df0
  13. 31 Jan, 2017 1 commit
    • alexbiehl's avatar
      Abstract over the way eventlogs are flushed · 4dfc6d1c
      alexbiehl authored
      Currently eventlog data is always written to a file `progname.eventlog`.
      This patch introduces the `flushEventLog` field in `RtsConfig` which
      allows to customize the writing of eventlog data.
      
      One possible scenario is the ongoing live-profile-monitor effort by
      @NCrashed which slurps all eventlog data through `fluchEventLog`.
      
      `flushEventLog` takes a buffer with eventlog data and its size and
      returns `false` (0) in case eventlog data could not be procesed.
      
      Reviewers: simonmar, austin, erikd, bgamari
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: qnikst, thomie, NCrashed
      
      Differential Revision: https://phabricator.haskell.org/D2934
      4dfc6d1c
  14. 29 Jan, 2017 2 commits
    • Sergei Trofimovich's avatar
      UNREG: add a forward declaration for local literals · 4441f907
      Sergei Trofimovich authored
      When toplevel literals don't have a way to be exported
      from module GHC infers their labels as static.
      
      Example from GHC.Arr:
          static char rdVA_bytes[] = " out of range ";
      
      When this label is used in module internally
      we also need to provide it's forward declaration.
      Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>
      4441f907
    • Sergei Trofimovich's avatar
      UNREG: fix "_bytes" string literal forward declaration · 34a02055
      Sergei Trofimovich authored
      Typical UNREG build failure looks like that:
      
        ghc-unreg/includes/Stg.h:226:46: error:
           note: in definition of macro 'EI_'
           #define EI_(X)          extern StgWordArray (X) GNU_ATTRIBUTE(aligned (8))
                                                        ^
            |
        226 | #define EI_(X)          extern StgWordArray (X) GNU_ATTRIBUTE(aligned (8))
            |                                              ^
      
        /tmp/ghc10489_0/ghc_3.hc:1754:6: error:
           note: previous definition of 'ghczmprim_GHCziTypes_zdtcTyCon2_bytes' was here
           char ghczmprim_GHCziTypes_zdtcTyCon2_bytes[] = "TyCon";
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             |
        1754 | char ghczmprim_GHCziTypes_zdtcTyCon2_bytes[] = "TyCon";
             |      ^
      
      As we see here "_bytes" string literals are defined as 'char []'
      array, not 'StgWord []'.
      
      The change special-cases "_bytes" string literals to have
      correct declaration type.
      Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>
      34a02055
  15. 18 Jan, 2017 1 commit
  16. 10 Jan, 2017 1 commit
  17. 06 Jan, 2017 1 commit
    • Simon Marlow's avatar
      More fixes for #5654 · 3a18baff
      Simon Marlow authored
      * In stg_ap_0_fast, if we're evaluating a thunk, the thunk might
        evaluate to a function in which case we may have to adjust its CCS.
      
      * The interpreter has its own implementation of stg_ap_0_fast, so we
        have to do the same shenanigans with creating empty PAPs and copying
        PAPs there.
      
      * GHCi creates Cost Centres as children of CCS_MAIN, which enterFunCCS()
        wrongly assumed to imply that they were CAFs.  Now we use the is_caf
        flag for this, which we have to correctly initialise when we create a
        Cost Centre in GHCi.
      3a18baff
  18. 17 Dec, 2016 1 commit
    • Sergei Trofimovich's avatar
      UNREG: include CCS_OVERHEAD to STG · 2fa00f5b
      Sergei Trofimovich authored
      Commit 394231b3 aded
      CCS_OVERHEAD annotation to 'rts/Apply.cmm'.
      
      Before the change CCS_OVERHEAD was used only in C code.
      
      The change exports CCS_OVERHEAD to STG.
      
      Fixes UNREG build failure:
        rts_dist_HC rts/dist/build/Apply.p_o
          /tmp/ghc29563_0/ghc_4.hc: In function 'cm_entry':
      
          /tmp/ghc29563_0/ghc_4.hc:73:13: error:
           error: 'CCS_OVERHEAD' undeclared (first use in this function)
           *((P_)((W_)&CCS_OVERHEAD+72)) = ...
                       ^~~~~~~~~~~~
      Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>
      2fa00f5b
  19. 15 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Fix cost-centre-stacks bug (#5654) · 394231b3
      Simon Marlow authored
      This fixes some cases of wrong stacks being generated by the profiler.
      For background and details on the fix see
      `Note [Evaluating functions with profiling]` in `rts/Apply.cmm`.
      
      This does have an impact on allocations for some programs when
      profiling.  nofib results:
      
      ```
         k-nucleotide          +0.0%     +8.8%    +11.0%    +11.0%      0.0%
               puzzle          +0.0%    +12.5%     0.244     0.246      0.0%
            typecheck           0.0%     +8.7%    +16.1%    +16.2%      0.0%
      ------------------------------------------------------------------------
      --------
                  Min          -0.0%     -0.0%    -34.4%    -35.5%    -25.0%
                  Max          +0.0%    +12.5%    +48.9%    +49.4%    +10.6%
       Geometric Mean          +0.0%     +0.6%     +2.0%     +1.8%     -0.3%
      
      ```
      
      But runtimes don't seem to be affected much, and the examples I looked
      at were completely legitimate.  For example, in puzzle we have this:
      
      ```
      position :: ItemType -> StateType ->  BankType
      position Bono = bonoPos
      position Edge = edgePos
      position Larry = larryPos
      position Adam = adamPos
      ```
      
      where the identifiers on the rhs are all record selectors.  Previously
      the profiler gave a stack that looked like
      
      ```
        position
        bonoPos
        ...
      ```
      
      i.e. `bonoPos` was at the same level of the call stack as `position`,
      but now it looks like
      
      ```
        position
         bonoPos
         ...
      ```
      
      I used the normaliser from the testsuite to diff the profiling output
      from other nofib programs and they all looked better.
      
      Test Plan:
      * the broken test passes
      * validate
      * compiled and ran all of nofib, measured perf, diff'd several .prof
      files
      
      Reviewers: niteria, erikd, austin, scpmw, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2804
      
      GHC Trac Issues: #5654, #10007
      394231b3
  20. 13 Dec, 2016 1 commit
  21. 11 Dec, 2016 1 commit
    • Moritz Angermann's avatar
      Make globals use sharedCAF · c3c70244
      Moritz Angermann authored
      Summary:
      The use of globals is quite painful when multiple rts are loaded, e.g.
      when plugins are loaded, which bring in a second rts. The sharedCAF
      appraoch was employed for the FastStringTable; I've taken the libery
      to extend this to the other globals I could find.
      
      This is a reboot of D2575, that should hopefully not exhibit the same
      windows build issues.
      
      Reviewers: Phyx, simonmar, goldfire, bgamari, austin, hvr, erikd
      
      Reviewed By: Phyx, simonmar, bgamari
      
      Subscribers: mpickering, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2773
      c3c70244
  22. 07 Dec, 2016 2 commits
    • Ben Gamari's avatar
      Don't barf() on failures in loadArchive() · 83d69dca
      Ben Gamari authored
      This patch replaces calls to barf() in loadArchive() with proper
      error handling.
      
      Test Plan: GHC CI
      
      Reviewers: rwbarton, erikd, hvr, austin, simonmar, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Tags: #ghc
      
      Differential Revision: https://phabricator.haskell.org/D2652
      
      GHC Trac Issues: #12388
      83d69dca
    • Simon Marlow's avatar
      Overhaul of Compact Regions (#12455) · 7036fde9
      Simon Marlow authored
      Summary:
      This commit makes various improvements and addresses some issues with
      Compact Regions (aka Compact Normal Forms).
      
      This was the most important thing I wanted to fix.  Compaction
      previously prevented GC from running until it was complete, which
      would be a problem in a multicore setting.  Now, we compact using a
      hand-written Cmm routine that can be interrupted at any point.  When a
      GC is triggered during a sharing-enabled compaction, the GC has to
      traverse and update the hash table, so this hash table is now stored
      in the StgCompactNFData object.
      
      Previously, compaction consisted of a deepseq using the NFData class,
      followed by a traversal in C code to copy the data.  This is now done
      in a single pass with hand-written Cmm (see rts/Compact.cmm). We no
      longer use the NFData instances, instead the Cmm routine evaluates
      components directly as it compacts.
      
      The new compaction is about 50% faster than the old one with no
      sharing, and a little faster on average with sharing (the cost of the
      hash table dominates when we're doing sharing).
      
      Static objects that don't (transitively) refer to any CAFs don't need
      to be copied into the compact region.  In particular this means we
      often avoid copying Char values and small Int values, because these
      are static closures in the runtime.
      
      Each Compact# object can support a single compactAdd# operation at any
      given time, so the Data.Compact library now enforces mutual exclusion
      using an MVar stored in the Compact object.
      
      We now get exceptions rather than killing everything with a barf()
      when we encounter an object that cannot be compacted (a function, or a
      mutable object).  We now also detect pinned objects, which can't be
      compacted either.
      
      The Data.Compact API has been refactored and cleaned up.  A new
      compactSize operation returns the size (in bytes) of the compact
      object.
      
      Most of the documentation is in the Haddock docs for the compact
      library, which I've expanded and improved here.
      
      Various comments in the code have been improved, especially the main
      Note [Compact Normal Forms] in rts/sm/CNF.c.
      
      I've added a few tests, and expanded a few of the tests that were
      there.  We now also run the tests with GHCi, and in a new test way
      that enables sanity checking (+RTS -DS).
      
      There's a benchmark in libraries/compact/tests/compact_bench.hs for
      measuring compaction speed and comparing sharing vs. no sharing.
      
      The field totalDataW in StgCompactNFData was unnecessary.
      
      Test Plan:
      * new unit tests
      * validate
      * tested manually that we can compact Data.Aeson data
      
      Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd
      
      Subscribers: thomie, simonpj
      
      Differential Revision: https://phabricator.haskell.org/D2751
      
      GHC Trac Issues: #12455
      7036fde9
  23. 06 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Overhaul GC stats · 24e6594c
      Simon Marlow authored
      Summary:
      Visible API changes:
      
      * The C struct `GCDetails` gives the stats about a single GC.  This is
        passed to the `gcDone()` callback if one is set via the
        RtsConfig. (previously we just passed a collection of values, so this
        is more extensible, at the expense of breaking the existing API)
      
      * `RTSStats` gives cumulative stats since the start of the program,
        and includes the `GCDetails` for the most recent GC.  This struct
        can be obtained via `getRTSStats()` (the old `getGCStats()` has been
        removed, and `getGCStatsEnabled()` has been renamed to
        `getRTSStatsEnabled()`)
      
      Improvements:
      
      * The per-GC stats and cumulative stats are now cleanly separated.
      
      * Inside the RTS we have a top-level `RTSStats` struct to keep all our
        stats in, previously this was just a collection of strangely-named
        variables.  This struct is mostly just copied in `getRTSStats()`, so
        the implementation of that function is a lot shorter.
      
      * Types are more consistent.  We use a uint64_t byte count for all
        memory values, and Time for all time values.
      
      * Names are more consistent.  We use a suffix `_bytes` for all byte
        counts and `_ns` for all time values.
      
      * We now collect information about the amount of memory in large
        objects and compact objects in `GCDetails`. (the latter was the reason
        I started doing this patch but it seems to have ballooned a bit!)
      
      * I fixed a bug in the calculation of the elapsed MUT time, and added
        an ASSERT to stop the calculations going wrong in the future.
      
      For now I kept the Haskell API in `GHC.Stats` the same, by
      impedence-matching with the new API.  We could either break that API
      and make it match the C API more closely, or we could add a new API
      and deprecate the old one.  Opinions welcome.
      
      This stuff is very easy to get wrong, and it's hard to test.  Reviews
      welcome!
      
      Test Plan:
      manual testing
      validate
      
      Reviewers: bgamari, niteria, austin, ezyang, hvr, erikd, rwbarton, Phyx
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2756
      24e6594c
  24. 02 Dec, 2016 1 commit
    • Alexander Vershilov's avatar
      Install toplevel handler inside fork. · 895a131f
      Alexander Vershilov authored
      When rts is forked it doesn't update toplevel handler, so UserInterrupt
      exception is sent to Thread1 that doesn't exist in forked process.
      
      We install toplevel handler when fork so signal will be delivered to the
      new main thread.
      
      Fixes #12903
      
      Reviewers: simonmar, austin, erikd, bgamari
      
      Reviewed By: bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2770
      
      GHC Trac Issues: #12903
      895a131f
  25. 30 Nov, 2016 1 commit
  26. 29 Nov, 2016 3 commits
    • Ben Gamari's avatar
      Use C99's bool · 428e152b
      Ben Gamari authored
      Test Plan: Validate on lots of platforms
      
      Reviewers: erikd, simonmar, austin
      
      Reviewed By: erikd, simonmar
      
      Subscribers: michalt, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2699
      428e152b
    • Moritz Angermann's avatar
      Make globals use sharedCAF · 6f7ed1e5
      Moritz Angermann authored
      The use of globals is quite painful when multiple rts are loaded, e.g.
      when plugins are loaded, which bring in a second rts. The sharedCAF
      appraoch was employed for the FastStringTable; I've taken the libery
      to extend this to the other globals I could find.
      
      Reviewers: rwbarton, simonmar, austin, hvr, erikd, bgamari
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2575
      6f7ed1e5
    • shlevy's avatar
      Define thread primitives if they're supported. · 1732d7ac
      shlevy authored
      On iOS, we use the pthread-based implementation of Itimer.c even for a
      non-threaded RTS. Since 999c464d, this relies on synchronization
      primitives like Mutex, so ensure those primitives are defined whenever
      they are supported, even if !THREADED_RTS.
      
      Fixes #12799.
      
      Reviewers: erikd, austin, simonmar, bgamari
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2712
      
      GHC Trac Issues: #12799
      1732d7ac
  27. 14 Nov, 2016 1 commit
    • Simon Marlow's avatar
      Remove CONSTR_STATIC · 55d535da
      Simon Marlow authored
      Summary:
      We currently have two info tables for a constructor
      
      * XXX_con_info: the info table for a heap-resident instance of the
        constructor, It has type CONSTR, or one of the specialised types like
        CONSTR_1_0
      
      * XXX_static_info: the info table for a static instance of this
        constructor, which has type CONSTR_STATIC or CONSTR_STATIC_NOCAF.
      
      I'm getting rid of the latter, and using the `con_info` info table for
      both static and dynamic constructors.  For rationale and more details
      see Note [static constructors] in SMRep.hs.
      
      I also removed these macros: `isSTATIC()`, `ip_STATIC()`,
      `closure_STATIC()`, since they relied on the CONSTR/CONSTR_STATIC
      distinction, and anyway HEAP_ALLOCED() does the same job.
      
      Test Plan: validate
      
      Reviewers: bgamari, simonpj, austin, gcampax, hvr, niteria, erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2690
      
      GHC Trac Issues: #12455
      55d535da
  28. 10 Nov, 2016 1 commit
    • darshan's avatar
      rts: Add api to pin a thread to a numa node but without fixing a capability · 122d826d
      darshan authored
      `rts_setInCallCapability` sets the thread affinity as well as pins the
      numa node. We should also have the ability to set the numa node without
      setting the capability affinity. `rts_pinNumaNodeForCapability` function
      is added and exported via `RtsAPI.h`.
      
      Previous callers of `rts_setInCallCapability` should now also call
      `rts_pinNumaNodeForCapability` to get the same effect as before.
      
      Test Plan:
        ./validate
      
      Reviewers: austin, simonmar, bgamari
      
      Reviewed By: simonmar, bgamari
      
      Subscribers: thomie, niteria
      
      Differential Revision: https://phabricator.haskell.org/D2637
      
      GHC Trac Issues: #12764
      122d826d
  29. 02 Nov, 2016 1 commit
  30. 01 Oct, 2016 1 commit
    • Tamar Christina's avatar
      Support more than 64 logical processors on Windows · 3c179054
      Tamar Christina authored
      Windows support for more than 64 logical processors are implemented
      using processor groups.
      
      Essentially what it's doing is keeping the existing maximum of 64
      processors and keeping the affinity mask a 64 bit value, but adds an
      hierarchy above that.
      
      This support was added to Windows 7 and so we need to at runtime detect
      if the APIs are still there due to our minimum supported version being
      Windows Vista.
      
      The Maximum number of groups supported at this time is 4, so 256 logical
      cores.  The group indices are 0 based. One thread can have affinity with
      multiple groups.
      
      See
      https://msdn.microsoft.com/en-us/library/windows/desktop/ms684251.aspx
      and particularly helpful is the whitepaper: 'Supporting Systems that
      have more than 64 processors' at
      https://msdn.microsoft.com/en-us/library/windows/hardware/dn653313.aspx
      
      Processor groups are not guaranteed to be uniformly distributed nor
      guaranteed to be filled before a next group is needed. The OS will
      assign processors to groups based on physical proximity and will never
      partially assign cores from one physical cpu to more than one group. If
      one has two 48 core CPUs then you'd end up with two groups of 48 logical
      cpus. Now add a 3rd CPU with 10 cores and the group it is assigned to
      depends where the socket is on the board.
      
      Test Plan:
      ./validate or make test -c . in the rts test folder.
      
      This tests for regressions, to test this particular functionality
      itself:
      
         <program> +RTS -N -qa -RTS
      
      Test is detailed in description.
      
      Reviewers: bgamari, simonmar, austin, erikd
      
      Reviewed By: simonmar
      
      Subscribers: thomie, #ghc_windows_task_force
      
      Differential Revision: https://phabricator.haskell.org/D2533
      
      GHC Trac Issues: #11054
      3c179054
  31. 23 Sep, 2016 1 commit
  32. 12 Sep, 2016 1 commit
    • Simon Marlow's avatar
      Add hs_try_putmvar() · 454033b5
      Simon Marlow authored
      Summary:
      This is a fast, non-blocking, asynchronous, interface to tryPutMVar that
      can be called from C/C++.
      
      It's useful for callback-based C/C++ APIs: the idea is that the callback
      invokes hs_try_putmvar(), and the Haskell code waits for the callback to
      run by blocking in takeMVar.
      
      The callback doesn't block - this is often a requirement of
      callback-based APIs.  The callback wakes up the Haskell thread with
      minimal overhead and no unnecessary context-switches.
      
      There are a couple of benchmarks in
      testsuite/tests/concurrent/should_run.  Some example results comparing
      hs_try_putmvar() with using a standard foreign export:
      
          ./hs_try_putmvar003 1 64 16 100 +RTS -s -N4     0.49s
          ./hs_try_putmvar003 2 64 16 100 +RTS -s -N4     2.30s
      
      hs_try_putmvar() is 4x faster for this workload (see the source for
      hs_try_putmvar003.hs for details of the workload).
      
      An alternative solution is to use the IO Manager for this.  We've tried
      it, but there are problems with that approach:
      * Need to create a new file descriptor for each callback
      * The IO Manger thread(s) become a bottleneck
      * More potential for things to go wrong, e.g. throwing an exception in
        an IO Manager callback kills the IO Manager thread.
      
      Test Plan: validate; new unit tests
      
      Reviewers: niteria, erikd, ezyang, bgamari, austin, hvr
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2501
      454033b5
  33. 09 Sep, 2016 1 commit
    • bitonic's avatar
      Make start address of `osReserveHeapMemory` tunable via command line -xb · 1b5f9207
      bitonic authored
      Summary:
      We stumbled upon a case where an external library (OpenCL) does not work
      if a specific address (0x200000000) is taken.
      
      It so happens that `osReserveHeapMemory` starts trying to mmap at 0x200000000:
      
      ```
              void *hint = (void*)((W_)8 * (1 << 30) + attempt * BLOCK_SIZE);
              at = osTryReserveHeapMemory(*len, hint);
      ```
      
      This makes it impossible to use Haskell programs compiled with GHC 8
      with C functions that use OpenCL.
      
      See this example ​https://github.com/chpatrick/oclwtf for a repro.
      
      This patch allows the user to work around this kind of behavior outside
      our control by letting the user override the starting address through an
      RTS command line flag.
      
      Reviewers: bgamari, Phyx, simonmar, erikd, austin
      
      Reviewed By: Phyx, simonmar
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D2513
      1b5f9207
  34. 30 Aug, 2016 1 commit
    • mniip's avatar
      Tag pointers in interpreted constructors · a25bf267
      mniip authored
      Instead of stg_interp_constr_entry there are now 7 functions (one for
      each value of the tag bits) that tag the constructor pointer before
      returning. This is consistent with compiled constructors' entry code,
      and expectations that compiled code places on compiled constructors. The
      iserv protocol is extended with an extra field that explains what
      pointer tag the constructor should use.
      
      Test Plan: Added tests for #12523
      
      Reviewers: erikd, bgamari, hvr, austin, simonmar
      
      Reviewed By: simonmar
      
      Subscribers: osa1, thomie, rwbarton
      
      Differential Revision: https://phabricator.haskell.org/D2473
      
      GHC Trac Issues: #12523
      a25bf267