This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
  1. 23 Apr, 2017 1 commit
  2. 02 Apr, 2017 1 commit
    • Simon Marlow's avatar
      Report heap overflow in the same way as stack overflow · 61ba4518
      Simon Marlow authored
      Now that we throw an exception for heap overflow, we should only print
      the heap overflow message in the main thread when the HeapOverflow
      exception is caught, rather than as a side effect in the GC.
      
      Stack overflows were already done this way, I just made heap overflow
      consistent with stack overflow, and did some related cleanup.
      
      Fixes broken T2592(profasm) which was reporting the heap overflow
      message twice (you would only notice when building with profiling
      libs enabled).
      
      Test Plan: validate
      
      Reviewers: bgamari, niteria, austin, DemiMarie, hvr, erikd
      
      Reviewed By: bgamari
      
      Subscribers: rwbarton, thomie
      
      Differential Revision: https://phabricator.haskell.org/D3394
      61ba4518
  3. 07 Dec, 2016 1 commit
  4. 06 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Overhaul GC stats · 24e6594c
      Simon Marlow authored
      Summary:
      Visible API changes:
      
      * The C struct `GCDetails` gives the stats about a single GC.  This is
        passed to the `gcDone()` callback if one is set via the
        RtsConfig. (previously we just passed a collection of values, so this
        is more extensible, at the expense of breaking the existing API)
      
      * `RTSStats` gives cumulative stats since the start of the program,
        and includes the `GCDetails` for the most recent GC.  This struct
        can be obtained via `getRTSStats()` (the old `getGCStats()` has been
        removed, and `getGCStatsEnabled()` has been renamed to
        `getRTSStatsEnabled()`)
      
      Improvements:
      
      * The per-GC stats and cumulative stats are now cleanly separated.
      
      * Inside the RTS we have a top-level `RTSStats` struct to keep all our
        stats in, previously this was just a collection of strangely-named
        variables.  This struct is mostly just copied in `getRTSStats()`, so
        the implementation of that function is a lot shorter.
      
      * Types are more consistent.  We use a uint64_t byte count for all
        memory values, and Time for all time values.
      
      * Names are more consistent.  We use a suffix `_bytes` for all byte
        counts and `_ns` for all time values.
      
      * We now collect information about the amount of memory in large
        objects and compact objects in `GCDetails`. (the latter was the reason
        I started doing this patch but it seems to have ballooned a bit!)
      
      * I fixed a bug in the calculation of the elapsed MUT time, and added
        an ASSERT to stop the calculations going wrong in the future.
      
      For now I kept the Haskell API in `GHC.Stats` the same, by
      impedence-matching with the new API.  We could either break that API
      and make it match the C API more closely, or we could add a new API
      and deprecate the old one.  Opinions welcome.
      
      This stuff is very easy to get wrong, and it's hard to test.  Reviews
      welcome!
      
      Test Plan:
      manual testing
      validate
      
      Reviewers: bgamari, niteria, austin, ezyang, hvr, erikd, rwbarton, Phyx
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2756
      24e6594c
  5. 15 Aug, 2016 1 commit
  6. 10 Jun, 2016 1 commit
    • Simon Marlow's avatar
      NUMA support · 9e5ea67e
      Simon Marlow authored
      Summary:
      The aim here is to reduce the number of remote memory accesses on
      systems with a NUMA memory architecture, typically multi-socket servers.
      
      Linux provides a NUMA API for doing two things:
      * Allocating memory local to a particular node
      * Binding a thread to a particular node
      
      When given the +RTS --numa flag, the runtime will
      * Determine the number of NUMA nodes (N) by querying the OS
      * Assign capabilities to nodes, so cap C is on node C%N
      * Bind worker threads on a capability to the correct node
      * Keep a separate free lists in the block layer for each node
      * Allocate the nursery for a capability from node-local memory
      * Allocate blocks in the GC from node-local memory
      
      For example, using nofib/parallel/queens on a 24-core 2-socket machine:
      
      ```
      $ ./Main 15 +RTS -N24 -s -A64m
        Total   time  173.960s  (  7.467s elapsed)
      
      $ ./Main 15 +RTS -N24 -s -A64m --numa
        Total   time  150.836s  (  6.423s elapsed)
      ```
      
      The biggest win here is expected to be allocating from node-local
      memory, so that means programs using a large -A value (as here).
      
      According to perf, on this program the number of remote memory accesses
      were reduced by more than 50% by using `--numa`.
      
      Test Plan:
      * validate
      * There's a new flag --debug-numa=<n> that pretends to do NUMA without
        actually making the OS calls, which is useful for testing the code
        on non-NUMA systems.
      * TODO: I need to add some unit tests
      
      Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2199
      9e5ea67e
  7. 18 Apr, 2016 1 commit
  8. 23 Nov, 2015 1 commit
  9. 17 Oct, 2015 1 commit
    • Ben Gamari's avatar
      Libdw: Add libdw-based stack unwinding · a6a3dabc
      Ben Gamari authored
      This adds basic support to the RTS for DWARF-assisted unwinding of the
      Haskell and C stack via libdw. This only adds the infrastructure;
      consumers of this functionality will be introduced in future diffs.
      
      Currently we are carrying the initial register collection code in
      Libdw.c but this will eventually make its way upstream to libdw.
      
      Test Plan: See future patches
      
      Reviewers: Tarrasch, scpmw, austin, simonmar
      
      Reviewed By: austin, simonmar
      
      Subscribers: simonmar, thomie, erikd
      
      Differential Revision: https://phabricator.haskell.org/D1196
      
      GHC Trac Issues: #10656
      a6a3dabc
  10. 10 Apr, 2015 1 commit
  11. 07 Apr, 2015 1 commit
  12. 10 Dec, 2014 1 commit
  13. 20 Oct, 2014 1 commit
  14. 02 Oct, 2014 1 commit
    • Edward Z. Yang's avatar
      Rename _closure to _static_closure, apply naming consistently. · 35672072
      Edward Z. Yang authored
      Summary:
      In preparation for indirecting all references to closures,
      we rename _closure to _static_closure to ensure any old code
      will get an undefined symbol error.  In order to reference
      a closure foobar_closure (which is now undefined), you should instead
      use STATIC_CLOSURE(foobar).  For convenience, a number of these
      old identifiers are macro'd.
      
      Across C-- and C (Windows and otherwise), there were differing
      conventions on whether or not foobar_closure or &foobar_closure
      was the address of the closure.  Now, all foobar_closure references
      are addresses, and no & is necessary.
      
      CHARLIKE/INTLIKE were not changed, simply alpha-renamed.
      
      Part of remove HEAP_ALLOCED patch set (#8199)
      
      Depends on D265
      Signed-off-by: Edward Z. Yang's avatarEdward Z. Yang <ezyang@mit.edu>
      
      Test Plan: validate
      
      Reviewers: simonmar, austin
      
      Subscribers: simonmar, ezyang, carter, thomie
      
      Differential Revision: https://phabricator.haskell.org/D267
      
      GHC Trac Issues: #8199
      35672072
  15. 20 Aug, 2014 1 commit
  16. 25 Oct, 2013 1 commit
  17. 11 Oct, 2013 1 commit
  18. 01 Oct, 2013 1 commit
  19. 08 Sep, 2013 2 commits
  20. 15 Jun, 2013 1 commit
  21. 05 Feb, 2013 1 commit
  22. 17 Jan, 2013 1 commit
  23. 01 Nov, 2012 1 commit
  24. 08 Oct, 2012 1 commit
    • Simon Marlow's avatar
      Produce new-style Cmm from the Cmm parser · a7c0387d
      Simon Marlow authored
      The main change here is that the Cmm parser now allows high-level cmm
      code with argument-passing and function calls.  For example:
      
      foo ( gcptr a, bits32 b )
      {
        if (b > 0) {
           // we can make tail calls passing arguments:
           jump stg_ap_0_fast(a);
        }
      
        return (x,y);
      }
      
      More details on the new cmm syntax are in Note [Syntax of .cmm files]
      in CmmParse.y.
      
      The old syntax is still more-or-less supported for those occasional
      code fragments that really need to explicitly manipulate the stack.
      However there are a couple of differences: it is now obligatory to
      give a list of live GlobalRegs on every jump, e.g.
      
        jump %ENTRY_CODE(Sp(0)) [R1];
      
      Again, more details in Note [Syntax of .cmm files].
      
      I have rewritten most of the .cmm files in the RTS into the new
      syntax, except for AutoApply.cmm which is generated by the genapply
      program: this file could be generated in the new syntax instead and
      would probably be better off for it, but I ran out of enthusiasm.
      
      Some other changes in this batch:
      
       - The PrimOp calling convention is gone, primops now use the ordinary
         NativeNodeCall convention.  This means that primops and "foreign
         import prim" code must be written in high-level cmm, but they can
         now take more than 10 arguments.
      
       - CmmSink now does constant-folding (should fix #7219)
      
       - .cmm files now go through the cmmPipeline, and as a result we
         generate better code in many cases.  All the object files generated
         for the RTS .cmm files are now smaller.  Performance should be
         better too, but I haven't measured it yet.
      
       - RET_DYN frames are removed from the RTS, lots of code goes away
      
       - we now have some more canned GC points to cover unboxed-tuples with
         2-4 pointers, which will reduce code size a little.
      a7c0387d
  25. 31 Jul, 2012 1 commit
  26. 27 Apr, 2012 1 commit
  27. 26 Apr, 2012 3 commits
  28. 24 Apr, 2012 1 commit
  29. 19 Mar, 2012 1 commit
  30. 18 Mar, 2012 1 commit
  31. 15 Jan, 2012 1 commit
    • Ian Lynagh's avatar
      Fix a #define · 54121fff
      Ian Lynagh authored
      I don't think it was causing any problems, but
          TimeToUS(x+y)
      would have evaluated to
          x + (y / 1000)
      54121fff
  32. 25 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Time handling overhaul · 6b109851
      Simon Marlow authored
      Terminology cleanup: the type "Ticks" has been renamed "Time", which
      is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds).
      The terminology "tick" is now used consistently to mean the interval
      between timer signals.
      
      The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if
      we have it).  Before it used CPU time in the non-threaded RTS and
      realtime in the threaded RTS, but I've discovered that the CPU timer
      has terrible resolution (at least on Linux) and isn't much use for
      profiling.  So now we always use realtime.  This should also fix
      
      The default tick interval is now 10ms, except when profiling where we
      drop it to 1ms.  This gives more accurate profiles without affecting
      runtime too much (<1%).
      
      Lots of cleanups - the resolution of Time is now in one place
      only (Rts.h) rather than having calculations that depend on the
      resolution scattered all over the RTS.  I hope I found them all.
      6b109851
  33. 16 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Generate the C main() function when linking a binary (fixes #5373) · 1df28a80
      Simon Marlow authored
      Rather than have main() be statically compiled as part of the RTS, we
      now generate it into the tiny C file that we compile when linking a
      binary.
      
      The main motivation is that we want to pass the settings for the
      -rtsotps and -with-rtsopts flags into the RTS, rather than relying on
      fragile linking semantics to override the defaults, which don't work
      with DLLs on Windows (#5373).  In order to do this, we need to extend
      the API for initialising the RTS, so now we have:
      
      void hs_init_ghc (int *argc, char **argv[],   // program arguments
                        RtsConfig rts_config);      // RTS configuration
      
      hs_init_ghc() can optionally be used instead of hs_init(), and allows
      passing in configuration options for the RTS.  RtsConfig is a struct,
      which currently has two fields:
      
      typedef struct {
          RtsOptsEnabledEnum rts_opts_enabled;
          const char *rts_opts;
      } RtsConfig;
      
      but might have more in the future.  There is a default value for the
      struct, defaultRtsConfig, the idea being that you start with this and
      override individual fields as necessary.
      
      In fact, main() was in a separate static library, libHSrtsmain.a.
      That's now gone.
      1df28a80
  34. 25 May, 2011 1 commit
  35. 14 May, 2011 1 commit
  36. 23 Nov, 2010 1 commit
  37. 17 Jun, 2010 1 commit