This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
  1. 04 Apr, 2012 1 commit
    • Duncan Coutts's avatar
      Add eventlog/trace stuff for capabilities: create/delete/enable/disable · f9c2e854
      Duncan Coutts authored
      Now that we can adjust the number of capabilities on the fly, we need
      this reflected in the eventlog. Previously the eventlog had a single
      startup event that declared a static number of capabilities. Obviously
      that's no good anymore.
      
      For compatability we're keeping the EVENT_STARTUP but adding new
      EVENT_CAP_CREATE/DELETE. The EVENT_CAP_DELETE is actually just the old
      EVENT_SHUTDOWN but renamed and extended (using the existing mechanism
      to extend eventlog events in a compatible way). So we now emit both
      EVENT_STARTUP and EVENT_CAP_CREATE. One day we will drop EVENT_STARTUP.
      
      Since reducing the number of capabilities at runtime does not really
      delete them, it just disables them, then we also have new events for
      disable/enable.
      
      The old EVENT_SHUTDOWN was in the scheduler class of events. The new
      EVENT_CAP_* events are in the unconditional class, along with the
      EVENT_CAPSET_* ones. Knowing when capabilities are created and deleted
      is crucial to making sense of eventlogs, you always want those events.
      In any case, they're extremely low volume.
      f9c2e854
  2. 19 Mar, 2012 1 commit
  3. 18 Mar, 2012 1 commit
  4. 16 Mar, 2012 1 commit
  5. 27 Feb, 2012 1 commit
  6. 06 Jan, 2012 1 commit
  7. 15 Dec, 2011 1 commit
    • Simon Marlow's avatar
      Support for reducing the number of Capabilities with setNumCapabilities · 9bae7915
      Simon Marlow authored
      This patch allows setNumCapabilities to /reduce/ the number of active
      capabilities as well as increase it.  This is particularly tricky to
      do, because a Capability is a large data structure and ties into the
      rest of the system in many ways.  Trying to clean it all up would be
      extremely error prone.
      
      So instead, the solution is to mark the extra capabilities as
      "disabled".  This has the following consequences:
      
        - threads on a disabled capability are migrated away by the
          scheduler loop
      
        - disabled capabilities do not participate in GC
          (see scheduleDoGC())
      
        - No spark threads are created on this capability
          (see scheduleActivateSpark())
      
        - We do not attempt to migrate threads *to* a disabled
          capability (see schedulePushWork()).
      
      So a disabled capability should do no work, and does not participate
      in GC, although it remains alive in other respects.  For example, a
      blocked thread might wake up on a disabled capability, and it will get
      quickly migrated to a live capability.  A disabled capability can
      still initiate GC if necessary.  Indeed, it turns out to be hard to
      migrate bound threads, so we wait until the next GC to do this (see
      comments for details).
      9bae7915
  8. 13 Dec, 2011 1 commit
    • Simon Marlow's avatar
      New flag +RTS -qi<n>, avoid waking up idle Capabilities to do parallel GC · a02eb298
      Simon Marlow authored
      This is an experimental tweak to the parallel GC that avoids waking up
      a Capability to do parallel GC if we know that the capability has been
      idle for a (tunable) number of GC cycles.  The idea is that if you're
      only using a few Capabilities, there's no point waking up the ones
      that aren't busy.
      
      e.g. +RTS -qi3
      
      says "A Capability will participate in parallel GC if it was running
      at all since the last 3 GC cycles."
      
      Results are a bit hit and miss, and I don't completely understand why
      yet.  Hence, for now it is turned off by default, and also not
      documented except in the +RTS -? output.
      a02eb298
  9. 06 Dec, 2011 2 commits
    • Simon Marlow's avatar
      Allow the number of capabilities to be increased at runtime (#3729) · 92e7d6c9
      Simon Marlow authored
      At present the number of capabilities can only be *increased*, not
      decreased.  The latter presents a few more challenges!
      92e7d6c9
    • Simon Marlow's avatar
      Make forkProcess work with +RTS -N · 8b75acd3
      Simon Marlow authored
      Consider this experimental for the time being.  There are a lot of
      things that could go wrong, but I've verified that at least it works
      on the test cases we have.
      
      I also did some API cleanups while I was here.  Previously we had:
      
      Capability * rts_eval (Capability *cap, HaskellObj p, /*out*/HaskellObj *ret);
      
      but this API is particularly error-prone: if you forget to discard the
      Capability * you passed in and use the return value instead, then
      you're in for subtle bugs with +RTS -N later on.  So I changed all
      these functions to this form:
      
      void rts_eval (/* inout */ Capability **cap,
                     /* in    */ HaskellObj p,
                     /* out */   HaskellObj *ret)
      
      It's much harder to use this version incorrectly, because you have to
      pass the Capability in by reference.
      8b75acd3
  10. 01 Dec, 2011 1 commit
    • Simon Marlow's avatar
      Fix a scheduling bug in the threaded RTS · 6d18141d
      Simon Marlow authored
      The parallel GC was using setContextSwitches() to stop all the other
      threads, which sets the context_switch flag on every Capability.  That
      had the side effect of causing every Capability to also switch
      threads, and since GCs can be much more frequent than context
      switches, this increased the context switch frequency.  When context
      switches are expensive (because the switch is between two bound
      threads or a bound and unbound thread), the difference is quite
      noticeable.
      
      The fix is to have a separate flag to indicate that a Capability
      should stop and return to the scheduler, but not switch threads.  I've
      called this the "interrupt" flag.
      6d18141d
  11. 29 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Make profiling work with multiple capabilities (+RTS -N) · 50de6034
      Simon Marlow authored
      This means that both time and heap profiling work for parallel
      programs.  Main internal changes:
      
        - CCCS is no longer a global variable; it is now another
          pseudo-register in the StgRegTable struct.  Thus every
          Capability has its own CCCS.
      
        - There is a new built-in CCS called "IDLE", which records ticks for
          Capabilities in the idle state.  If you profile a single-threaded
          program with +RTS -N2, you'll see about 50% of time in "IDLE".
      
        - There is appropriate locking in rts/Profiling.c to protect the
          shared cost-centre-stack data structures.
      
      This patch does enough to get it working, I have cut one big corner:
      the cost-centre-stack data structure is still shared amongst all
      Capabilities, which means that multiple Capabilities will race when
      updating the "allocations" and "entries" fields of a CCS.  Not only
      does this give unpredictable results, but it runs very slowly due to
      cache line bouncing.
      
      It is strongly recommended that you use -fno-prof-count-entries to
      disable the "entries" count when profiling parallel programs. (I shall
      add a note to this effect to the docs).
      50de6034
  12. 25 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Time handling overhaul · 6b109851
      Simon Marlow authored
      Terminology cleanup: the type "Ticks" has been renamed "Time", which
      is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds).
      The terminology "tick" is now used consistently to mean the interval
      between timer signals.
      
      The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if
      we have it).  Before it used CPU time in the non-threaded RTS and
      realtime in the threaded RTS, but I've discovered that the CPU timer
      has terrible resolution (at least on Linux) and isn't much use for
      profiling.  So now we always use realtime.  This should also fix
      
      The default tick interval is now 10ms, except when profiling where we
      drop it to 1ms.  This gives more accurate profiles without affecting
      runtime too much (<1%).
      
      Lots of cleanups - the resolution of Time is now in one place
      only (Rts.h) rather than having calculations that depend on the
      resolution scattered all over the RTS.  I hope I found them all.
      6b109851
  13. 14 Aug, 2011 1 commit
    • Simon Marlow's avatar
      fix occasional failure of numsparks001 test. During shutdown we · f6f04307
      Simon Marlow authored
      discard all the sparks from each Capability, but we were forgetting to
      account for the discarded sparks in the stats, leading to a failure of
      the assertion that tests the spark invariant.
      
      I've moved the discarding of sparks to just before the GC, to avoid
      race conditions, and counted the discarded sparks as GC'd.
      f6f04307
  14. 20 Jul, 2011 1 commit
  15. 18 Jul, 2011 5 commits
    • Duncan Coutts's avatar
      Add spark counter tracing · d77df1ca
      Duncan Coutts authored
      A new eventlog event containing 7 spark counters/statistics: sparks
      created, dud, overflowed, converted, GC'd, fizzled and remaining.
      These are maintained and logged separately for each capability.
      We log them at startup, on each GC (minor and major) and on shutdown.
      d77df1ca
    • Duncan Coutts's avatar
      Move allocation of spark pools into initCapability · 5d091088
      Duncan Coutts authored
      Rather than a separate phase of initSparkPools. It means all the spark
      stuff for a capability is initialisaed at the same time, which is then
      becomes a good place to stick an initial spark trace event.
      5d091088
    • Duncan Coutts's avatar
      Add assertion of the invariant for the spark counters · ddb47a91
      Duncan Coutts authored
      The invariant is: created = converted + remaining + gcd + fizzled
      Since sparks move between capabilities, we have to aggregate the
      counters over all capabilities. This in turn means we can only check
      the invariant at stable points where all but one capabilities are
      stopped. We can do this at shutdown time and before and after a global
      synchronised GC.
      ddb47a91
    • Duncan Coutts's avatar
      Change tryStealSpark so it does not consume fizzled sparks · ededf355
      Duncan Coutts authored
      We want to count fizzled sparks accurately. Now tryStealSpark returns
      fizzled sparks, and the callers now update the fizzled spark count.
      ededf355
    • Simon Marlow's avatar
      Fix Windows breakage (#5322). When I modified StgRun to use the pure · 81eddb4c
      Simon Marlow authored
      assembly version as part of the fix for #5250, we inadvertently lost
      the Windows magic for extending the stack.  Win32 requires that the
      stack is extended a page at a time, otherwise you get a segfault.  The
      C compiler knows how to do this, so we now call a C stub to ensure
      there's enough stack space at each invocation of the scheduler.
      81eddb4c
  16. 25 Jun, 2011 1 commit
    • Ian Lynagh's avatar
      Fix gcc 4.6 warnings; fixes #5176 · 0a6f26f6
      Ian Lynagh authored
      Based on a patch from David Terei.
      
      Some parts are a little ugly (e.g. defining things that only ASSERTs
      use only when DEBUG is defined), so we might want to tweak things a
      little.
      
      I've also turned off -Werror for didn't-inline warnings, as we now
      get a few such warnings.
      0a6f26f6
  17. 26 May, 2011 1 commit
    • Duncan Coutts's avatar
      Rearrange shutdownCapability code slightly · 68b76e0e
      Duncan Coutts authored
      This is mostly for the beneift of having sensible places to put tracing
      code later. We want a code path that has somewhere to trace (in order):
       (1) starting up all capabilities;
       (2) N * starting up an individual capability;
       (3) N * shutting down an individual capability;
       (4) shutting down all capabilities.
      This has to work in both threaded and non-threaded modes.
      
      Locations (1) and (2) are provided by initCapabilities and
      initCapability respectively. Previously, there was no loccation for (4)
      and while shutdownCapability should be usable for (3) it was only called
      in the !THREADED_RTS case.
      
      Now, shutdownCapability is called unconditionally (and the body is
      conditonal on THREADED_RTS) and there is a new shutdownCapabilities that
      calls shutdownCapability in a loop.
      68b76e0e
  18. 22 May, 2011 1 commit
  19. 18 May, 2011 1 commit
  20. 11 May, 2011 1 commit
  21. 11 Apr, 2011 1 commit
    • Simon Marlow's avatar
      Refactoring and tidy up · 1fb38442
      Simon Marlow authored
      This is a port of some of the changes from my private local-GC branch
      (which is still in darcs, I haven't converted it to git yet).  There
      are a couple of small functional differences in the GC stats: first,
      per-thread GC timings should now be more accurate, and secondly we now
      report average and maximum pause times. e.g. from minimax +RTS -N8 -s:
      
                                          Tot time (elapsed)  Avg pause  Max pause
        Gen  0      2755 colls,  2754 par   13.16s    0.93s     0.0003s    0.0150s
        Gen  1       769 colls,   769 par    3.71s    0.26s     0.0003s    0.0059s
      1fb38442
  22. 30 Mar, 2011 1 commit
  23. 02 Feb, 2011 1 commit
  24. 27 Jan, 2011 1 commit
  25. 21 Dec, 2010 1 commit
  26. 15 Dec, 2010 1 commit
    • Simon Marlow's avatar
      Implement stack chunks and separate TSO/STACK objects · f30d5273
      Simon Marlow authored
      This patch makes two changes to the way stacks are managed:
      
      1. The stack is now stored in a separate object from the TSO.
      
      This means that it is easier to replace the stack object for a thread
      when the stack overflows or underflows; we don't have to leave behind
      the old TSO as an indirection any more.  Consequently, we can remove
      ThreadRelocated and deRefTSO(), which were a pain.
      
      This is obviously the right thing, but the last time I tried to do it
      it made performance worse.  This time I seem to have cracked it.
      
      2. Stacks are now represented as a chain of chunks, rather than
         a single monolithic object.
      
      The big advantage here is that individual chunks are marked clean or
      dirty according to whether they contain pointers to the young
      generation, and the GC can avoid traversing clean stack chunks during
      a young-generation collection.  This means that programs with deep
      stacks will see a big saving in GC overhead when using the default GC
      settings.
      
      A secondary advantage is that there is much less copying involved as
      the stack grows.  Programs that quickly grow a deep stack will see big
      improvements.
      
      In some ways the implementation is simpler, as nothing special needs
      to be done to reclaim stack as the stack shrinks (the GC just recovers
      the dead stack chunks).  On the other hand, we have to manage stack
      underflow between chunks, so there's a new stack frame
      (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects.
      The total amount of code is probably about the same as before.
      
      There are new RTS flags:
      
         -ki<size> Sets the initial thread stack size (default 1k)  Egs: -ki4k -ki2m
         -kc<size> Sets the stack chunk size (default 32k)
         -kb<size> Sets the stack chunk buffer size (default 1k)
      
      -ki was previously called just -k, and the old name is still accepted
      for backwards compatibility.  These new options are documented.
      f30d5273
  27. 10 Dec, 2010 1 commit
  28. 09 Dec, 2010 1 commit
  29. 03 Dec, 2010 1 commit
  30. 26 Nov, 2010 1 commit
  31. 25 Nov, 2010 1 commit
  32. 25 Sep, 2010 2 commits
  33. 19 Sep, 2010 1 commit
    • Edward Z. Yang's avatar
      Interruptible FFI calls with pthread_kill and CancelSynchronousIO. v4 · 83d563cb
      Edward Z. Yang authored
      This is patch that adds support for interruptible FFI calls in the form
      of a new foreign import keyword 'interruptible', which can be used
      instead of 'safe' or 'unsafe'.  Interruptible FFI calls act like safe
      FFI calls, except that the worker thread they run on may be interrupted.
      
      Internally, it replaces BlockedOnCCall_NoUnblockEx with
      BlockedOnCCall_Interruptible, and changes the behavior of the RTS
      to not modify the TSO_ flags on the event of an FFI call from
      a thread that was interruptible.  It also modifies the bytecode
      format for foreign call, adding an extra Word16 to indicate
      interruptibility.
      
      The semantics of interruption vary from platform to platform, but the
      intent is that any blocking system calls are aborted with an error code.
      This is most useful for making function calls to system library
      functions that support interrupting.  There is no support for pre-Vista
      Windows.
      
      There is a partner testsuite patch which adds several tests for this
      functionality.
      83d563cb
  34. 18 May, 2010 1 commit
    • Simon Marlow's avatar
      Fix #4074 (I hope). · f9b4bc22
      Simon Marlow authored
      1. allow multiple threads to call startTimer()/stopTimer() pairs
      2. disable the timer around fork() in forkProcess()
      
      A corresponding change to the process package is required.
      f9b4bc22