1. 02 Feb, 2011 1 commit
  2. 17 Jun, 2010 1 commit
  3. 03 Dec, 2009 1 commit
    • Simon Marlow's avatar
      GC refactoring, remove "steps" · 214b3663
      Simon Marlow authored
      The GC had a two-level structure, G generations each of T steps.
      Steps are for aging within a generation, mostly to avoid premature
      promotion.  
      
      Measurements show that more than 2 steps is almost never worthwhile,
      and 1 step is usually worse than 2.  In theory fractional steps are
      possible, so the ideal number of steps is somewhere between 1 and 3.
      GHC's default has always been 2.
      
      We can implement 2 steps quite straightforwardly by having each block
      point to the generation to which objects in that block should be
      promoted, so blocks in the nursery point to generation 0, and blocks
      in gen 0 point to gen 1, and so on.
      
      This commit removes the explicit step structures, merging generations
      with steps, thus simplifying a lot of code.  Performance is
      unaffected.  The tunable number of steps is now gone, although it may
      be replaced in the future by a way to tune the aging in generation 0.
      214b3663
  4. 09 Sep, 2009 1 commit
  5. 05 Aug, 2009 1 commit
  6. 02 Aug, 2009 1 commit
    • Simon Marlow's avatar
      RTS tidyup sweep, first phase · a2a67cd5
      Simon Marlow authored
      The first phase of this tidyup is focussed on the header files, and in
      particular making sure we are exposinng publicly exactly what we need
      to, and no more.
      
       - Rts.h now includes everything that the RTS exposes publicly,
         rather than a random subset of it.
      
       - Most of the public header files have moved into subdirectories, and
         many of them have been renamed.  But clients should not need to
         include any of the other headers directly, just #include the main
         public headers: Rts.h, HsFFI.h, RtsAPI.h.
      
       - All the headers needed for via-C compilation have moved into the
         stg subdirectory, which is self-contained.  Most of the headers for
         the rest of the RTS APIs have moved into the rts subdirectory.
      
       - I left MachDeps.h where it is, because it is so widely used in
         Haskell code.
       
       - I left a deprecated stub for RtsFlags.h in place.  The flag
         structures are now exposed by Rts.h.
      
       - Various internal APIs are no longer exposed by public header files.
      
       - Various bits of dead code and declarations have been removed
      
       - More gcc warnings are turned on, and the RTS code is more
         warning-clean.
      
       - More source files #include "PosixSource.h", and hence only use
         standard POSIX (1003.1c-1995) interfaces.
      
      There is a lot more tidying up still to do, this is just the first
      pass.  I also intend to standardise the names for external RTS APIs
      (e.g use the rts_ prefix consistently), and declare the internal APIs
      as hidden for shared libraries.
      a2a67cd5
  7. 13 Mar, 2009 1 commit
    • Simon Marlow's avatar
      Use work-stealing for load-balancing in the GC · 4e354226
      Simon Marlow authored
        
      New flag: "+RTS -qb" disables load-balancing in the parallel GC
      (though this is subject to change, I think we will probably want to do
      something more automatic before releasing this).
      
      To get the "PARGC3" configuration described in the "Runtime support
      for Multicore Haskell" paper, use "+RTS -qg0 -qb -RTS".
      
      The main advantage of this is that it allows us to easily disable
      load-balancing altogether, which turns out to be important in parallel
      programs.  Maintaining locality is sometimes more important that
      spreading the work out in parallel GC.  There is a side benefit in
      that the parallel GC should have improved locality even when
      load-balancing, because each processor prefers to take work from its
      own queue before stealing from others.
      4e354226
  8. 12 Jan, 2009 1 commit
    • Simon Marlow's avatar
      Keep the remembered sets local to each thread during parallel GC · 6a405b1e
      Simon Marlow authored
      This turns out to be quite vital for parallel programs:
      
        - The way we discover which threads to traverse is by finding
          dirty threads via the remembered sets (aka mutable lists).
      
        - A dirty thread will be on the remembered set of the capability
          that was running it, and we really want to traverse that thread's
          stack using the GC thread for the capability, because it is in
          that CPU's cache.  If we get this wrong, we get penalised badly by
          the memory system.
      
      Previously we had per-capability mutable lists but they were
      aggregated before GC and traversed by just one of the GC threads.
      This resulted in very poor performance particularly for parallel
      programs with deep stacks.
      
      Now we keep per-capability remembered sets throughout GC, which also
      removes a lock (recordMutableGen_sync).
      6a405b1e
  9. 16 Apr, 2008 3 commits
  10. 13 Dec, 2007 1 commit
  11. 21 Nov, 2007 1 commit
  12. 31 Oct, 2007 1 commit
    • Simon Marlow's avatar
      Refactoring of the GC in preparation for parallel GC · d5bd3e82
      Simon Marlow authored
        
      This patch localises the state of the GC into a gc_thread structure,
      and reorganises the inner loop of the GC to scavenge one block at a
      time from global work lists in each "step".  The gc_thread structure
      has a "workspace" for each step, in which it collects evacuated
      objects until it has a full block to push out to the step's global
      list.  Details of the algorithm will be on the wiki in due course.
      
      At the moment, THREADED_RTS does not compile, but the single-threaded
      GC works (and is 10-20% slower than before).
      d5bd3e82
  13. 26 Oct, 2006 1 commit
  14. 24 Oct, 2006 1 commit
    • Simon Marlow's avatar
      Split GC.c, and move storage manager into sm/ directory · ab0e778c
      Simon Marlow authored
      In preparation for parallel GC, split up the monolithic GC.c file into
      smaller parts.  Also in this patch (and difficult to separate,
      unfortunatley):
        
        - Don't include Stable.h in Rts.h, instead just include it where
          necessary.
        
        - consistently use STATIC_INLINE in source files, and INLINE_HEADER
          in header files.  STATIC_INLINE is now turned off when DEBUG is on,
          to make debugging easier.
        
        - The GC no longer takes the get_roots function as an argument.
          We weren't making use of this generalisation.
      ab0e778c