1. 26 Mar, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-03-26 10:43:15 by simonmar] · 4b5f32d7
      simonmar authored
      A couple of cleanups to the previous change: we should test
      TABLES_NEXT_TO_CODE rather than USE_MINIINTERPRETER to enable the
      MacOSX "plan C", and use structure field selection rather than array
      indexing to get the entry code ptr from the info table.
      4b5f32d7
  2. 21 Mar, 2002 1 commit
    • sebc's avatar
      [project @ 2002-03-21 11:23:59 by sebc] · d182db3a
      sebc authored
      Implement Plan C, with correct code to detect the data and text
      sections for MacOS X.
      Also add a sanity check in initStorage, to make sure we are able to
      make the distinction between closures and infotables.
      d182db3a
  3. 14 Feb, 2002 1 commit
  4. 04 Feb, 2002 1 commit
  5. 01 Feb, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-02-01 10:50:35 by simonmar] · b8684d58
      simonmar authored
      When distinguishing between code & data pointers, rather than testing
      for membership of the text section, test for not membership of one of
      the data sections.
      
      The reason for this change is that testing for membership of the text
      section was fragile:  we could only test whether a value was smaller
      than the end address, because there doesn't appear to be a portable
      way to find the beginning of the text section.  Indeed, the test
      breaks on very recent Linux kernels which mmap() memory below the
      program text.
      
      In fact, the reversed test may be faster because the expected common
      case is when the pointer is into the dynamic heap, and we eliminate
      these case immediately in the new test.  A quick test shows no
      measurable performance difference with the change.
      
      MERGE TO STABLE
      b8684d58
  6. 25 Jan, 2002 1 commit
  7. 22 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-22 14:25:11 by simonmar] · db61851c
      simonmar authored
      Retainer Profiling / Lag-drag-void profiling.
      
      This is mostly work by Sungwoo Park, who spent a summer internship at
      MSR Cambridge this year implementing these two types of heap profiling
      in GHC.
      
      Relative to Sungwoo's original work, I've made some improvements to
      the code:
      
         - it's now possible to apply constraints to retainer and LDV profiles
           in the same way as we do for other types of heap profile (eg.
           +RTS -hc{foo,bar} -hR -RTS gives you a retainer profiling considering
           only closures with cost centres 'foo' and 'bar').
      
         - the heap-profile timer implementation is cleaned up.
      
         - heap profiling no longer has to be run in a two-space heap.
      
         - general cleanup of the code and application of the SDM C coding
           style guidelines.
      
      Profiling will be a little slower and require more space than before,
      mainly because closures have an extra header word to support either
      retainer profiling or LDV profiling (you can't do both at the same
      time).
      
      We've used the new profiling tools on GHC itself, with moderate
      success.  Fixes for some space leaks in GHC to follow...
      db61851c
  8. 08 Aug, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-08-08 10:50:36 by simonmar] · 52c07834
      simonmar authored
      Had a brainwave on the way to work this morning, and realised that the
      garbage collector can handle "pinned objects" as long as they don't
      contain any pointers.
      
      This is absolutely ideal for doing temporary allocation in the FFI,
      because what we really want to do is allocate a pinned ByteArray and
      let the GC clean it up later.  So this set of changes adds the
      required framework.
      
      There are two new primops:
      
       newPinnedByteArray# :: Int# -> State# s -> (# State# s, MutByteArr# s #)
       byteArrayContents#  :: ByteArr# -> Addr#
      
      obviously byteArrayContents# is highly unsafe.
      
      Allocating a pinned ByteArr# isn't the default, because a pinned
      ByteArr# will hold an entire block (currently 4k) live until it is
      garbage collected (that doesn't mean each pinned ByteArr# requires
      4k of storage, just that if a block contains a single live pinned
      ByteArray, the whole block must be retained).
      52c07834
  9. 24 Jul, 2001 1 commit
  10. 23 Jul, 2001 2 commits
    • simonmar's avatar
      [project @ 2001-07-23 17:23:19 by simonmar] · dfd7d6d0
      simonmar authored
      Add a compacting garbage collector.
      
      It isn't enabled by default, as there are still a couple of problems:
      there's a fallback case I haven't implemented yet which means it will
      occasionally bomb out, and speed-wise it's quite a bit slower than the
      copying collector (about 1.8x slower).
      
      Until I can make it go faster, it'll only be useful when you're
      actually running low on real memory.
      
      '+RTS -c' to enable it.
      
      Oh, and I cleaned up a few things in the RTS while I was there, and
      fixed one or two possibly real bugs in the existing GC.
      dfd7d6d0
    • simonmar's avatar
      [project @ 2001-07-23 10:47:16 by simonmar] · 6f83fbc0
      simonmar authored
      Small changes to improve GC performance slightly:
      
        - store the generation *number* in the block descriptor rather
          than a pointer to the generation structure, since the most
          common operation is to pull out the generation number, and
          it's one less indirection this way.
      
        - cache the generation number in the step structure too, which
          avoids an extra indirection in several places.
      6f83fbc0
  11. 03 May, 2001 1 commit
  12. 02 Mar, 2001 2 commits
    • simonmar's avatar
      [project @ 2001-03-02 16:15:53 by simonmar] · 435b1086
      simonmar authored
      ASSERT in updateWithIndirection() that we haven't already updated this
      object with an indirection, and fix two places in the RTS where this
      could happen.
      
      The problem only occurs when we're in a black-hole-style loop, and
      there are multiple update frames on the stack pointing to the same
      object (this is possible because of lazy black-holing).  Both stack
      squeezing and asynchronous exception raising walk down the stack and
      remove update frames, updating their contents with indirections.  If
      we don't protect against multiple updates, the mutable list in the old
      generation may get into a bogus state.
      435b1086
    • simonmar's avatar
      [project @ 2001-03-02 14:36:16 by simonmar] · ffaa2614
      simonmar authored
      Add some ASSERT()s so we can catch updates where updatee==target.
      ffaa2614
  13. 11 Feb, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-02-11 17:51:07 by simonmar] · 6d35596c
      simonmar authored
      Bite the bullet and make GHCi support non-optional in the RTS.  GHC
      4.11 should be able to build GHCi without any additional tweaks now.
      
      - the Linker is split into two parts: LinkerBasic.c, containing the
        routines required by the rest of the RTS, and Linker.c, containing
        the linker proper, which is not referred to from the rest of the RTS.
        Only Linker.c requires -ldl, so programs which don't make use of the
        linker (everything except GHC, in other words) won't need -ldl.
      6d35596c
  14. 09 Feb, 2001 2 commits
  15. 08 Feb, 2001 1 commit
  16. 29 Jan, 2001 1 commit
  17. 26 Jan, 2001 2 commits
  18. 24 Jan, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-01-24 15:46:19 by simonmar] · 43b212f5
      simonmar authored
      Add a CAF list for GHCI.
      
      Retaining all looked-up symbols in a list in the interpreter was the
      Wrong Thing To Do, since we can't guarantee that the transitive
      closure of this list points to all the CAFs so far evaluated (the
      transitive closure gets smaller as reachable CAFs are evaluated).
      
      A Better Thing To Do is just to retain all the CAFs.  A refinement is
      to only retain all CAFs in dynamically linked code, which is what this
      patch implements.
      43b212f5
  19. 09 Jan, 2001 1 commit
  20. 19 Dec, 2000 1 commit
  21. 11 Dec, 2000 1 commit
  22. 04 Dec, 2000 1 commit
  23. 13 Nov, 2000 1 commit
  24. 14 Apr, 2000 1 commit
    • sewardj's avatar
      [project @ 2000-04-14 15:18:05 by sewardj] · 9ff75d08
      sewardj authored
      Clean up the runtime heap before deleting modules (and, currently, after
      every evaluation) so that the combined system can safely throw away
      modules and info tables without creating dangling refs from the heap.
      9ff75d08
  25. 11 Apr, 2000 1 commit
    • sewardj's avatar
      [project @ 2000-04-11 16:36:53 by sewardj] · d5087432
      sewardj authored
      Ensure that when Hugs decides to unload a module (nukeModule()), there are
      no closures anywhere in the system which refers to infotables defined
      in that module.  That means reverting all CAFs and doing a major GC
      prior to deleting the module.  A flag is used to avoid redundant GCs.
      d5087432
  26. 13 Jan, 2000 1 commit
    • hwloidl's avatar
      [project @ 2000-01-13 14:33:57 by hwloidl] · 1b28d4e1
      hwloidl authored
      Merged GUM-4-04 branch into the main trunk. In particular merged GUM and
      SMP code. Most of the GranSim code in GUM-4-04 still has to be carried over.
      1b28d4e1
  27. 11 Nov, 1999 2 commits
  28. 09 Nov, 1999 1 commit
    • simonmar's avatar
      [project @ 1999-11-09 15:46:49 by simonmar] · 30681e79
      simonmar authored
      A slew of SMP-related changes.
      
       - New locking scheme for thunks: we now check whether the thunk
         being entered is in our private allocation area, and if so
         we don't lock it.  Well, that's the upshot.  In practice it's
         a lot more fiddly than that.
      
       - I/O blocking is handled a bit more sanely now (but still not
         properly, methinks)
      
       - deadlock detection is back
      
       - remove old pre-SMP scheduler code
      
       - revamp the timing code.  We actually get reasonable-looking
         timing info for SMP programs now.
      
       - fix a bug in the garbage collector to do with IND_OLDGENs appearing
         on the mutable list of the old generation.
      
       - move BDescr() function from rts/BlockAlloc.h to includes/Block.h.
      
       - move struct generation and struct step into includes/StgStorage.h (sigh)
      
       - add UPD_IND_NOLOCK for updating with an indirection where locking
         the black hole is not required.
      30681e79
  29. 02 Nov, 1999 1 commit
    • simonmar's avatar
      [project @ 1999-11-02 15:05:38 by simonmar] · f6692611
      simonmar authored
      This commit adds in the current state of our SMP support.  Notably,
      this allows the new way 's' to be built, providing support for running
      multiple Haskell threads simultaneously on top of any pthreads
      implementation, the idea being to take advantage of commodity SMP
      boxes.
      
      Don't expect to get much of a speedup yet; due to the excessive
      locking required to synchronise access to mutable heap objects, you'll
      see a slowdown in most cases, even on a UP machine.  The best I've
      seen is a 1.6-1.7 speedup on an example that did no locking (two
      optimised nfibs in parallel).
      
      	- new RTS -N flag specifies how many pthreads to start.
      
      	- new driver -smp flag, tells the driver to use way 's'.
      
      	- new compiler -fsmp option (not for user comsumption)
      	  tells the compiler not to generate direct jumps to
      	  thunk entry code.
      
      	- largely rewritten scheduler
      
      	- _ccall_GC is now done by handing back a "token" to the
      	  RTS before executing the ccall; it should now be possible
      	  to execute blocking ccalls in the current thread while
      	  allowing the RTS to continue running Haskell threads as
      	  normal.
      
      	- you can only call thread-safe C libraries from a way 's'
      	  build, of course.
      
      Pthread support is still incomplete, and weird things (including
      deadlocks) are likely to happen.
      f6692611
  30. 11 May, 1999 1 commit
    • keithw's avatar
      [project @ 1999-05-11 16:47:39 by keithw] · eb407ca1
      keithw authored
      (this is number 9 of 9 commits to be applied together)
      
        Usage verification changes / ticky-ticky changes:
      
        We want to verify that SingleEntry thunks are indeed entered at most
        once.  In order to do this, -ticky / -DTICKY_TICKY turns on eager
        blackholing.  We blackhole with new blackholes: SE_BLACKHOLE and
        SE_CAF_BLACKHOLE.  We will enter one of these if we attempt to enter
        a SingleEntry thunk twice.  Note that CAFs are dealt with in by
        codeGen, and ordinary thunks by the RTS.
      
        We also want to see how many times we enter each Updatable thunk.
        To this end, we have modified -ticky.  When -ticky is on, we update
        with a permanent indirection, and arrange that when we enter a
        permanent indirection we count the entry and then convert the
        indirection to a normal indirection.  This gives us a means of
        counting the number of thunks entered again after the first entry.
        Obviously this screws up profiling, and so you can't build a ticky
        and profiling compiler any more.
      
        Also a few other changes that didn't make it into the previous 8
        commits, but form a part of this set.
      eb407ca1
  31. 18 Mar, 1999 1 commit
  32. 05 Feb, 1999 1 commit
  33. 02 Feb, 1999 1 commit
    • simonm's avatar
      [project @ 1999-02-02 14:21:28 by simonm] · bf739c10
      simonm authored
      - Add ticky counter for total bytes copied during GC.
      - Separate mutable list into two lists, a "mut once" list for
        old generation indirections and MUT_CONS cells, and a "mut many"
        list for mutable arrays, TSOs etc.  Objects on the "mut once" list
        will be eagerly promoted.
      bf739c10
  34. 21 Jan, 1999 1 commit
  35. 18 Jan, 1999 1 commit
    • simonm's avatar
      [project @ 1999-01-18 15:21:37 by simonm] · c5a9b776
      simonm authored
      - BLACKHOLE_BQ is a mutable object, because new threads get added to
        its blocking_queue field.  Hence add a mut_link field and treat it
        as mutable in the garbage collector.
      
      - Change StgBlackHole to StgBlockingQueue while I'm at it.
      
      - Optimise evacuation of black holes: don't copy the padding
        words, just skip over them.
      
      - Several garbage collection fixes.
      
      - Improve sanity checking: now the older generations are fully checked
        at each GC.
      c5a9b776