1. 05 Sep, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-09-05 16:26:33 by simonmar] · 8435b2e4
      simonmar authored
      Fix for infinite loop when there is a THUNK_SELECTOR which eventually
      refers to itself, such as might be generated by code like
      
      	let x = (fst x, snd x) in ...
      
      At the same time, I re-enabled the code to traverse multiple selector
      thunks with bounded depth, because I believe it now works.
      
      MERGE TO STABLE (but test thoroughly in the HEAD first, this is
      fragile stuff)
      8435b2e4
  2. 16 Aug, 2002 1 commit
  3. 17 Jul, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-07-17 09:21:48 by simonmar] · 7457757f
      simonmar authored
      Remove most #includes of system headers from Stg.h, and instead
      #include any required headers directly in each RTS source file.
      
      The idea is to (a) reduce namespace pollution from system headers that
      we don't need, (c) be clearer about dependencies on system things in
      the RTS, and (c) improve via-C compilation times (maybe).
      
      In practice though, HsBase.h #includes everything anyway, so the
      difference from the point of view of .hc source is minimal.  However,
      this makes it easier to move to zero-includes if we wanted to (see
      discussion on the FFI list; I'm still not sure that's possible but
      at least this is a step in the right direction).
      7457757f
  4. 10 Jul, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-07-10 09:28:54 by simonmar] · f477a85c
      simonmar authored
      Fix a GC bug.  In a "large block", only the block descriptor for the
      head of the block has the fields step, gen_no and flags set.  So
      whenever we want one of these fields in the descriptor for a random
      object anywhere in the large block, we have to check whether it is in
      the head block, and if not follow the link to the head block
      descriptor.
      
      evacuate() was doing this correctly, but isAlive() wasn't (and
      goodness knows what other places are broken in this way - I identified
      several other possible cases of the same bug).
      
      So to try to make things more robust, when we allocate a large block
      we now initialise the step, gen_no, and flags fields in the descriptor
      for *every* sub-block, not just the first one.  Now, as long as you
      only want one of these fields from the descriptor, there's no need to
      try to find the block head.  evacuate() gets minutely faster, and
      hopefully multiple obscure bugs are fixed by this.
      f477a85c
  5. 23 Apr, 2002 1 commit
    • sof's avatar
      [project @ 2002-04-23 06:34:26 by sof] · 30a97b4c
      sof authored
      More sched_mutex related cleanup & fixes; little bit
      too delicate for my liking, this.
      
      * Revert lock assumptions for raiseAsync(); now assumed
        to run with sched_mutex held (as it mucks about with a
        TSO).
      * stodgily release / acquire sched_mutex around calls
        to startSignalHandlers() (as is done in Signals.c)
      * in the presence of user-installed signal handlers, the
        MT-enabled RTS failed to shutdown -- all queues empty
        causes a lone RTS worker thread to pause() waiting for
        signals. This is not the right thing to do; I (temporarily?)
        disabled this signal-wait fallback in MT mode and shut
        down instead. We need to be clearer as to what is a shutdown
        condition in MT mode.
      
      * The use of sched_mutex to protect next_thread_id increments
        is causing headaches; disabled in non-SMP mode right now until
        I've figured out the pthreads-correct way of doing atomic
        increments.
      
      * There's still a ^C-related problem which causes the Haskell
        handler to sometimes induce a SEGV when run. Feel free to debug :)
      30a97b4c
  6. 19 Apr, 2002 1 commit
  7. 13 Apr, 2002 1 commit
  8. 12 Mar, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-03-12 11:50:02 by simonmar] · f762be1b
      simonmar authored
      Main threads are now not kept alive artificially, so it is possible
      for a main thread to be sent the BlockedOnDeadMVar exception.  Main
      threads are no longer GC roots.
      
      This involved cleaning up the weak pointer processing somewhat, and
      separating the processing of real weak pointers from the processing of
      the all_threads list (which can be thought of as "weaker pointers": a
      finalizer can keep a blocked thread alive, but not vice-versa).  The
      new story is described in a detailed comment in GC.c.
      
      One interesting consequence is that it's much harder to get a Deadlock
      exception now - many deadlock situations involving main threads will
      turn into BlockedOnDeadMVar situations instead.  For example, if there
      are a group of threads in a circular deadlock, then they will all be
      sent BlockedOnDeadMVar simultaneously, whereas before if one of them
      was the main thread it would be sent Deadlock.  It's really hard to
      get Deadlock now - you have to somehow keep an MVar independently
      reachable, eg. by using a StablePtr.
      f762be1b
  9. 07 Mar, 2002 1 commit
  10. 18 Feb, 2002 1 commit
    • sof's avatar
      [project @ 2002-02-18 13:26:12 by sof] · 6e2ea06c
      sof authored
      Be clear about the lock assumptions of GarbageCollect(); it
      is now required to hold sched_mutex.
      
      The real reason for adding this requirement is so that when
      prior to scheduling finalizers and doing thread resurrection,
      GarbageCollect() may set the lock status of sched_mutex to
      the state expected by scheduleFinalizers() and resurrectThreads()
      (i.e., unlocked).
      
      Note: this is only an issue with pthreads. In the Win32 threading
      model, it's a NOP for a thread to grab a mutex it already holds.
      6e2ea06c
  11. 28 Nov, 2001 1 commit
  12. 26 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-26 16:54:21 by simonmar] · dbef766c
      simonmar authored
      Profiling cleanup.
      
      This commit eliminates some duplication in the various heap profiling
      subsystems, and generally centralises much of the machinery.  The key
      concept is the separation of a heap *census* (which is now done in one
      place only instead of three) from the calculation of retainer sets.
      Previously the retainer profiling code also did a heap census on the
      fly, and lag-drag-void profiling had its own census machinery.
      
      Value-adds:
      
         - you can now restrict a heap profile to certain retainer sets,
           but still display by cost centre (or type, or closure or
           whatever).
      
         - I've added an option to restrict the maximum retainer set size
           (+RTS -R<size>, defaulting to 8).
      
         - I've cleaned up the heap profiling options at the request of
           Simon PJ.  See the help text for details.  The new scheme
           is backwards compatible with the old.
      
         - I've removed some odd bits of LDV or retainer profiling-specific
           code from various parts of the system.
      
         - the time taken doing heap censuses (and retainer set calculation)
           is now accurately reported by the RTS when you say +RTS -Sstderr.
      
      Still to come:
      
         - restricting a profile to a particular biography
           (lag/drag/void/use).  This requires keeping old heap censuses
           around, but the infrastructure is now in place to do this.
      dbef766c
  13. 22 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-22 14:25:11 by simonmar] · db61851c
      simonmar authored
      Retainer Profiling / Lag-drag-void profiling.
      
      This is mostly work by Sungwoo Park, who spent a summer internship at
      MSR Cambridge this year implementing these two types of heap profiling
      in GHC.
      
      Relative to Sungwoo's original work, I've made some improvements to
      the code:
      
         - it's now possible to apply constraints to retainer and LDV profiles
           in the same way as we do for other types of heap profile (eg.
           +RTS -hc{foo,bar} -hR -RTS gives you a retainer profiling considering
           only closures with cost centres 'foo' and 'bar').
      
         - the heap-profile timer implementation is cleaned up.
      
         - heap profiling no longer has to be run in a two-space heap.
      
         - general cleanup of the code and application of the SDM C coding
           style guidelines.
      
      Profiling will be a little slower and require more space than before,
      mainly because closures have an extra header word to support either
      retainer profiling or LDV profiling (you can't do both at the same
      time).
      
      We've used the new profiling tools on GHC itself, with moderate
      success.  Fixes for some space leaks in GHC to follow...
      db61851c
  14. 08 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-08 12:46:31 by simonmar] · 0671ef05
      simonmar authored
      Fix the large block allocation bug (Yay!)
      -----------------------------------------
      
      In order to do this, I had to
      
       1. in each heap-check failure branch, return the amount of heap
          actually requested, in a known location (I added another slot
          in StgRegTable called HpAlloc for this purpose).  This is
          useful for other reasons - in particular it makes it possible
          to get accurate allocation statistics.
      
       2. In the scheduler, if a heap check fails and we wanted more than
          BLOCK_SIZE_W words, then allocate a special large block and place
          it in the nursery.  The nursery now has to be double-linked so
          we can insert the new block in the middle.
      
       3. The garbage collector has to be able to deal with multiple objects
          in a large block.  It turns out that this isn't a problem as long as
          the large blocks only occur in the nursery, because we always copy
          objects from the nursery during GC.  One small change had to be
          made: in evacuate(), we may need to follow the link field from the
          block descriptor to get to the block descriptor for the head of a
          large block.
      
       4. Various other parts of the storage manager had to be modified
          to cope with a nursery containing a mixture of block sizes.
      
      Point (3) causes a slight pessimization in the garbage collector.  I
      don't see a way to avoid this.  Point (1) causes some code bloat (a
      rough measurement is around 5%), so to offset this I made the
      following change which I'd been meaning to do for some time:
      
        - Store the values of some commonly-used absolute addresses
          (eg. stg_update_PAP) in the register table.  This lets us use
          shorter instruction forms for some absolute jumps and saves some
          code space.
      
        - The type of Capability is no longer the same as an StgRegTable.
          MainRegTable renamed to MainCapability.  See Regs.h for details.
      
      Other minor changes:
      
        - remove individual declarations for the heap-check-failure jump
          points, and declare them all in StgMiscClosures.h instead.  Remove
          HeapStackCheck.h.
      
      Updates to the native code generator to follow.
      0671ef05
  15. 19 Oct, 2001 1 commit
    • sewardj's avatar
      [project @ 2001-10-19 09:41:11 by sewardj] · dd33e044
      sewardj authored
      merge from stable revs:
        1.121.4.1 +7 -6      fptools/ghc/rts/GC.c
        1.9.4.1   +4 -1      fptools/ghc/rts/GCCompact.c
        1.17.4.1  +4 -3      fptools/ghc/rts/StoragePriv.h
      
        SimonM's fixes to deal with GHCi and CAFs properly in the compacting
        collector.
      dd33e044
  16. 17 Oct, 2001 1 commit
  17. 01 Oct, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-10-01 10:52:36 by simonmar] · 3a7e4d98
      simonmar authored
      Fix a bug in the heap size calculation, where a negative result wasn't
      noticed because we're working with unsigned types.  We now explicitly
      test that the heap has enough room for the minimum allocation area
      size, otherwise a heap overflow is reported.
      3a7e4d98
  18. 30 Aug, 2001 1 commit
  19. 17 Aug, 2001 1 commit
  20. 14 Aug, 2001 1 commit
    • sewardj's avatar
      [project @ 2001-08-14 13:40:07 by sewardj] · bc5c8021
      sewardj authored
      Change the story about POSIX headers in C compilation.
      
      Until now, all C code in the RTS and library cbits has by default been
      compiled with settings for POSIXness enabled, that is:
         #define _POSIX_SOURCE   1
         #define _POSIX_C_SOURCE 199309L
         #define _ISOC9X_SOURCE
      If you wanted to negate this, you'd have to define NON_POSIX_SOURCE
      before including headers.
      
      This scheme has some bad effects:
      
      * It means that ccall-unfoldings exported via interfaces from a
        module compiled with -DNON_POSIX_SOURCE may not compile when
        imported into a module which does not -DNON_POSIX_SOURCE.
      
      * It overlaps with the feature tests we do with autoconf.
      
      * It seems to have caused borkage in the Solaris builds for some
        considerable period of time.
      
      The New Way is:
      
      * The default changes to not-being-in-Posix mode.
      
      * If you want to force a C file into Posix mode, #include as
        the **first** include the new file ghc/includes/PosixSource.h.
        Most of the RTS C sources have this include now.
      
      * NON_POSIX_SOURCE is almost totally expunged.  Unfortunately
        we have to retain some vestiges of it in ghc/compiler so that
        modules compiled via C on Solaris using older compilers don't
        break.
      bc5c8021
  21. 10 Aug, 2001 1 commit
  22. 08 Aug, 2001 3 commits
    • simonmar's avatar
      [project @ 2001-08-08 14:14:08 by simonmar] · f3d40a6e
      simonmar authored
      Flag tweaks: +RTS -c now means "enable compaction all the time"
      (previously there was no way to do this *and* run without a maximum
      heap size).
      
      The heuristics for determining the generation sizes are also slightly
      better now.
      f3d40a6e
    • simonmar's avatar
      [project @ 2001-08-08 13:45:02 by simonmar] · dfebb20f
      simonmar authored
      wibble
      dfebb20f
    • simonmar's avatar
      [project @ 2001-08-08 10:50:36 by simonmar] · 52c07834
      simonmar authored
      Had a brainwave on the way to work this morning, and realised that the
      garbage collector can handle "pinned objects" as long as they don't
      contain any pointers.
      
      This is absolutely ideal for doing temporary allocation in the FFI,
      because what we really want to do is allocate a pinned ByteArray and
      let the GC clean it up later.  So this set of changes adds the
      required framework.
      
      There are two new primops:
      
       newPinnedByteArray# :: Int# -> State# s -> (# State# s, MutByteArr# s #)
       byteArrayContents#  :: ByteArr# -> Addr#
      
      obviously byteArrayContents# is highly unsafe.
      
      Allocating a pinned ByteArr# isn't the default, because a pinned
      ByteArr# will hold an entire block (currently 4k) live until it is
      garbage collected (that doesn't mean each pinned ByteArr# requires
      4k of storage, just that if a block contains a single live pinned
      ByteArray, the whole block must be retained).
      52c07834
  23. 07 Aug, 2001 2 commits
    • simonmar's avatar
      [project @ 2001-08-07 10:49:49 by simonmar] · 8553e558
      simonmar authored
      (forced commit)
      
      Note that the previous commit also fixed the bug reported by Ken Shan
      yesterday, namely that the conc004.hs test was failing.
      8553e558
    • simonmar's avatar
      [project @ 2001-08-07 09:20:52 by simonmar] · 433cdcad
      simonmar authored
      - Allow RTS options to be given using the GHCRTS environment variable.
      
      - Fix the heap size calculation to take into account all generations.
        It's more conservative than it used to be, but now it is less likely
        that the maximum heap size will be exceeded.
      
      - Compacting collection is turned on automatically when residency
        reaches 30% of the maximum heap size, tunable with +RTS -c<n>.
        +RTS -c turns off compaction altogether.
      
      - The maximum heap size is off by default.  NOTE: this also means no
        compaction by default.  It is recommended that people enable a maximum
        heap size for their system using the GHCRTS environment var; eg:
        GHCRTS=-M128m.
      433cdcad
  24. 04 Aug, 2001 1 commit
  25. 02 Aug, 2001 1 commit
  26. 30 Jul, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-07-30 12:54:12 by simonmar] · a5381464
      simonmar authored
      - Bugfix: mark the weak pointer list before GC, instead of the
        (strange) old mechanism which involved "cleaning it up" after GC.
      
      - size the old generation properly when doing compacting GC.
      a5381464
  27. 26 Jul, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-07-26 14:29:26 by simonmar] · bc51f1af
      simonmar authored
      Fall back to doing a linear scan of the old generation when the mark
      stack fills up.
      
      The compacting collector should work for all programs now, but there's
      still some work to do on the speed of the collector - don't expect
      programs to go any faster :)
      bc51f1af
  28. 25 Jul, 2001 2 commits
    • simonmar's avatar
      [project @ 2001-07-25 12:18:26 by simonmar] · f7341243
      simonmar authored
      - move the call to scavenge_mark_stack() inside the inner scavenge loop,
        so it gets called more often and there's less chance of the mark
        stack filling up.  None of the nofib tests cause the mark stack to
        fill up now.
      
      - remove some experimental code from scavenge_mark_stack().
      f7341243
    • simonmar's avatar
      [project @ 2001-07-25 09:14:21 by simonmar] · e16e9973
      simonmar authored
      - bugfix (was erroneously ignoring the return value from scavenge_one() and
        checking failed_to_evac instead, which was always false)
      
      - printf format fix (shut gcc up).
      e16e9973
  29. 24 Jul, 2001 3 commits
    • simonmar's avatar
      [project @ 2001-07-24 16:36:43 by simonmar] · 76a51a41
      simonmar authored
      Bugfixes; take large objects into account in stats output.
      76a51a41
    • ken's avatar
      [project @ 2001-07-24 06:31:35 by ken] · d888cbcb
      ken authored
      Innocent changes to resurrect/add 64-bit support.
      d888cbcb
    • ken's avatar
      [project @ 2001-07-24 05:04:58 by ken] · 030787e5
      ken authored
      Removed 32-bit dependencies in the generation and handling of
      liveness mask bitmaps.  We now support both 32-bit and 64-bit
      machines with identical .hc files.  Support for >64-bit machines
      would be easy to add.  Note that old .hc files are incompatible
      with the changes made to ghc/include/InfoMacros.h!
      030787e5
  30. 23 Jul, 2001 2 commits
    • simonmar's avatar
      [project @ 2001-07-23 17:23:19 by simonmar] · dfd7d6d0
      simonmar authored
      Add a compacting garbage collector.
      
      It isn't enabled by default, as there are still a couple of problems:
      there's a fallback case I haven't implemented yet which means it will
      occasionally bomb out, and speed-wise it's quite a bit slower than the
      copying collector (about 1.8x slower).
      
      Until I can make it go faster, it'll only be useful when you're
      actually running low on real memory.
      
      '+RTS -c' to enable it.
      
      Oh, and I cleaned up a few things in the RTS while I was there, and
      fixed one or two possibly real bugs in the existing GC.
      dfd7d6d0
    • simonmar's avatar
      [project @ 2001-07-23 10:47:16 by simonmar] · 6f83fbc0
      simonmar authored
      Small changes to improve GC performance slightly:
      
        - store the generation *number* in the block descriptor rather
          than a pointer to the generation structure, since the most
          common operation is to pull out the generation number, and
          it's one less indirection this way.
      
        - cache the generation number in the step structure too, which
          avoids an extra indirection in several places.
      6f83fbc0
  31. 03 Apr, 2001 1 commit
  32. 02 Apr, 2001 1 commit
  33. 22 Mar, 2001 1 commit
    • hwloidl's avatar
      [project @ 2001-03-22 03:51:08 by hwloidl] · 20fc2f0c
      hwloidl authored
      -*- outline -*-
      Time-stamp: <Thu Mar 22 2001 03:50:16 Stardate: [-30]6365.79 hwloidl>
      
      This commit covers changes in GHC to get GUM (way=mp) and GUM/GdH (way=md)
      working. It is a merge of my working version of GUM, based on GHC 4.06,
      with GHC 4.11. Almost all changes are in the RTS (see below).
      
      GUM is reasonably stable, we used the 4.06 version in large-ish programs for
      recent papers. Couple of things I want to change, but nothing urgent.
      GUM/GdH has just been merged and needs more testing. Hope to do that in the
      next weeks. It works in our working build but needs tweaking to run.
      GranSim doesn't work yet (*sigh*). Most of the code should be in, but needs
      more debugging.
      
      ToDo: I still want to make the following minor modifications before the release
      - Better wrapper skript for parallel execution [ghc/compiler/main]
      - Update parallel docu: started on it but it's minimal [ghc/docs/users_guide]
      - Clean up [nofib/parallel]: it's a real mess right now (*sigh*)
      - Update visualisation tools (minor things only IIRC) [ghc/utils/parallel]
      - Add a Klingon-English glossary
      
      * RTS:
      
      Almost all changes are restricted to ghc/rts/parallel and should not
      interfere with the rest. I only comment on changes outside the parallel
      dir:
      
      - Several changes in Schedule.c (scheduling loop; createThreads etc);
        should only affect parallel code
      - Added ghc/rts/hooks/ShutdownEachPEHook.c
      - ghc/rts/Linker.[ch]: GUM doesn't know about Stable Names (ifdefs)!!
      - StgMiscClosures.h: END_TSO_QUEUE etc now defined here (from StgMiscClosures.hc)
                           END_ECAF_LIST was missing a leading stg_
      - SchedAPI.h: taskStart now defined in here; it's only a wrapper around
                    scheduleThread now, but might use some init, shutdown later
      - RtsAPI.h: I have nuked the def of rts_evalNothing
      
      * Compiler:
      
      - ghc/compiler/main/DriverState.hs
        added PVM-ish flags to the parallel way
        added new ways for parallel ticky profiling and distributed exec
      
      - ghc/compiler/main/DriverPipeline.hs
        added a fct run_phase_MoveBinary which is called with way=mp after linking;
        it moves the bin file into a PVM dir and produces a wrapper script for
        parallel execution
        maybe cleaner to add a MoveBinary phase in DriverPhases.hs but this way
        it's less intrusive and MoveBinary makes probably only sense for mp anyway
      
      * Nofib:
      
      - nofib/spectral/Makefile, nofib/real/Makefile, ghc/tests/programs/Makefile:
        modified to skip some tests if HWL_NOFIB_HACK is set; only tmp to record
        which test prgs cause problems in my working build right now
      20fc2f0c