1. 13 Feb, 2002 1 commit
    • sof's avatar
      [project @ 2002-02-13 08:48:06 by sof] · e289780e
      sof authored
      Revised implementation of multi-threaded callouts (and callins):
      
      - unified synchronisation story for threaded and SMP builds,
        following up on SimonM's suggestion. The following synchro
        variables are now used inside the Scheduler:
      
          + thread_ready_cond - condition variable that is signalled
            when a H. thread has become runnable (via the THREAD_RUNNABLE()
            macro) and there are available capabilities. Waited on:
               + upon schedule() entry (iff no caps. available).
      	 + when a thread inside of the Scheduler spots that there
      	   are no runnable threads to service, but one or more
      	   external call is in progress.
      	 + in resumeThread(), waiting for a capability to become
      	   available.
      
            Prior to waiting on thread_ready_cond, a counter rts_n_waiting_tasks
            is incremented, so that we can keep track of the number of
            readily available worker threads (need this in order to make
            an informed decision on whether or not to create a new thread
            when an external call is made).
      
      
          + returning_worker_cond - condition variable that is waited
            on by an OS thread that has finished executing and external
            call & now want to feed its result back to the H thread
            that made the call. Before doing so, the counter
            rts_n_returning_workers is incremented.
      
            Upon entry to the Scheduler, this counter is checked for &
            if it is non-zero, the thread gives up its capability and
            signals returning_worker_cond before trying to re-grab a
            capability. (releaseCapability() takes care of this).
      
          + sched_mutex - protect Scheduler data structures.
          + gc_pending_cond - SMP-only condition variable for signalling
            completion of GCs.
      
      - initial implementation of call-ins, i.e., multiple OS threads
        may concurrently call into the RTS without interfering with
        each other. Implementation uses cheesy locking protocol to
        ensure that only one OS thread at a time can construct a
        function application -- stop-gap measure until the RtsAPI
        is revised (as discussed last month) *and* a designated
        block is used for allocating these applications.
      
      - In the implementation of call-ins, the OS thread blocks
        waiting for an RTS worker thread to complete the evaluation
        of the function application. Since main() also uses the
        RtsAPI, provide a separate entry point for it (rts_mainEvalIO()),
        which avoids creating a separate thread to evaluate Main.main,
        that can be done by the thread exec'ing main() directly.
        [Maybe there's a tidier way of doing this, a bit ugly the
        way it is now..]
      
      
      There are a couple of dark corners that needs to be looked at,
      such as conditions for shutting down (and how) + consider what
      ought to happen when async I/O is thrown into the mix (I know
      what will happen, but that's maybe not what we want).
      
      Other than that, things are in a generally happy state & I hope
      to declare myself done before the week is up.
      e289780e
  2. 12 Feb, 2002 1 commit
    • sof's avatar
      [project @ 2002-02-12 15:38:08 by sof] · a62d5cd2
      sof authored
      Snapshot (before heading into work):
      - thread_ready_aux_mutex is no more; use sched_mutex instead.
      - gc_pending_cond only used in SMP mode.
      - document the condition that thread_ready_cond captures.
      a62d5cd2
  3. 08 Feb, 2002 1 commit
  4. 07 Feb, 2002 1 commit
  5. 06 Feb, 2002 1 commit
    • sof's avatar
      [project @ 2002-02-06 01:29:27 by sof] · 91fd2101
      sof authored
      - use task manager API to keep track of the number
        of tasks that are blocked waiting on the RTS lock.
      - comment updates/additions.
      91fd2101
  6. 05 Feb, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-02-05 10:06:24 by simonmar] · 84a29fdd
      simonmar authored
      Fix bad bugs in deleteAllThreds: we were looping through the thread
      queues calling deleteThread() on each thread as we go, but calling
      deleteThread() has the side effect of removing the thread from the
      relevant queue, so we would end up breaking out of the loop after
      processing only a single thread.
      
      This may fix problems like "resurrectThreads: thread blocked in a
      strange way" seen after pressing ^C.
      
      Aside: we really shouldn't be using deleteThread() at all, since it
      doesn't give the thread a chance to clean up & release locks.  To be
      well-behaved a program has to catch ^C itself at the moment.
      84a29fdd
  7. 04 Feb, 2002 2 commits
    • sof's avatar
      [project @ 2002-02-04 20:56:53 by sof] · 632827e0
      sof authored
      resumeThread: ifdef threads-specific code
      632827e0
    • sof's avatar
      [project @ 2002-02-04 20:40:36 by sof] · be72dc05
      sof authored
      Snapshot of 'native thread'-friendly extension:
      - call-outs now work, i.e., a Concurrent Haskell thread which
        makes an external (C) call no longer stop other CH threads
        dead in their tracks. [More testing and tightening up of
        invariants reqd, this is just a snapshot].
      - separated task handling into sep. module.
      be72dc05
  8. 31 Jan, 2002 1 commit
    • sof's avatar
      [project @ 2002-01-31 11:18:06 by sof] · 3b9c5eb2
      sof authored
      First steps towards implementing better interop between
      Concurrent Haskell and native threads.
      
      - factored out Capability handling into a separate source file
        (only the SMP build uses multiple capabilities tho).
      - factored out OS/native threads handling into a separate
        source file, OSThreads.{c,h}. Currently, just a pthreads-based
        implementation; Win32 version to follow.
      - scheduler code now distinguishes between multi-task threaded
        code (SMP) and single-task threaded code ('threaded RTS'),
        but sharing code between these two modes whenever poss.
      
      i.e., just a first snapshot; the bulk of the transitioning code
      remains to be implemented.
      3b9c5eb2
  9. 24 Jan, 2002 2 commits
  10. 22 Jan, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-01-22 13:54:22 by simonmar] · 33a7aa8b
      simonmar authored
      Deadlock is now an exception instead of a return status from
      rts_evalIO().
      
      The current behaviour is as follows, and can be changed if necessary:
      in the event of a deadlock, the top main thread is taken from the main
      thread queue, and if it is blocked on an MVar or an Exception (for
      throwTo), then it receives a Deadlock exception.  If it is blocked on
      a BLACKHOLE, we instead send it the NonTermination exception.  Note
      that only the main thread gets the exception: it is the responsibility
      of the main thread to unblock other threads if necessary.
      
      There's a slight difference in the SMP build: *all* the main threads
      get an exception, because clearly none of them may make progress
      (compared to the non-SMP situation, where all but the top main thread
      are usually blocked).
      33a7aa8b
  11. 18 Dec, 2001 1 commit
  12. 07 Dec, 2001 1 commit
    • sof's avatar
      [project @ 2001-12-07 20:57:53 by sof] · f9a21ddc
      sof authored
      - tidy up TICK_ALLOC_TSO() uses.
      - scheduleThread: remove special-case for END_TSO_QUEUE. If you want
        to call schedule(), do so directly. (Only one of the scheduleThread()
        call sites depended on this feature).
      f9a21ddc
  13. 26 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-26 16:54:21 by simonmar] · dbef766c
      simonmar authored
      Profiling cleanup.
      
      This commit eliminates some duplication in the various heap profiling
      subsystems, and generally centralises much of the machinery.  The key
      concept is the separation of a heap *census* (which is now done in one
      place only instead of three) from the calculation of retainer sets.
      Previously the retainer profiling code also did a heap census on the
      fly, and lag-drag-void profiling had its own census machinery.
      
      Value-adds:
      
         - you can now restrict a heap profile to certain retainer sets,
           but still display by cost centre (or type, or closure or
           whatever).
      
         - I've added an option to restrict the maximum retainer set size
           (+RTS -R<size>, defaulting to 8).
      
         - I've cleaned up the heap profiling options at the request of
           Simon PJ.  See the help text for details.  The new scheme
           is backwards compatible with the old.
      
         - I've removed some odd bits of LDV or retainer profiling-specific
           code from various parts of the system.
      
         - the time taken doing heap censuses (and retainer set calculation)
           is now accurately reported by the RTS when you say +RTS -Sstderr.
      
      Still to come:
      
         - restricting a profile to a particular biography
           (lag/drag/void/use).  This requires keeping old heap censuses
           around, but the infrastructure is now in place to do this.
      dbef766c
  14. 22 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-22 14:25:11 by simonmar] · db61851c
      simonmar authored
      Retainer Profiling / Lag-drag-void profiling.
      
      This is mostly work by Sungwoo Park, who spent a summer internship at
      MSR Cambridge this year implementing these two types of heap profiling
      in GHC.
      
      Relative to Sungwoo's original work, I've made some improvements to
      the code:
      
         - it's now possible to apply constraints to retainer and LDV profiles
           in the same way as we do for other types of heap profile (eg.
           +RTS -hc{foo,bar} -hR -RTS gives you a retainer profiling considering
           only closures with cost centres 'foo' and 'bar').
      
         - the heap-profile timer implementation is cleaned up.
      
         - heap profiling no longer has to be run in a two-space heap.
      
         - general cleanup of the code and application of the SDM C coding
           style guidelines.
      
      Profiling will be a little slower and require more space than before,
      mainly because closures have an extra header word to support either
      retainer profiling or LDV profiling (you can't do both at the same
      time).
      
      We've used the new profiling tools on GHC itself, with moderate
      success.  Fixes for some space leaks in GHC to follow...
      db61851c
  15. 08 Nov, 2001 2 commits
    • simonmar's avatar
      [project @ 2001-11-08 16:17:35 by simonmar] · c094c3ad
      simonmar authored
      Revert resumeThread and suspendThread to working with StgRegTable
      rather than Capability, and do the conversion in the functions
      themselves rather than in the inline code.  This means I don't have to
      fiddle with the NCG to fix the SUSPEND_THREAD/RESUME_THREAD macros.
      c094c3ad
    • simonmar's avatar
      [project @ 2001-11-08 12:46:31 by simonmar] · 0671ef05
      simonmar authored
      Fix the large block allocation bug (Yay!)
      -----------------------------------------
      
      In order to do this, I had to
      
       1. in each heap-check failure branch, return the amount of heap
          actually requested, in a known location (I added another slot
          in StgRegTable called HpAlloc for this purpose).  This is
          useful for other reasons - in particular it makes it possible
          to get accurate allocation statistics.
      
       2. In the scheduler, if a heap check fails and we wanted more than
          BLOCK_SIZE_W words, then allocate a special large block and place
          it in the nursery.  The nursery now has to be double-linked so
          we can insert the new block in the middle.
      
       3. The garbage collector has to be able to deal with multiple objects
          in a large block.  It turns out that this isn't a problem as long as
          the large blocks only occur in the nursery, because we always copy
          objects from the nursery during GC.  One small change had to be
          made: in evacuate(), we may need to follow the link field from the
          block descriptor to get to the block descriptor for the head of a
          large block.
      
       4. Various other parts of the storage manager had to be modified
          to cope with a nursery containing a mixture of block sizes.
      
      Point (3) causes a slight pessimization in the garbage collector.  I
      don't see a way to avoid this.  Point (1) causes some code bloat (a
      rough measurement is around 5%), so to offset this I made the
      following change which I'd been meaning to do for some time:
      
        - Store the values of some commonly-used absolute addresses
          (eg. stg_update_PAP) in the register table.  This lets us use
          shorter instruction forms for some absolute jumps and saves some
          code space.
      
        - The type of Capability is no longer the same as an StgRegTable.
          MainRegTable renamed to MainCapability.  See Regs.h for details.
      
      Other minor changes:
      
        - remove individual declarations for the heap-check-failure jump
          points, and declare them all in StgMiscClosures.h instead.  Remove
          HeapStackCheck.h.
      
      Updates to the native code generator to follow.
      0671ef05
  16. 31 Oct, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-10-31 10:34:29 by simonmar] · 760f104f
      simonmar authored
      Fix a problem when a Haskell process is suspended/resumed using shell
      job control in Unix.  The shell tends to put stdin back into blocking
      mode before resuming the process, so we have to catch SIGCONT and put
      it back into O_NONBLOCK.
      
      Also:
      
        - fix a bug in the scheduler: reverse the order of the check
          for pending signals and the call to awaitEvent to block on I/O.
      
        - do a style sweep in Signals.c
      760f104f
  17. 27 Oct, 2001 1 commit
  18. 23 Oct, 2001 2 commits
  19. 14 Aug, 2001 1 commit
    • sewardj's avatar
      [project @ 2001-08-14 13:40:07 by sewardj] · bc5c8021
      sewardj authored
      Change the story about POSIX headers in C compilation.
      
      Until now, all C code in the RTS and library cbits has by default been
      compiled with settings for POSIXness enabled, that is:
         #define _POSIX_SOURCE   1
         #define _POSIX_C_SOURCE 199309L
         #define _ISOC9X_SOURCE
      If you wanted to negate this, you'd have to define NON_POSIX_SOURCE
      before including headers.
      
      This scheme has some bad effects:
      
      * It means that ccall-unfoldings exported via interfaces from a
        module compiled with -DNON_POSIX_SOURCE may not compile when
        imported into a module which does not -DNON_POSIX_SOURCE.
      
      * It overlaps with the feature tests we do with autoconf.
      
      * It seems to have caused borkage in the Solaris builds for some
        considerable period of time.
      
      The New Way is:
      
      * The default changes to not-being-in-Posix mode.
      
      * If you want to force a C file into Posix mode, #include as
        the **first** include the new file ghc/includes/PosixSource.h.
        Most of the RTS C sources have this include now.
      
      * NON_POSIX_SOURCE is almost totally expunged.  Unfortunately
        we have to retain some vestiges of it in ghc/compiler so that
        modules compiled via C on Solaris using older compilers don't
        break.
      bc5c8021
  20. 30 Jul, 2001 1 commit
  21. 24 Jul, 2001 1 commit
  22. 23 Jul, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-07-23 17:23:19 by simonmar] · dfd7d6d0
      simonmar authored
      Add a compacting garbage collector.
      
      It isn't enabled by default, as there are still a couple of problems:
      there's a fallback case I haven't implemented yet which means it will
      occasionally bomb out, and speed-wise it's quite a bit slower than the
      copying collector (about 1.8x slower).
      
      Until I can make it go faster, it'll only be useful when you're
      actually running low on real memory.
      
      '+RTS -c' to enable it.
      
      Oh, and I cleaned up a few things in the RTS while I was there, and
      fixed one or two possibly real bugs in the existing GC.
      dfd7d6d0
  23. 04 Jun, 2001 1 commit
  24. 23 Mar, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-03-23 16:36:20 by simonmar] · 50027272
      simonmar authored
      Changes to support bootstrapping the compiler from .hc files.  It's
      not quite working yet, but it's not far off.
      
        - the biggest change is that any injected #includes are now placed in
          the .hc file at generation time, rather than compilation time.  I
          can't see any reason not to do this - it makes it clear by looking at
          the .hc file which files are being #included, it means one less
          temporary file at compilation time, and it means the .hc file is more
          standalone.
      
        - all the gruesomeness is in mk/bootstrap.mk, which handles building
          .hc files without a ghc driver.
      50027272
  25. 22 Mar, 2001 1 commit
    • hwloidl's avatar
      [project @ 2001-03-22 03:51:08 by hwloidl] · 20fc2f0c
      hwloidl authored
      -*- outline -*-
      Time-stamp: <Thu Mar 22 2001 03:50:16 Stardate: [-30]6365.79 hwloidl>
      
      This commit covers changes in GHC to get GUM (way=mp) and GUM/GdH (way=md)
      working. It is a merge of my working version of GUM, based on GHC 4.06,
      with GHC 4.11. Almost all changes are in the RTS (see below).
      
      GUM is reasonably stable, we used the 4.06 version in large-ish programs for
      recent papers. Couple of things I want to change, but nothing urgent.
      GUM/GdH has just been merged and needs more testing. Hope to do that in the
      next weeks. It works in our working build but needs tweaking to run.
      GranSim doesn't work yet (*sigh*). Most of the code should be in, but needs
      more debugging.
      
      ToDo: I still want to make the following minor modifications before the release
      - Better wrapper skript for parallel execution [ghc/compiler/main]
      - Update parallel docu: started on it but it's minimal [ghc/docs/users_guide]
      - Clean up [nofib/parallel]: it's a real mess right now (*sigh*)
      - Update visualisation tools (minor things only IIRC) [ghc/utils/parallel]
      - Add a Klingon-English glossary
      
      * RTS:
      
      Almost all changes are restricted to ghc/rts/parallel and should not
      interfere with the rest. I only comment on changes outside the parallel
      dir:
      
      - Several changes in Schedule.c (scheduling loop; createThreads etc);
        should only affect parallel code
      - Added ghc/rts/hooks/ShutdownEachPEHook.c
      - ghc/rts/Linker.[ch]: GUM doesn't know about Stable Names (ifdefs)!!
      - StgMiscClosures.h: END_TSO_QUEUE etc now defined here (from StgMiscClosures.hc)
                           END_ECAF_LIST was missing a leading stg_
      - SchedAPI.h: taskStart now defined in here; it's only a wrapper around
                    scheduleThread now, but might use some init, shutdown later
      - RtsAPI.h: I have nuked the def of rts_evalNothing
      
      * Compiler:
      
      - ghc/compiler/main/DriverState.hs
        added PVM-ish flags to the parallel way
        added new ways for parallel ticky profiling and distributed exec
      
      - ghc/compiler/main/DriverPipeline.hs
        added a fct run_phase_MoveBinary which is called with way=mp after linking;
        it moves the bin file into a PVM dir and produces a wrapper script for
        parallel execution
        maybe cleaner to add a MoveBinary phase in DriverPhases.hs but this way
        it's less intrusive and MoveBinary makes probably only sense for mp anyway
      
      * Nofib:
      
      - nofib/spectral/Makefile, nofib/real/Makefile, ghc/tests/programs/Makefile:
        modified to skip some tests if HWL_NOFIB_HACK is set; only tmp to record
        which test prgs cause problems in my working build right now
      20fc2f0c
  26. 02 Mar, 2001 2 commits
    • simonmar's avatar
      [project @ 2001-03-02 16:15:53 by simonmar] · 435b1086
      simonmar authored
      ASSERT in updateWithIndirection() that we haven't already updated this
      object with an indirection, and fix two places in the RTS where this
      could happen.
      
      The problem only occurs when we're in a black-hole-style loop, and
      there are multiple update frames on the stack pointing to the same
      object (this is possible because of lazy black-holing).  Both stack
      squeezing and asynchronous exception raising walk down the stack and
      remove update frames, updating their contents with indirections.  If
      we don't protect against multiple updates, the mutable list in the old
      generation may get into a bogus state.
      435b1086
    • simonmar's avatar
      [project @ 2001-03-02 14:25:04 by simonmar] · 2efbfc25
      simonmar authored
      A good bug: detectBlackHoles wasn't checking for ThreadRelocated,
      which is why we sometimes get "no threads to run: infinite loop or
      deadlock?" when we should get a NonTermination exception.
      
      To be merged into the 4.08 branch.
      2efbfc25
  27. 12 Feb, 2001 1 commit
  28. 11 Feb, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-02-11 17:51:07 by simonmar] · 6d35596c
      simonmar authored
      Bite the bullet and make GHCi support non-optional in the RTS.  GHC
      4.11 should be able to build GHCi without any additional tweaks now.
      
      - the Linker is split into two parts: LinkerBasic.c, containing the
        routines required by the rest of the RTS, and Linker.c, containing
        the linker proper, which is not referred to from the rest of the RTS.
        Only Linker.c requires -ldl, so programs which don't make use of the
        linker (everything except GHC, in other words) won't need -ldl.
      6d35596c
  29. 09 Feb, 2001 1 commit
  30. 31 Jan, 2001 1 commit
  31. 24 Jan, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-01-24 15:46:19 by simonmar] · 43b212f5
      simonmar authored
      Add a CAF list for GHCI.
      
      Retaining all looked-up symbols in a list in the interpreter was the
      Wrong Thing To Do, since we can't guarantee that the transitive
      closure of this list points to all the CAFs so far evaluated (the
      transitive closure gets smaller as reachable CAFs are evaluated).
      
      A Better Thing To Do is just to retain all the CAFs.  A refinement is
      to only retain all CAFs in dynamically linked code, which is what this
      patch implements.
      43b212f5
  32. 16 Jan, 2001 1 commit
  33. 19 Dec, 2000 1 commit
  34. 14 Dec, 2000 1 commit
  35. 04 Dec, 2000 1 commit