1. 27 Jan, 2003 1 commit
  2. 25 Jan, 2003 1 commit
    • wolfgang's avatar
      [project @ 2003-01-25 15:54:48 by wolfgang] · af136096
      wolfgang authored
      This commit fixes many bugs and limitations in the threaded RTS.
      There are still some issues remaining, though.
      
      The following bugs should have been fixed:
      
      - [+] "safe" calls could cause crashes
      - [+] yieldToReturningWorker/grabReturnCapability
          -     It used to deadlock.
      - [+] couldn't wake blocked workers
          -     Calls into the RTS could go unanswered for a long time, and
                that includes ordinary callbacks in some circumstances.
      - [+] couldn't block on an MVar and expect to be woken up by a signal
            handler
          -     Depending on the exact situation, the RTS shut down or
                blocked forever and ignored the signal.
      - [+] The locking scheme in RtsAPI.c didn't work
      - [+] run_thread label in wrong place (schedule())
      - [+] Deadlock in GHC.Handle
          -     if a signal arrived at the wrong time, an mvar was never
                filled again
      - [+] Signals delivered to the "wrong" thread were ignored or handled
            too late.
      
      Issues:
      *) If GC can move TSO objects (I don't know - can it?), then ghci
      will occasionally crash when calling foreign functions, because the
      parameters are stored on the TSO stack.
      
      *) There is still a race condition lurking in the code
      (both threaded and non-threaded RTS are affected):
      If a signal arrives after the check for pending signals in
      schedule(), but before the call to select() in awaitEvent(),
      select() will be called anyway. The signal handler will be
      executed much later than expected.
      
      *) For Win32, GHC doesn't yet support non-blocking IO, so while a
      thread is waiting for IO, no call-ins can happen. If the RTS is
      blocked in awaitEvent, it uses a polling loop on Win32, so call-ins
      should work (although the polling loop looks ugly).
      
      *) Deadlock detection is disabled for the threaded rts, because I
      don't know how to do it properly in the presence of foreign call-ins
      from foreign threads.
      This causes the tests conc031, conc033 and conc034 to fail.
      
      *) "safe" is currently treated as "threadsafe". Implementing "safe" in
      a way that blocks other Haskell threads is more difficult than was
      thought at first. I think it could be done with a few additional lines
      of code, but personally, I'm strongly in favour of abolishing the
      distinction.
      
      *) Running finalizers at program termination is inefficient - there
      are two OS threads passing messages back and forth for every finalizer
      that is run. Also (just as in the non-threaded case) the finalizers
      are run in parallel to any remaining haskell threads and to any
      foreign call-ins that might still happen.
      af136096
  3. 11 Dec, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-12-11 15:36:20 by simonmar] · 0bffc410
      simonmar authored
      Merge the eval-apply-branch on to the HEAD
      ------------------------------------------
      
      This is a change to GHC's evaluation model in order to ultimately make
      GHC more portable and to reduce complexity in some areas.
      
      At some point we'll update the commentary to describe the new state of
      the RTS.  Pending that, the highlights of this change are:
      
        - No more Su.  The Su register is gone, update frames are one
          word smaller.
      
        - Slow-entry points and arg checks are gone.  Unknown function calls
          are handled by automatically-generated RTS entry points (AutoApply.hc,
          generated by the program in utils/genapply).
      
        - The stack layout is stricter: there are no "pending arguments" on
          the stack any more, the stack is always strictly a sequence of
          stack frames.
      
          This means that there's no need for LOOKS_LIKE_GHC_INFO() or
          LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
          how to find the boundary between the text and data segments (BIG WIN!).
      
        - A couple of nasty hacks in the mangler caused by the neet to
          identify closure ptrs vs. info tables have gone away.
      
        - Info tables are a bit more complicated.  See InfoTables.h for the
          details.
      
        - As a side effect, GHCi can now deal with polymorphic seq.  Some bugs
          in GHCi which affected primitives and unboxed tuples are now
          fixed.
      
        - Binary sizes are reduced by about 7% on x86.  Performance is roughly
          similar, some programs get faster while some get slower.  I've seen
          GHCi perform worse on some examples, but haven't investigated
          further yet (GHCi performance *should* be about the same or better
          in theory).
      
        - Internally the code generator is rather better organised.  I've moved
          info-table generation from the NCG into the main codeGen where it is
          shared with the C back-end; info tables are now emitted as arrays
          of words in both back-ends.  The NCG is one step closer to being able
          to support profiling.
      
      This has all been fairly thoroughly tested, but no doubt I've messed
      up the commit in some way.
      0bffc410
  4. 19 Aug, 2002 1 commit
  5. 16 Aug, 2002 1 commit
  6. 23 Apr, 2002 1 commit
  7. 13 Apr, 2002 1 commit
    • sof's avatar
      [project @ 2002-04-13 05:25:38 by sof] · b53a7735
      sof authored
      yieldToReturningWorker(): once yielded to a returning worker,
      thread now directly waits for capability to become available
      again (via waitForWorkCapability()) -- simplifies Cap. grabbing
      'logic' in the Scheduler.
      
      grabReturnCapability(): assume pMutex is held upon entry.
      b53a7735
  8. 15 Feb, 2002 2 commits
    • sof's avatar
      [project @ 2002-02-15 21:07:19 by sof] · d031626e
      sof authored
      comments only
      d031626e
    • sof's avatar
      [project @ 2002-02-15 07:50:36 by sof] · 6d7576ef
      sof authored
      Tighten up the Scheduler synchronisation story some more:
      
      - moved thread_ready_cond + the counter rts_n_waiting_tasks
        to Capability.c, leaving only sched_mutex as a synchro
        variable in Scheduler (the less stuff that inhabit
        Schedule.c, the better, methinks.)
      - upon entry to the Scheduler, a worker thread will now call
        Capability.yieldToReturningWorker() to check whether it
        needs to give up its capability.
      - Worker threads that are either idle or lack a capability,
        will now call Capability.waitForWorkCapability() and block.
      6d7576ef
  9. 14 Feb, 2002 3 commits
    • sof's avatar
      [project @ 2002-02-14 18:20:37 by sof] · f7e5d55c
      sof authored
      more comments
      f7e5d55c
    • sof's avatar
      [project @ 2002-02-14 08:59:29 by sof] · 5e856f00
      sof authored
      debugged
      5e856f00
    • sof's avatar
      [project @ 2002-02-14 07:52:05 by sof] · efa41d9d
      sof authored
      Restructured / tidied a bit:
      
      * Capability.grabReturnCapability() is now called by resumeThread().
        It takes care of waiting on the (Capability.c-local) condition
        variable, 'returning_worker_cond' (moved here from Schedule.c)
      
      * If a worker notices upon entry to the Scheduler that there are
        worker threads waiting to deposit results of external calls,
        it gives up its capability by calling Capability.yieldCapability().
      
      * Added Scheduler.waitForWork(), which takes care of blocking
        on 'thread_ready_cond' (+ 'rts_n_waiting_tasks' book-keeping).
      
      Note: changes haven't been fully tested, due to HEAD instability.
      efa41d9d
  10. 13 Feb, 2002 1 commit
    • sof's avatar
      [project @ 2002-02-13 08:48:06 by sof] · e289780e
      sof authored
      Revised implementation of multi-threaded callouts (and callins):
      
      - unified synchronisation story for threaded and SMP builds,
        following up on SimonM's suggestion. The following synchro
        variables are now used inside the Scheduler:
      
          + thread_ready_cond - condition variable that is signalled
            when a H. thread has become runnable (via the THREAD_RUNNABLE()
            macro) and there are available capabilities. Waited on:
               + upon schedule() entry (iff no caps. available).
      	 + when a thread inside of the Scheduler spots that there
      	   are no runnable threads to service, but one or more
      	   external call is in progress.
      	 + in resumeThread(), waiting for a capability to become
      	   available.
      
            Prior to waiting on thread_ready_cond, a counter rts_n_waiting_tasks
            is incremented, so that we can keep track of the number of
            readily available worker threads (need this in order to make
            an informed decision on whether or not to create a new thread
            when an external call is made).
      
      
          + returning_worker_cond - condition variable that is waited
            on by an OS thread that has finished executing and external
            call & now want to feed its result back to the H thread
            that made the call. Before doing so, the counter
            rts_n_returning_workers is incremented.
      
            Upon entry to the Scheduler, this counter is checked for &
            if it is non-zero, the thread gives up its capability and
            signals returning_worker_cond before trying to re-grab a
            capability. (releaseCapability() takes care of this).
      
          + sched_mutex - protect Scheduler data structures.
          + gc_pending_cond - SMP-only condition variable for signalling
            completion of GCs.
      
      - initial implementation of call-ins, i.e., multiple OS threads
        may concurrently call into the RTS without interfering with
        each other. Implementation uses cheesy locking protocol to
        ensure that only one OS thread at a time can construct a
        function application -- stop-gap measure until the RtsAPI
        is revised (as discussed last month) *and* a designated
        block is used for allocating these applications.
      
      - In the implementation of call-ins, the OS thread blocks
        waiting for an RTS worker thread to complete the evaluation
        of the function application. Since main() also uses the
        RtsAPI, provide a separate entry point for it (rts_mainEvalIO()),
        which avoids creating a separate thread to evaluate Main.main,
        that can be done by the thread exec'ing main() directly.
        [Maybe there's a tidier way of doing this, a bit ugly the
        way it is now..]
      
      
      There are a couple of dark corners that needs to be looked at,
      such as conditions for shutting down (and how) + consider what
      ought to happen when async I/O is thrown into the mix (I know
      what will happen, but that's maybe not what we want).
      
      Other than that, things are in a generally happy state & I hope
      to declare myself done before the week is up.
      e289780e
  11. 12 Feb, 2002 1 commit
    • sof's avatar
      [project @ 2002-02-12 15:34:25 by sof] · 3fa80568
      sof authored
      - give rts_n_free_capabilities an interpretation
        in threaded mode (possible values: 0,1)
      - noFreeCapabilities() -? noCapabilities()
      3fa80568
  12. 08 Feb, 2002 1 commit
  13. 04 Feb, 2002 1 commit
  14. 31 Jan, 2002 1 commit
    • sof's avatar
      [project @ 2002-01-31 11:18:06 by sof] · 3b9c5eb2
      sof authored
      First steps towards implementing better interop between
      Concurrent Haskell and native threads.
      
      - factored out Capability handling into a separate source file
        (only the SMP build uses multiple capabilities tho).
      - factored out OS/native threads handling into a separate
        source file, OSThreads.{c,h}. Currently, just a pthreads-based
        implementation; Win32 version to follow.
      - scheduler code now distinguishes between multi-task threaded
        code (SMP) and single-task threaded code ('threaded RTS'),
        but sharing code between these two modes whenever poss.
      
      i.e., just a first snapshot; the bulk of the transitioning code
      remains to be implemented.
      3b9c5eb2