1. 13 Mar, 2004 1 commit
  2. 01 Mar, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-03-01 14:18:35 by simonmar] · a20ec0ce
      simonmar authored
      Threaded RTS improvements:
      
        - Make the main_threads list doubly linked.  Have threads
          remove themselves from this list when they complete, rather
          than searching for completed main threads each time around
          the scheduler loop.  This removes an O(n) loop from the
          scheduler, but adds some new constraints (basically completed
          threads must remain on the run queue until dealt with, including
          threads which have been killed by an async exception).
      
        - Add a pointer from the TSO to the StgMainThread struct, for
          main threads.  This avoids a number of places where we had
          to traverse the list of main threads to find the right one,
          including one place in the scheduler loop.  Adding a field to
          a TSO is cheap.
      
        - taskStart: we should be resetting the startingWorkerThread flag
          in here.  Not sure why we aren't; maybe this got lost at some point.
      
        - Use the BlockedOnCCall flags in the non-threaded RTS too.  Q: what
          should happen if a thread does a foreign call which re-enters the
          RTS, and then sends an async exception to the original thread?
          Answer: it should deadlock, which it does in the threaded RTS, and
          this commit makes it do so in the non-threaded RTS too (see
          testsuite/tests/concurrent/should_run/conc040.hs).
      a20ec0ce
  3. 27 Feb, 2004 4 commits
  4. 26 Feb, 2004 3 commits
  5. 25 Feb, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-02-25 17:35:44 by simonmar] · dc167cca
      simonmar authored
      Feeble performance hack for the threaded RTS: instead of
      allocating/releasing a new condition variable for each new call-in, we
      just cache one in the RTS and re-use it for the next call.
      
      On a little test I have here which does lots of call-ins on Windows,
      this reduces the slowdown for using the threaded RTS from a factor of
      7-8 down to a factor of 4-5.  I'm aiming for a factor of 2 or better...
      dc167cca
  6. 18 Dec, 2003 1 commit
  7. 16 Dec, 2003 1 commit
    • simonmar's avatar
      [project @ 2003-12-16 13:27:31 by simonmar] · de02e02a
      simonmar authored
      Clean up Capability API
      ~~~~~~~~~~~~~~~~~~~~~~~
      
      - yieldToReturningWorker() is now yieldCapability(), and performs all
        kinds of yielding (both to returning workers and passing to other
        OS threads).  yieldCapabiltiy() does *not* re-acquire a capability.
      
      - waitForWorkCapabilty() is now waitForCapability().
      
      - releaseCapbility() also releases the capability when passing to
        another OS thread.  It is the only way to release a capability (apart
        from yieldCapability(), which calls releaseCapability() internally).
      
      - passCapability() and passCapabilityToWorker() now do not release the
        capability.  They just set a flag to indicate where the capabiliy
        should go when it it next released.
      
      
      Other cleanups:
      
        - Removed all the SMP stuff from Schedule.c.  It had extensive bitrot,
          and was just obfuscating the code.  If it is ever needed again,
          it can be resurrected from CVS.
      
        - Removed some other dead code in Schedule.c, in an attempt to make
          this file more manageable.
      de02e02a
  8. 12 Dec, 2003 1 commit
  9. 05 Dec, 2003 1 commit
  10. 12 Nov, 2003 1 commit
    • sof's avatar
      [project @ 2003-11-12 17:49:05 by sof] · 20593d1d
      sof authored
      Tweaks to have RTS (C) sources compile with MSVC. Apart from wibbles
      related to the handling of 'inline', changed Schedule.h:POP_RUN_QUEUE()
      not to use expression-level statement blocks.
      20593d1d
  11. 05 Oct, 2003 1 commit
    • panne's avatar
      [project @ 2003-10-05 20:18:36 by panne] · 45ee5a69
      panne authored
      Unbreak the 2nd stage of non-threaded-RTS builds by #ifdefing out a
      call to wakeBlockedWorkerThread. This should probably fixed more
      cleanly by taking an OO view, i.e. always defining this function, but
      doing nothing in the non-threaded case. The final decision on this
      issue is left to the Masters of the Threads (tm)...
      45ee5a69
  12. 01 Oct, 2003 3 commits
    • wolfgang's avatar
      [project @ 2003-10-01 21:16:12 by wolfgang] · a9190910
      wolfgang authored
      Un-break non-threaded RTS
      (hopefully; I have no time to test it right now)
      a9190910
    • wolfgang's avatar
      [project @ 2003-10-01 10:57:39 by wolfgang] · d3581a6a
      wolfgang authored
      New implementation & changed type signature of forkProcess
      
      forkProcess now has the following type:
      forkProcess :: IO () -> IO ProcessID
      
      forkProcessAll has been removed as it is unimplementable in the threaded RTS.
      
      forkProcess using the old type (IO (Maybe ProcessID)) was impossible to
      implement correctly in the non-threaded RTS and very hard to implement
      in the threaded RTS.
      The new type signature allows a clean and simple implementation.
      d3581a6a
    • wolfgang's avatar
      [project @ 2003-10-01 10:49:07 by wolfgang] · 324e96d2
      wolfgang authored
      Threaded RTS:
      Don't start new worker threads earlier than necessary.
      After this commit, a Haskell program that uses neither forkOS nor forkIO is
      really single-threaded (rather than using two OS threads internally).
      
      Some details:
      Worker threads are now only created when a capability is released, and
      only when
      (there are no worker threads)
      	&& (there are runnable Haskell threads ||
      	    there are Haskell threads blocked on IO or threadDelay)
      awaitEvent can now be called from bound thread scheduling loops
      (so that we don't have to create a worker thread just to run awaitEvent)
      324e96d2
  13. 26 Sep, 2003 1 commit
  14. 21 Sep, 2003 1 commit
    • wolfgang's avatar
      [project @ 2003-09-21 22:20:51 by wolfgang] · 85aa72b9
      wolfgang authored
      Bound Threads
      =============
      
      Introduce a way to use foreign libraries that rely on thread local state
      from multiple threads (mainly affects the threaded RTS).
      
      See the file threads.tex in CVS at haskell-report/ffi/threads.tex
      (not entirely finished yet) for a definition of this extension. A less formal
      description is also found in the documentation of Control.Concurrent.
      
      The changes mostly affect the THREADED_RTS (./configure --enable-threaded-rts),
      except for saving & restoring errno on a per-TSO basis, which is also necessary
      for the non-threaded RTS (a bugfix).
      
      Detailed list of changes
      ------------------------
      
      - errno is saved in the TSO object and restored when necessary:
      ghc/includes/TSO.h, ghc/rts/Interpreter.c, ghc/rts/Schedule.c
      
      - rts_mainLazyIO is no longer needed, main is no special case anymore
      ghc/includes/RtsAPI.h, ghc/rts/RtsAPI.c, ghc/rts/Main.c, ghc/rts/Weak.c
      
      - passCapability: a new function that releases the capability and "passes"
        it to a specific OS thread:
      ghc/rts/Capability.h ghc/rts/Capability.c
      
      - waitThread(), scheduleWaitThread() and schedule() get an optional
        Capability *initialCapability passed as an argument:
      ghc/includes/SchedAPI.h, ghc/rts/Schedule.c, ghc/rts/RtsAPI.c
      
      - Bound Thread scheduling (that's what this is all about):
      ghc/rts/Schedule.h, ghc/rts/Schedule.c
      
      - new Primop isCurrentThreadBound#:
      ghc/compiler/prelude/primops.txt.pp, ghc/includes/PrimOps.h, ghc/rts/PrimOps.hc,
      ghc/rts/Schedule.h, ghc/rts/Schedule.c
      
      - a simple function, rtsSupportsBoundThreads, that returns true if THREADED_RTS
        is defined:
      ghc/rts/Schedule.h, ghc/rts/Schedule.c
      
      - a new implementation of forkProcess (the old implementation stays in place
        for the non-threaded case). Partially broken; works for the standard
        fork-and-exec case, but not for much else. A proper forkProcess is
        really next to impossible to implement:
      ghc/rts/Schedule.c
      
      - Library support for bound threads:
          Control.Concurrent.
            rtsSupportsBoundThreads, isCurrentThreadBound, forkOS,
            runInBoundThread, runInUnboundThread
      libraries/base/Control/Concurrent.hs, libraries/base/Makefile,
      libraries/base/include/HsBase.h, libraries/base/cbits/forkOS.c (new file)
      85aa72b9
  15. 15 Aug, 2003 1 commit
  16. 12 Jul, 2003 1 commit
  17. 03 Jul, 2003 1 commit
    • sof's avatar
      [project @ 2003-07-03 15:14:56 by sof] · 18340925
      sof authored
      New primop (mingw only),
      
        asyncDoProc# :: Addr# -> Addr# -> State# RealWorld-> (# State# RealWorld, Int#, Int# #)
      
      which lets a Haskell thread hand off a pointer to external code (1st arg) for
      asynchronous execution by the RTS worker thread pool. Second arg is data passed
      in to the asynchronous routine. The routine is _not_ permitted to re-enter
      the RTS as part of its execution.
      18340925
  18. 19 Jun, 2003 1 commit
  19. 14 May, 2003 1 commit
  20. 08 Apr, 2003 1 commit
  21. 02 Apr, 2003 1 commit
  22. 01 Apr, 2003 1 commit
    • sof's avatar
      [project @ 2003-04-01 15:05:13 by sof] · c49a6ca9
      sof authored
      Tidy up code that supports user/Haskell signal handlers.
      
      Signals.h now defines RTS_USER_SIGNALS when this is supported,
      which is then used elsewhere.
      c49a6ca9
  23. 25 Mar, 2003 2 commits
  24. 19 Mar, 2003 1 commit
  25. 22 Feb, 2003 1 commit
    • sof's avatar
      [project @ 2003-02-22 04:51:50 by sof] · 557947d3
      sof authored
      Clean up code&interfaces that deals with timers and asynchrony:
      
      - Timer.{c,h} now defines the platform-independent interface
        to the timing services needed by the RTS. Itimer.{c,h} +
        win32/Ticker.{c,h} defines the OS-specific services that
        creates/destroys a timer.
      - For win32 plats, drop the long-standing use of the 'multimedia'
        API timers and implement the ticking service ourselves. Simpler
        and more flexible.
      - Select.c is now solely for platforms that use select() to handle
        non-blocking I/O & thread delays. win32/AwaitEvent.c provides
        the same API on the Win32 side.
      - support threadDelay on win32 platforms via worker threads.
      
      Not yet compiled up on non-win32 platforms; will do once checked in.
      557947d3
  26. 25 Jan, 2003 1 commit
    • wolfgang's avatar
      [project @ 2003-01-25 15:54:48 by wolfgang] · af136096
      wolfgang authored
      This commit fixes many bugs and limitations in the threaded RTS.
      There are still some issues remaining, though.
      
      The following bugs should have been fixed:
      
      - [+] "safe" calls could cause crashes
      - [+] yieldToReturningWorker/grabReturnCapability
          -     It used to deadlock.
      - [+] couldn't wake blocked workers
          -     Calls into the RTS could go unanswered for a long time, and
                that includes ordinary callbacks in some circumstances.
      - [+] couldn't block on an MVar and expect to be woken up by a signal
            handler
          -     Depending on the exact situation, the RTS shut down or
                blocked forever and ignored the signal.
      - [+] The locking scheme in RtsAPI.c didn't work
      - [+] run_thread label in wrong place (schedule())
      - [+] Deadlock in GHC.Handle
          -     if a signal arrived at the wrong time, an mvar was never
                filled again
      - [+] Signals delivered to the "wrong" thread were ignored or handled
            too late.
      
      Issues:
      *) If GC can move TSO objects (I don't know - can it?), then ghci
      will occasionally crash when calling foreign functions, because the
      parameters are stored on the TSO stack.
      
      *) There is still a race condition lurking in the code
      (both threaded and non-threaded RTS are affected):
      If a signal arrives after the check for pending signals in
      schedule(), but before the call to select() in awaitEvent(),
      select() will be called anyway. The signal handler will be
      executed much later than expected.
      
      *) For Win32, GHC doesn't yet support non-blocking IO, so while a
      thread is waiting for IO, no call-ins can happen. If the RTS is
      blocked in awaitEvent, it uses a polling loop on Win32, so call-ins
      should work (although the polling loop looks ugly).
      
      *) Deadlock detection is disabled for the threaded rts, because I
      don't know how to do it properly in the presence of foreign call-ins
      from foreign threads.
      This causes the tests conc031, conc033 and conc034 to fail.
      
      *) "safe" is currently treated as "threadsafe". Implementing "safe" in
      a way that blocks other Haskell threads is more difficult than was
      thought at first. I think it could be done with a few additional lines
      of code, but personally, I'm strongly in favour of abolishing the
      distinction.
      
      *) Running finalizers at program termination is inefficient - there
      are two OS threads passing messages back and forth for every finalizer
      that is run. Also (just as in the non-threaded case) the finalizers
      are run in parallel to any remaining haskell threads and to any
      foreign call-ins that might still happen.
      af136096
  27. 13 Dec, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-12-13 15:16:29 by simonmar] · c986ed0b
      simonmar authored
      Shortcut when switching evaluators: instead of going round the normal
      scheduler loop, just cut to the chase and run the thread using the
      other evaluator.
      
      This avoids doing stack squeezing each time we switch evaluators,
      which is an O(n) operation these days, whereas it used to be O(n) the
      first time, and O(1) thereafter if the stack hadn't changed too much.
      This is a problem that we should perhaps address separately, but for
      now the workaround should provide a speed boost to GHCi on the HEAD.
      c986ed0b
  28. 11 Dec, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-12-11 15:36:20 by simonmar] · 0bffc410
      simonmar authored
      Merge the eval-apply-branch on to the HEAD
      ------------------------------------------
      
      This is a change to GHC's evaluation model in order to ultimately make
      GHC more portable and to reduce complexity in some areas.
      
      At some point we'll update the commentary to describe the new state of
      the RTS.  Pending that, the highlights of this change are:
      
        - No more Su.  The Su register is gone, update frames are one
          word smaller.
      
        - Slow-entry points and arg checks are gone.  Unknown function calls
          are handled by automatically-generated RTS entry points (AutoApply.hc,
          generated by the program in utils/genapply).
      
        - The stack layout is stricter: there are no "pending arguments" on
          the stack any more, the stack is always strictly a sequence of
          stack frames.
      
          This means that there's no need for LOOKS_LIKE_GHC_INFO() or
          LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
          how to find the boundary between the text and data segments (BIG WIN!).
      
        - A couple of nasty hacks in the mangler caused by the neet to
          identify closure ptrs vs. info tables have gone away.
      
        - Info tables are a bit more complicated.  See InfoTables.h for the
          details.
      
        - As a side effect, GHCi can now deal with polymorphic seq.  Some bugs
          in GHCi which affected primitives and unboxed tuples are now
          fixed.
      
        - Binary sizes are reduced by about 7% on x86.  Performance is roughly
          similar, some programs get faster while some get slower.  I've seen
          GHCi perform worse on some examples, but haven't investigated
          further yet (GHCi performance *should* be about the same or better
          in theory).
      
        - Internally the code generator is rather better organised.  I've moved
          info-table generation from the NCG into the main codeGen where it is
          shared with the C back-end; info tables are now emitted as arrays
          of words in both back-ends.  The NCG is one step closer to being able
          to support profiling.
      
      This has all been fairly thoroughly tested, but no doubt I've messed
      up the commit in some way.
      0bffc410
  29. 10 Dec, 2002 1 commit
    • wolfgang's avatar
      [project @ 2002-12-10 13:38:40 by wolfgang] · 8a8eee36
      wolfgang authored
      Fix a race condition/possible deadlock in the threaded rts:
      
      If a callback into haskell finished before waitThread_() was called,
      the signal was lost  ans waitThread_() waited indefinitely.
      
      Solution: Don't release the sched_mutex between calls to scheduleThread_
      and waitThread_.
      
      Please note that the scheduler API function waitThread is still possibly
      affected by this race condition. It's used in rts_mainEvalIO (I think that's
      safe) and in finishAllThreads (this looks dangerous, but finishAllThreads is
      never used).
      8a8eee36
  30. 22 Oct, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-10-22 11:01:18 by simonmar] · b7129526
      simonmar authored
      change the types of cmp_thread, rts_getThreadId, and labelThread to
      take StgPtr rather than StgTSO *, since the compiler now has no
      distinction between these two types in the back end.
      
      I also noticed that labelThread need not be a primitive: it could just
      as well be a normal C function called by the FFI, but I haven't made
      that change.
      b7129526
  31. 25 Sep, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-09-25 14:46:31 by simonmar] · 28e69a6b
      simonmar authored
      Fix a scheduling/GC bug, spotted by Wolfgang Thaller.  If a main
      thread completes, and a GC runs before the return (from rts_evalIO())
      happens, then the thread might be GC'd before we get a chance to
      extract its return value, leading to barf("main thread has been GC'd")
      from the garbage collector.
      
      The fix is to treat all main threads which have completed as roots:
      this is logically the right thing to do, because these threads must be
      retained by virtue of holding the return value, and this is a property of
      main threads only.
      28e69a6b
  32. 18 Sep, 2002 1 commit