• simonmar's avatar
    [project @ 2005-10-21 14:02:17 by simonmar] · 03a9ff01
    simonmar authored
    Big re-hash of the threaded/SMP runtime
    This is a significant reworking of the threaded and SMP parts of
    the runtime.  There are two overall goals here:
      - To push down the scheduler lock, reducing contention and allowing
        more parts of the system to run without locks.  In particular,
        the scheduler does not require a lock any more in the common case.
      - To improve affinity, so that running Haskell threads stick to the
        same OS threads as much as possible.
    At this point we have the basic structure working, but there are some
    pieces missing.  I believe it's reasonably stable - the important
    parts of the testsuite pass in all the (normal,threaded,SMP) ways.
    In more detail:
      - Each capability now has a run queue, instead of one global run
        queue.  The Capability and Task APIs have been completely
        rewritten; see Capability.h and Task.h for the details.
      - Each capability has its own pool of worker Tasks.  Hence, Haskell
        threads on a Capability's run queue will run on the same worker
        Task(s).  As long as the OS is doing something reasonable, this
        should mean they usually stick to the same CPU.  Another way to
        look at this is that we're assuming each Capability is associated
        with a fixed CPU.
      - What used to be StgMainThread is now part of the Task structure.
        Every OS thread in the runtime has an associated Task, and it
        can ask for its current Task at any time with myTask().
      - removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead
        (it is now defined for SMP too).
      - The RtsAPI has had to change; we must explicitly pass a Capability
        around now.  The previous interface assumed some global state.
        SchedAPI has also changed a lot.
      - The OSThreads API now supports thread-local storage, used to
        implement myTask(), although it could be done more efficiently
        using gcc's __thread extension when available.
      - I've moved some POSIX-specific stuff into the posix subdirectory,
        moving in the direction of separating out platform-specific
      - lots of lock-debugging and assertions in the runtime.  In particular,
        when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is
        also an ASSERT_LOCK_HELD() call.
    What's missing so far:
      - I have almost certainly broken the Win32 build, will fix soon.
      - any kind of thread migration or load balancing.  This is high up
        the agenda, though.
      - various performance tweaks to do
      - throwTo and forkProcess still do not work in SMP mode
STM.c 34.5 KB