Authors: email@example.com, firstname.lastname@example.org
Date: April 2002
This document presents the implementation of an extension to Concurrent Haskell that provides two enhancements:
When a Concurrent Haskell(CH) thread calls a 'foreign import'ed function, the runtime system(RTS) has to handle this in a manner transparent to other CH threads. That is, they shouldn't be blocked from making progress while the CH thread executes the external call. Presently, all threads will block.
Clearly, we have to rely on OS-level threads in order to support this kind of concurrency. The implementation described here defines the (abstract) OS threads interface that the RTS assumes. The implementation currently provides two instances of this interface, one for POSIX threads (pthreads) and one for the Win32 threads.
From an RTS perspective, a simple and efficient way to implement this is to retain the property that only one OS thread is allowed to execute code inside of the GHC runtime system. [There are alternate designs, but I won't go into details on their pros and cons here.]
When this OS thread comes to execute a potentially blocking 'foreign import', it leaves the RTS, but before doing so it makes certain that another OS worker thread is available to take over its RTS executing priviledges. Consequently, the external call will be handled concurrently to the execution of the other Concurrent Haskell threads. When the external call eventually completes, the Concurrent Haskell thread that made the call is passed the result and made runnable again.
The rest of this section describes the mechanics of implementing this. There's two parts to it, one that describes how a native thread leaves the RTS to service the external call, the other how the same thread handles returning the result of the external call back to the Haskell thread.
Presently, GHC handles 'safe' C calls by effectively emitting the following code sequence:
...save thread state... t = suspendThread(); r = foo(arg1,...,argn); resumeThread(t); ...restore thread state... return r;
After having squirreled away the state of a Haskell thread, Schedule.c:suspendThread() is called which puts the current thread on a list [Schedule.c:suspended_ccalling_threads] containing threads that are currently blocked waiting for external calls to complete (this is done for the purposes of finding roots when garbage collecting).
In addition to putting the Haskell thread on suspended_ccalling_threads, suspendThread() now also does the following:
Upon return from suspendThread(), the OS thread is free of its RTS executing responsibility, and can now invoke the external call. Meanwhile, the other worker thread that have now gained access to the RTS will continue executing Concurrent Haskell code. Concurrent 'stuff' is happening!
When the native thread eventually returns from the external call, the result needs to be communicated back to the Haskell thread that issued the external call. The following steps takes care of this:
If a worker thread inside the RTS runs out of runnable Haskell threads, it goes to sleep waiting for the external calls to complete. It does this by calling waitForWorkCapability
The availability of new runnable Haskell threads is signalled when:
The reason why a separate worker thread is made to evaluate the Haskell function and not the OS thread that made the call-in via the Rts API, is that we want that OS thread to return as soon as possible. We wouldn't be able to guarantee that if the OS thread entered the RTS to (initially) just execute its function application, as the Scheduler may side-track it and also ask it to evaluate other Haskell threads.
Note: As of 20020413, the implementation of the RTS API only serializes access to the allocator between multiple OS threads wanting to call into Haskell (via the RTS API.) It does not coordinate this access to the allocator with that of the OS worker thread that's currently executing within the RTS. This weakness/bug is scheduled to be tackled as part of an overhaul/reworking of the RTS API itself.
These threads extensions affect the Scheduler portions of the runtime system. To make it more manageable to work with, the changes introduced a couple of new RTS 'sub-systems'. This section presents the functionality and API of these sub-systems.
A Capability represent the token required to execute STG code, and all the state an OS thread/task needs to run Haskell code: its STG registers, a pointer to its TSO, a nursery etc. During STG execution, a pointer to the capabilitity is kept in a register (BaseReg).
Only in an SMP build will there be multiple capabilities, for the threaded RTS and other non-threaded builds, there is only one global capability, namely MainCapability.
The Capability API is as follows:
/* Capability.h */ extern void initCapabilities(void); extern void grabReturnCapability(Mutex* pMutex, Capability** pCap); extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable); extern void releaseCapability(Capability* cap); extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap); extern void grabCapability(Capability** cap);
The condition variables used to implement the synchronisation between worker consumers and providers are local to the Capability implementation. See source for details and comments.
The Task Manager API is responsible for managing the creation of OS worker RTS threads. When a Haskell thread wants to make an external call, the Task Manager is asked to possibly create a new worker thread to take over the RTS-executing capability of the worker thread that's exiting the RTS to execute the external call.
The Capability subsystem keeps track of idle worker threads, so making an informed decision about whether or not to create a new OS worker thread is easy work for the task manager. The Task manager provides the following API:
/* Task.h */ extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) ); extern void stopTaskManager ( void ); extern void startTask ( void (*taskStart)(void) );
/* OSThreads.h */ typedef ..OS specific.. Mutex; extern void initMutex ( Mutex* pMut ); extern void grabMutex ( Mutex* pMut ); extern void releaseMutex ( Mutex* pMut ); typedef ..OS specific.. Condition; extern void initCondition ( Condition* pCond ); extern void closeCondition ( Condition* pCond ); extern rtsBool broadcastCondition ( Condition* pCond ); extern rtsBool signalCondition ( Condition* pCond ); extern rtsBool waitCondition ( Condition* pCond, Mutex* pMut ); extern OSThreadId osThreadId ( void ); extern void shutdownThread ( void ); extern void yieldThread ( void ); extern int createOSThread ( OSThreadId* tid, void (*startProc)(void) );
foreign import "bigComp" threadsafe largeComputation :: Int -> IO ()
The distinction between 'safe' and thread-safe C calls is made so that we may call external functions that aren't re-entrant but may cause a GC to occur.
The threadsafe attribute subsumes safe.