- 11 Oct, 2007 1 commit
-
-
Simon Marlow authored
Previously MVars were always on the mutable list of the old generation, which meant every MVar was visited during every minor GC. With lots of MVars hanging around, this gets expensive. We addressed this problem for MUT_VARs (aka IORefs) a while ago, the solution is to use a traditional GC write-barrier when the object is modified. This patch does the same thing for MVars. TVars are still done the old way, they could probably benefit from the same treatment too.
-
- 05 Sep, 2007 2 commits
-
-
Simon Marlow authored
-
chak@cse.unsw.edu.au. authored
-
- 04 Sep, 2007 1 commit
-
-
Simon Marlow authored
This applies to EnterCriticalSection and LeaveCriticalSection in the RTS
-
- 29 Aug, 2007 1 commit
-
-
Simon Marlow authored
The C-- parser was missing the "stdcall" calling convention for foreign calls, but once added we can call {Enter,Leave}CricialSection directly.
-
- 20 Aug, 2007 1 commit
-
-
nr@eecs.harvard.edu authored
* The correct definition of C-- requires that a procedure not 'fall off the end'. The 'never returns' annotation tells us if a (foreign) call is not going to return. Validated!
-
- 10 Aug, 2007 1 commit
-
-
Clemens Fruhwirth authored
Properly guard imports because they have to be precise on Windows and Darwin sets __PIC__ automatically
-
- 06 Aug, 2007 1 commit
-
-
Clemens Fruhwirth authored
-
- 27 Jul, 2007 1 commit
-
-
Simon Marlow authored
This patch implements pointer tagging as per our ICFP'07 paper "Faster laziness using dynamic pointer tagging". It improves performance by 10-15% for most workloads, including GHC itself. The original patches were by Alexey Rodriguez Yakushev <mrchebas@gmail.com>, with additions and improvements by me. I've re-recorded the development as a single patch. The basic idea is this: we use the low 2 bits of a pointer to a heap object (3 bits on a 64-bit architecture) to encode some information about the object pointed to. For a constructor, we encode the "tag" of the constructor (e.g. True vs. False), for a function closure its arity. This enables some decisions to be made without dereferencing the pointer, which speeds up some common operations. In particular it enables us to avoid costly indirect jumps in many cases. More information in the commentary: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging
-
- 03 Jul, 2007 1 commit
-
-
Simon Marlow authored
-
- 27 Jun, 2007 1 commit
-
-
Michael D. Adams authored
-
- 26 Jun, 2007 1 commit
-
-
Simon Marlow authored
We needed to turn some inline C functions and C macros into either real C functions or C-- macros.
-
- 25 May, 2007 1 commit
-
-
Simon Marlow authored
-
- 03 May, 2007 1 commit
-
-
Simon Marlow authored
This means we can avoid some StablePtrs, and also catch cases where the AP_STACK has been evaluated (this can happen with :history, see the hist001 test).
-
- 17 Apr, 2007 1 commit
-
-
Simon Marlow authored
This is the result of Bernie Pope's internship work at MSR Cambridge, with some subsequent improvements by me. The main plan was to (a) Reduce the overhead for breakpoints, so we could enable the feature by default without incurrent a significant penalty (b) Scatter more breakpoint sites throughout the code Currently we can set a breakpoint on almost any subexpression, and the overhead is around 1.5x slower than normal GHCi. I hope to be able to get this down further and/or allow breakpoints to be turned off. This patch also fixes up :print following the recent changes to constructor info tables. (most of the :print tests now pass) We now support single-stepping, which just enables all breakpoints. :step <expr> executes <expr> with single-stepping turned on :step single-steps from the current breakpoint The mechanism is quite different to the previous implementation. We share code with the HPC (haskell program coverage) implementation now. The coverage pass annotates source code with "tick" locations which are tracked by the coverage tool. In GHCi, each "tick" becomes a potential breakpoint location. Previously breakpoints were compiled into code that magically invoked a nested instance of GHCi. Now, a breakpoint causes the current thread to block and control is returned to GHCi. See the wiki page for more details and the current ToDo list: http://hackage.haskell.org/trac/ghc/wiki/NewGhciDebugger
-
- 16 Apr, 2007 1 commit
-
-
Simon Marlow authored
We changed the convention a while ago so that BaseReg is returned to the scheduler in R1, because BaseReg may change during the run of a thread, e.g. during a foreign call. A few places got missed, mostly for very rare events. Should fix concprog001, although I'm not able to reliably reproduce the failure.
-
- 06 Mar, 2007 2 commits
-
-
Simon Marlow authored
This primop ensures that the current computation is not being duplicated, by calling threadPaused(). The idea is to use it inside unsafePerformIO/unsafeInterleaveIO (see #986).
-
Simon Marlow authored
-
- 28 Feb, 2007 1 commit
-
-
Simon Marlow authored
We recently discovered that they aren't a win any more, and just cost code size.
-
- 27 Feb, 2007 1 commit
-
-
Simon Marlow authored
This is a simplification & minor optimisation for GHCi
-
- 09 Dec, 2006 1 commit
-
-
mnislaih authored
- infoPtr# :: a -> Addr# - closurePayload# :: a -> (# Array b, ByteArr# #) These prim ops provide the magic behind the ':print' command
-
- 28 Nov, 2006 1 commit
-
-
Ian Lynagh authored
-
- 24 Nov, 2006 1 commit
-
-
wolfgang.thaller@gmx.net authored
We can avoid using any other long long operations in PrimOps.cmm. One more step towards compiling the RTS using the NCG.
-
- 07 Oct, 2006 1 commit
-
-
tharris@microsoft.com authored
-
- 05 Sep, 2006 1 commit
-
-
Ian Lynagh authored
Fixed version of an old patch by Simon Marlow. His description read: Also, now an arbitrarily short context switch interval may now be specified, as we increase the RTS ticker's resolution to match the requested context switch interval. This also applies to +RTS -i (heap profiling) and +RTS -I (the idle GC timer). +RTS -V is actually only required for increasing the resolution of the profile timer.
-
- 26 Jul, 2006 1 commit
-
-
Simon Marlow authored
-
- 29 Jun, 2006 2 commits
-
-
Simon Marlow authored
So that we can build the RTS with the NCG.
-
Simon Marlow authored
gmp.h #defines mpz_foo to __gmpz_foo, so the real ABI is __gmpz_foo, so that is what we must invoke in order to be portable here. Similarly for mpn --> __gmpn.
-
- 20 Jun, 2006 1 commit
-
-
Simon Marlow authored
-
- 07 Apr, 2006 1 commit
-
-
Simon Marlow authored
Most of the other users of the fptools build system have migrated to Cabal, and with the move to darcs we can now flatten the source tree without losing history, so here goes. The main change is that the ghc/ subdir is gone, and most of what it contained is now at the top level. The build system now makes no pretense at being multi-project, it is just the GHC build system. No doubt this will break many things, and there will be a period of instability while we fix the dependencies. A straightforward build should work, but I haven't yet fixed binary/source distributions. Changes to the Building Guide will follow, too.
-
- 27 Mar, 2006 1 commit
-
-
Simon Marlow authored
This gives some control over affinity, while we figure out the best way to automatically schedule threads to make best use of the available parallelism. In addition to the primitive, there is also: GHC.Conc.forkOnIO :: Int -> IO () -> IO ThreadId where 'forkOnIO i m' creates a thread on Capability (i `rem` N), where N is the number of available Capabilities set by +RTS -N. Threads forked by forkOnIO do not automatically migrate when there are free Capabilities, like normal threads do. Still, if you're using forkOnIO exclusively, it's a good idea to do +RTS -qm to disable work pushing anyway (work pushing takes too much time when the run queues are large, this is something we need to fix).
-
- 28 Feb, 2006 3 commits
-
-
Simon Marlow authored
This relates to the recent introduction of clean/dirty TSOs, and the consqeuent write barriers required. We were missing some write barriers in the takeMVar/putMVar family of primops, when performing the take/put directly on another TSO. Fixes #705, and probably some test failures.
-
Simon Marlow authored
We now have more stg_ap entry points: stg_ap_*_fast, which take arguments in registers according to the platform calling convention. This is faster if the function being called is evaluated and has the right arity, which is the common case (see the eval/apply paper for measurements). We still need the stg_ap_*_info entry points for stack-based application, such as an overflows when a function is applied to too many argumnets. The stg_ap_*_fast functions actually just check for an evaluated function, and if they don't find one, push the args on the stack and invoke stg_ap_*_info. (this might be slightly slower in some cases, but not the common case).
-
Simon Marlow authored
fix one incorrect case, and made several more accurate
-
- 21 Feb, 2006 1 commit
-
-
Simon Marlow authored
atomicModifyMutVar# was re-using the storage manager mutex (sm_mutex) to get its atomicity guarantee in SMP mode. But recently the addition of a call to dirty_MUT_VAR() to implement the read barrier lead to a rare deadlock case, because dirty_MUT_VAR() very occasionally needs to allocate a new block to chain on the mutable list, which requires sm_mutex.
-
- 10 Feb, 2006 1 commit
-
-
Simon Marlow authored
-
- 09 Feb, 2006 1 commit
-
-
Simon Marlow authored
rather than recordMutableGen(), the former works better in SMP
-
- 07 Feb, 2006 1 commit
-
-
Simon Marlow authored
-
- 19 Jan, 2006 1 commit
-
-
sof authored
tryPutMVarzh_fast: make it work in the non-full case. Merge to STABLE.
-
- 17 Jan, 2006 1 commit
-
-
simonmar authored
Improve the GC behaviour of IORefs (see Ticket #650). This is a small change to the way IORefs interact with the GC, which should improve GC performance for programs with plenty of IORefs. Previously we had a single closure type for mutable variables, MUT_VAR. Mutable variables were *always* on the mutable list in older generations, and always traversed on every GC. Now, we have two closure types: MUT_VAR_CLEAN and MUT_VAR_DIRTY. The latter is on the mutable list, but the former is not. (NB. this differs from MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY, both of which are on the mutable list). writeMutVar# now implements a write barrier, by calling dirty_MUT_VAR() in the runtime, that does the necessary modification of MUT_VAR_CLEAN into MUT_VAR_DIRY, and adding to the mutable list if necessary. This results in some pretty dramatic speedups for GHC itself. I've just measureed a 30% overall speedup compiling a 31-module program (anna) with the default heap settings :-D
-