- 10 Aug, 2007 1 commit
-
-
Clemens Fruhwirth authored
Properly guard imports because they have to be precise on Windows and Darwin sets __PIC__ automatically
-
- 06 Aug, 2007 1 commit
-
-
Clemens Fruhwirth authored
-
- 27 Jul, 2007 1 commit
-
-
Simon Marlow authored
This patch implements pointer tagging as per our ICFP'07 paper "Faster laziness using dynamic pointer tagging". It improves performance by 10-15% for most workloads, including GHC itself. The original patches were by Alexey Rodriguez Yakushev <mrchebas@gmail.com>, with additions and improvements by me. I've re-recorded the development as a single patch. The basic idea is this: we use the low 2 bits of a pointer to a heap object (3 bits on a 64-bit architecture) to encode some information about the object pointed to. For a constructor, we encode the "tag" of the constructor (e.g. True vs. False), for a function closure its arity. This enables some decisions to be made without dereferencing the pointer, which speeds up some common operations. In particular it enables us to avoid costly indirect jumps in many cases. More information in the commentary: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging
-
- 03 Jul, 2007 1 commit
-
-
Simon Marlow authored
-
- 27 Jun, 2007 1 commit
-
-
Michael D. Adams authored
-
- 26 Jun, 2007 1 commit
-
-
Simon Marlow authored
We needed to turn some inline C functions and C macros into either real C functions or C-- macros.
-
- 25 May, 2007 1 commit
-
-
Simon Marlow authored
-
- 03 May, 2007 1 commit
-
-
Simon Marlow authored
This means we can avoid some StablePtrs, and also catch cases where the AP_STACK has been evaluated (this can happen with :history, see the hist001 test).
-
- 17 Apr, 2007 1 commit
-
-
Simon Marlow authored
This is the result of Bernie Pope's internship work at MSR Cambridge, with some subsequent improvements by me. The main plan was to (a) Reduce the overhead for breakpoints, so we could enable the feature by default without incurrent a significant penalty (b) Scatter more breakpoint sites throughout the code Currently we can set a breakpoint on almost any subexpression, and the overhead is around 1.5x slower than normal GHCi. I hope to be able to get this down further and/or allow breakpoints to be turned off. This patch also fixes up :print following the recent changes to constructor info tables. (most of the :print tests now pass) We now support single-stepping, which just enables all breakpoints. :step <expr> executes <expr> with single-stepping turned on :step single-steps from the current breakpoint The mechanism is quite different to the previous implementation. We share code with the HPC (haskell program coverage) implementation now. The coverage pass annotates source code with "tick" locations which are tracked by the coverage tool. In GHCi, each "tick" becomes a potential breakpoint location. Previously breakpoints were compiled into code that magically invoked a nested instance of GHCi. Now, a breakpoint causes the current thread to block and control is returned to GHCi. See the wiki page for more details and the current ToDo list: http://hackage.haskell.org/trac/ghc/wiki/NewGhciDebugger
-
- 16 Apr, 2007 1 commit
-
-
Simon Marlow authored
We changed the convention a while ago so that BaseReg is returned to the scheduler in R1, because BaseReg may change during the run of a thread, e.g. during a foreign call. A few places got missed, mostly for very rare events. Should fix concprog001, although I'm not able to reliably reproduce the failure.
-
- 06 Mar, 2007 2 commits
-
-
Simon Marlow authored
This primop ensures that the current computation is not being duplicated, by calling threadPaused(). The idea is to use it inside unsafePerformIO/unsafeInterleaveIO (see #986).
-
Simon Marlow authored
-
- 28 Feb, 2007 1 commit
-
-
Simon Marlow authored
We recently discovered that they aren't a win any more, and just cost code size.
-
- 27 Feb, 2007 1 commit
-
-
Simon Marlow authored
This is a simplification & minor optimisation for GHCi
-
- 09 Dec, 2006 1 commit
-
-
mnislaih authored
- infoPtr# :: a -> Addr# - closurePayload# :: a -> (# Array b, ByteArr# #) These prim ops provide the magic behind the ':print' command
-
- 28 Nov, 2006 1 commit
-
-
Ian Lynagh authored
-
- 24 Nov, 2006 1 commit
-
-
wolfgang.thaller@gmx.net authored
We can avoid using any other long long operations in PrimOps.cmm. One more step towards compiling the RTS using the NCG.
-
- 07 Oct, 2006 1 commit
-
-
tharris@microsoft.com authored
-
- 05 Sep, 2006 1 commit
-
-
Ian Lynagh authored
Fixed version of an old patch by Simon Marlow. His description read: Also, now an arbitrarily short context switch interval may now be specified, as we increase the RTS ticker's resolution to match the requested context switch interval. This also applies to +RTS -i (heap profiling) and +RTS -I (the idle GC timer). +RTS -V is actually only required for increasing the resolution of the profile timer.
-
- 26 Jul, 2006 1 commit
-
-
Simon Marlow authored
-
- 29 Jun, 2006 2 commits
-
-
Simon Marlow authored
So that we can build the RTS with the NCG.
-
Simon Marlow authored
gmp.h #defines mpz_foo to __gmpz_foo, so the real ABI is __gmpz_foo, so that is what we must invoke in order to be portable here. Similarly for mpn --> __gmpn.
-
- 20 Jun, 2006 1 commit
-
-
Simon Marlow authored
-
- 07 Apr, 2006 1 commit
-
-
Simon Marlow authored
Most of the other users of the fptools build system have migrated to Cabal, and with the move to darcs we can now flatten the source tree without losing history, so here goes. The main change is that the ghc/ subdir is gone, and most of what it contained is now at the top level. The build system now makes no pretense at being multi-project, it is just the GHC build system. No doubt this will break many things, and there will be a period of instability while we fix the dependencies. A straightforward build should work, but I haven't yet fixed binary/source distributions. Changes to the Building Guide will follow, too.
-
- 27 Mar, 2006 1 commit
-
-
Simon Marlow authored
This gives some control over affinity, while we figure out the best way to automatically schedule threads to make best use of the available parallelism. In addition to the primitive, there is also: GHC.Conc.forkOnIO :: Int -> IO () -> IO ThreadId where 'forkOnIO i m' creates a thread on Capability (i `rem` N), where N is the number of available Capabilities set by +RTS -N. Threads forked by forkOnIO do not automatically migrate when there are free Capabilities, like normal threads do. Still, if you're using forkOnIO exclusively, it's a good idea to do +RTS -qm to disable work pushing anyway (work pushing takes too much time when the run queues are large, this is something we need to fix).
-
- 28 Feb, 2006 3 commits
-
-
Simon Marlow authored
This relates to the recent introduction of clean/dirty TSOs, and the consqeuent write barriers required. We were missing some write barriers in the takeMVar/putMVar family of primops, when performing the take/put directly on another TSO. Fixes #705, and probably some test failures.
-
Simon Marlow authored
We now have more stg_ap entry points: stg_ap_*_fast, which take arguments in registers according to the platform calling convention. This is faster if the function being called is evaluated and has the right arity, which is the common case (see the eval/apply paper for measurements). We still need the stg_ap_*_info entry points for stack-based application, such as an overflows when a function is applied to too many argumnets. The stg_ap_*_fast functions actually just check for an evaluated function, and if they don't find one, push the args on the stack and invoke stg_ap_*_info. (this might be slightly slower in some cases, but not the common case).
-
Simon Marlow authored
fix one incorrect case, and made several more accurate
-
- 21 Feb, 2006 1 commit
-
-
Simon Marlow authored
atomicModifyMutVar# was re-using the storage manager mutex (sm_mutex) to get its atomicity guarantee in SMP mode. But recently the addition of a call to dirty_MUT_VAR() to implement the read barrier lead to a rare deadlock case, because dirty_MUT_VAR() very occasionally needs to allocate a new block to chain on the mutable list, which requires sm_mutex.
-
- 10 Feb, 2006 1 commit
-
-
Simon Marlow authored
-
- 09 Feb, 2006 1 commit
-
-
Simon Marlow authored
rather than recordMutableGen(), the former works better in SMP
-
- 07 Feb, 2006 1 commit
-
-
Simon Marlow authored
-
- 19 Jan, 2006 1 commit
-
-
sof authored
tryPutMVarzh_fast: make it work in the non-full case. Merge to STABLE.
-
- 17 Jan, 2006 2 commits
-
-
simonmar authored
Improve the GC behaviour of IORefs (see Ticket #650). This is a small change to the way IORefs interact with the GC, which should improve GC performance for programs with plenty of IORefs. Previously we had a single closure type for mutable variables, MUT_VAR. Mutable variables were *always* on the mutable list in older generations, and always traversed on every GC. Now, we have two closure types: MUT_VAR_CLEAN and MUT_VAR_DIRTY. The latter is on the mutable list, but the former is not. (NB. this differs from MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY, both of which are on the mutable list). writeMutVar# now implements a write barrier, by calling dirty_MUT_VAR() in the runtime, that does the necessary modification of MUT_VAR_CLEAN into MUT_VAR_DIRY, and adding to the mutable list if necessary. This results in some pretty dramatic speedups for GHC itself. I've just measureed a 30% overall speedup compiling a 31-module program (anna) with the default heap settings :-D
-
simonmar authored
Improve the GC behaviour of IOArrays/STArrays See Ticket #650 This is a small change to the way mutable arrays interact with the GC, that can have a dramatic effect on performance, and make tricks with unsafeThaw/unsafeFreeze redundant. Data.HashTable should be faster now (I haven't measured it yet). We now have two mutable array closure types, MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY. Both are on the mutable list if the array is in an old generation. writeArray# sets the type to MUT_ARR_PTRS_DIRTY. The garbage collector can set the type to MUT_ARR_PTRS_CLEAN if it finds that no element of the array points into a younger generation (discovering this required a small addition to evacuate(), but rough tests indicate that it doesn't measurably affect performance). NOTE: none of this affects unboxed arrays (IOUArray/STUArray), only boxed arrays (IOArray/STArray). We could go further and extend the DIRTY bit to be per-block rather than for the whole array, but for now this is an easy improvement.
-
- 13 Dec, 2005 1 commit
-
-
simonmar authored
Raise the (new) exception NestedAtomically when atomically is nested (using unsafePerformIO). This is a small improvement over crashing.
-
- 28 Nov, 2005 1 commit
-
-
simonmar authored
Small performance improvement to STM: reduce the size of an atomically frame from 3 words to 2 words by combining the "waiting" boolean field with the info pointer, i.e. having two separate info tables/return addresses for an atomically frame, one for the normal case and one for the waiitng case.
-
- 21 Nov, 2005 1 commit
-
-
tharris authored
Re-use temporary storage in the STM implementation
-
- 10 Nov, 2005 1 commit
-
-
simonmar authored
Fix a crash in STM; we were releasing ownership of the transaction too early in stmWait(), so a TSO could be woken up before we had finished putting it to sleep properly.
-
- 07 Nov, 2005 1 commit
-
-
simonmar authored
Fix some problems with array thawing/freezing and the GC.
-