This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
- 02 Jun, 2009 2 commits
-
-
Ian Lynagh authored
-
Simon Marlow authored
-
- 06 Mar, 2009 1 commit
-
-
Simon Marlow authored
- add newAlignedPinnedByteArray# for allocating pinned BAs with arbitrary alignment - the old newPinnedByteArray# now aligns to 16 bytes Foreign.alloca will use newAlignedPinnedByteArray#, and so might end up wasting less space than before (we used to align to 8 by default). Foreign.allocaBytes and Foreign.mallocForeignPtrBytes will get 16-byte aligned memory, which is enough to avoid problems with SSE instructions on x86, for example. There was a bug in the old newPinnedByteArray#: it aligned to 8 bytes, but would have failed if the header was not a multiple of 8 (fortunately it always was, even with profiling). Also we occasionally wasted some space unnecessarily due to alignment in allocatePinned(). I haven't done anything about Foreign.malloc/mallocBytes, which will give you the same alignment guarantees as malloc() (8 bytes on Linux/x86 here).
-
- 27 Jan, 2009 1 commit
-
-
Samuel Bronson authored
In this version, I untag R1 before using it, and even enter R2 at the end rather than simply returning it (which didn't work right when R2 was a thunk).
-
- 10 Dec, 2008 1 commit
-
-
Simon Marlow authored
Patch originally by Ivan Tomac <tomac@pacific.net.au>, amended by Simon Marlow: - mkWeakFinalizer# commoned up with mkWeakFinalizerEnv# - GC parameters to ALLOC_PRIM fixed
-
- 18 Nov, 2008 1 commit
-
-
Simon Marlow authored
Eager blackholing can improve parallel performance by reducing the chances that two threads perform the same computation. However, it has a cost: one extra memory write per thunk entry. To get the best results, any code which may be executed in parallel should be compiled with eager blackholing turned on. But since there's a cost for sequential code, we make it optional and turn it on for the parallel package only. It might be a good idea to compile applications (or modules) with parallel code in with -feager-blackholing. ToDo: document -feager-blackholing.
-
- 06 Nov, 2008 1 commit
-
-
Simon Marlow authored
Signficantly reduces the overhead for par, which means that we can make use of paralellism at a much finer granularity.
-
- 10 Oct, 2008 1 commit
-
-
Simon Marlow authored
-
- 19 Sep, 2008 1 commit
-
-
Simon Marlow authored
Fixes a long-standing bug that could in some cases cause sub-optimal scheduling behaviour.
-
- 10 Jul, 2008 1 commit
-
-
Simon Marlow authored
-
- 09 Jul, 2008 1 commit
-
-
Simon Marlow authored
-
- 04 Jun, 2008 1 commit
-
-
Simon Marlow authored
-
- 16 Apr, 2008 1 commit
-
-
Ian Lynagh authored
-
- 17 Apr, 2008 1 commit
-
-
Ian Lynagh authored
-
- 09 Apr, 2008 2 commits
-
-
Simon Marlow authored
-
Simon Marlow authored
-
- 07 Apr, 2008 1 commit
-
-
Simon Marlow authored
-
- 02 Apr, 2008 1 commit
-
-
Simon Marlow authored
This has several advantages: - -fvia-C is consistent with -fasm with respect to FFI declarations: both bind to the ABI, not the API. - foreign calls can now be inlined freely across module boundaries, since a header file is not required when compiling the call. - bootstrapping via C will be more reliable, because this difference in behavour between the two backends has been removed. There is one disadvantage: - we get no checking by the C compiler that the FFI declaration is correct. So now, the c-includes field in a .cabal file is always ignored by GHC, as are header files specified in an FFI declaration. This was previously the case only for -fasm compilations, now it is also the case for -fvia-C too.
-
- 11 Oct, 2007 1 commit
-
-
Simon Marlow authored
Previously MVars were always on the mutable list of the old generation, which meant every MVar was visited during every minor GC. With lots of MVars hanging around, this gets expensive. We addressed this problem for MUT_VARs (aka IORefs) a while ago, the solution is to use a traditional GC write-barrier when the object is modified. This patch does the same thing for MVars. TVars are still done the old way, they could probably benefit from the same treatment too.
-
- 10 Oct, 2007 1 commit
-
-
Simon Marlow authored
The extra safe points introduced for breakpoints were previously compiled as normal updatable thunks, but they are guaranteed single-entry, so we can use non-updatable thunks here. This restores the tail-call property where it was lost in some cases (although stack squeezing probably often recovered it), and should improve performance.
-
- 16 May, 2007 1 commit
-
-
Simon Marlow authored
-
- 15 May, 2007 1 commit
-
-
Simon Marlow authored
When -fbreak-on-exception is set, an exception will cause GHCi to suspend the current computation and return to the prompt, where the history of the current evaluation can be inspected (if we are in :trace). This isn't on by default, because the behaviour could be confusing: for example, ^C will cause a breakpoint. It can be very useful for finding the cause of a "head []" or a "fromJust Nothing", though.
-
- 17 Apr, 2007 1 commit
-
-
Simon Marlow authored
This is the result of Bernie Pope's internship work at MSR Cambridge, with some subsequent improvements by me. The main plan was to (a) Reduce the overhead for breakpoints, so we could enable the feature by default without incurrent a significant penalty (b) Scatter more breakpoint sites throughout the code Currently we can set a breakpoint on almost any subexpression, and the overhead is around 1.5x slower than normal GHCi. I hope to be able to get this down further and/or allow breakpoints to be turned off. This patch also fixes up :print following the recent changes to constructor info tables. (most of the :print tests now pass) We now support single-stepping, which just enables all breakpoints. :step <expr> executes <expr> with single-stepping turned on :step single-steps from the current breakpoint The mechanism is quite different to the previous implementation. We share code with the HPC (haskell program coverage) implementation now. The coverage pass annotates source code with "tick" locations which are tracked by the coverage tool. In GHCi, each "tick" becomes a potential breakpoint location. Previously breakpoints were compiled into code that magically invoked a nested instance of GHCi. Now, a breakpoint causes the current thread to block and control is returned to GHCi. See the wiki page for more details and the current ToDo list: http://hackage.haskell.org/trac/ghc/wiki/NewGhciDebugger
-
- 16 Apr, 2007 1 commit
-
-
Simon Marlow authored
We changed the convention a while ago so that BaseReg is returned to the scheduler in R1, because BaseReg may change during the run of a thread, e.g. during a foreign call. A few places got missed, mostly for very rare events. Should fix concprog001, although I'm not able to reliably reproduce the failure.
-
- 07 Mar, 2007 1 commit
-
-
Simon Marlow authored
-
- 28 Feb, 2007 1 commit
-
-
Simon Marlow authored
We recently discovered that they aren't a win any more, and just cost code size.
-
- 09 Dec, 2006 1 commit
-
-
mnislaih authored
- infoPtr# :: a -> Addr# - closurePayload# :: a -> (# Array b, ByteArr# #) These prim ops provide the magic behind the ':print' command
-
- 07 Oct, 2006 1 commit
-
-
tharris@microsoft.com authored
-
- 20 Jun, 2006 1 commit
-
-
Simon Marlow authored
-
- 16 Jun, 2006 1 commit
-
-
Simon Marlow authored
This patch makes throwTo work with -threaded, and also refactors large parts of the concurrency support in the RTS to clean things up. We have some new files: RaiseAsync.{c,h} asynchronous exception support Threads.{c,h} general threading-related utils Some of the contents of these new files used to be in Schedule.c, which is smaller and cleaner as a result of the split. Asynchronous exception support in the presence of multiple running Haskell threads is rather tricky. In fact, to my annoyance there are still one or two bugs to track down, but the majority of the tests run now.
-
- 07 Apr, 2006 1 commit
-
-
Simon Marlow authored
Most of the other users of the fptools build system have migrated to Cabal, and with the move to darcs we can now flatten the source tree without losing history, so here goes. The main change is that the ghc/ subdir is gone, and most of what it contained is now at the top level. The build system now makes no pretense at being multi-project, it is just the GHC build system. No doubt this will break many things, and there will be a period of instability while we fix the dependencies. A straightforward build should work, but I haven't yet fixed binary/source distributions. Changes to the Building Guide will follow, too.
-
- 27 Mar, 2006 1 commit
-
-
Simon Marlow authored
This gives some control over affinity, while we figure out the best way to automatically schedule threads to make best use of the available parallelism. In addition to the primitive, there is also: GHC.Conc.forkOnIO :: Int -> IO () -> IO ThreadId where 'forkOnIO i m' creates a thread on Capability (i `rem` N), where N is the number of available Capabilities set by +RTS -N. Threads forked by forkOnIO do not automatically migrate when there are free Capabilities, like normal threads do. Still, if you're using forkOnIO exclusively, it's a good idea to do +RTS -qm to disable work pushing anyway (work pushing takes too much time when the run queues are large, this is something we need to fix).
-
- 14 Mar, 2006 1 commit
-
-
Simon Marlow authored
This is just an assertion, in effect: we should never enter a PAP, but for convenience we previously attached the PAP apply code to the PAP info table. The problem with this was that it makes it harder to track down bugs that result in entering a PAP...
-
- 28 Feb, 2006 1 commit
-
-
Simon Marlow authored
We now have more stg_ap entry points: stg_ap_*_fast, which take arguments in registers according to the platform calling convention. This is faster if the function being called is evaluated and has the right arity, which is the common case (see the eval/apply paper for measurements). We still need the stg_ap_*_info entry points for stack-based application, such as an overflows when a function is applied to too many argumnets. The stg_ap_*_fast functions actually just check for an evaluated function, and if they don't find one, push the args on the stack and invoke stg_ap_*_info. (this might be slightly slower in some cases, but not the common case).
-
- 17 Jan, 2006 2 commits
-
-
simonmar authored
Improve the GC behaviour of IORefs (see Ticket #650). This is a small change to the way IORefs interact with the GC, which should improve GC performance for programs with plenty of IORefs. Previously we had a single closure type for mutable variables, MUT_VAR. Mutable variables were *always* on the mutable list in older generations, and always traversed on every GC. Now, we have two closure types: MUT_VAR_CLEAN and MUT_VAR_DIRTY. The latter is on the mutable list, but the former is not. (NB. this differs from MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY, both of which are on the mutable list). writeMutVar# now implements a write barrier, by calling dirty_MUT_VAR() in the runtime, that does the necessary modification of MUT_VAR_CLEAN into MUT_VAR_DIRY, and adding to the mutable list if necessary. This results in some pretty dramatic speedups for GHC itself. I've just measureed a 30% overall speedup compiling a 31-module program (anna) with the default heap settings :-D
-
simonmar authored
Improve the GC behaviour of IOArrays/STArrays See Ticket #650 This is a small change to the way mutable arrays interact with the GC, that can have a dramatic effect on performance, and make tricks with unsafeThaw/unsafeFreeze redundant. Data.HashTable should be faster now (I haven't measured it yet). We now have two mutable array closure types, MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY. Both are on the mutable list if the array is in an old generation. writeArray# sets the type to MUT_ARR_PTRS_DIRTY. The garbage collector can set the type to MUT_ARR_PTRS_CLEAN if it finds that no element of the array points into a younger generation (discovering this required a small addition to evacuate(), but rough tests indicate that it doesn't measurably affect performance). NOTE: none of this affects unboxed arrays (IOUArray/STUArray), only boxed arrays (IOArray/STArray). We could go further and extend the DIRTY bit to be per-block rather than for the whole array, but for now this is an easy improvement.
-
- 28 Nov, 2005 1 commit
-
-
simonmar authored
Small performance improvement to STM: reduce the size of an atomically frame from 3 words to 2 words by combining the "waiting" boolean field with the info pointer, i.e. having two separate info tables/return addresses for an atomically frame, one for the normal case and one for the waiitng case.
-
- 18 Nov, 2005 1 commit
-
-
simonmar authored
Two improvements to the SMP runtime: - support for 'par', aka sparks. Load balancing is very primitive right now, but I have seen programs that go faster using par. - support for backing off when a thread is found to be duplicating a computation currently underway in another thread. This also fixes some instability in SMP, because it turned out that when an update frame points to an indirection, which can happen if a thunk is under evaluation in multiple threads, then after GC has shorted out the indirection the update will trash the value. Now we suspend the duplicate computation to the heap before this can happen. Additionally: - stack squeezing is separate from lazy blackholing, and now only happens if there's a reasonable amount of squeezing to be done in relation to the number of words of stack that have to be moved. This means we won't try to shift 10Mb of stack just to save 2 words at the bottom (it probably never happened, but still). - update frames are now marked when they have been visited by lazy blackholing, as per the SMP paper. - cleaned up raiseAsync() a bit.
-
- 10 Nov, 2005 1 commit
-
-
simonmar authored
Fix a crash in STM; we were releasing ownership of the transaction too early in stmWait(), so a TSO could be woken up before we had finished putting it to sleep properly.
-
- 13 Oct, 2005 1 commit
-
-
sof authored
add protos for HeapStackCheck.cmm:stg_block_blackhole_* entry points
-