- 18 Mar, 2009 1 commit
-
-
Simon Marlow authored
-
- 17 Mar, 2009 3 commits
-
-
Simon Marlow authored
Generate binary log files from the RTS containing a log of runtime events with timestamps. The log file can be visualised in various ways, for investigating runtime behaviour and debugging performance problems. See for example the forthcoming ThreadScope viewer. New GHC option: -eventlog (link-time option) Enables event logging. +RTS -l (runtime option) Generates <prog>.eventlog with the binary event information. This replaces some of the tracing machinery we already had in the RTS: e.g. +RTS -vg for GC tracing (we should do this using the new event logging instead). Event logging has almost no runtime cost when it isn't enabled, though in the future we might add more fine-grained events and this might change; hence having a link-time option and compiling a separate version of the RTS for event logging. There's a small runtime cost for enabling event-logging, for most programs it shouldn't make much difference. (Todo: docs)
-
Simon Marlow authored
Since we introduced pointer tagging, we no longer always enter a closure to evaluate it. However, the biographical profiler relies on closures being entered in order to mark them as "used", so we were getting spurious amounts of data attributed to VOID. It turns out there are various places that need to be fixed, and I think at least one of them was also wrong before pointer tagging (CgCon.cgReturnDataCon).
-
Simon Marlow authored
Somebody needs to implement getNumberOfProcessors() for MacOS X, currently it will return 1.
-
- 16 Mar, 2009 3 commits
-
-
Simon Marlow authored
Also remove some unused cruft
-
Simon Marlow authored
Fixes heapprof001(prof_hp) after fix for #2917
-
Simon Marlow authored
Fixes heapprof001(prof_hp) following the recent HpLim patch, which depended on the lack of slop in the heap.
-
- 13 Mar, 2009 2 commits
-
-
Simon Marlow authored
-
Simon Marlow authored
-
- 17 Feb, 2009 1 commit
-
-
Simon Marlow authored
-
- 16 Mar, 2009 4 commits
-
-
simonpj@microsoft.com authored
-
simonpj@microsoft.com authored
-
simonpj@microsoft.com authored
-
simonpj@microsoft.com authored
We were't checking that a 'data/type instance' was extending a family type constructor. Merge to 6.10 if we ever release 6.10.3 (or do it for 6.10.2).
-
- 15 Mar, 2009 1 commit
-
-
chak@cse.unsw.edu.au. authored
- During fianlisation we use to occasionally swivel variable-variable equalities - Now, normalisation ensures that they are always oriented as appropriate for instantation. - Also fixed #1899 properly; the previous fix fixed a symptom, not the cause.
-
- 13 Mar, 2009 6 commits
-
-
Simon Marlow authored
New flag: "+RTS -qb" disables load-balancing in the parallel GC (though this is subject to change, I think we will probably want to do something more automatic before releasing this). To get the "PARGC3" configuration described in the "Runtime support for Multicore Haskell" paper, use "+RTS -qg0 -qb -RTS". The main advantage of this is that it allows us to easily disable load-balancing altogether, which turns out to be important in parallel programs. Maintaining locality is sometimes more important that spreading the work out in parallel GC. There is a side benefit in that the parallel GC should have improved locality even when load-balancing, because each processor prefers to take work from its own queue before stealing from others.
-
simonpj@microsoft.com authored
This patch generates code in deriving(Data) for dataCast1 or 2 as appropriate. While I was there I did some refactoring (of course), pulling out the TcDeriv.inferConstraints as a separate function. I don't think it's worth merging this to 6.10.2, even though it's a bugfix, because it modifies code that I added in the HEAD only (for deriving Functor) so the merge will be sligtly awkward. And there's an easy workaround.
-
simonpj@microsoft.com authored
-
Simon Marlow authored
-
Simon Marlow authored
-
Simon Marlow authored
-
- 12 Mar, 2009 1 commit
-
-
Simon Marlow authored
-
- 13 Mar, 2009 1 commit
-
-
Simon Marlow authored
This reduces the latency between a context-switch being triggered and the thread returning to the scheduler, which in turn should reduce the cost of the GC barrier when there are many cores. We still retain the old context_switch flag which is checked at the end of each block of allocation. The idea is that setting HpLim may fail if the the target thread is modifying HpLim at the same time; the context_switch flag is a fallback. It also allows us to "context switch soon" without forcing an immediate switch, which can be costly.
-
- 12 Mar, 2009 1 commit
-
-
Simon Marlow authored
I ended up rewriting this horrible bit of code, using (yikes) lazy I/O to slurp in the source file a chunk at a time. The old code tried to read the file a chunk at a time, but failed with LANGUAGE pragmas because the parser for LANGUAGE has state and the state wasn't being saved between chunks. We're still closing the Handle eagerly, so there shouldn't be any problems here.
-
- 11 Mar, 2009 6 commits
-
-
Simon Marlow authored
-
Ian Lynagh authored
-
Simon Marlow authored
This is just a hack, since we don't have correct unicode output for Handles in general, I just fixed a couple of places where we were not converting to UTF-8 for output.
-
Simon Marlow authored
Evidently I misread the docs for CreateEvent: if you pass a name to CreateEvent, then it creates a single shared system-wide Event with that name. So all Haskell processes on the machine were sharing the same Event object. duh.
-
Simon Marlow authored
Now ghc --info reports whether-split-objs is supported, rather than whether the libraries were built using -split-objs.
-
Simon Marlow authored
-
- 10 Mar, 2009 1 commit
-
-
Ian Lynagh authored
-
- 09 Mar, 2009 2 commits
-
-
Simon Marlow authored
A long-running GC would cause the timer signal to declare the system to be idle, which would cause a major GC immediately following the current GC. This only happened with +RTS -N2 or greater.
-
Simon Marlow authored
After much experimentation, I've found a formulation for HEAP_ALLOCED that (a) improves performance, and (b) doesn't have any race conditions when used concurrently. GC performance on x86_64 should be improved slightly. See extensive comments in MBlock.h for the details.
-
- 06 Mar, 2009 1 commit
-
-
Simon Marlow authored
- add newAlignedPinnedByteArray# for allocating pinned BAs with arbitrary alignment - the old newPinnedByteArray# now aligns to 16 bytes Foreign.alloca will use newAlignedPinnedByteArray#, and so might end up wasting less space than before (we used to align to 8 by default). Foreign.allocaBytes and Foreign.mallocForeignPtrBytes will get 16-byte aligned memory, which is enough to avoid problems with SSE instructions on x86, for example. There was a bug in the old newPinnedByteArray#: it aligned to 8 bytes, but would have failed if the header was not a multiple of 8 (fortunately it always was, even with profiling). Also we occasionally wasted some space unnecessarily due to alignment in allocatePinned(). I haven't done anything about Foreign.malloc/mallocBytes, which will give you the same alignment guarantees as malloc() (8 bytes on Linux/x86 here).
-
- 08 Mar, 2009 1 commit
-
-
Ian Lynagh authored
This removes a burden from developers, and I can't remember an occasion where it would have caught a regression.
-
- 06 Mar, 2009 1 commit
-
-
Ian Lynagh authored
It's used by ESC/Haskell.
-
- 07 Mar, 2009 2 commits
-
-
rl@cse.unsw.edu.au authored
-
rl@cse.unsw.edu.au authored
-
- 06 Mar, 2009 2 commits
-
-
Ian Lynagh authored
-
rl@cse.unsw.edu.au authored
-