This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
- 13 May, 2005 1 commit
-
-
simonmar authored
gcc 4.0.0 fix: don't declare static static_objects as extern
-
- 12 May, 2005 1 commit
-
-
simonmar authored
Declare checkNurserySanity()
-
- 10 May, 2005 1 commit
-
-
simonmar authored
Two SMP-related changes: - New storage manager interface: bdescr *allocateLocal(StgRegTable *reg, nat words) which allocates from the current thread's nursery (being careful not to clash with the heap pointer). It can do this without taking any locks; the lock only has to be taken if a block needs to be allocated. allocateLocal() is now used instead of allocate() in a few PrimOps. This removes locks from most Integer operations, cutting down the overhead for SMP a bit more. To make this work, we have to be able to grab the current thread's Capability out of thin air (i.e. when called from GMP), so the Capability subsystem needs to keep a hash from thread IDs to Capabilities. - Small MVar optimisation: instead of taking the global storage-manager lock, do our own locking of MVars with a bit of inline assembly (x86 only for now).
-
- 27 Apr, 2005 2 commits
-
-
simonmar authored
When using -H<size> in SMP mode, divide the total nursery size amongst the various nurseries. -H<size> now does something reasonable with SMP.
-
simonmar authored
Hold the sm_mutex around access to the mutable list. The SMP RTS now seems quite stable, I've run my simple test program with 64 threads without crashes.
-
- 22 Apr, 2005 1 commit
-
-
simonmar authored
SMP: the rest of the changes to support safe thunk entry & updates. I thought the compiler changes were independent, but I ended up breaking the HEAD, so I'll have to commit the rest. non-SMP compilation should not be affected.
-
- 12 Apr, 2005 1 commit
-
-
simonmar authored
Per-task nurseries for SMP. This was kind-of implemented before, but it's much cleaner now. There is now one *step* per capability, so we have somewhere to hang the block count. So for SMP, there are simply multiple instances of generation 0 step 0. The rNursery entry in the register table now points to the step rather than the head block of the nurersy.
-
- 05 Apr, 2005 1 commit
-
-
simonmar authored
Some multi-processor hackery, including - Don't hang blocked threads off BLACKHOLEs any more, instead keep them all on a separate queue which is checked periodically for threads to wake up. This is good because (a) we don't have to worry about locking the closure in SMP mode when we want to block on it, and (b) it means the standard update code doesn't need to wake up any threads or check for a BLACKHOLE_BQ, simplifying the update code. The downside is that if there are lots of threads blocked on BLACKHOLEs, we might have to do a lot of repeated list traversal. We don't expect this to be common, though. conc023 goes slower with this change, but we expect most programs to benefit from the shorter update code. - Fixing up the Capability code to handle multiple capabilities (SMP mode), and related changes to get the SMP mode at least building.
-
- 04 Apr, 2005 1 commit
-
-
simonmar authored
Give prototypes for getAllocations and revertCAFs.
-
- 27 Mar, 2005 1 commit
-
-
panne authored
* Some preprocessors don't like the C99/C++ '//' comments after a directive, so use '/* */' instead. For consistency, a lot of '//' in the include files were converted, too. * UnDOSified libraries/base/cbits/runProcess.c. * My favourite sport: Killed $Id$s.
-
- 09 Mar, 2005 1 commit
-
-
wolfgang authored
Retain all CAFs when dynamic Haskell libraries are used from GHCi. The Linker usually replaces references to newCAF with references to newDynCAF, but the system dynamic linker won't do that for us. Also, the situation is slightly different - we never want CAFs from dylibs to be reverted, because the dylibs might be used both by the interpreted program and by GHCi itself. So instead of just caf_list, there's now both caf_list and revertible_caf_list. newDynCAF adds a CAF to revertible_caf_list, and newCAF either adds the CAF to caf_list or to the mutable list, depending on whether we are in GHCi. This hack is only active when Linker.c has loaded libHSbase_dyn.[so|dylib], but for now, it applies to all CAFs, not just dynamically-linked ones. If that is worth fixing, we could do that by checking whether the the CAF closure or it's info pointer is in the main executable's address range. MERGE TO STABLE
-
- 10 Feb, 2005 1 commit
-
-
simonmar authored
GC changes: instead of threading old-generation mutable lists through objects in the heap, keep it in a separate flat array. This has some advantages: - the IND_OLDGEN object is now only 2 words, so the minimum size of a THUNK is now 2 words instead of 3. This saves some amount of allocation (about 2% on average according to my measurements), and is more friendly to the cache by squashing objects together more. - keeping the mutable list separate from the IND object will be necessary for our multiprocessor implementation. - removing the mut_link field makes the layout of some objects more uniform, leading to less complexity and special cases. - I also unified the two mutable lists (mut_once_list and mut_list) into a single mutable list, which lead to more simplifications in the GC.
-
- 07 Oct, 2004 1 commit
-
-
wolfgang authored
Position Independent Code and Dynamic Linking Support, Part 1 This commit allows generation of position independent code (PIC) that fully supports dynamic linking on Mac OS X and PowerPC Linux. Other platforms are not yet supported, and there is no support for actually linking or using dynamic libraries - so if you use the -fPIC or -dynamic code generation flags, you have to type your (platform-specific) linker command lines yourself. nativeGen/PositionIndependentCode.hs: New file. Look here for some more comments on how this works. cmm/CLabel.hs: Add support for DynamicLinkerLabels and PIC base labels - for use inside the NCG. needsCDecl: Case alternative labels now need C decls, see the codeGen/CgInfoTbls.hs below for details cmm/Cmm.hs: Add CmmPicBaseReg (used in NCG), and CmmLabelDiffOff (used in NCG and for offsets in info tables) cmm/CmmParse.y: support offsets in info tables cmm/PprC.hs: support CmmLabelDiffOff Case alternative labels now need C decls (see the codeGen/CgInfoTbls.hs for details), so we need to pprDataExterns for info tables. cmm/PprCmm.hs: support CmmLabelDiffOff codeGen/CgInfoTbls.hs: no longer store absolute addresses in info tables, instead, we store offsets. Also, for vectored return points, emit the alternatives _after_ the vector table. This is to work around a limitation in Apple's as, which refuses to handle label differences where one label is at the end of a section. Emitting alternatives after vector info tables makes sure this never happens in GHC generated code. Case alternatives now require prototypes in hc code, though (see changes in PprC.hs, CLabel.hs). main/CmdLineOpts.lhs: Add a new option, -fPIC. main/DriverFlags.hs: Pass the correct options for PIC to gcc, depending on the platform. Only for powerpc for now. nativeGen/AsmCodeGen.hs: Many changes... Mac OS X-specific management of import stubs is no longer, it's now part of a general mechanism to handle such things for all platforms that need it (Darwin [both ppc and x86], Linux on ppc, and some platforms we don't support). Move cmmToCmm into its own monad which can accumulate a list of imported symbols. Make it call cmmMakeDynamicReference at the right places. nativeGen/MachCodeGen.hs: nativeGen/MachInstrs.hs: nativeGen/MachRegs.lhs: nativeGen/PprMach.hs: nativeGen/RegAllocInfo.hs: Too many changes to enumerate here, PowerPC specific. nativeGen/NCGMonad.hs: NatM still tracks imported symbols, as more labels can be created during code generation (float literals, jump tables; on some platforms all data access has to go through the dynamic linking mechanism). driver/mangler/ghc-asm.lprl: Mangle absolute addresses in info tables to offsets. Correctly pass through GCC-generated PIC for Mac OS X and powerpc linux. includes/Cmm.h: includes/InfoTables.h: includes/Storage.h: includes/mkDerivedConstants.c: rts/GC.c: rts/GCCompact.c: rts/HeapStackCheck.cmm: rts/Printer.c: rts/RetainerProfile.c: rts/Sanity.c: Adapt to the fact that info tables now contain offsets. rts/Linker.c: Mac-specific: change machoInitSymbolsWithoutUnderscore to support PIC.
-
- 13 Aug, 2004 2 commits
- 12 Nov, 2003 1 commit
-
-
sof authored
Tweaks to have RTS (C) sources compile with MSVC. Apart from wibbles related to the handling of 'inline', changed Schedule.h:POP_RUN_QUEUE() not to use expression-level statement blocks.
-
- 22 Apr, 2003 1 commit
-
-
simonmar authored
Fix an obscure bug: the most general kind of heap check, HEAP_CHECK_GEN(), is supposed to save the contents of *every* register known to the STG machine (used in cases where we either can't figure out which ones are live, or doing so would be too much hassle). The problem is that it wasn't saving the L1 register. A slight complication arose in that saving the L1 register pushed the size of the frame over the 16 words allowed for the size of the bitmap stored in the frame, so I changed the layout of the frame a bit. Describing all the registers using a single bitmap is overkill when only 8 of them can actually be pointers, so now the bitmap is only 8 bits long and we always skip over a fixed number of non-ptr words to account for all the non-ptr regs. This is all described in StgMacros.h.
-
- 27 Mar, 2003 1 commit
-
-
simonmar authored
Two performance tweaks: - Use specialised indirections, which perform the right kind of return without needing to enter the object they point to. This saves a small percentages of memory reads. - Tweak the update code to generate better code with gcc. This saves a few instructions per update.
-
- 26 Mar, 2003 1 commit
-
-
sof authored
wibbles - drop references to PleaseStopAllocating(), use CloseNursery() to express ExtendNursery()
-
- 24 Mar, 2003 1 commit
-
-
simonmar authored
Fix some bugs in compacting GC. Bug 1: When threading the fields of an AP or PAP, we were grabbing the info table of the function without unthreading it first. Bug 2: eval_thunk_selector() might accidentally find itself in to-space when going through indirections in a compacted generation. We must check for this case and bale out if necessary. Bug 3: This is somewhat more nasty. When we have an AP or PAP that points to a BCO, the layout info for the AP/PAP is in the BCO's instruction array, which is two objects deep from the AP/PAP itself. The trouble is, during compacting GC, we can only safely look one object deep from the current object, because pointers from objects any deeper might have been already updated to point to their final destinations. The solution is to put the arity and bitmap info for a BCO into the BCO object itself. This means BCOs become variable-length, which is a slight annoyance, but it also means that looking up the arity/bitmap is quicker. There is a slight reduction in complexity in the byte code generator due to not having to stuff the bitmap at the front of the instruction stream.
-
- 21 Mar, 2003 1 commit
-
-
sof authored
Friday morning code-wibbling: - made RetainerProfile.c:firstStack a 'static' - added RetainerProfile.c:retainerStackBlocks()
-
- 13 Dec, 2002 1 commit
-
-
simonmar authored
Fix bug in stack_frame_sizeW
-
- 11 Dec, 2002 1 commit
-
-
simonmar authored
Merge the eval-apply-branch on to the HEAD ------------------------------------------ This is a change to GHC's evaluation model in order to ultimately make GHC more portable and to reduce complexity in some areas. At some point we'll update the commentary to describe the new state of the RTS. Pending that, the highlights of this change are: - No more Su. The Su register is gone, update frames are one word smaller. - Slow-entry points and arg checks are gone. Unknown function calls are handled by automatically-generated RTS entry points (AutoApply.hc, generated by the program in utils/genapply). - The stack layout is stricter: there are no "pending arguments" on the stack any more, the stack is always strictly a sequence of stack frames. This means that there's no need for LOOKS_LIKE_GHC_INFO() or LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know how to find the boundary between the text and data segments (BIG WIN!). - A couple of nasty hacks in the mangler caused by the neet to identify closure ptrs vs. info tables have gone away. - Info tables are a bit more complicated. See InfoTables.h for the details. - As a side effect, GHCi can now deal with polymorphic seq. Some bugs in GHCi which affected primitives and unboxed tuples are now fixed. - Binary sizes are reduced by about 7% on x86. Performance is roughly similar, some programs get faster while some get slower. I've seen GHCi perform worse on some examples, but haven't investigated further yet (GHCi performance *should* be about the same or better in theory). - Internally the code generator is rather better organised. I've moved info-table generation from the NCG into the main codeGen where it is shared with the C back-end; info tables are now emitted as arrays of words in both back-ends. The NCG is one step closer to being able to support profiling. This has all been fairly thoroughly tested, but no doubt I've messed up the commit in some way.
-
- 21 Oct, 2002 1 commit
-
-
simonmar authored
Bite the bullet and generalise the central memory allocation scheme. Previously we tried to allocate memory starting from a fixed address, which was set for each architecture (0x5000000 was a common one), and to decide whether a particular address was in the heap or not we would do a simple comparison against this address. This doesn't work too well, because: - if we dynamically-load some objects above the boundary, the heap-allocated test becomes invalid - on windows we have less control, and the heap might be split into multiple sections - it turns out that on some Linux kernels we don't get memory where we asked for it. This might be a bug in those kernels, but it exposes the fragility of our allocation scheme. The solution is to bite the bullet and maintain a table mapping addresses to a value indicating whether that address is in the heap or not. Since we normally allocate heap in chunks of 1Mb, the table is quite small: 4k on a 32-bit machine, using one byte for each 1Mb block. Testing an address for heap residency now involves a memory access, but the table is normally cache-resident. I didn't manage to measure any slowdown after making the change. On a 64-bit machine, we'll need to use a 2-level table; I haven't implemented that yet. Now we can generalise the procedure used to grab memory from the OS. In the general case, we allocate one megablock more than we need to, and trim off the slop around the allocation to leave an aligned chunk. The next time around, however, we try to allocate memory right after the last chunk allocated, on the grounds that it is aligned and probably free: if this doesn't work, we have to back off to the general mechanism (it seems to work most of the time). This cleans up the Windows story too: is_heap_alloced() has gone, and we should be able to handle more than 256M of memory (or whatever the arbitrary limit was before). MERGE TO STABLE (after lots of testing)
-
- 26 Mar, 2002 2 commits
-
-
sof authored
TEXT_BEFORE_HEAP & cygwin: same as for mingw
-
simonmar authored
A couple of cleanups to the previous change: we should test TABLES_NEXT_TO_CODE rather than USE_MINIINTERPRETER to enable the MacOSX "plan C", and use structure field selection rather than array indexing to get the entry code ptr from the info table.
-
- 21 Mar, 2002 1 commit
-
-
sebc authored
Implement Plan C, with correct code to detect the data and text sections for MacOS X. Also add a sanity check in initStorage, to make sure we are able to make the distinction between closures and infotables.
-
- 14 Feb, 2002 1 commit
-
-
sof authored
widen the scope of is_heap_alloced() proto; for all mingw builds
-
- 04 Feb, 2002 1 commit
-
-
sof authored
- sm_mutex is now a Mutex (not a pthread_mutex_t). - sm_mutex lock/unlocks are only done for SMP builds.
-
- 01 Feb, 2002 1 commit
-
-
simonmar authored
When distinguishing between code & data pointers, rather than testing for membership of the text section, test for not membership of one of the data sections. The reason for this change is that testing for membership of the text section was fragile: we could only test whether a value was smaller than the end address, because there doesn't appear to be a portable way to find the beginning of the text section. Indeed, the test breaks on very recent Linux kernels which mmap() memory below the program text. In fact, the reversed test may be faster because the expected common case is when the pointer is into the dynamic heap, and we eliminate these case immediately in the new test. A quick test shows no measurable performance difference with the change. MERGE TO STABLE
-
- 25 Jan, 2002 1 commit
-
-
simonmar authored
Fix bit-rot in TICKY_TICKY
-
- 22 Nov, 2001 1 commit
-
-
simonmar authored
Retainer Profiling / Lag-drag-void profiling. This is mostly work by Sungwoo Park, who spent a summer internship at MSR Cambridge this year implementing these two types of heap profiling in GHC. Relative to Sungwoo's original work, I've made some improvements to the code: - it's now possible to apply constraints to retainer and LDV profiles in the same way as we do for other types of heap profile (eg. +RTS -hc{foo,bar} -hR -RTS gives you a retainer profiling considering only closures with cost centres 'foo' and 'bar'). - the heap-profile timer implementation is cleaned up. - heap profiling no longer has to be run in a two-space heap. - general cleanup of the code and application of the SDM C coding style guidelines. Profiling will be a little slower and require more space than before, mainly because closures have an extra header word to support either retainer profiling or LDV profiling (you can't do both at the same time). We've used the new profiling tools on GHC itself, with moderate success. Fixes for some space leaks in GHC to follow...
-
- 08 Aug, 2001 1 commit
-
-
simonmar authored
Had a brainwave on the way to work this morning, and realised that the garbage collector can handle "pinned objects" as long as they don't contain any pointers. This is absolutely ideal for doing temporary allocation in the FFI, because what we really want to do is allocate a pinned ByteArray and let the GC clean it up later. So this set of changes adds the required framework. There are two new primops: newPinnedByteArray# :: Int# -> State# s -> (# State# s, MutByteArr# s #) byteArrayContents# :: ByteArr# -> Addr# obviously byteArrayContents# is highly unsafe. Allocating a pinned ByteArr# isn't the default, because a pinned ByteArr# will hold an entire block (currently 4k) live until it is garbage collected (that doesn't mean each pinned ByteArr# requires 4k of storage, just that if a block contains a single live pinned ByteArray, the whole block must be retained).
-
- 24 Jul, 2001 1 commit
-
-
ken authored
Innocent changes to resurrect/add 64-bit support.
-
- 23 Jul, 2001 2 commits
-
-
simonmar authored
Add a compacting garbage collector. It isn't enabled by default, as there are still a couple of problems: there's a fallback case I haven't implemented yet which means it will occasionally bomb out, and speed-wise it's quite a bit slower than the copying collector (about 1.8x slower). Until I can make it go faster, it'll only be useful when you're actually running low on real memory. '+RTS -c' to enable it. Oh, and I cleaned up a few things in the RTS while I was there, and fixed one or two possibly real bugs in the existing GC.
-
simonmar authored
Small changes to improve GC performance slightly: - store the generation *number* in the block descriptor rather than a pointer to the generation structure, since the most common operation is to pull out the generation number, and it's one less indirection this way. - cache the generation number in the step structure too, which avoids an extra indirection in several places.
-
- 03 May, 2001 1 commit
-
-
simonmar authored
silence gcc 2.96 warning
-
- 02 Mar, 2001 2 commits
-
-
simonmar authored
ASSERT in updateWithIndirection() that we haven't already updated this object with an indirection, and fix two places in the RTS where this could happen. The problem only occurs when we're in a black-hole-style loop, and there are multiple update frames on the stack pointing to the same object (this is possible because of lazy black-holing). Both stack squeezing and asynchronous exception raising walk down the stack and remove update frames, updating their contents with indirections. If we don't protect against multiple updates, the mutable list in the old generation may get into a bogus state.
-
simonmar authored
Add some ASSERT()s so we can catch updates where updatee==target.
-
- 11 Feb, 2001 1 commit
-
-
simonmar authored
Bite the bullet and make GHCi support non-optional in the RTS. GHC 4.11 should be able to build GHCi without any additional tweaks now. - the Linker is split into two parts: LinkerBasic.c, containing the routines required by the rest of the RTS, and Linker.c, containing the linker proper, which is not referred to from the rest of the RTS. Only Linker.c requires -ldl, so programs which don't make use of the linker (everything except GHC, in other words) won't need -ldl.
-