1. 11 Feb, 2005 2 commits
    • simonmar's avatar
      [project @ 2005-02-11 14:01:30 by simonmar] · 4d3ce736
      simonmar authored
      Careful with mutable list entries that point to THUNKs: the thunk
      might be updated, and the resulting IND_OLDGEN will be on the mutable
      list twice.
      
      We previously avoided this problem by having an extra MUT_CONS object
      on the mutable list pointing to the THUNK, so that we could tell the
      difference between the entry on the mutable list that used to be the
      THUNK, and the new entry for the IND_OLDGEN.
      
      We don't have MUT_CONS any more (this was part of the cleanup from
      separating the mutable list from the heap).  So, now, when scavenging
      an IND_OLDGEN on the mutable list, we check whether it is pointing to
      an already-evacuated object.  This is a bit crude, but at least it is
      a localised hack.
      4d3ce736
    • simonmar's avatar
      [project @ 2005-02-11 12:20:12 by simonmar] · a186d6f7
      simonmar authored
      Fix a bug: thunk_selector_depth was being incremented before checking
      that we had reached the depth limit, and not decremented if we had
      reached the limit.
      a186d6f7
  2. 10 Feb, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-02-10 13:01:52 by simonmar] · e7c3f957
      simonmar authored
      GC changes: instead of threading old-generation mutable lists
      through objects in the heap, keep it in a separate flat array.
      
      This has some advantages:
      
        - the IND_OLDGEN object is now only 2 words, so the minimum
          size of a THUNK is now 2 words instead of 3.  This saves
          some amount of allocation (about 2% on average according to
          my measurements), and is more friendly to the cache by
          squashing objects together more.
      
        - keeping the mutable list separate from the IND object
          will be necessary for our multiprocessor implementation.
      
        - removing the mut_link field makes the layout of some objects
          more uniform, leading to less complexity and special cases.
      
        - I also unified the two mutable lists (mut_once_list and mut_list)
          into a single mutable list, which lead to more simplifications
          in the GC.
      e7c3f957
  3. 20 Jan, 2005 1 commit
  4. 18 Nov, 2004 1 commit
  5. 07 Oct, 2004 1 commit
    • wolfgang's avatar
      [project @ 2004-10-07 15:54:03 by wolfgang] · b4d045ae
      wolfgang authored
      Position Independent Code and Dynamic Linking Support, Part 1
      
      This commit allows generation of position independent code (PIC) that fully supports dynamic linking on Mac OS X and PowerPC Linux.
      Other platforms are not yet supported, and there is no support for actually linking or using dynamic libraries - so if you use the -fPIC or -dynamic code generation flags, you have to type your (platform-specific) linker command lines yourself.
      
      
      nativeGen/PositionIndependentCode.hs:
      New file. Look here for some more comments on how this works.
      
      cmm/CLabel.hs:
      Add support for DynamicLinkerLabels and PIC base labels - for use inside the NCG.
      needsCDecl: Case alternative labels now need C decls, see the codeGen/CgInfoTbls.hs below for details
      
      cmm/Cmm.hs:
      Add CmmPicBaseReg (used in NCG),
      and CmmLabelDiffOff (used in NCG and for offsets in info tables)
      
      cmm/CmmParse.y:
      support offsets in info tables
      
      cmm/PprC.hs:
      support CmmLabelDiffOff
      Case alternative labels now need C decls (see the codeGen/CgInfoTbls.hs for details), so we need to pprDataExterns for info tables.
      
      cmm/PprCmm.hs:
      support CmmLabelDiffOff
      
      codeGen/CgInfoTbls.hs:
      no longer store absolute addresses in info tables, instead, we store offsets.
      Also, for vectored return points, emit the alternatives _after_ the vector table. This is to work around a limitation in Apple's as, which refuses to handle label differences where one label is at the end of a section. Emitting alternatives after vector info tables makes sure this never happens in GHC generated code. Case alternatives now require prototypes in hc code, though (see changes in PprC.hs, CLabel.hs).
      
      main/CmdLineOpts.lhs:
      Add a new option, -fPIC.
      
      main/DriverFlags.hs:
      Pass the correct options for PIC to gcc, depending on the platform. Only for powerpc for now.
      
      nativeGen/AsmCodeGen.hs:
      Many changes...
      Mac OS X-specific management of import stubs is no longer, it's now part of a general mechanism to handle such things for all platforms that need it (Darwin [both ppc and x86], Linux on ppc, and some platforms we don't support).
      Move cmmToCmm into its own monad which can accumulate a list of imported symbols. Make it call cmmMakeDynamicReference at the right places.
      
      nativeGen/MachCodeGen.hs:
      nativeGen/MachInstrs.hs:
      nativeGen/MachRegs.lhs:
      nativeGen/PprMach.hs:
      nativeGen/RegAllocInfo.hs:
      Too many changes to enumerate here, PowerPC specific.
      
      nativeGen/NCGMonad.hs:
      NatM still tracks imported symbols, as more labels can be created during code generation (float literals, jump tables; on some platforms all data access has to go through the dynamic linking mechanism).
      
      driver/mangler/ghc-asm.lprl:
      Mangle absolute addresses in info tables to offsets.
      Correctly pass through GCC-generated PIC for Mac OS X and powerpc linux.
      
      includes/Cmm.h:
      includes/InfoTables.h:
      includes/Storage.h:
      includes/mkDerivedConstants.c:
      rts/GC.c:
      rts/GCCompact.c:
      rts/HeapStackCheck.cmm:
      rts/Printer.c:
      rts/RetainerProfile.c:
      rts/Sanity.c:
      Adapt to the fact that info tables now contain offsets.
      
      rts/Linker.c:
      Mac-specific: change machoInitSymbolsWithoutUnderscore to support PIC.
      b4d045ae
  6. 13 Sep, 2004 1 commit
    • sof's avatar
      [project @ 2004-09-13 17:18:27 by sof] · db42a91b
      sof authored
      threadSqueezeStack(): with DEBUG, zero out entire payload when
      blackholing. Required now that LOOKS_LIKE_INFO_PTR() isn't 100%
      precise.
      
      Merge to STABLE
      db42a91b
  7. 12 Sep, 2004 1 commit
  8. 03 Sep, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-09-03 15:28:18 by simonmar] · 95ca6bff
      simonmar authored
      Cleanup: all (well, most) messages from the RTS now go through the
      functions in RtsUtils: barf(), debugBelch() and errorBelch().  The
      latter two were previously called belch() and prog_belch()
      respectively.  See the comments for the right usage of these message
      functions.
      
      One reason for doing this is so that we can avoid spurious uses of
      stdout/stderr by Haskell apps on platforms where we shouldn't be using
      them (eg. non-console apps on Windows).
      95ca6bff
  9. 13 Aug, 2004 1 commit
  10. 21 May, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-05-21 13:28:59 by simonmar] · 89f9f089
      simonmar authored
      Fix yet another bug in the THUNK_SELECTOR code.  Interestingly, I
      spotted this one earlier but left a ToDo in the code rather than
      fixing it (I think I wasn't sure whether it could happen or not).
      
      The bug is to close another another way that eval_thunk_selector()
      could return a pointer into to-space.  See comments for details.
      89f9f089
  11. 10 May, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-05-10 11:53:41 by simonmar] · 318f8bc4
      simonmar authored
      Fix mishandling of the BF_COMPACTED flag, which could lead to problems
      when using the compacting collector (+RTS -c, or +RTS -M<size>).  In
      fact, I'm not sure how it worked at all.
      
      MERGE TO STABLE
      318f8bc4
  12. 07 May, 2004 1 commit
    • panne's avatar
      [project @ 2004-05-07 21:19:21 by panne] · 70b3bd34
      panne authored
      GCC's __attribute__ handling seems to be a little bit stricter with GCC 3.3.3:
      
         * When a function declaration uses it, the corresponding definition has to
           use it, too.
      
         * Syntactically it is allowed only at the beginning of the function
           definition.
      
      Let's hope that the current syntax is backwards compatible...
      70b3bd34
  13. 26 Nov, 2003 1 commit
  14. 12 Nov, 2003 1 commit
    • sof's avatar
      [project @ 2003-11-12 17:49:05 by sof] · 20593d1d
      sof authored
      Tweaks to have RTS (C) sources compile with MSVC. Apart from wibbles
      related to the handling of 'inline', changed Schedule.h:POP_RUN_QUEUE()
      not to use expression-level statement blocks.
      20593d1d
  15. 24 Oct, 2003 1 commit
  16. 22 Oct, 2003 1 commit
    • simonmar's avatar
      [project @ 2003-10-22 15:00:59 by simonmar] · d283bfc1
      simonmar authored
      Fix a nasty bug in the GC mutable list handling, which shows up when
      an array is frozen and then unsafeThaw#'d.  The array could end up on
      the mutable list twice.
      
      Fixes SourceForge bug #819116.
      d283bfc1
  17. 23 Sep, 2003 1 commit
  18. 26 Aug, 2003 1 commit
  19. 14 Aug, 2003 1 commit
  20. 26 Jun, 2003 1 commit
  21. 19 Jun, 2003 1 commit
    • simonmar's avatar
      [project @ 2003-06-19 12:47:08 by simonmar] · 2ba64673
      simonmar authored
      small optimisation: when evacuating a TSO, only copy the part of the
      stack that is above the stack pointer and hence in use (doesn't apply
      to most stacks, which are large objects and don't get copied anyhow).
      2ba64673
  22. 14 May, 2003 1 commit
    • simonmar's avatar
      [project @ 2003-05-14 09:13:52 by simonmar] · 7a236a56
      simonmar authored
      Change the way SRTs are represented:
      
      Previously, the SRT associated with a function or thunk would be a
      sub-list of the enclosing top-level function's SRT.  But this approach
      can lead to lots of duplication: if a CAF is referenced in several
      different thunks, then it may appear several times in the SRT.
      Let-no-escapes compound the problem, because the occurrence of a
      let-no-escape-bound variable would expand to all the CAFs referred to
      by the let-no-escape.
      
      The new way is to describe the SRT associated with a function or thunk
      as a (pointer+offset,bitmap) pair, where the pointer+offset points
      into some SRT table (the enclosing function's SRT), and the bitmap
      indicates which entries in this table are "live" for this closure.
      The bitmap is stored in the 16 bits previously used for the length
      field, but this rarely overflows.  When it does overflow, we store the
      bitmap externally in a new "SRT descriptor".
      
      Now the enclosing SRT can be a set, hence eliminating the duplicates.
      
      Also, we now have one SRT per top-level function in a recursive group,
      where previously we used to have one SRT for the whole group.  This
      helps keep the size of SRTs down.
      
      Bottom line: very little difference most of the time.  GHC itself got
      slightly smaller.  One bad case of a module in GHC which had a huge
      SRT has gone away.
      
      While I was in the area:
      
        - Several parts of the back-end require bitmaps.  Functions for
          creating bitmaps are now centralised in the Bitmap module.
      
        - We were trying to be independent of word-size in a couple of
          places in the back end, but we've now abandoned that strategy so I
          simplified things a bit.
      7a236a56
  23. 22 Apr, 2003 1 commit
    • simonmar's avatar
      [project @ 2003-04-22 16:25:08 by simonmar] · 1da232fc
      simonmar authored
      Fix an obscure bug: the most general kind of heap check,
      HEAP_CHECK_GEN(), is supposed to save the contents of *every* register
      known to the STG machine (used in cases where we either can't figure
      out which ones are live, or doing so would be too much hassle).  The
      problem is that it wasn't saving the L1 register.
      
      A slight complication arose in that saving the L1 register pushed the
      size of the frame over the 16 words allowed for the size of the bitmap
      stored in the frame, so I changed the layout of the frame a bit.
      Describing all the registers using a single bitmap is overkill when
      only 8 of them can actually be pointers, so now the bitmap is only 8
      bits long and we always skip over a fixed number of non-ptr words to
      account for all the non-ptr regs.  This is all described in StgMacros.h.
      1da232fc
  24. 01 Apr, 2003 1 commit
    • sof's avatar
      [project @ 2003-04-01 15:05:13 by sof] · c49a6ca9
      sof authored
      Tidy up code that supports user/Haskell signal handlers.
      
      Signals.h now defines RTS_USER_SIGNALS when this is supported,
      which is then used elsewhere.
      c49a6ca9
  25. 26 Mar, 2003 2 commits
  26. 24 Mar, 2003 2 commits
    • simonmar's avatar
      [project @ 2003-03-24 15:33:25 by simonmar] · 25efe5a4
      simonmar authored
      A couple of changes related to bug #2 in the previous commit:
      
       - follow IND_STATICs, now that we can check whether
         we end up in to-space or not
      
       - improve the commentary
      25efe5a4
    • simonmar's avatar
      [project @ 2003-03-24 14:46:53 by simonmar] · b3f53081
      simonmar authored
      Fix some bugs in compacting GC.
      
      Bug 1: When threading the fields of an AP or PAP, we were grabbing the
      info table of the function without unthreading it first.
      
      Bug 2: eval_thunk_selector() might accidentally find itself in
      to-space when going through indirections in a compacted generation.
      We must check for this case and bale out if necessary.
      
      Bug 3: This is somewhat more nasty.  When we have an AP or PAP that
      points to a BCO, the layout info for the AP/PAP is in the BCO's
      instruction array, which is two objects deep from the AP/PAP itself.
      The trouble is, during compacting GC, we can only safely look one
      object deep from the current object, because pointers from objects any
      deeper might have been already updated to point to their final
      destinations.
      
      The solution is to put the arity and bitmap info for a BCO into the
      BCO object itself.  This means BCOs become variable-length, which is a
      slight annoyance, but it also means that looking up the arity/bitmap
      is quicker.  There is a slight reduction in complexity in the byte
      code generator due to not having to stuff the bitmap at the front of
      the instruction stream.
      b3f53081
  27. 19 Mar, 2003 1 commit
  28. 12 Feb, 2003 1 commit
  29. 11 Dec, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-12-11 15:36:20 by simonmar] · 0bffc410
      simonmar authored
      Merge the eval-apply-branch on to the HEAD
      ------------------------------------------
      
      This is a change to GHC's evaluation model in order to ultimately make
      GHC more portable and to reduce complexity in some areas.
      
      At some point we'll update the commentary to describe the new state of
      the RTS.  Pending that, the highlights of this change are:
      
        - No more Su.  The Su register is gone, update frames are one
          word smaller.
      
        - Slow-entry points and arg checks are gone.  Unknown function calls
          are handled by automatically-generated RTS entry points (AutoApply.hc,
          generated by the program in utils/genapply).
      
        - The stack layout is stricter: there are no "pending arguments" on
          the stack any more, the stack is always strictly a sequence of
          stack frames.
      
          This means that there's no need for LOOKS_LIKE_GHC_INFO() or
          LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
          how to find the boundary between the text and data segments (BIG WIN!).
      
        - A couple of nasty hacks in the mangler caused by the neet to
          identify closure ptrs vs. info tables have gone away.
      
        - Info tables are a bit more complicated.  See InfoTables.h for the
          details.
      
        - As a side effect, GHCi can now deal with polymorphic seq.  Some bugs
          in GHCi which affected primitives and unboxed tuples are now
          fixed.
      
        - Binary sizes are reduced by about 7% on x86.  Performance is roughly
          similar, some programs get faster while some get slower.  I've seen
          GHCi perform worse on some examples, but haven't investigated
          further yet (GHCi performance *should* be about the same or better
          in theory).
      
        - Internally the code generator is rather better organised.  I've moved
          info-table generation from the NCG into the main codeGen where it is
          shared with the C back-end; info tables are now emitted as arrays
          of words in both back-ends.  The NCG is one step closer to being able
          to support profiling.
      
      This has all been fairly thoroughly tested, but no doubt I've messed
      up the commit in some way.
      0bffc410
  30. 25 Oct, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-10-25 09:40:47 by simonmar] · 67944e15
      simonmar authored
      In eval_thunk_selector(), don't follow IND_STATICs because they might
      lead us into to-space.  Fixes a case of "EVACUATED object entered!".
      
      Also, add an assertion to catch this bug earlier.
      
      MERGE TO STABLE
      67944e15
  31. 25 Sep, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-09-25 14:46:31 by simonmar] · 28e69a6b
      simonmar authored
      Fix a scheduling/GC bug, spotted by Wolfgang Thaller.  If a main
      thread completes, and a GC runs before the return (from rts_evalIO())
      happens, then the thread might be GC'd before we get a chance to
      extract its return value, leading to barf("main thread has been GC'd")
      from the garbage collector.
      
      The fix is to treat all main threads which have completed as roots:
      this is logically the right thing to do, because these threads must be
      retained by virtue of holding the return value, and this is a property of
      main threads only.
      28e69a6b
  32. 18 Sep, 2002 1 commit
  33. 17 Sep, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-09-17 12:11:44 by simonmar] · 48c557b5
      simonmar authored
      The GC wasn't properly marking pending signal handlers, which could
      lead to "EVACUATED object entered!" errors.  Also, a race occurs if a
      signal arrives during GC.  Two fixes:
      
        (a) mark all pending signal handlers during GC, and
        (b) block signals during GC
      
      MERGE TO STABLE
      48c557b5
  34. 10 Sep, 2002 1 commit
  35. 06 Sep, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-09-06 09:56:12 by simonmar] · f8e722a4
      simonmar authored
      Selector Thunk Fix, take II.
      
      The previous version didn't deal well with selector thunks which point
      to more selector thunks, and on closer inspection the method was
      flawed.  Now I've introduced a function
      
      	StgClosure *eval_selector_thunk( int field, StgClosure * )
      
      which evaluates a selector thunk returning its value, in from-space,
      if possible.  It blackholes the thunk during evaluation.  It might
      recursively evaluate more selector thunks, but it does this in a
      bounded way and updates the thunks with indirections (NOT forwarding
      pointers) after evaluation.
      
      This cleans things up somewhat, and I believe it deals properly with
      both types of selector-thunk loops that arise.
      
      MERGE TO STABLE
      f8e722a4
  36. 05 Sep, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-09-05 16:26:33 by simonmar] · 8435b2e4
      simonmar authored
      Fix for infinite loop when there is a THUNK_SELECTOR which eventually
      refers to itself, such as might be generated by code like
      
      	let x = (fst x, snd x) in ...
      
      At the same time, I re-enabled the code to traverse multiple selector
      thunks with bounded depth, because I believe it now works.
      
      MERGE TO STABLE (but test thoroughly in the HEAD first, this is
      fragile stuff)
      8435b2e4
  37. 16 Aug, 2002 1 commit