1. 22 Apr, 2005 2 commits
    • sof's avatar
      [project @ 2005-04-22 16:49:38 by sof] · 23e16cda
      sof authored
      resetStaticObjectForRetainerProfiling(): warning wibble
      23e16cda
    • simonmar's avatar
      [project @ 2005-04-22 09:32:39 by simonmar] · 0f3205e6
      simonmar authored
      SMP: the rest of the changes to support safe thunk entry & updates.  I
      thought the compiler changes were independent, but I ended up breaking
      the HEAD, so I'll have to commit the rest.  non-SMP compilation should
      not be affected.
      0f3205e6
  2. 05 Apr, 2005 2 commits
    • simonmar's avatar
      [project @ 2005-04-05 12:19:54 by simonmar] · 16214216
      simonmar authored
      Some multi-processor hackery, including
      
        - Don't hang blocked threads off BLACKHOLEs any more, instead keep
          them all on a separate queue which is checked periodically for
          threads to wake up.
      
          This is good because (a) we don't have to worry about locking the
          closure in SMP mode when we want to block on it, and (b) it means
          the standard update code doesn't need to wake up any threads or
          check for a BLACKHOLE_BQ, simplifying the update code.
      
          The downside is that if there are lots of threads blocked on
          BLACKHOLEs, we might have to do a lot of repeated list traversal.
          We don't expect this to be common, though.  conc023 goes slower
          with this change, but we expect most programs to benefit from the
          shorter update code.
      
        - Fixing up the Capability code to handle multiple capabilities (SMP
          mode), and related changes to get the SMP mode at least building.
      16214216
    • simonmar's avatar
      [project @ 2005-04-05 09:38:00 by simonmar] · 3f4fd743
      simonmar authored
      Main x86_64 hacking: we have a problem on this arch where binutils
      can't generate 64-bit relative relocations (R_X86_64_PC64), which many
      of our info-table fields are.  So far we've been hacking around it by
      putting everything in the text section, but I've decided to adopt
      another approach: we'll use explicit 32-bit offset fields on this
      platform instead.  This is safe in the default "small" memory model
      where all symbols are guaranteed to be in the lower 2Gb of the address
      space.
      
      NCG changes coming; mangler changes are probably required too.
      3f4fd743
  3. 27 Mar, 2005 1 commit
    • panne's avatar
      [project @ 2005-03-27 13:41:13 by panne] · 03dc2dd3
      panne authored
      * Some preprocessors don't like the C99/C++ '//' comments after a
        directive, so use '/* */' instead. For consistency, a lot of '//' in
        the include files were converted, too.
      
      * UnDOSified libraries/base/cbits/runProcess.c.
      
      * My favourite sport: Killed $Id$s.
      03dc2dd3
  4. 24 Feb, 2005 1 commit
  5. 10 Feb, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-02-10 13:01:52 by simonmar] · e7c3f957
      simonmar authored
      GC changes: instead of threading old-generation mutable lists
      through objects in the heap, keep it in a separate flat array.
      
      This has some advantages:
      
        - the IND_OLDGEN object is now only 2 words, so the minimum
          size of a THUNK is now 2 words instead of 3.  This saves
          some amount of allocation (about 2% on average according to
          my measurements), and is more friendly to the cache by
          squashing objects together more.
      
        - keeping the mutable list separate from the IND object
          will be necessary for our multiprocessor implementation.
      
        - removing the mut_link field makes the layout of some objects
          more uniform, leading to less complexity and special cases.
      
        - I also unified the two mutable lists (mut_once_list and mut_list)
          into a single mutable list, which lead to more simplifications
          in the GC.
      e7c3f957
  6. 07 Oct, 2004 1 commit
    • wolfgang's avatar
      [project @ 2004-10-07 15:54:03 by wolfgang] · b4d045ae
      wolfgang authored
      Position Independent Code and Dynamic Linking Support, Part 1
      
      This commit allows generation of position independent code (PIC) that fully supports dynamic linking on Mac OS X and PowerPC Linux.
      Other platforms are not yet supported, and there is no support for actually linking or using dynamic libraries - so if you use the -fPIC or -dynamic code generation flags, you have to type your (platform-specific) linker command lines yourself.
      
      
      nativeGen/PositionIndependentCode.hs:
      New file. Look here for some more comments on how this works.
      
      cmm/CLabel.hs:
      Add support for DynamicLinkerLabels and PIC base labels - for use inside the NCG.
      needsCDecl: Case alternative labels now need C decls, see the codeGen/CgInfoTbls.hs below for details
      
      cmm/Cmm.hs:
      Add CmmPicBaseReg (used in NCG),
      and CmmLabelDiffOff (used in NCG and for offsets in info tables)
      
      cmm/CmmParse.y:
      support offsets in info tables
      
      cmm/PprC.hs:
      support CmmLabelDiffOff
      Case alternative labels now need C decls (see the codeGen/CgInfoTbls.hs for details), so we need to pprDataExterns for info tables.
      
      cmm/PprCmm.hs:
      support CmmLabelDiffOff
      
      codeGen/CgInfoTbls.hs:
      no longer store absolute addresses in info tables, instead, we store offsets.
      Also, for vectored return points, emit the alternatives _after_ the vector table. This is to work around a limitation in Apple's as, which refuses to handle label differences where one label is at the end of a section. Emitting alternatives after vector info tables makes sure this never happens in GHC generated code. Case alternatives now require prototypes in hc code, though (see changes in PprC.hs, CLabel.hs).
      
      main/CmdLineOpts.lhs:
      Add a new option, -fPIC.
      
      main/DriverFlags.hs:
      Pass the correct options for PIC to gcc, depending on the platform. Only for powerpc for now.
      
      nativeGen/AsmCodeGen.hs:
      Many changes...
      Mac OS X-specific management of import stubs is no longer, it's now part of a general mechanism to handle such things for all platforms that need it (Darwin [both ppc and x86], Linux on ppc, and some platforms we don't support).
      Move cmmToCmm into its own monad which can accumulate a list of imported symbols. Make it call cmmMakeDynamicReference at the right places.
      
      nativeGen/MachCodeGen.hs:
      nativeGen/MachInstrs.hs:
      nativeGen/MachRegs.lhs:
      nativeGen/PprMach.hs:
      nativeGen/RegAllocInfo.hs:
      Too many changes to enumerate here, PowerPC specific.
      
      nativeGen/NCGMonad.hs:
      NatM still tracks imported symbols, as more labels can be created during code generation (float literals, jump tables; on some platforms all data access has to go through the dynamic linking mechanism).
      
      driver/mangler/ghc-asm.lprl:
      Mangle absolute addresses in info tables to offsets.
      Correctly pass through GCC-generated PIC for Mac OS X and powerpc linux.
      
      includes/Cmm.h:
      includes/InfoTables.h:
      includes/Storage.h:
      includes/mkDerivedConstants.c:
      rts/GC.c:
      rts/GCCompact.c:
      rts/HeapStackCheck.cmm:
      rts/Printer.c:
      rts/RetainerProfile.c:
      rts/Sanity.c:
      Adapt to the fact that info tables now contain offsets.
      
      rts/Linker.c:
      Mac-specific: change machoInitSymbolsWithoutUnderscore to support PIC.
      b4d045ae
  7. 12 Sep, 2004 1 commit
  8. 03 Sep, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-09-03 15:28:18 by simonmar] · 95ca6bff
      simonmar authored
      Cleanup: all (well, most) messages from the RTS now go through the
      functions in RtsUtils: barf(), debugBelch() and errorBelch().  The
      latter two were previously called belch() and prog_belch()
      respectively.  See the comments for the right usage of these message
      functions.
      
      One reason for doing this is so that we can avoid spurious uses of
      stdout/stderr by Haskell apps on platforms where we shouldn't be using
      them (eg. non-console apps on Windows).
      95ca6bff
  9. 13 Aug, 2004 1 commit
  10. 16 May, 2003 1 commit
  11. 23 Apr, 2003 1 commit
  12. 21 Mar, 2003 1 commit
    • sof's avatar
      [project @ 2003-03-21 16:18:37 by sof] · 557bca73
      sof authored
      Friday morning code-wibbling:
      - made RetainerProfile.c:firstStack a 'static'
      - added RetainerProfile.c:retainerStackBlocks()
      557bca73
  13. 22 Feb, 2003 1 commit
    • sof's avatar
      [project @ 2003-02-22 04:51:50 by sof] · 557947d3
      sof authored
      Clean up code&interfaces that deals with timers and asynchrony:
      
      - Timer.{c,h} now defines the platform-independent interface
        to the timing services needed by the RTS. Itimer.{c,h} +
        win32/Ticker.{c,h} defines the OS-specific services that
        creates/destroys a timer.
      - For win32 plats, drop the long-standing use of the 'multimedia'
        API timers and implement the ticking service ourselves. Simpler
        and more flexible.
      - Select.c is now solely for platforms that use select() to handle
        non-blocking I/O & thread delays. win32/AwaitEvent.c provides
        the same API on the Win32 side.
      - support threadDelay on win32 platforms via worker threads.
      
      Not yet compiled up on non-win32 platforms; will do once checked in.
      557947d3
  14. 11 Dec, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-12-11 15:36:20 by simonmar] · 0bffc410
      simonmar authored
      Merge the eval-apply-branch on to the HEAD
      ------------------------------------------
      
      This is a change to GHC's evaluation model in order to ultimately make
      GHC more portable and to reduce complexity in some areas.
      
      At some point we'll update the commentary to describe the new state of
      the RTS.  Pending that, the highlights of this change are:
      
        - No more Su.  The Su register is gone, update frames are one
          word smaller.
      
        - Slow-entry points and arg checks are gone.  Unknown function calls
          are handled by automatically-generated RTS entry points (AutoApply.hc,
          generated by the program in utils/genapply).
      
        - The stack layout is stricter: there are no "pending arguments" on
          the stack any more, the stack is always strictly a sequence of
          stack frames.
      
          This means that there's no need for LOOKS_LIKE_GHC_INFO() or
          LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
          how to find the boundary between the text and data segments (BIG WIN!).
      
        - A couple of nasty hacks in the mangler caused by the neet to
          identify closure ptrs vs. info tables have gone away.
      
        - Info tables are a bit more complicated.  See InfoTables.h for the
          details.
      
        - As a side effect, GHCi can now deal with polymorphic seq.  Some bugs
          in GHCi which affected primitives and unboxed tuples are now
          fixed.
      
        - Binary sizes are reduced by about 7% on x86.  Performance is roughly
          similar, some programs get faster while some get slower.  I've seen
          GHCi perform worse on some examples, but haven't investigated
          further yet (GHCi performance *should* be about the same or better
          in theory).
      
        - Internally the code generator is rather better organised.  I've moved
          info-table generation from the NCG into the main codeGen where it is
          shared with the C back-end; info tables are now emitted as arrays
          of words in both back-ends.  The NCG is one step closer to being able
          to support profiling.
      
      This has all been fairly thoroughly tested, but no doubt I've messed
      up the commit in some way.
      0bffc410
  15. 18 Jul, 2002 1 commit
  16. 19 Dec, 2001 1 commit
  17. 12 Dec, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-12-12 14:25:03 by simonmar] · 63706895
      simonmar authored
      - Relax the restriction that roots must also be retainers, by changing
        the type of the 'r' argument to retainClosure from (StgClosure *) to
        retainer.  Now retainRoot can pass CCS_SYSTEM as the retainer for a
        root if the closure is itself not a retainer.
      
      - Traverse roots from the stable ptr table, which might not also be
        retainers (hence the generalisation above).
      63706895
  18. 26 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-26 16:54:21 by simonmar] · dbef766c
      simonmar authored
      Profiling cleanup.
      
      This commit eliminates some duplication in the various heap profiling
      subsystems, and generally centralises much of the machinery.  The key
      concept is the separation of a heap *census* (which is now done in one
      place only instead of three) from the calculation of retainer sets.
      Previously the retainer profiling code also did a heap census on the
      fly, and lag-drag-void profiling had its own census machinery.
      
      Value-adds:
      
         - you can now restrict a heap profile to certain retainer sets,
           but still display by cost centre (or type, or closure or
           whatever).
      
         - I've added an option to restrict the maximum retainer set size
           (+RTS -R<size>, defaulting to 8).
      
         - I've cleaned up the heap profiling options at the request of
           Simon PJ.  See the help text for details.  The new scheme
           is backwards compatible with the old.
      
         - I've removed some odd bits of LDV or retainer profiling-specific
           code from various parts of the system.
      
         - the time taken doing heap censuses (and retainer set calculation)
           is now accurately reported by the RTS when you say +RTS -Sstderr.
      
      Still to come:
      
         - restricting a profile to a particular biography
           (lag/drag/void/use).  This requires keeping old heap censuses
           around, but the infrastructure is now in place to do this.
      dbef766c
  19. 22 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-22 14:25:11 by simonmar] · db61851c
      simonmar authored
      Retainer Profiling / Lag-drag-void profiling.
      
      This is mostly work by Sungwoo Park, who spent a summer internship at
      MSR Cambridge this year implementing these two types of heap profiling
      in GHC.
      
      Relative to Sungwoo's original work, I've made some improvements to
      the code:
      
         - it's now possible to apply constraints to retainer and LDV profiles
           in the same way as we do for other types of heap profile (eg.
           +RTS -hc{foo,bar} -hR -RTS gives you a retainer profiling considering
           only closures with cost centres 'foo' and 'bar').
      
         - the heap-profile timer implementation is cleaned up.
      
         - heap profiling no longer has to be run in a two-space heap.
      
         - general cleanup of the code and application of the SDM C coding
           style guidelines.
      
      Profiling will be a little slower and require more space than before,
      mainly because closures have an extra header word to support either
      retainer profiling or LDV profiling (you can't do both at the same
      time).
      
      We've used the new profiling tools on GHC itself, with moderate
      success.  Fixes for some space leaks in GHC to follow...
      db61851c