1. 18 Aug, 2009 1 commit
    • Simon Marlow's avatar
      Fix #3429: a tricky race condition · c5cafbcc
      Simon Marlow authored
      There were two bugs, and had it not been for the first one we would
      not have noticed the second one, so this is quite fortunate.
      
      The first bug is in stg_unblockAsyncExceptionszh_ret, when we found a
      pending exception to raise, but don't end up raising it, there was a
      missing adjustment to the stack pointer.  
      
      The second bug was that this case was actually happening at all: it
      ought to be incredibly rare, because the pending exception thread
      would have to be killed between us finding it and attempting to raise
      the exception.  This made me suspicious.  It turned out that there was
      a race condition on the tso->flags field; multiple threads were
      updating this bitmask field non-atomically (one of the bits is the
      dirty-bit for the generational GC).  The fix is to move the dirty bit
      into its own field of the TSO, making the TSO one word larger (sadly).
      c5cafbcc
  2. 02 Aug, 2009 1 commit
    • Simon Marlow's avatar
      RTS tidyup sweep, first phase · a2a67cd5
      Simon Marlow authored
      The first phase of this tidyup is focussed on the header files, and in
      particular making sure we are exposinng publicly exactly what we need
      to, and no more.
      
       - Rts.h now includes everything that the RTS exposes publicly,
         rather than a random subset of it.
      
       - Most of the public header files have moved into subdirectories, and
         many of them have been renamed.  But clients should not need to
         include any of the other headers directly, just #include the main
         public headers: Rts.h, HsFFI.h, RtsAPI.h.
      
       - All the headers needed for via-C compilation have moved into the
         stg subdirectory, which is self-contained.  Most of the headers for
         the rest of the RTS APIs have moved into the rts subdirectory.
      
       - I left MachDeps.h where it is, because it is so widely used in
         Haskell code.
       
       - I left a deprecated stub for RtsFlags.h in place.  The flag
         structures are now exposed by Rts.h.
      
       - Various internal APIs are no longer exposed by public header files.
      
       - Various bits of dead code and declarations have been removed
      
       - More gcc warnings are turned on, and the RTS code is more
         warning-clean.
      
       - More source files #include "PosixSource.h", and hence only use
         standard POSIX (1003.1c-1995) interfaces.
      
      There is a lot more tidying up still to do, this is just the first
      pass.  I also intend to standardise the names for external RTS APIs
      (e.g use the rts_ prefix consistently), and declare the internal APIs
      as hidden for shared libraries.
      a2a67cd5
  3. 09 Sep, 2008 1 commit
  4. 10 Jul, 2008 1 commit
  5. 16 Apr, 2008 1 commit
  6. 11 Sep, 2007 1 commit
    • Simon Marlow's avatar
      FIX #1466 (partly), which was causing concprog001(ghci) to fail · 066f1028
      Simon Marlow authored
      An AP_STACK now ensures that there is at least AP_STACK_SPLIM words of
      stack headroom available after unpacking the payload.  Continuations
      that require more than AP_STACK_SPLIM words of stack must do their own
      stack checks instead of aggregating their stack usage into the parent
      frame.  I have made this change for the interpreter, but not for
      compiled code yet - we should do this in the glorious rewrite of the
      code generator.
      066f1028
  7. 18 Jul, 2007 1 commit
  8. 17 Apr, 2007 1 commit
    • Simon Marlow's avatar
      Re-working of the breakpoint support · cdce6477
      Simon Marlow authored
      This is the result of Bernie Pope's internship work at MSR Cambridge,
      with some subsequent improvements by me.  The main plan was to
      
       (a) Reduce the overhead for breakpoints, so we could enable 
           the feature by default without incurrent a significant penalty
       (b) Scatter more breakpoint sites throughout the code
      
      Currently we can set a breakpoint on almost any subexpression, and the
      overhead is around 1.5x slower than normal GHCi.  I hope to be able to
      get this down further and/or allow breakpoints to be turned off.
      
      This patch also fixes up :print following the recent changes to
      constructor info tables.  (most of the :print tests now pass)
      
      We now support single-stepping, which just enables all breakpoints.
      
        :step <expr>     executes <expr> with single-stepping turned on
        :step            single-steps from the current breakpoint
      
      The mechanism is quite different to the previous implementation.  We
      share code with the HPC (haskell program coverage) implementation now.
      The coverage pass annotates source code with "tick" locations which
      are tracked by the coverage tool.  In GHCi, each "tick" becomes a
      potential breakpoint location.
      
      Previously breakpoints were compiled into code that magically invoked
      a nested instance of GHCi.  Now, a breakpoint causes the current
      thread to block and control is returned to GHCi.
      
      See the wiki page for more details and the current ToDo list:
      
        http://hackage.haskell.org/trac/ghc/wiki/NewGhciDebugger
      cdce6477
  9. 28 Feb, 2007 1 commit
  10. 16 Jun, 2006 1 commit
    • Simon Marlow's avatar
      Asynchronous exception support for SMP · b1953bbb
      Simon Marlow authored
      This patch makes throwTo work with -threaded, and also refactors large
      parts of the concurrency support in the RTS to clean things up.  We
      have some new files:
      
        RaiseAsync.{c,h}	asynchronous exception support
        Threads.{c,h}         general threading-related utils
      
      Some of the contents of these new files used to be in Schedule.c,
      which is smaller and cleaner as a result of the split.
      
      Asynchronous exception support in the presence of multiple running
      Haskell threads is rather tricky.  In fact, to my annoyance there are
      still one or two bugs to track down, but the majority of the tests run
      now.
      b1953bbb
  11. 07 Apr, 2006 1 commit
    • Simon Marlow's avatar
      Reorganisation of the source tree · 0065d5ab
      Simon Marlow authored
      Most of the other users of the fptools build system have migrated to
      Cabal, and with the move to darcs we can now flatten the source tree
      without losing history, so here goes.
      
      The main change is that the ghc/ subdir is gone, and most of what it
      contained is now at the top level.  The build system now makes no
      pretense at being multi-project, it is just the GHC build system.
      
      No doubt this will break many things, and there will be a period of
      instability while we fix the dependencies.  A straightforward build
      should work, but I haven't yet fixed binary/source distributions.
      Changes to the Building Guide will follow, too.
      0065d5ab
  12. 08 Feb, 2006 1 commit
    • Simon Marlow's avatar
      make the smp way RTS-only, normal libraries now work with -smp · beb5737b
      Simon Marlow authored
      We had to bite the bullet here and add an extra word to every thunk,
      to enable running ordinary libraries on SMP.  Otherwise, we would have
      needed to ship an extra set of libraries with GHC 6.6 in addition to
      the two sets we already ship (normal + profiled), and all Cabal
      packages would have to be compiled for SMP too.  We decided it best
      just to take the hit now, making SMP easily accessible to everyone in
      GHC 6.6.
      
      Incedentally, although this increases allocation by around 12% on
      average, the performance hit is around 5%, and much less if your inner
      loop doesn't use any laziness.
      beb5737b
  13. 21 Oct, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-10-21 14:02:17 by simonmar] · 03a9ff01
      simonmar authored
      Big re-hash of the threaded/SMP runtime
      
      This is a significant reworking of the threaded and SMP parts of
      the runtime.  There are two overall goals here:
      
        - To push down the scheduler lock, reducing contention and allowing
          more parts of the system to run without locks.  In particular,
          the scheduler does not require a lock any more in the common case.
      
        - To improve affinity, so that running Haskell threads stick to the
          same OS threads as much as possible.
      
      At this point we have the basic structure working, but there are some
      pieces missing.  I believe it's reasonably stable - the important
      parts of the testsuite pass in all the (normal,threaded,SMP) ways.
      
      In more detail:
      
        - Each capability now has a run queue, instead of one global run
          queue.  The Capability and Task APIs have been completely
          rewritten; see Capability.h and Task.h for the details.
      
        - Each capability has its own pool of worker Tasks.  Hence, Haskell
          threads on a Capability's run queue will run on the same worker
          Task(s).  As long as the OS is doing something reasonable, this
          should mean they usually stick to the same CPU.  Another way to
          look at this is that we're assuming each Capability is associated
          with a fixed CPU.
      
        - What used to be StgMainThread is now part of the Task structure.
          Every OS thread in the runtime has an associated Task, and it
          can ask for its current Task at any time with myTask().
      
        - removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead
          (it is now defined for SMP too).
      
        - The RtsAPI has had to change; we must explicitly pass a Capability
          around now.  The previous interface assumed some global state.
          SchedAPI has also changed a lot.
      
        - The OSThreads API now supports thread-local storage, used to
          implement myTask(), although it could be done more efficiently
          using gcc's __thread extension when available.
      
        - I've moved some POSIX-specific stuff into the posix subdirectory,
          moving in the direction of separating out platform-specific
          implementations.
      
        - lots of lock-debugging and assertions in the runtime.  In particular,
          when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is
          also an ASSERT_LOCK_HELD() call.
      
      What's missing so far:
      
        - I have almost certainly broken the Win32 build, will fix soon.
      
        - any kind of thread migration or load balancing.  This is high up
          the agenda, though.
      
        - various performance tweaks to do
      
        - throwTo and forkProcess still do not work in SMP mode
      03a9ff01
  14. 22 Apr, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-04-22 12:28:00 by simonmar] · ec0984a9
      simonmar authored
      - Now that labels are always prefixed with '&' in .hc code, we have to
        fix some sloppiness in the RTS .cmm code.  Fortunately it's not too
        painful.
      
      - SMP: acquire/release the storage manager lock around
        atomicModifyMutVar#.  This is a hack: atomicModifyMutVar# isn't
        atomic under SMP otherwise, but the SM lock is a large sledgehammer.
        I think I'll apply the sledgehammer to the MVar primitives too, for
        the time being.
      ec0984a9
  15. 27 Mar, 2005 1 commit
    • panne's avatar
      [project @ 2005-03-27 13:41:13 by panne] · 03dc2dd3
      panne authored
      * Some preprocessors don't like the C99/C++ '//' comments after a
        directive, so use '/* */' instead. For consistency, a lot of '//' in
        the include files were converted, too.
      
      * UnDOSified libraries/base/cbits/runProcess.c.
      
      * My favourite sport: Killed $Id$s.
      03dc2dd3
  16. 10 Feb, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-02-10 13:01:52 by simonmar] · e7c3f957
      simonmar authored
      GC changes: instead of threading old-generation mutable lists
      through objects in the heap, keep it in a separate flat array.
      
      This has some advantages:
      
        - the IND_OLDGEN object is now only 2 words, so the minimum
          size of a THUNK is now 2 words instead of 3.  This saves
          some amount of allocation (about 2% on average according to
          my measurements), and is more friendly to the cache by
          squashing objects together more.
      
        - keeping the mutable list separate from the IND object
          will be necessary for our multiprocessor implementation.
      
        - removing the mut_link field makes the layout of some objects
          more uniform, leading to less complexity and special cases.
      
        - I also unified the two mutable lists (mut_once_list and mut_list)
          into a single mutable list, which lead to more simplifications
          in the GC.
      e7c3f957
  17. 18 Nov, 2004 1 commit
  18. 13 Aug, 2004 1 commit
  19. 28 Apr, 2003 1 commit
    • simonmar's avatar
      [project @ 2003-04-28 09:55:20 by simonmar] · 1fd3ead7
      simonmar authored
      Following the recent change to the layout of the StgRetDyn frame, we
      now need to bump RESERVED_STACK_WORDS because this governs the amount
      of room which is guaranteed to be available on the stack in the event
      of a stack check failure.
      
      This accounts for at least one cause of recent crashes in the HEAD.
      1fd3ead7
  20. 25 Mar, 2003 2 commits
  21. 11 Dec, 2002 1 commit
    • simonmar's avatar
      [project @ 2002-12-11 15:36:20 by simonmar] · 0bffc410
      simonmar authored
      Merge the eval-apply-branch on to the HEAD
      ------------------------------------------
      
      This is a change to GHC's evaluation model in order to ultimately make
      GHC more portable and to reduce complexity in some areas.
      
      At some point we'll update the commentary to describe the new state of
      the RTS.  Pending that, the highlights of this change are:
      
        - No more Su.  The Su register is gone, update frames are one
          word smaller.
      
        - Slow-entry points and arg checks are gone.  Unknown function calls
          are handled by automatically-generated RTS entry points (AutoApply.hc,
          generated by the program in utils/genapply).
      
        - The stack layout is stricter: there are no "pending arguments" on
          the stack any more, the stack is always strictly a sequence of
          stack frames.
      
          This means that there's no need for LOOKS_LIKE_GHC_INFO() or
          LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
          how to find the boundary between the text and data segments (BIG WIN!).
      
        - A couple of nasty hacks in the mangler caused by the neet to
          identify closure ptrs vs. info tables have gone away.
      
        - Info tables are a bit more complicated.  See InfoTables.h for the
          details.
      
        - As a side effect, GHCi can now deal with polymorphic seq.  Some bugs
          in GHCi which affected primitives and unboxed tuples are now
          fixed.
      
        - Binary sizes are reduced by about 7% on x86.  Performance is roughly
          similar, some programs get faster while some get slower.  I've seen
          GHCi perform worse on some examples, but haven't investigated
          further yet (GHCi performance *should* be about the same or better
          in theory).
      
        - Internally the code generator is rather better organised.  I've moved
          info-table generation from the NCG into the main codeGen where it is
          shared with the C back-end; info tables are now emitted as arrays
          of words in both back-ends.  The NCG is one step closer to being able
          to support profiling.
      
      This has all been fairly thoroughly tested, but no doubt I've messed
      up the commit in some way.
      0bffc410
  22. 23 Sep, 2002 1 commit
  23. 28 Nov, 2001 1 commit
  24. 26 Nov, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-11-26 16:54:21 by simonmar] · dbef766c
      simonmar authored
      Profiling cleanup.
      
      This commit eliminates some duplication in the various heap profiling
      subsystems, and generally centralises much of the machinery.  The key
      concept is the separation of a heap *census* (which is now done in one
      place only instead of three) from the calculation of retainer sets.
      Previously the retainer profiling code also did a heap census on the
      fly, and lag-drag-void profiling had its own census machinery.
      
      Value-adds:
      
         - you can now restrict a heap profile to certain retainer sets,
           but still display by cost centre (or type, or closure or
           whatever).
      
         - I've added an option to restrict the maximum retainer set size
           (+RTS -R<size>, defaulting to 8).
      
         - I've cleaned up the heap profiling options at the request of
           Simon PJ.  See the help text for details.  The new scheme
           is backwards compatible with the old.
      
         - I've removed some odd bits of LDV or retainer profiling-specific
           code from various parts of the system.
      
         - the time taken doing heap censuses (and retainer set calculation)
           is now accurately reported by the RTS when you say +RTS -Sstderr.
      
      Still to come:
      
         - restricting a profile to a particular biography
           (lag/drag/void/use).  This requires keeping old heap censuses
           around, but the infrastructure is now in place to do this.
      dbef766c
  25. 03 Oct, 2001 1 commit
    • simonmar's avatar
      [project @ 2001-10-03 13:57:42 by simonmar] · b4623557
      simonmar authored
      Tidy up ghc/includes/Constants and related things.
      
      Now all the constants that the compiler needs to know, such as header
      size, update frame size, info table size and so on are generated
      automatically into a header file, DeriviedConstants.h, by a small C
      program in the same way as NativeDefs.h.  The C code in the RTS is
      expected to use sizeof() directly (it already does).
      
      Also tidied up the constants in MachDeps.h - all the constants
      representing the sizes of various types are named SIZEOF_<foo>, to
      match the constants defined in config.h.  PrelStorable.lhs now doesn't
      contain any special knowledge about GHC's conventions as regards the
      size of certain types, this is all in MachDeps.h.
      b4623557
  26. 01 Aug, 2001 1 commit
  27. 31 Jul, 2001 2 commits
  28. 07 Aug, 2000 1 commit
    • qrczak's avatar
      [project @ 2000-08-07 23:37:19 by qrczak] · 4b172698
      qrczak authored
      Now Char, Char#, StgChar have 31 bits (physically 32).
      "foo"# is still an array of bytes.
      
      CharRep represents 32 bits (on a 64-bit arch too). There is also
      Int8Rep, used in those places where bytes were originally meant.
      readCharArray, indexCharOffAddr etc. still use bytes. Storable and
      {I,M}Array use wide Chars.
      
      In future perhaps all sized integers should be primitive types. Then
      some usages of indexing primops scattered through the code could
      be changed to then-available Int8 ones, and then Char variants of
      primops could be made wide (other usages that handle text should use
      conversion that will be provided later).
      
      I/O and _ccall_ arguments assume ISO-8859-1. UTF-8 is internally used
      for string literals (only).
      
      Z-encoding is ready for Unicode identifiers.
      
      Ranges of intlike and charlike closures are more easily configurable.
      
      I've probably broken nativeGen/MachCode.lhs:chrCode for Alpha but I
      don't know the Alpha assembler to fix it (what is zapnot?). Generally
      I'm not sure if I've done the NCG changes right.
      
      This commit breaks the binary compatibility (of course).
      
      TODO:
      * is* and to{Lower,Upper} in Char (in progress).
      * Libraries for text conversion (in design / experiments),
        to be plugged to I/O and a higher level foreign library.
      * PackedString.
      * StringBuffer and accepting source in encodings other than ISO-8859-1.
      4b172698
  29. 03 Aug, 2000 1 commit
    • simonmar's avatar
      [project @ 2000-08-03 11:28:35 by simonmar] · 66f7a41d
      simonmar authored
      Implement +RTS -C<n>, the context switch interval flag.  This was
      previously advertised in the usage message, but there was a note in
      the Users' Guide stating that it didn't work.  Anwyay, I'm going to
      consider it a bug and backport to 4.08.1.
      66f7a41d
  30. 26 Jul, 2000 1 commit
  31. 28 Feb, 2000 1 commit
    • sewardj's avatar
      [project @ 2000-02-28 12:02:31 by sewardj] · 4070b105
      sewardj authored
      Many changes to improve the quality and correctness of generated code,
      both for x86 and all-platforms.  The intent is that the x86 NCG will
      now be good enough for general use.
      
      -- Add an almost-trivial Stix (generic) peephole optimiser, whose sole
         purpose is elide assignments to temporaries used only once, in the
         very next tree.  This generates substantially better code for
         conditionals on all platforms.  Enhance Stix constant folding to
         take advantage of the inlining.
      
         The inlining presents subsequent insn selection phases with more
         complex trees than would have previously been used to.  This has
         shown up several bugs in the x86 insn selectors, now fixed.
         (assumptions that data size is Word, when could be Byte,
          assumptions that an operand will always be in a temp reg, etc)
      
      -- x86: Use the FLDZ and FLD1 insns.
      
      -- x86: spill FP registers with 80-bit loads/stores so that
         Intel's extra 16 bits of accuracy are not lost.  If this isn't
         done, FP spills are not suitably transparent.  Increase the
         number of spill words available to 2048.
      
      -- x86: give the register allocator more flexibility in choosing
         spill temporaries.
      
      -- x86, RegAllocInfo.regUsage: fix error for GST, and rewrite to
         make it clearer.
      
      -- Correctly track movements in the C stack pointer, and generate
         correct spill code for archs which spill against the stack pointer
         even when the stack pointer moves.  Redo the x86 ccall mechanism
         to push args on the C stack in the normal way.  Rather than have
         the spiller have to analyse code sequences to determine the current
         stack offset, the insn selectors communicate the current offset
         whenever it changes by inserting a DELTA pseudo-insn.  Then the
         spiller only has to spot DELTAs.
      
         This means having a new native-code-generator monad (Stix.NatM)
         which carries both a UniqSupply and the current stack offset.
      
      -- Remove the asmPar/asmSeq ways of grouping insns together.
         In the presence of fixed registers, it is hard to demonstrate
         that insn selectors using asmPar always give correct code, and
         the extra complication doesn't help any.
      
         Also, directly construct code sequences using tree-based ordered
         lists (utils/OrdList.lhs) for linear-time appends, rather than
         the bizarrely complex method using fns and fn composition.
      
      -- Inline some hcats in printing of x86 address modes.
      
      -- Document more of the hidden assumptions which insn selection relies
         on, particular wrt addressing modes.
      4070b105
  32. 01 Feb, 2000 1 commit
  33. 24 Jan, 2000 1 commit
    • sewardj's avatar
      [project @ 2000-01-24 18:22:07 by sewardj] · 9ac31f7c
      sewardj authored
      ARR_HDR_SIZE --> ARR_WORDS_HDR_SIZE, and derived quantities in
      Constants.h, Constants.lhs et al are similarly renamed.
      
      new constant ARR_PTRS_HDR_SIZE, with corresponding derivatives.
      9ac31f7c
  34. 13 Jan, 2000 1 commit
    • hwloidl's avatar
      [project @ 2000-01-13 14:33:57 by hwloidl] · 1b28d4e1
      hwloidl authored
      Merged GUM-4-04 branch into the main trunk. In particular merged GUM and
      SMP code. Most of the GranSim code in GUM-4-04 still has to be carried over.
      1b28d4e1
  35. 27 Oct, 1999 1 commit
  36. 26 Mar, 1999 1 commit
  37. 05 Feb, 1999 1 commit
  38. 26 Jan, 1999 1 commit