This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
  1. 02 Jun, 2009 2 commits
  2. 06 Mar, 2009 1 commit
    • Simon Marlow's avatar
      Partial fix for #2917 · 1b62aece
      Simon Marlow authored
       - add newAlignedPinnedByteArray# for allocating pinned BAs with
         arbitrary alignment
      
       - the old newPinnedByteArray# now aligns to 16 bytes
      
      Foreign.alloca will use newAlignedPinnedByteArray#, and so might end
      up wasting less space than before (we used to align to 8 by default).
      Foreign.allocaBytes and Foreign.mallocForeignPtrBytes will get 16-byte
      aligned memory, which is enough to avoid problems with SSE
      instructions on x86, for example.
      
      There was a bug in the old newPinnedByteArray#: it aligned to 8 bytes,
      but would have failed if the header was not a multiple of 8
      (fortunately it always was, even with profiling).  Also we
      occasionally wasted some space unnecessarily due to alignment in
      allocatePinned().
      
      I haven't done anything about Foreign.malloc/mallocBytes, which will
      give you the same alignment guarantees as malloc() (8 bytes on
      Linux/x86 here).
      1b62aece
  3. 27 Jan, 2009 1 commit
  4. 10 Dec, 2008 1 commit
  5. 18 Nov, 2008 1 commit
    • Simon Marlow's avatar
      Add optional eager black-holing, with new flag -feager-blackholing · d600bf7a
      Simon Marlow authored
      Eager blackholing can improve parallel performance by reducing the
      chances that two threads perform the same computation.  However, it
      has a cost: one extra memory write per thunk entry.  
      
      To get the best results, any code which may be executed in parallel
      should be compiled with eager blackholing turned on.  But since
      there's a cost for sequential code, we make it optional and turn it on
      for the parallel package only.  It might be a good idea to compile
      applications (or modules) with parallel code in with
      -feager-blackholing.
      
      ToDo: document -feager-blackholing.
      d600bf7a
  6. 06 Nov, 2008 1 commit
  7. 10 Oct, 2008 1 commit
  8. 19 Sep, 2008 1 commit
  9. 10 Jul, 2008 1 commit
  10. 09 Jul, 2008 1 commit
  11. 04 Jun, 2008 1 commit
  12. 16 Apr, 2008 1 commit
  13. 17 Apr, 2008 1 commit
  14. 09 Apr, 2008 2 commits
  15. 07 Apr, 2008 1 commit
  16. 02 Apr, 2008 1 commit
    • Simon Marlow's avatar
      Do not #include external header files when compiling via C · c245355e
      Simon Marlow authored
      This has several advantages:
      
       - -fvia-C is consistent with -fasm with respect to FFI declarations:
         both bind to the ABI, not the API.
      
       - foreign calls can now be inlined freely across module boundaries, since
         a header file is not required when compiling the call.
      
       - bootstrapping via C will be more reliable, because this difference
         in behavour between the two backends has been removed.
      
      There is one disadvantage:
      
       - we get no checking by the C compiler that the FFI declaration
         is correct.
      
      So now, the c-includes field in a .cabal file is always ignored by
      GHC, as are header files specified in an FFI declaration.  This was
      previously the case only for -fasm compilations, now it is also the
      case for -fvia-C too.
      c245355e
  17. 11 Oct, 2007 1 commit
    • Simon Marlow's avatar
      Add a proper write barrier for MVars · 1ed01a87
      Simon Marlow authored
      Previously MVars were always on the mutable list of the old
      generation, which meant every MVar was visited during every minor GC.
      With lots of MVars hanging around, this gets expensive.  We addressed
      this problem for MUT_VARs (aka IORefs) a while ago, the solution is to
      use a traditional GC write-barrier when the object is modified.  This
      patch does the same thing for MVars.
      
      TVars are still done the old way, they could probably benefit from the
      same treatment too.
      1ed01a87
  18. 10 Oct, 2007 1 commit
    • Simon Marlow's avatar
      GHCi: use non-updatable thunks for breakpoints · 27779403
      Simon Marlow authored
      The extra safe points introduced for breakpoints were previously
      compiled as normal updatable thunks, but they are guaranteed
      single-entry, so we can use non-updatable thunks here.  This restores
      the tail-call property where it was lost in some cases (although stack
      squeezing probably often recovered it), and should improve
      performance.
      27779403
  19. 16 May, 2007 1 commit
  20. 15 May, 2007 1 commit
    • Simon Marlow's avatar
      GHCi debugger: new flag -fbreak-on-exception · 17f848e1
      Simon Marlow authored
      When -fbreak-on-exception is set, an exception will cause GHCi to
      suspend the current computation and return to the prompt, where the
      history of the current evaluation can be inspected (if we are in
      :trace).  This isn't on by default, because the behaviour could be
      confusing: for example, ^C will cause a breakpoint.  It can be very
      useful for finding the cause of a "head []" or a "fromJust Nothing",
      though.
      17f848e1
  21. 17 Apr, 2007 1 commit
    • Simon Marlow's avatar
      Re-working of the breakpoint support · cdce6477
      Simon Marlow authored
      This is the result of Bernie Pope's internship work at MSR Cambridge,
      with some subsequent improvements by me.  The main plan was to
      
       (a) Reduce the overhead for breakpoints, so we could enable 
           the feature by default without incurrent a significant penalty
       (b) Scatter more breakpoint sites throughout the code
      
      Currently we can set a breakpoint on almost any subexpression, and the
      overhead is around 1.5x slower than normal GHCi.  I hope to be able to
      get this down further and/or allow breakpoints to be turned off.
      
      This patch also fixes up :print following the recent changes to
      constructor info tables.  (most of the :print tests now pass)
      
      We now support single-stepping, which just enables all breakpoints.
      
        :step <expr>     executes <expr> with single-stepping turned on
        :step            single-steps from the current breakpoint
      
      The mechanism is quite different to the previous implementation.  We
      share code with the HPC (haskell program coverage) implementation now.
      The coverage pass annotates source code with "tick" locations which
      are tracked by the coverage tool.  In GHCi, each "tick" becomes a
      potential breakpoint location.
      
      Previously breakpoints were compiled into code that magically invoked
      a nested instance of GHCi.  Now, a breakpoint causes the current
      thread to block and control is returned to GHCi.
      
      See the wiki page for more details and the current ToDo list:
      
        http://hackage.haskell.org/trac/ghc/wiki/NewGhciDebugger
      cdce6477
  22. 16 Apr, 2007 1 commit
    • Simon Marlow's avatar
      MERGE: Fix a few uses of the wrong return convention for the scheduler · 94363dd5
      Simon Marlow authored
      We changed the convention a while ago so that BaseReg is returned to
      the scheduler in R1, because BaseReg may change during the run of a
      thread, e.g. during a foreign call.  A few places got missed,
      mostly for very rare events.
      
      Should fix concprog001, although I'm not able to reliably reproduce
      the failure.
      94363dd5
  23. 07 Mar, 2007 1 commit
  24. 28 Feb, 2007 1 commit
  25. 09 Dec, 2006 1 commit
  26. 07 Oct, 2006 1 commit
  27. 20 Jun, 2006 1 commit
  28. 16 Jun, 2006 1 commit
    • Simon Marlow's avatar
      Asynchronous exception support for SMP · b1953bbb
      Simon Marlow authored
      This patch makes throwTo work with -threaded, and also refactors large
      parts of the concurrency support in the RTS to clean things up.  We
      have some new files:
      
        RaiseAsync.{c,h}	asynchronous exception support
        Threads.{c,h}         general threading-related utils
      
      Some of the contents of these new files used to be in Schedule.c,
      which is smaller and cleaner as a result of the split.
      
      Asynchronous exception support in the presence of multiple running
      Haskell threads is rather tricky.  In fact, to my annoyance there are
      still one or two bugs to track down, but the majority of the tests run
      now.
      b1953bbb
  29. 07 Apr, 2006 1 commit
    • Simon Marlow's avatar
      Reorganisation of the source tree · 0065d5ab
      Simon Marlow authored
      Most of the other users of the fptools build system have migrated to
      Cabal, and with the move to darcs we can now flatten the source tree
      without losing history, so here goes.
      
      The main change is that the ghc/ subdir is gone, and most of what it
      contained is now at the top level.  The build system now makes no
      pretense at being multi-project, it is just the GHC build system.
      
      No doubt this will break many things, and there will be a period of
      instability while we fix the dependencies.  A straightforward build
      should work, but I haven't yet fixed binary/source distributions.
      Changes to the Building Guide will follow, too.
      0065d5ab
  30. 27 Mar, 2006 1 commit
    • Simon Marlow's avatar
      Add a new primitive forkOn#, for forking a thread on a specific Capability · c520a3a2
      Simon Marlow authored
      This gives some control over affinity, while we figure out the best
      way to automatically schedule threads to make best use of the
      available parallelism.
      
      In addition to the primitive, there is also:
       
        GHC.Conc.forkOnIO :: Int -> IO () -> IO ThreadId
      
      where 'forkOnIO i m' creates a thread on Capability (i `rem` N), where
      N is the number of available Capabilities set by +RTS -N.
      
      Threads forked by forkOnIO do not automatically migrate when there are
      free Capabilities, like normal threads do.  Still, if you're using
      forkOnIO exclusively, it's a good idea to do +RTS -qm to disable work
      pushing anyway (work pushing takes too much time when the run queues
      are large, this is something we need to fix).
      c520a3a2
  31. 14 Mar, 2006 1 commit
    • Simon Marlow's avatar
      Make it a fatal error to try to enter a PAP · 960a5e6a
      Simon Marlow authored
      This is just an assertion, in effect: we should never enter a PAP, but
      for convenience we previously attached the PAP apply code to the PAP
      info table.  The problem with this was that it makes it harder to track
      down bugs that result in entering a PAP...
      960a5e6a
  32. 28 Feb, 2006 1 commit
    • Simon Marlow's avatar
      pass arguments to unknown function calls in registers · 04db0e9f
      Simon Marlow authored
      We now have more stg_ap entry points: stg_ap_*_fast, which take
      arguments in registers according to the platform calling convention.
      This is faster if the function being called is evaluated and has the
      right arity, which is the common case (see the eval/apply paper for
      measurements).  
      
      We still need the stg_ap_*_info entry points for stack-based
      application, such as an overflows when a function is applied to too
      many argumnets.  The stg_ap_*_fast functions actually just check for
      an evaluated function, and if they don't find one, push the args on
      the stack and invoke stg_ap_*_info.  (this might be slightly slower in
      some cases, but not the common case).
      04db0e9f
  33. 17 Jan, 2006 2 commits
    • simonmar's avatar
      [project @ 2006-01-17 16:13:18 by simonmar] · 91b07216
      simonmar authored
      Improve the GC behaviour of IORefs (see Ticket #650).
      
      This is a small change to the way IORefs interact with the GC, which
      should improve GC performance for programs with plenty of IORefs.
      
      Previously we had a single closure type for mutable variables,
      MUT_VAR.  Mutable variables were *always* on the mutable list in older
      generations, and always traversed on every GC.
      
      Now, we have two closure types: MUT_VAR_CLEAN and MUT_VAR_DIRTY.  The
      latter is on the mutable list, but the former is not.  (NB. this
      differs from MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY, both of which
      are on the mutable list).  writeMutVar# now implements a write
      barrier, by calling dirty_MUT_VAR() in the runtime, that does the
      necessary modification of MUT_VAR_CLEAN into MUT_VAR_DIRY, and adding
      to the mutable list if necessary.
      
      This results in some pretty dramatic speedups for GHC itself.  I've
      just measureed a 30% overall speedup compiling a 31-module program
      (anna) with the default heap settings :-D
      91b07216
    • simonmar's avatar
      [project @ 2006-01-17 16:03:47 by simonmar] · da69fa9c
      simonmar authored
      Improve the GC behaviour of IOArrays/STArrays
      
      See Ticket #650
      
      This is a small change to the way mutable arrays interact with the GC,
      that can have a dramatic effect on performance, and make tricks with
      unsafeThaw/unsafeFreeze redundant.  Data.HashTable should be faster
      now (I haven't measured it yet).
      
      We now have two mutable array closure types, MUT_ARR_PTRS_CLEAN and
      MUT_ARR_PTRS_DIRTY.  Both are on the mutable list if the array is in
      an old generation.  writeArray# sets the type to MUT_ARR_PTRS_DIRTY.
      The garbage collector can set the type to MUT_ARR_PTRS_CLEAN if it
      finds that no element of the array points into a younger generation
      (discovering this required a small addition to evacuate(), but rough
      tests indicate that it doesn't measurably affect performance).
      
      NOTE: none of this affects unboxed arrays (IOUArray/STUArray), only
      boxed arrays (IOArray/STArray).
      
      We could go further and extend the DIRTY bit to be per-block rather
      than for the whole array, but for now this is an easy improvement.
      da69fa9c
  34. 28 Nov, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-11-28 14:39:47 by simonmar] · d3d69395
      simonmar authored
      Small performance improvement to STM: reduce the size of an atomically
      frame from 3 words to 2 words by combining the "waiting" boolean field
      with the info pointer, i.e. having two separate info tables/return
      addresses for an atomically frame, one for the normal case and one for
      the waiitng case.
      d3d69395
  35. 18 Nov, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-11-18 15:24:12 by simonmar] · c5cd2343
      simonmar authored
      Two improvements to the SMP runtime:
      
        - support for 'par', aka sparks.  Load balancing is very primitive
          right now, but I have seen programs that go faster using par.
      
        - support for backing off when a thread is found to be duplicating
          a computation currently underway in another thread.  This also
          fixes some instability in SMP, because it turned out that when
          an update frame points to an indirection, which can happen if
          a thunk is under evaluation in multiple threads, then after GC
          has shorted out the indirection the update will trash the value.
          Now we suspend the duplicate computation to the heap before this
          can happen.
      
      Additionally:
      
        - stack squeezing is separate from lazy blackholing, and now only
          happens if there's a reasonable amount of squeezing to be done
          in relation to the number of words of stack that have to be moved.
          This means we won't try to shift 10Mb of stack just to save 2
          words at the bottom (it probably never happened, but still).
      
        - update frames are now marked when they have been visited by lazy
          blackholing, as per the SMP paper.
      
        - cleaned up raiseAsync() a bit.
      c5cd2343
  36. 10 Nov, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-11-10 16:14:01 by simonmar] · 84b434c5
      simonmar authored
      Fix a crash in STM; we were releasing ownership of the transaction too
      early in stmWait(), so a TSO could be woken up before we had finished
      putting it to sleep properly.
      84b434c5
  37. 13 Oct, 2005 1 commit