1. 06 Dec, 2016 1 commit
    • Simon Marlow's avatar
      Overhaul GC stats · 24e6594c
      Simon Marlow authored
      Summary:
      Visible API changes:
      
      * The C struct `GCDetails` gives the stats about a single GC.  This is
        passed to the `gcDone()` callback if one is set via the
        RtsConfig. (previously we just passed a collection of values, so this
        is more extensible, at the expense of breaking the existing API)
      
      * `RTSStats` gives cumulative stats since the start of the program,
        and includes the `GCDetails` for the most recent GC.  This struct
        can be obtained via `getRTSStats()` (the old `getGCStats()` has been
        removed, and `getGCStatsEnabled()` has been renamed to
        `getRTSStatsEnabled()`)
      
      Improvements:
      
      * The per-GC stats and cumulative stats are now cleanly separated.
      
      * Inside the RTS we have a top-level `RTSStats` struct to keep all our
        stats in, previously this was just a collection of strangely-named
        variables.  This struct is mostly just copied in `getRTSStats()`, so
        the implementation of that function is a lot shorter.
      
      * Types are more consistent.  We use a uint64_t byte count for all
        memory values, and Time for all time values.
      
      * Names are more consistent.  We use a suffix `_bytes` for all byte
        counts and `_ns` for all time values.
      
      * We now collect information about the amount of memory in large
        objects and compact objects in `GCDetails`. (the latter was the reason
        I started doing this patch but it seems to have ballooned a bit!)
      
      * I fixed a bug in the calculation of the elapsed MUT time, and added
        an ASSERT to stop the calculations going wrong in the future.
      
      For now I kept the Haskell API in `GHC.Stats` the same, by
      impedence-matching with the new API.  We could either break that API
      and make it match the C API more closely, or we could add a new API
      and deprecate the old one.  Opinions welcome.
      
      This stuff is very easy to get wrong, and it's hard to test.  Reviews
      welcome!
      
      Test Plan:
      manual testing
      validate
      
      Reviewers: bgamari, niteria, austin, ezyang, hvr, erikd, rwbarton, Phyx
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D2756
      24e6594c
  2. 04 May, 2016 1 commit
  3. 07 Feb, 2016 1 commit
  4. 18 Nov, 2015 1 commit
  5. 07 Apr, 2015 1 commit
  6. 29 Sep, 2014 1 commit
  7. 28 Jul, 2014 1 commit
  8. 10 Jul, 2014 1 commit
  9. 14 Feb, 2013 1 commit
    • Simon Marlow's avatar
      Simplify the allocation stats accounting · 65a0e1eb
      Simon Marlow authored
      We were doing it in two different ways and asserting that the results
      were the same.  In most cases they were, but I found one case where
      they weren't: the GC itself allocates some memory for running
      finalizers, and this memory was accounted for one way but not the
      other.
      
      It was simpler to remove the old way of counting allocation that to
      try to fix it up, so I did that.
      65a0e1eb
  10. 17 Jan, 2013 1 commit
  11. 07 Sep, 2012 1 commit
    • Simon Marlow's avatar
      Deprecate lnat, and use StgWord instead · 41737f12
      Simon Marlow authored
      lnat was originally "long unsigned int" but we were using it when we
      wanted a 64-bit type on a 64-bit machine.  This broke on Windows x64,
      where long == int == 32 bits.  Using types of unspecified size is bad,
      but what we really wanted was a type with N bits on an N-bit machine.
      StgWord is exactly that.
      
      lnat was mentioned in some APIs that clients might be using
      (e.g. StackOverflowHook()), so we leave it defined but with a comment
      to say that it's deprecated.
      41737f12
  12. 26 Apr, 2012 2 commits
    • Ian Lynagh's avatar
      OS X build fixes · 42760bd9
      Ian Lynagh authored
      OS X doesn't understand 'gnu_printf', so we need to onyl use it
      conditionally.
      42760bd9
    • Ian Lynagh's avatar
      Fix warnings on Win64 · 1dbe6d59
      Ian Lynagh authored
      Mostly this meant getting pointer<->int conversions to use the right
      sizes. lnat is now size_t, rather than unsigned long, as that seems a
      better match for how it's used.
      1dbe6d59
  13. 04 Apr, 2012 2 commits
    • Mikolaj Konarski's avatar
      Fix the timestamps in GC_START and GC_END events on the GC-initiating cap · 598109eb
      Mikolaj Konarski authored
      There was a discrepancy between GC times reported in +RTS -s
      and the timestamps of GC_START and GC_END events on the cap,
      on which +RTS -s stats for the given GC are based.
      This is fixed by posting the events with exactly the same timestamp
      as generated for the stat calculation. The calls posting the events
      are moved too, so that the events are emitted close to the time instant
      they claim to be emitted at. The GC_STATS_GHC was moved, too, ensuring
      it's emitted before the moved GC_END on all caps, which simplifies tools code.
      598109eb
    • Duncan Coutts's avatar
      Add new eventlog events for various heap and GC statistics · 65aaa9b2
      Duncan Coutts authored
      They cover much the same info as is available via the GHC.Stats module
      or via the '+RTS -s' textual output, but via the eventlog and with a
      better sampling frequency.
      
      We have three new generic heap info events and two very GHC-specific
      ones. (The hope is the general ones are usable by other implementations
      that use the same eventlog system, or indeed not so sensitive to changes
      in GHC itself.)
      
      The general ones are:
      
       * total heap mem allocated since prog start, on a per-HEC basis
       * current size of the heap (MBlocks reserved from OS for the heap)
       * current size of live data in the heap
      
      Currently these are all emitted by GHC at GC time (live data only at
      major GC).
      
      The GHC specific ones are:
      
       * an event giving various static heap paramaters:
         * number of generations (usually 2)
         * max size if any
         * nursary size
         * MBlock and block sizes
       * a event emitted on each GC containing:
         * GC generation (usually just 0,1)
         * total bytes copied
         * bytes lost to heap slop and fragmentation
         * the number of threads in the parallel GC (1 for serial)
         * the maximum number of bytes copied by any par GC thread
         * the total number of bytes copied by all par GC threads
           (these last three can be used to calculate an estimate of the
            work balance in parallel GCs)
      65aaa9b2
  14. 25 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Time handling overhaul · 6b109851
      Simon Marlow authored
      Terminology cleanup: the type "Ticks" has been renamed "Time", which
      is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds).
      The terminology "tick" is now used consistently to mean the interval
      between timer signals.
      
      The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if
      we have it).  Before it used CPU time in the non-threaded RTS and
      realtime in the threaded RTS, but I've discovered that the CPU timer
      has terrible resolution (at least on Linux) and isn't much use for
      profiling.  So now we always use realtime.  This should also fix
      
      The default tick interval is now 10ms, except when profiling where we
      drop it to 1ms.  This gives more accurate profiles without affecting
      runtime too much (<1%).
      
      Lots of cleanups - the resolution of Time is now in one place
      only (Rts.h) rather than having calculations that depend on the
      resolution scattered all over the RTS.  I hope I found them all.
      6b109851
  15. 23 Jul, 2011 2 commits
    • Ian Lynagh's avatar
      Remove prototype to mut_user_time_during_GC · 5c6e293d
      Ian Lynagh authored
      The function no longer exists.
      5c6e293d
    • Ian Lynagh's avatar
      Fix heap profiling times · 4f8cfaf9
      Ian Lynagh authored
      Now that the heap census runs in the middle of garbage collections,
      the "CPU time" it was calculating included any CPU time used so far
      in the current GC. This could cause CPU time to appear to go down,
      which means hp2ps complained about "samples out of sequence".
      
      I'm not sure if this is the nicest way to solve this (maybe resurrecting
      mut_user_time_during_GC would be better?) but it gets things working again.
      4f8cfaf9
  16. 11 Apr, 2011 1 commit
    • Simon Marlow's avatar
      Refactoring and tidy up · 1fb38442
      Simon Marlow authored
      This is a port of some of the changes from my private local-GC branch
      (which is still in darcs, I haven't converted it to git yet).  There
      are a couple of small functional differences in the GC stats: first,
      per-thread GC timings should now be more accurate, and secondly we now
      report average and maximum pause times. e.g. from minimax +RTS -N8 -s:
      
                                          Tot time (elapsed)  Avg pause  Max pause
        Gen  0      2755 colls,  2754 par   13.16s    0.93s     0.0003s    0.0150s
        Gen  1       769 colls,   769 par    3.71s    0.26s     0.0003s    0.0059s
      1fb38442
  17. 17 Jun, 2010 2 commits
  18. 02 Dec, 2009 1 commit
  19. 09 Sep, 2009 1 commit
  20. 05 Aug, 2009 1 commit
  21. 02 Aug, 2009 1 commit
    • Simon Marlow's avatar
      RTS tidyup sweep, first phase · a2a67cd5
      Simon Marlow authored
      The first phase of this tidyup is focussed on the header files, and in
      particular making sure we are exposinng publicly exactly what we need
      to, and no more.
      
       - Rts.h now includes everything that the RTS exposes publicly,
         rather than a random subset of it.
      
       - Most of the public header files have moved into subdirectories, and
         many of them have been renamed.  But clients should not need to
         include any of the other headers directly, just #include the main
         public headers: Rts.h, HsFFI.h, RtsAPI.h.
      
       - All the headers needed for via-C compilation have moved into the
         stg subdirectory, which is self-contained.  Most of the headers for
         the rest of the RTS APIs have moved into the rts subdirectory.
      
       - I left MachDeps.h where it is, because it is so widely used in
         Haskell code.
       
       - I left a deprecated stub for RtsFlags.h in place.  The flag
         structures are now exposed by Rts.h.
      
       - Various internal APIs are no longer exposed by public header files.
      
       - Various bits of dead code and declarations have been removed
      
       - More gcc warnings are turned on, and the RTS code is more
         warning-clean.
      
       - More source files #include "PosixSource.h", and hence only use
         standard POSIX (1003.1c-1995) interfaces.
      
      There is a lot more tidying up still to do, this is just the first
      pass.  I also intend to standardise the names for external RTS APIs
      (e.g use the rts_ prefix consistently), and declare the internal APIs
      as hidden for shared libraries.
      a2a67cd5
  22. 16 Apr, 2008 2 commits
  23. 31 Oct, 2007 1 commit
  24. 08 Nov, 2006 1 commit
    • mrchebas@gmail.com's avatar
      Addition of PAPI to RTS · fe07f054
      mrchebas@gmail.com authored
      This patch still requires the addition of the USE_PAPI
      define to compile with PAPI. Also, programs must be
      compiled and linked with the appropriate library flags
      for papi.
      fe07f054
  25. 08 Jun, 2006 1 commit
    • Simon Marlow's avatar
      New tracing interface · 5a2769f0
      Simon Marlow authored
      A simple interface for generating trace messages with timestamps and
      thread IDs attached to them.  Most debugging output goes through this
      interface now, so it is straightforward to get timestamped debugging
      traces with +RTS -vt.  Also, we plan to use this to generate
      parallelism profiles from the trace output.
      5a2769f0
  26. 07 Apr, 2006 1 commit
    • Simon Marlow's avatar
      Reorganisation of the source tree · 0065d5ab
      Simon Marlow authored
      Most of the other users of the fptools build system have migrated to
      Cabal, and with the move to darcs we can now flatten the source tree
      without losing history, so here goes.
      
      The main change is that the ghc/ subdir is gone, and most of what it
      contained is now at the top level.  The build system now makes no
      pretense at being multi-project, it is just the GHC build system.
      
      No doubt this will break many things, and there will be a period of
      instability while we fix the dependencies.  A straightforward build
      should work, but I haven't yet fixed binary/source distributions.
      Changes to the Building Guide will follow, too.
      0065d5ab
  27. 12 Jan, 2006 1 commit
    • simonmar's avatar
      [project @ 2006-01-12 14:42:25 by simonmar] · de910f06
      simonmar authored
      +RTS -S: replace "collected" with "copied", which is more useful.
      +RTS -Dg: print size of mutable list, and breakdown by type of closure
      (MUT_VAR, MUT_ARR, others).
      de910f06
  28. 03 Nov, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-11-03 11:05:38 by simonmar] · cc2926f3
      simonmar authored
      Improvments to time-measurement and stats:
      
        - move all the platform-dependent timing related stuff into
          posix/GetTime.c and win32/GetTime.c, with the machine-indepent
          interface specified in GetTime.h.  This is now used by
          Stats.c.
      
        - On Unix, use gettimeofday() and getrusage() by default, falling
          back to time() if one of these isn't available.
      
        - try to implement thread-specfic CPU-time measurement using
          clock_gettime() on Unix.  Doesn't work reliably on Linux, because
          the implemenation tries to use the processor TSC, which on an
          SMP machine goes wrong when the thread moves between CPUs.  However,
          it's slightly less bogus that before, and hopefully will improve
          in the future.
      cc2926f3
  29. 21 Oct, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-10-21 14:02:17 by simonmar] · 03a9ff01
      simonmar authored
      Big re-hash of the threaded/SMP runtime
      
      This is a significant reworking of the threaded and SMP parts of
      the runtime.  There are two overall goals here:
      
        - To push down the scheduler lock, reducing contention and allowing
          more parts of the system to run without locks.  In particular,
          the scheduler does not require a lock any more in the common case.
      
        - To improve affinity, so that running Haskell threads stick to the
          same OS threads as much as possible.
      
      At this point we have the basic structure working, but there are some
      pieces missing.  I believe it's reasonably stable - the important
      parts of the testsuite pass in all the (normal,threaded,SMP) ways.
      
      In more detail:
      
        - Each capability now has a run queue, instead of one global run
          queue.  The Capability and Task APIs have been completely
          rewritten; see Capability.h and Task.h for the details.
      
        - Each capability has its own pool of worker Tasks.  Hence, Haskell
          threads on a Capability's run queue will run on the same worker
          Task(s).  As long as the OS is doing something reasonable, this
          should mean they usually stick to the same CPU.  Another way to
          look at this is that we're assuming each Capability is associated
          with a fixed CPU.
      
        - What used to be StgMainThread is now part of the Task structure.
          Every OS thread in the runtime has an associated Task, and it
          can ask for its current Task at any time with myTask().
      
        - removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead
          (it is now defined for SMP too).
      
        - The RtsAPI has had to change; we must explicitly pass a Capability
          around now.  The previous interface assumed some global state.
          SchedAPI has also changed a lot.
      
        - The OSThreads API now supports thread-local storage, used to
          implement myTask(), although it could be done more efficiently
          using gcc's __thread extension when available.
      
        - I've moved some POSIX-specific stuff into the posix subdirectory,
          moving in the direction of separating out platform-specific
          implementations.
      
        - lots of lock-debugging and assertions in the runtime.  In particular,
          when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is
          also an ASSERT_LOCK_HELD() call.
      
      What's missing so far:
      
        - I have almost certainly broken the Win32 build, will fix soon.
      
        - any kind of thread migration or load balancing.  This is high up
          the agenda, though.
      
        - various performance tweaks to do
      
        - throwTo and forkProcess still do not work in SMP mode
      03a9ff01
  30. 25 Jul, 2005 1 commit
  31. 06 Apr, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-04-06 15:27:06 by simonmar] · 9a92cb1c
      simonmar authored
      Revamp the Task API: now we use the same implementation for threaded
      and SMP.  We also keep per-task timing stats in the threaded RTS now,
      which makes the output of +RTS -sstderr more useful.
      9a92cb1c
  32. 27 Mar, 2005 1 commit
    • panne's avatar
      [project @ 2005-03-27 13:41:13 by panne] · 03dc2dd3
      panne authored
      * Some preprocessors don't like the C99/C++ '//' comments after a
        directive, so use '/* */' instead. For consistency, a lot of '//' in
        the include files were converted, too.
      
      * UnDOSified libraries/base/cbits/runProcess.c.
      
      * My favourite sport: Killed $Id$s.
      03dc2dd3
  33. 12 Sep, 2004 1 commit
  34. 27 May, 2004 1 commit
  35. 06 Feb, 2002 1 commit