This project is mirrored from https://gitlab.haskell.org/ghc/ghc.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Last successful update .
  1. 15 Jan, 2012 1 commit
    • Ian Lynagh's avatar
      Fix a #define · 54121fff
      Ian Lynagh authored
      I don't think it was causing any problems, but
          TimeToUS(x+y)
      would have evaluated to
          x + (y / 1000)
      54121fff
  2. 09 Jan, 2012 2 commits
  3. 08 Jan, 2012 1 commit
    • Ian Lynagh's avatar
      Refactoring · 9e452874
      Ian Lynagh authored
      This is working towards being able to put ghcautoconf.h and
      ghcplatform.h in includes/dist
      9e452874
  4. 06 Jan, 2012 2 commits
  5. 05 Jan, 2012 3 commits
  6. 03 Jan, 2012 1 commit
  7. 19 Dec, 2011 1 commit
  8. 13 Dec, 2011 1 commit
    • Simon Marlow's avatar
      New flag +RTS -qi<n>, avoid waking up idle Capabilities to do parallel GC · a02eb298
      Simon Marlow authored
      This is an experimental tweak to the parallel GC that avoids waking up
      a Capability to do parallel GC if we know that the capability has been
      idle for a (tunable) number of GC cycles.  The idea is that if you're
      only using a few Capabilities, there's no point waking up the ones
      that aren't busy.
      
      e.g. +RTS -qi3
      
      says "A Capability will participate in parallel GC if it was running
      at all since the last 3 GC cycles."
      
      Results are a bit hit and miss, and I don't completely understand why
      yet.  Hence, for now it is turned off by default, and also not
      documented except in the +RTS -? output.
      a02eb298
  9. 07 Dec, 2011 2 commits
    • Simon Marlow's avatar
      8b48562e
    • chak@cse.unsw.edu.au.'s avatar
      Add new primtypes 'ArrayArray#' and 'MutableArrayArray#' · 021a0dd2
      chak@cse.unsw.edu.au. authored
      The primitive array types, such as 'ByteArray#', have kind #, but are represented by pointers. They are boxed, but unpointed types (i.e., they cannot be 'undefined').
      
      The two categories of array types —[Mutable]Array# and [Mutable]ByteArray#— are containers for unboxed (and unpointed) as well as for boxed and pointed types.  So far, we lacked support for containers for boxed, unpointed types (i.e., containers for the primitive arrays themselves).  This is what the new primtypes provide.
      
      Containers for boxed, unpointed types are crucial for the efficient implementation of scattered nested arrays, which are central to the new DPH backend library dph-lifted-vseg.  Without such containers, we cannot eliminate all unboxing from the inner loops of traversals processing scattered nested arrays.
      021a0dd2
  10. 06 Dec, 2011 2 commits
    • Simon Marlow's avatar
      Allow the number of capabilities to be increased at runtime (#3729) · 92e7d6c9
      Simon Marlow authored
      At present the number of capabilities can only be *increased*, not
      decreased.  The latter presents a few more challenges!
      92e7d6c9
    • Simon Marlow's avatar
      Make forkProcess work with +RTS -N · 8b75acd3
      Simon Marlow authored
      Consider this experimental for the time being.  There are a lot of
      things that could go wrong, but I've verified that at least it works
      on the test cases we have.
      
      I also did some API cleanups while I was here.  Previously we had:
      
      Capability * rts_eval (Capability *cap, HaskellObj p, /*out*/HaskellObj *ret);
      
      but this API is particularly error-prone: if you forget to discard the
      Capability * you passed in and use the return value instead, then
      you're in for subtle bugs with +RTS -N later on.  So I changed all
      these functions to this form:
      
      void rts_eval (/* inout */ Capability **cap,
                     /* in    */ HaskellObj p,
                     /* out */   HaskellObj *ret)
      
      It's much harder to use this version incorrectly, because you have to
      pass the Capability in by reference.
      8b75acd3
  11. 02 Dec, 2011 3 commits
    • Ian Lynagh's avatar
      Fix header installation · 11a614ff
      Ian Lynagh authored
      11a614ff
    • Ian Lynagh's avatar
      Move includes/DerivedConstants.h and includes/GHCConstants.h into dist dirs · e8723129
      Ian Lynagh authored
      When they existed, they were getting included in the includes_H_FILES
      variable (as it uses wildcard to find all header files). But the
      .depends files for the programs that generate the headers depend on
      $(includes_H_FILES), so the .depends files looked out-of-date once the
      headers had been created. This caused unnecessary make reinvocations.
      
      So now we put them in dist* directories, where they ought to be anyway.
      e8723129
    • Simon Marlow's avatar
      More changes aimed at improving call stacks. · 1469f1eb
      Simon Marlow authored
        - Attach a SrcSpan to every CostCentre.  This had the side effect
          that CostCentres that used to be merged because they had the same
          name are now considered distinct; so I had to add a Unique to
          CostCentre to give them distinct object-code symbols.
      
        - New flag: -fprof-auto-calls.  This flag adds an automatic SCC to
          every call site (application, to be precise).  This is typically
          more useful for call stacks than annotating whole functions.
      
      Various tidy-ups at the same time: removed unused NoCostCentre
      constructor, and refactored a bit in Coverage.lhs.
      
      The call stack we get from traceStack now looks like this:
      
      Stack trace:
        Main.CAF (<entire-module>)
        Main.main.xs (callstack002.hs:18:12-24)
        Main.map (callstack002.hs:13:12-16)
        Main.map.go (callstack002.hs:15:21-34)
        Main.map.go (callstack002.hs:15:21-23)
        Main.f (callstack002.hs:10:7-43)
      1469f1eb
  12. 01 Dec, 2011 1 commit
    • Simon Marlow's avatar
      Fix a scheduling bug in the threaded RTS · 6d18141d
      Simon Marlow authored
      The parallel GC was using setContextSwitches() to stop all the other
      threads, which sets the context_switch flag on every Capability.  That
      had the side effect of causing every Capability to also switch
      threads, and since GCs can be much more frequent than context
      switches, this increased the context switch frequency.  When context
      switches are expensive (because the switch is between two bound
      threads or a bound and unbound thread), the difference is quite
      noticeable.
      
      The fix is to have a separate flag to indicate that a Capability
      should stop and return to the scheduler, but not switch threads.  I've
      called this the "interrupt" flag.
      6d18141d
  13. 29 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Make profiling work with multiple capabilities (+RTS -N) · 50de6034
      Simon Marlow authored
      This means that both time and heap profiling work for parallel
      programs.  Main internal changes:
      
        - CCCS is no longer a global variable; it is now another
          pseudo-register in the StgRegTable struct.  Thus every
          Capability has its own CCCS.
      
        - There is a new built-in CCS called "IDLE", which records ticks for
          Capabilities in the idle state.  If you profile a single-threaded
          program with +RTS -N2, you'll see about 50% of time in "IDLE".
      
        - There is appropriate locking in rts/Profiling.c to protect the
          shared cost-centre-stack data structures.
      
      This patch does enough to get it working, I have cut one big corner:
      the cost-centre-stack data structure is still shared amongst all
      Capabilities, which means that multiple Capabilities will race when
      updating the "allocations" and "entries" fields of a CCS.  Not only
      does this give unpredictable results, but it runs very slowly due to
      cache line bouncing.
      
      It is strongly recommended that you use -fno-prof-count-entries to
      disable the "entries" count when profiling parallel programs. (I shall
      add a note to this effect to the docs).
      50de6034
  14. 25 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Time handling overhaul · 6b109851
      Simon Marlow authored
      Terminology cleanup: the type "Ticks" has been renamed "Time", which
      is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds).
      The terminology "tick" is now used consistently to mean the interval
      between timer signals.
      
      The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if
      we have it).  Before it used CPU time in the non-threaded RTS and
      realtime in the threaded RTS, but I've discovered that the CPU timer
      has terrible resolution (at least on Linux) and isn't much use for
      profiling.  So now we always use realtime.  This should also fix
      
      The default tick interval is now 10ms, except when profiling where we
      drop it to 1ms.  This gives more accurate profiles without affecting
      runtime too much (<1%).
      
      Lots of cleanups - the resolution of Time is now in one place
      only (Rts.h) rather than having calculations that depend on the
      resolution scattered all over the RTS.  I hope I found them all.
      6b109851
  15. 22 Nov, 2011 3 commits
  16. 19 Nov, 2011 1 commit
    • Ian Lynagh's avatar
      Improve the way we call "rm" in the build system; fixes trac #4916 · 80e9070c
      Ian Lynagh authored
      We avoid calling "rm -rf" with no file arguments; this fixes cleaning
      on Solaris, where that fails.
      
      We also check for suspicious arguments: anything containing "..",
      starting "/", or containing a "*" (you need to call $(wildcard ...)
      yourself now if you really want globbing). This should make things
      a little safer.
      80e9070c
  17. 16 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Generate the C main() function when linking a binary (fixes #5373) · 1df28a80
      Simon Marlow authored
      Rather than have main() be statically compiled as part of the RTS, we
      now generate it into the tiny C file that we compile when linking a
      binary.
      
      The main motivation is that we want to pass the settings for the
      -rtsotps and -with-rtsopts flags into the RTS, rather than relying on
      fragile linking semantics to override the defaults, which don't work
      with DLLs on Windows (#5373).  In order to do this, we need to extend
      the API for initialising the RTS, so now we have:
      
      void hs_init_ghc (int *argc, char **argv[],   // program arguments
                        RtsConfig rts_config);      // RTS configuration
      
      hs_init_ghc() can optionally be used instead of hs_init(), and allows
      passing in configuration options for the RTS.  RtsConfig is a struct,
      which currently has two fields:
      
      typedef struct {
          RtsOptsEnabledEnum rts_opts_enabled;
          const char *rts_opts;
      } RtsConfig;
      
      but might have more in the future.  There is a default value for the
      struct, defaultRtsConfig, the idea being that you start with this and
      override individual fields as necessary.
      
      In fact, main() was in a separate static library, libHSrtsmain.a.
      That's now gone.
      1df28a80
  18. 06 Nov, 2011 1 commit
  19. 04 Nov, 2011 1 commit
    • Duncan Coutts's avatar
      Add eventlog event for thread labels · c739d845
      Duncan Coutts authored
      The existing GHC.Conc.labelThread will now also emit the the thread
      label into the eventlog. Profiling tools like ThreadScope could then
      use the thread labels rather than thread numbers.
      c739d845
  20. 02 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints · 7bb0447d
      Simon Marlow authored
      User visible changes
      ====================
      
      Profilng
      --------
      
      Flags renamed (the old ones are still accepted for now):
      
        OLD            NEW
        ---------      ------------
        -auto-all      -fprof-auto
        -auto          -fprof-exported
        -caf-all       -fprof-cafs
      
      New flags:
      
        -fprof-auto              Annotates all bindings (not just top-level
                                 ones) with SCCs
      
        -fprof-top               Annotates just top-level bindings with SCCs
      
        -fprof-exported          Annotates just exported bindings with SCCs
      
        -fprof-no-count-entries  Do not maintain entry counts when profiling
                                 (can make profiled code go faster; useful with
                                 heap profiling where entry counts are not used)
      
      Cost-centre stacks have a new semantics, which should in most cases
      result in more useful and intuitive profiles.  If you find this not to
      be the case, please let me know.  This is the area where I have been
      experimenting most, and the current solution is probably not the
      final version, however it does address all the outstanding bugs and
      seems to be better than GHC 7.2.
      
      Stack traces
      ------------
      
      +RTS -xc now gives more information.  If the exception originates from
      a CAF (as is common, because GHC tends to lift exceptions out to the
      top-level), then the RTS walks up the stack and reports the stack in
      the enclosing update frame(s).
      
      Result: +RTS -xc is much more useful now - but you still have to
      compile for profiling to get it.  I've played around a little with
      adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
      quite accurately.
      
      I plan to add more facilities for stack tracing (e.g. in GHCi) in the
      future.
      
      Coverage (HPC)
      --------------
      
       * derived instances are now coloured yellow if they weren't used
       * likewise record field names
       * entry counts are more accurate (hpc --fun-entry-count)
       * tab width is now correct (markup was previously off in source with
         tabs)
      
      Internal changes
      ================
      
      In Core, the Note constructor has been replaced by
      
              Tick (Tickish b) (Expr b)
      
      which is used to represent all the kinds of source annotation we
      support: profiling SCCs, HPC ticks, and GHCi breakpoints.
      
      Depending on the properties of the Tickish, different transformations
      apply to Tick.  See CoreUtils.mkTick for details.
      
      Tickets
      =======
      
      This commit closes the following tickets, test cases to follow:
      
        - Close #2552: not a bug, but the behaviour is now more intuitive
          (test is T2552)
      
        - Close #680 (test is T680)
      
        - Close #1531 (test is result001)
      
        - Close #949 (test is T949)
      
        - Close #2466: test case has bitrotted (doesn't compile against current
          version of vector-space package)
      7bb0447d
  21. 27 Oct, 2011 1 commit
  22. 26 Oct, 2011 1 commit
    • Duncan Coutts's avatar
      Add new eventlog EVENT_WALL_CLOCK_TIME for time matching · 4856d15a
      Duncan Coutts authored
      Eventlog timestamps are elapsed times (in nanoseconds) relative to the
      process start. To be able to merge eventlogs from multiple processes we
      need to be able to align their timelines. If they share a clock domain
      (or a user judges that their clocks are sufficiently closely
      synchronised) then it is sufficient to know how the eventlog timestamps
      match up with the clock.
      
      The EVENT_WALL_CLOCK_TIME contains the clock time with (up to)
      nanosecond precision. It is otherwise an ordinary event and so contains
      the usual timestamp for the same moment in time. It therefore enables
      us to match up all the eventlog timestamps with clock time.
      4856d15a
  23. 19 Oct, 2011 2 commits
  24. 18 Oct, 2011 1 commit
  25. 17 Oct, 2011 1 commit
  26. 07 Oct, 2011 2 commits
    • dmp's avatar
      Add autoconf support to detect an LLVM-based C compiler · 6247b59e
      dmp authored
      This patch adds support to the autoconf scripts to detect
      when we are using a C compiler that uses an LLVM back end.
      An LLVM back end does not support all of the extensions use
      by GCC, so we need to perform some conditional compilation
      in the runtime, particularly for handling thread local
      storage and global register variables.
      
      The changes here will set the CC_LLVM_BACKEND in the
      autoconf scripts if we detect an llvm-based compiler. We use
      this variable to define the llvm_CC_FLAVOR variable that we
      can use in the runtime code to conditionally compile for
      LLVM.
      6247b59e
    • dmp's avatar
      Enable pthread_getspecific() tls for LLVM compiler · dba72545
      dmp authored
      LLVM does not support the __thread attribute for thread
      local storage and may generate incorrect code for global
      register variables. We want to allow building the runtime with
      LLVM-based compilers such as llvm-gcc and clang,
      particularly for MacOS.
      
      This patch changes the gct variable used by the garbage
      collector to use pthread_getspecific() for thread local
      storage when an llvm based compiler is used to build the
      runtime.
      dba72545
  27. 02 Sep, 2011 1 commit
    • Simon Peyton Jones's avatar
      Increase the "context stack depth" to 200 (from 20) · d3a31e48
      Simon Peyton Jones authored
      This parameter controls the allowed depth of reasoning in the
      type constraint solver.  Perfectly well-behaved programs can
      use deep stacks, and 20 is obviously too small.  (Indeed, if
      you don't have UndecidableInstances, the constraint solver
      is supposed to terminate, so no limit should be needed.)
      
      Responding to Trac #5395 this patch increases the default
      to 200.
      d3a31e48
  28. 25 Aug, 2011 1 commit