1. 28 Feb, 2006 2 commits
    • Simon Marlow's avatar
      Allow C argument regs to be used as global regs (R1, R2, etc.) · 14a5c62a
      Simon Marlow authored
      The problem here was that we generated C calls with expressions
      involving R1 etc. as parameters.  When some of the R registers are
      also C argument registers, both GCC and the native code generator
      generate incorrect code.  The hacky workaround is to assign
      problematic arguments to temporaries first; fortunately this works
      with both GCC and the NCG, but we have to be careful not to undo this
      with later optimisations (see changes to CmmOpt).
      14a5c62a
    • Simon Marlow's avatar
      pass arguments to unknown function calls in registers · 04db0e9f
      Simon Marlow authored
      We now have more stg_ap entry points: stg_ap_*_fast, which take
      arguments in registers according to the platform calling convention.
      This is faster if the function being called is evaluated and has the
      right arity, which is the common case (see the eval/apply paper for
      measurements).  
      
      We still need the stg_ap_*_info entry points for stack-based
      application, such as an overflows when a function is applied to too
      many argumnets.  The stg_ap_*_fast functions actually just check for
      an evaluated function, and if they don't find one, push the args on
      the stack and invoke stg_ap_*_info.  (this might be slightly slower in
      some cases, but not the common case).
      04db0e9f
  2. 24 Feb, 2006 1 commit
  3. 09 Feb, 2006 1 commit
  4. 08 Feb, 2006 1 commit
    • Simon Marlow's avatar
      make the smp way RTS-only, normal libraries now work with -smp · beb5737b
      Simon Marlow authored
      We had to bite the bullet here and add an extra word to every thunk,
      to enable running ordinary libraries on SMP.  Otherwise, we would have
      needed to ship an extra set of libraries with GHC 6.6 in addition to
      the two sets we already ship (normal + profiled), and all Cabal
      packages would have to be compiled for SMP too.  We decided it best
      just to take the hit now, making SMP easily accessible to everyone in
      GHC 6.6.
      
      Incedentally, although this increases allocation by around 12% on
      average, the performance hit is around 5%, and much less if your inner
      loop doesn't use any laziness.
      beb5737b
  5. 24 Jan, 2006 1 commit
  6. 17 Jan, 2006 2 commits
    • simonmar's avatar
      [project @ 2006-01-17 16:13:18 by simonmar] · 91b07216
      simonmar authored
      Improve the GC behaviour of IORefs (see Ticket #650).
      
      This is a small change to the way IORefs interact with the GC, which
      should improve GC performance for programs with plenty of IORefs.
      
      Previously we had a single closure type for mutable variables,
      MUT_VAR.  Mutable variables were *always* on the mutable list in older
      generations, and always traversed on every GC.
      
      Now, we have two closure types: MUT_VAR_CLEAN and MUT_VAR_DIRTY.  The
      latter is on the mutable list, but the former is not.  (NB. this
      differs from MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY, both of which
      are on the mutable list).  writeMutVar# now implements a write
      barrier, by calling dirty_MUT_VAR() in the runtime, that does the
      necessary modification of MUT_VAR_CLEAN into MUT_VAR_DIRY, and adding
      to the mutable list if necessary.
      
      This results in some pretty dramatic speedups for GHC itself.  I've
      just measureed a 30% overall speedup compiling a 31-module program
      (anna) with the default heap settings :-D
      91b07216
    • simonmar's avatar
      [project @ 2006-01-17 16:03:47 by simonmar] · da69fa9c
      simonmar authored
      Improve the GC behaviour of IOArrays/STArrays
      
      See Ticket #650
      
      This is a small change to the way mutable arrays interact with the GC,
      that can have a dramatic effect on performance, and make tricks with
      unsafeThaw/unsafeFreeze redundant.  Data.HashTable should be faster
      now (I haven't measured it yet).
      
      We now have two mutable array closure types, MUT_ARR_PTRS_CLEAN and
      MUT_ARR_PTRS_DIRTY.  Both are on the mutable list if the array is in
      an old generation.  writeArray# sets the type to MUT_ARR_PTRS_DIRTY.
      The garbage collector can set the type to MUT_ARR_PTRS_CLEAN if it
      finds that no element of the array points into a younger generation
      (discovering this required a small addition to evacuate(), but rough
      tests indicate that it doesn't measurably affect performance).
      
      NOTE: none of this affects unboxed arrays (IOUArray/STUArray), only
      boxed arrays (IOArray/STArray).
      
      We could go further and extend the DIRTY bit to be per-block rather
      than for the whole array, but for now this is an easy improvement.
      da69fa9c
  7. 10 Jan, 2006 2 commits
  8. 09 Jan, 2006 1 commit
  9. 06 Jan, 2006 1 commit
    • simonmar's avatar
      [project @ 2006-01-06 16:30:17 by simonmar] · 9d7da331
      simonmar authored
      Add support for UTF-8 source files
      
      GHC finally has support for full Unicode in source files.  Source
      files are now assumed to be UTF-8 encoded, and the full range of
      Unicode characters can be used, with classifications recognised using
      the implementation from Data.Char.  This incedentally means that only
      the stage2 compiler will recognise Unicode in source files, because I
      was too lazy to port the unicode classifier code into libcompat.
      
      Additionally, the following synonyms for keywords are now recognised:
      
        forall symbol 	(U+2200)	forall
        right arrow   	(U+2192)	->
        left arrow   		(U+2190)	<-
        horizontal ellipsis 	(U+22EF)	..
      
      there are probably more things we could add here.
      
      This will break some source files if Latin-1 characters are being used.
      In most cases this should result in a UTF-8 decoding error.  Later on
      if we want to support more encodings (perhaps with a pragma to specify
      the encoding), I plan to do it by recoding into UTF-8 before parsing.
      
      Internally, there were some pretty big changes:
      
        - FastStrings are now stored in UTF-8
      
        - Z-encoding has been moved right to the back end.  Previously we
          used to Z-encode every identifier on the way in for simplicity,
          and only decode when we needed to show something to the user.
          Instead, we now keep every string in its UTF-8 encoding, and
          Z-encode right before printing it out.  To avoid Z-encoding the
          same string multiple times, the Z-encoding is cached inside the
          FastString the first time it is requested.
      
          This speeds up the compiler - I've measured some definite
          improvement in parsing at least, and I expect compilations overall
          to be faster too.  It also cleans up a lot of cruft from the
          OccName interface.  Z-encoding is nicely hidden inside the
          Outputable instance for Names & OccNames now.
      
        - StringBuffers are UTF-8 too, and are now represented as
          ForeignPtrs.
      
        - I've put together some test cases, not by any means exhaustive,
          but there are some interesting UTF-8 decoding error cases that
          aren't obvious.  Also, take a look at unicode001.hs for a demo.
      9d7da331
  10. 16 Nov, 2005 1 commit
    • simonpj's avatar
      [project @ 2005-11-16 12:55:58 by simonpj] · cdea9949
      simonpj authored
      Two significant changes to the representation of types
      
      1. Change the representation of type synonyms
      
          Up to now, type synonym applications have been held in
          *both* expanded *and* un-expanded form.  Unfortunately, this
          has exponential (!) behaviour when type synonyms are deeply
          nested.  E.g.
      	    type P a b = (a,b)
      	    f :: P a (P b (P c (P d e)))
          
          This showed up in a program of Joel Reymont, now immortalised
          as typecheck/should_compile/syn-perf.hs
      
          So now synonyms are held as ordinary TyConApps, and expanded
          only on demand.  
      
          SynNote has disappeared altogether, so the only remaining TyNote
          is a FTVNote.  I'm not sure if it's even useful.
      
      2. Eta-reduce newtypes
      
          See the Note [Newtype eta] in TyCon.lhs
          
          If we have 
      	    newtype T a b = MkT (S a b)
          
          then, in Core land, we would like S = T, even though the application
          of T is then not saturated. This commit eta-reduces T's RHS, and
          keeps that inside the TyCon (in nt_etad_rhs).  Result is that 
          coreEqType can be simpler, and has less need of expanding newtypes.
      cdea9949
  11. 28 Oct, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-10-28 11:35:35 by simonmar] · 55495951
      simonmar authored
      Change the default executable name to match the basename of the source
      file containing the Main module (or the module specified by -main-is),
      if there is one.  On Windows, the .exe extension is added.
      
      As requested on the ghc-users list, and as implemented by Tomasz
      Zielonka <tomasz.zielonka at gmail.com>, with modifications by me.
      
      I changed the type of the mainModIs field of DynFlags from Maybe
      String to Module, which removed some duplicate code.
      55495951
  12. 27 Oct, 2005 1 commit
  13. 02 Aug, 2005 1 commit
  14. 25 Jul, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-07-25 14:12:48 by simonmar] · e792bb84
      simonmar authored
      Remove the ForeignObj# type, and all its PrimOps.  The new efficient
      representation of ForeignPtr doesn't use ForeignObj# underneath, and
      there seems no need to keep it.
      e792bb84
  15. 07 Jul, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-07-07 13:50:40 by simonmar] · cca5f22b
      simonmar authored
      small performance fix: in via-C mode we previously always created a
      switch instead of an conditional-tree for a multi-branch case.  Refine
      this slightly so that 2-branch switches turn into conditionals again,
      since gcc doesn't do a good job of optimising the equivalent switch.
      cca5f22b
  16. 21 Jun, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-06-21 10:44:37 by simonmar] · 0c53bd0e
      simonmar authored
      Relax the restrictions on conflicting packages.  This should address
      many of the traps that people have been falling into with the current
      package story.
      
      Now, a local module can shadow a module in an exposed package, as long
      as the package is not otherwise required by the program.  GHC checks
      for conflicts when it knows the dependencies of the module being
      compiled.
      
      Also, we now check for module conflicts in exposed packages only when
      importing a module: if an import can be satisfied from multiple
      packages, that's an error.  It's not possible to prevent GHC from
      starting by installing packages now (unless you install another base
      package).
      
      It seems to be possible to confuse GHCi by having a local module
      shadowing a package module that goes away and comes back again.  I
      think it's nearly right, but strange happenings have been observed.
      
      I'll try to merge this into the STABLE branch.
      0c53bd0e
  17. 18 May, 2005 2 commits
  18. 17 May, 2005 2 commits
    • simonmar's avatar
      [project @ 2005-05-17 13:47:39 by simonmar] · 8eee9100
      simonmar authored
      closureDescription: remove duplicate module name for external names,
      and include the unique for local names.  This makes profiling with -hd
      more uesful.
      8eee9100
    • simonmar's avatar
      [project @ 2005-05-17 12:22:37 by simonmar] · ce2fc604
      simonmar authored
      Profiling: the type_descr and closure_descr were the wrong way around,
      so +RTS -hy behaves like +RTS -hd, and vice-versa.  How on earth that
      happened I have no idea.
      ce2fc604
  19. 15 May, 2005 1 commit
  20. 12 May, 2005 1 commit
  21. 28 Apr, 2005 3 commits
    • simonpj's avatar
      [project @ 2005-04-28 16:05:54 by simonpj] · 91944423
      simonpj authored
      Re-plumb the connections between TidyPgm and the various
      code generators.  There's a new type, CgGuts, to mediate this,
      which has the happy effect that ModGuts can die earlier.
      
      The non-O route still isn't quite right, because default methods
      are being lost.  I'm working on it.
      91944423
    • simonmar's avatar
      [project @ 2005-04-28 15:28:05 by simonmar] · c84e392e
      simonmar authored
      Small code-size optimisation: I forgot to add a specialised case for
      functions with no argument words (which might happen if the function
      takes a void argument, for example).
      c84e392e
    • simonpj's avatar
      [project @ 2005-04-28 10:09:41 by simonpj] · dd313897
      simonpj authored
      This big commit does several things at once (aeroplane hacking)
      which change the format of interface files.  
      
      	So you'll need to recompile your libraries!
      
      1. The "stupid theta" of a newtype declaration
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Retain the "stupid theta" in a newtype declaration.
      For some reason this was being discarded, and putting it
      back in meant changing TyCon and IfaceSyn slightly.
         
      
      2. Overlap flags travel with the instance
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Arrange that the ability to support overlap and incoherence
      is a property of the *instance declaration* rather than the
      module that imports the instance decl.  This allows a library
      writer to define overlapping instance decls without the
      library client having to know.  
      
      The implementation is that in an Instance we store the
      overlap flag, and preseve that across interface files
      
      
      3. Nuke the "instnce pool" and "rule pool"
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      A major tidy-up and simplification of the way that instances
      and rules are sucked in from interface files.  Up till now
      an instance decl has been held in a "pool" until its "gates" 
      (a set of Names) are in play, when the instance is typechecked
      and added to the InstEnv in the ExternalPackageState.  
      This is complicated and error-prone; it's easy to suck in 
      too few (and miss an instance) or too many (and thereby be
      forced to suck in its type constructors, etc).
      
      Now, as we load an instance from an interface files, we 
      put it straight in the InstEnv... but the Instance we put in
      the InstEnv has some Names (the "rough-match" names) that 
      can be used on lookup to say "this Instance can't match".
      The detailed dfun is only read lazily, and the rough-match
      thing meansn it is'nt poked on until it has a chance of
      being needed.
      
      This simply continues the successful idea for Ids, whereby
      they are loaded straightaway into the TypeEnv, but their
      TyThing is a lazy thunk, not poked on until the thing is looked
      up.
      
      Just the same idea applies to Rules.
      
      On the way, I made CoreRule and Instance into full-blown records
      with lots of info, with the same kind of key status as TyCon or 
      DataCon or Class.  And got rid of IdCoreRule altogether.   
      It's all much more solid and uniform, but it meant touching
      a *lot* of modules.
      
      
      4. Allow instance decls in hs-boot files
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Allowing instance decls in hs-boot files is jolly useful, becuase
      in a big mutually-recursive bunch of data types, you want to give
      the instances with the data type declarations.  To achieve this
      
      * The hs-boot file makes a provisional name for the dict-fun, something
        like $fx9.
      
      * When checking the "mother module", we check that the instance
        declarations line up (by type) and generate bindings for the 
        boot dfuns, such as
      	$fx9 = $f2
        where $f2 is the dfun generated by the mother module
      
      * In doing this I decided that it's cleaner to have DFunIds get their
        final External Name at birth.  To do that they need a stable OccName,
        so I have an integer-valued dfun-name-supply in the TcM monad.
        That keeps it simple.
      
      This feature is hardly tested yet.
      
      
      5. Tidy up tidying, and Iface file generation
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      main/TidyPgm now has two entry points:
      
        simpleTidyPgm is for hi-boot files, when typechecking only
        (not yet implemented), and potentially when compiling without -O.
        It ignores the bindings, and generates a nice small TypeEnv.
      
        optTidyPgm is the normal case: compiling with -O.  It generates a
        TypeEnv rich in IdInfo
      
      MkIface.mkIface now only generates a ModIface.  A separate
      procedure, MkIface.writeIfaceFile, writes the file out to disk.
      dd313897
  22. 27 Apr, 2005 1 commit
  23. 22 Apr, 2005 1 commit
    • sof's avatar
      [project @ 2005-04-22 16:01:53 by sof] · a584b4ff
      sof authored
      Until the GHCi linker is made capable of handling .ctors sections in
      PEi object files, stick with __stginits. Being a bit sloppy by
      using 'mingw32_HOST_OS' to test for this.
      a584b4ff
  24. 21 Apr, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-04-21 15:28:20 by simonmar] · effd3425
      simonmar authored
      SMP: thunks get an extra header word so that the payload doesn't
      occupy the same space as the updated value.  This is the sum total of
      the changes to compiler/, which are pleasingly few.
      effd3425
  25. 15 Apr, 2005 1 commit
    • wolfgang's avatar
      [project @ 2005-04-15 05:29:48 by wolfgang] · eab7055a
      wolfgang authored
      Initialise foreign exports from GNU C __attribute__((constructor)) functions
      in the stub C file, rather than from __stginit_ functions.
      For non-profiling ways, leave out __stginit_ alltogether.
      eab7055a
  26. 12 Apr, 2005 1 commit
    • wolfgang's avatar
      [project @ 2005-04-12 19:58:56 by wolfgang] · 48ea897f
      wolfgang authored
      Dynamic Linking:
      On non-Win32, we can store cross-dylib pointers in static data, so disable
      a Win32-specific hack on the other platforms, to prevent code bloat.
      48ea897f
  27. 11 Apr, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-04-11 08:52:29 by simonmar] · 5ba806d8
      simonmar authored
      When generating a switch for:
      
        case e of
          0 -> A
          1 -> B
      
      instead of generating
      
        if (e < 1) then goto A
        B
      
      generate
      
        if (e >= 1) then goto B
        A
      
      because this helps the NCG to generate better code.  In particular, if
      e is a comparison, then we don't need to reverse the sense of the
      comparison to eliminate the comparse against 1 (the NCG does try to
      reverse the comparison, but floating-point comparisons can't be
      reversed).
      5ba806d8
  28. 07 Apr, 2005 1 commit
  29. 31 Mar, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-03-31 10:16:33 by simonmar] · 853e20a3
      simonmar authored
      Tweaks to get the GHC sources through Haddock.  Doesn't quite work
      yet, because Haddock complains about the recursive modules.  Haddock
      needs to understand SOURCE imports (it can probably just ignore them
      as a first attempt).
      853e20a3
  30. 24 Mar, 2005 1 commit
  31. 21 Mar, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-03-21 10:50:22 by simonmar] · 50159f6c
      simonmar authored
      Complete the transition of -split-objs into a dynamic flag (looks like I
      half-finished it in the last commit).
      
      Also: complete the transition of -tmpdir into a dynamic flag, which
      involves some rearrangement of code from SysTools into DynFlags.
      
      Someday, initSysTools should move wholesale into initDynFlags, because
      most of the state that it initialises is now part of the DynFlags
      structure, and the rest could be moved in easily.
      50159f6c
  32. 18 Mar, 2005 2 commits
    • simonmar's avatar
      [project @ 2005-03-18 13:37:27 by simonmar] · d1c1b7d0
      simonmar authored
      Flags cleanup.
      
      Basically the purpose of this commit is to move more of the compiler's
      global state into DynFlags, which is moving in the direction we need
      to go for the GHC API which can have multiple active sessions
      supported by a single GHC instance.
      
      Before:
      
      $ grep 'global_var' */*hs | wc -l
           78
      
      After:
      
      $ grep 'global_var' */*hs | wc -l
           27
      
      Well, it's an improvement.  Most of what's left won't really affect
      our ability to host multiple sessions.
      
      Lots of static flags have become dynamic flags (yay!).  Notably lots
      of flags that we used to think of as "driver" flags, like -I and -L,
      are now dynamic.  The most notable static flags left behind are the
      "way" flags, eg. -prof.  It would be nice to fix this, but it isn't
      urgent.
      
      On the way, lots of cleanup has happened.  Everything related to
      static and dynamic flags lives in StaticFlags and DynFlags
      respectively, and they share a common command-line parser library in
      CmdLineParser.  The flags related to modes (--makde, --interactive
      etc.) are now private to the front end: in fact private to Main
      itself, for now.
      d1c1b7d0
    • simonmar's avatar
      [project @ 2005-03-18 11:19:27 by simonmar] · 6a51f7df
      simonmar authored
      merge rev. 1.6.2.1, simplified slightly:
      
        Initialise a CostCentreStack by generating SIZEOF_CostCentreStack
        (gotten from the C compiler) zeros, padded to the nearest word.
        Improves on the previous fixes for unpredictable padding (see comment).
      6a51f7df