1. 25 Aug, 2006 1 commit
  2. 11 Aug, 2006 1 commit
  3. 25 Jul, 2006 1 commit
    • Simon Marlow's avatar
      Generalise Package Support · 61d2625a
      Simon Marlow authored
      This patch pushes through one fundamental change: a module is now
      identified by the pair of its package and module name, whereas
      previously it was identified by its module name alone.  This means
      that now a program can contain multiple modules with the same name, as
      long as they belong to different packages.
      
      This is a language change - the Haskell report says nothing about
      packages, but it is now necessary to understand packages in order to
      understand GHC's module system.  For example, a type T from module M
      in package P is different from a type T from module M in package Q.
      Previously this wasn't an issue because there could only be a single
      module M in the program.
      
      The "module restriction" on combining packages has therefore been
      lifted, and a program can contain multiple versions of the same
      package.
      
      Note that none of the proposed syntax changes have yet been
      implemented, but the architecture is geared towards supporting import
      declarations qualified by package name, and that is probably the next
      step.
      
      It is now necessary to specify the package name when compiling a
      package, using the -package-name flag (which has been un-deprecated).
      Fortunately Cabal still uses -package-name.
      
      Certain packages are "wired in".  Currently the wired-in packages are:
      base, haskell98, template-haskell and rts, and are always referred to
      by these versionless names.  Other packages are referred to with full
      package IDs (eg. "network-1.0").  This is because the compiler needs
      to refer to entities in the wired-in packages, and we didn't want to
      bake the version of these packages into the comiler.  It's conceivable
      that someone might want to upgrade the base package independently of
      GHC.
      
      Internal changes:
      
        - There are two module-related types:
      
              ModuleName      just a FastString, the name of a module
              Module          a pair of a PackageId and ModuleName
      
          A mapping from ModuleName can be a UniqFM, but a mapping from Module
          must be a FiniteMap (we provide it as ModuleEnv).
      
        - The "HomeModules" type that was passed around the compiler is now
          gone, replaced in most cases by the current package name which is
          contained in DynFlags.  We can tell whether a Module comes from the
          current package by comparing its package name against the current
          package.
      
        - While I was here, I changed PrintUnqual to be a little more useful:
          it now returns the ModuleName that the identifier should be qualified
          with according to the current scope, rather than its original
          module.  Also, PrintUnqual tells whether to qualify module names with
          package names (currently unused).
      
      Docs to follow.
      61d2625a
  4. 04 Jul, 2006 1 commit
  5. 07 Apr, 2006 1 commit
    • Simon Marlow's avatar
      Reorganisation of the source tree · 0065d5ab
      Simon Marlow authored
      Most of the other users of the fptools build system have migrated to
      Cabal, and with the move to darcs we can now flatten the source tree
      without losing history, so here goes.
      
      The main change is that the ghc/ subdir is gone, and most of what it
      contained is now at the top level.  The build system now makes no
      pretense at being multi-project, it is just the GHC build system.
      
      No doubt this will break many things, and there will be a period of
      instability while we fix the dependencies.  A straightforward build
      should work, but I haven't yet fixed binary/source distributions.
      Changes to the Building Guide will follow, too.
      0065d5ab
  6. 28 Feb, 2006 1 commit
    • Simon Marlow's avatar
      pass arguments to unknown function calls in registers · 04db0e9f
      Simon Marlow authored
      We now have more stg_ap entry points: stg_ap_*_fast, which take
      arguments in registers according to the platform calling convention.
      This is faster if the function being called is evaluated and has the
      right arity, which is the common case (see the eval/apply paper for
      measurements).  
      
      We still need the stg_ap_*_info entry points for stack-based
      application, such as an overflows when a function is applied to too
      many argumnets.  The stg_ap_*_fast functions actually just check for
      an evaluated function, and if they don't find one, push the args on
      the stack and invoke stg_ap_*_info.  (this might be slightly slower in
      some cases, but not the common case).
      04db0e9f
  7. 24 Feb, 2006 1 commit
  8. 17 Jan, 2006 2 commits
    • simonmar's avatar
      [project @ 2006-01-17 16:13:18 by simonmar] · 91b07216
      simonmar authored
      Improve the GC behaviour of IORefs (see Ticket #650).
      
      This is a small change to the way IORefs interact with the GC, which
      should improve GC performance for programs with plenty of IORefs.
      
      Previously we had a single closure type for mutable variables,
      MUT_VAR.  Mutable variables were *always* on the mutable list in older
      generations, and always traversed on every GC.
      
      Now, we have two closure types: MUT_VAR_CLEAN and MUT_VAR_DIRTY.  The
      latter is on the mutable list, but the former is not.  (NB. this
      differs from MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY, both of which
      are on the mutable list).  writeMutVar# now implements a write
      barrier, by calling dirty_MUT_VAR() in the runtime, that does the
      necessary modification of MUT_VAR_CLEAN into MUT_VAR_DIRY, and adding
      to the mutable list if necessary.
      
      This results in some pretty dramatic speedups for GHC itself.  I've
      just measureed a 30% overall speedup compiling a 31-module program
      (anna) with the default heap settings :-D
      91b07216
    • simonmar's avatar
      [project @ 2006-01-17 16:03:47 by simonmar] · da69fa9c
      simonmar authored
      Improve the GC behaviour of IOArrays/STArrays
      
      See Ticket #650
      
      This is a small change to the way mutable arrays interact with the GC,
      that can have a dramatic effect on performance, and make tricks with
      unsafeThaw/unsafeFreeze redundant.  Data.HashTable should be faster
      now (I haven't measured it yet).
      
      We now have two mutable array closure types, MUT_ARR_PTRS_CLEAN and
      MUT_ARR_PTRS_DIRTY.  Both are on the mutable list if the array is in
      an old generation.  writeArray# sets the type to MUT_ARR_PTRS_DIRTY.
      The garbage collector can set the type to MUT_ARR_PTRS_CLEAN if it
      finds that no element of the array points into a younger generation
      (discovering this required a small addition to evacuate(), but rough
      tests indicate that it doesn't measurably affect performance).
      
      NOTE: none of this affects unboxed arrays (IOUArray/STUArray), only
      boxed arrays (IOArray/STArray).
      
      We could go further and extend the DIRTY bit to be per-block rather
      than for the whole array, but for now this is an easy improvement.
      da69fa9c
  9. 06 Jan, 2006 1 commit
    • simonmar's avatar
      [project @ 2006-01-06 16:30:17 by simonmar] · 9d7da331
      simonmar authored
      Add support for UTF-8 source files
      
      GHC finally has support for full Unicode in source files.  Source
      files are now assumed to be UTF-8 encoded, and the full range of
      Unicode characters can be used, with classifications recognised using
      the implementation from Data.Char.  This incedentally means that only
      the stage2 compiler will recognise Unicode in source files, because I
      was too lazy to port the unicode classifier code into libcompat.
      
      Additionally, the following synonyms for keywords are now recognised:
      
        forall symbol 	(U+2200)	forall
        right arrow   	(U+2192)	->
        left arrow   		(U+2190)	<-
        horizontal ellipsis 	(U+22EF)	..
      
      there are probably more things we could add here.
      
      This will break some source files if Latin-1 characters are being used.
      In most cases this should result in a UTF-8 decoding error.  Later on
      if we want to support more encodings (perhaps with a pragma to specify
      the encoding), I plan to do it by recoding into UTF-8 before parsing.
      
      Internally, there were some pretty big changes:
      
        - FastStrings are now stored in UTF-8
      
        - Z-encoding has been moved right to the back end.  Previously we
          used to Z-encode every identifier on the way in for simplicity,
          and only decode when we needed to show something to the user.
          Instead, we now keep every string in its UTF-8 encoding, and
          Z-encode right before printing it out.  To avoid Z-encoding the
          same string multiple times, the Z-encoding is cached inside the
          FastString the first time it is requested.
      
          This speeds up the compiler - I've measured some definite
          improvement in parsing at least, and I expect compilations overall
          to be faster too.  It also cleans up a lot of cruft from the
          OccName interface.  Z-encoding is nicely hidden inside the
          Outputable instance for Names & OccNames now.
      
        - StringBuffers are UTF-8 too, and are now represented as
          ForeignPtrs.
      
        - I've put together some test cases, not by any means exhaustive,
          but there are some interesting UTF-8 decoding error cases that
          aren't obvious.  Also, take a look at unicode001.hs for a demo.
      9d7da331
  10. 21 Jun, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-06-21 10:44:37 by simonmar] · 0c53bd0e
      simonmar authored
      Relax the restrictions on conflicting packages.  This should address
      many of the traps that people have been falling into with the current
      package story.
      
      Now, a local module can shadow a module in an exposed package, as long
      as the package is not otherwise required by the program.  GHC checks
      for conflicts when it knows the dependencies of the module being
      compiled.
      
      Also, we now check for module conflicts in exposed packages only when
      importing a module: if an import can be satisfied from multiple
      packages, that's an error.  It's not possible to prevent GHC from
      starting by installing packages now (unless you install another base
      package).
      
      It seems to be possible to confuse GHCi by having a local module
      shadowing a package module that goes away and comes back again.  I
      think it's nearly right, but strange happenings have been observed.
      
      I'll try to merge this into the STABLE branch.
      0c53bd0e
  11. 23 May, 2005 1 commit
  12. 18 Mar, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-03-18 13:37:27 by simonmar] · d1c1b7d0
      simonmar authored
      Flags cleanup.
      
      Basically the purpose of this commit is to move more of the compiler's
      global state into DynFlags, which is moving in the direction we need
      to go for the GHC API which can have multiple active sessions
      supported by a single GHC instance.
      
      Before:
      
      $ grep 'global_var' */*hs | wc -l
           78
      
      After:
      
      $ grep 'global_var' */*hs | wc -l
           27
      
      Well, it's an improvement.  Most of what's left won't really affect
      our ability to host multiple sessions.
      
      Lots of static flags have become dynamic flags (yay!).  Notably lots
      of flags that we used to think of as "driver" flags, like -I and -L,
      are now dynamic.  The most notable static flags left behind are the
      "way" flags, eg. -prof.  It would be nice to fix this, but it isn't
      urgent.
      
      On the way, lots of cleanup has happened.  Everything related to
      static and dynamic flags lives in StaticFlags and DynFlags
      respectively, and they share a common command-line parser library in
      CmdLineParser.  The flags related to modes (--makde, --interactive
      etc.) are now private to the front end: in fact private to Main
      itself, for now.
      d1c1b7d0
  13. 10 Feb, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-02-10 13:01:52 by simonmar] · e7c3f957
      simonmar authored
      GC changes: instead of threading old-generation mutable lists
      through objects in the heap, keep it in a separate flat array.
      
      This has some advantages:
      
        - the IND_OLDGEN object is now only 2 words, so the minimum
          size of a THUNK is now 2 words instead of 3.  This saves
          some amount of allocation (about 2% on average according to
          my measurements), and is more friendly to the cache by
          squashing objects together more.
      
        - keeping the mutable list separate from the IND object
          will be necessary for our multiprocessor implementation.
      
        - removing the mut_link field makes the layout of some objects
          more uniform, leading to less complexity and special cases.
      
        - I also unified the two mutable lists (mut_once_list and mut_list)
          into a single mutable list, which lead to more simplifications
          in the GC.
      e7c3f957
  14. 28 Jan, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-01-28 12:55:17 by simonmar] · 153b9cb9
      simonmar authored
      Rationalise the BUILD,HOST,TARGET defines.
      
      Recall that:
      
        - build is the platform we're building on
        - host is the platform we're running on
        - target is the platform we're generating code for
      
      The change is that now we take these definitions as applying from the
      point of view of the particular source code being built, rather than
      the point of view of the whole build tree.
      
      For example, in RTS and library code, we were previously testing the
      TARGET platform.  But under the new rule, the platform on which this
      code is going to run is the HOST platform.  TARGET only makes sense in
      the compiler sources.
      
      In practical terms, this means that the values of BUILD, HOST & TARGET
      may vary depending on which part of the build tree we are in.
      
      Actual changes:
      
       - new file: includes/ghcplatform.h contains platform defines for
         the RTS and library code.
      
       - new file: includes/ghcautoconf.h contains the autoconf settings
         only (HAVE_BLAH).  This is so that we can get hold of these
         settings independently of the platform defines when necessary
         (eg. in GHC).
      
       - ghcconfig.h now #includes both ghcplatform.h and ghcautoconf.h.
      
       - MachRegs.h, which is included into both the compiler and the RTS,
         now has to cope with the fact that it might need to test either
         _TARGET_ or _HOST_ depending on the context.
      
       - the compiler's Makefile now generates
           stage{1,2,3}/ghc_boot_platform.h
         which contains platform defines for the compiler.  These differ
         depending on the stage, of course: in stage2, the HOST is the
         TARGET of stage1.  This was wrong before.
      
       - The compiler doesn't get platform info from Config.hs any more.
         Previously it did (sometimes), but unless we want to generate
         a new Config.hs for each stage we can't do this.
      
       - GHC now helpfully defines *_{BUILD,HOST}_{OS,ARCH} automatically
         in CPP'd Haskell source.
      
       - ghcplatform.h defines *_TARGET_* for backwards compatibility
         (ghcplatform.h is included by ghcconfig.h, which is included by
         config.h, so code which still #includes config.h will get the TARGET
         settings as before).
      
       - The Users's Guide is updated to mention *_HOST_* rather than
         *_TARGET_*.
      
       - coding-style.html in the commentary now contains a section on
         platform defines.  There are further doc updates to come.
      
      Thanks to Wolfgang Thaller for pointing me in the right direction.
      153b9cb9
  15. 23 Jan, 2005 1 commit
    • wolfgang's avatar
      [project @ 2005-01-23 06:10:15 by wolfgang] · 6f985ae8
      wolfgang authored
      Add support for the dead code stripping feature of recent Apple linkers.
      If your code is compiled using the NCG, you can now specify
      -optl-W,-dead_strip on the GHC command line when linking.
      It will have basically the same effect as using split-objs to build the
      libraries.
      
      Advantages over split-objs:
          * No evil perl script involved
          * Requires no special handling when building libraries
      
      Disadvantages:
          * The current version of Apple's linker is slow when given the
            -dead_strip flag. _REALLY_ slow.
          * Mac OS X only.
      
      This works by making the NCG emit the .subsections_via_symbols directive.
      Additionally, we have to add an extra label at the top of every info table,
      and make sure that the entry code references it (otherwise the info table
      will be considered part of the preceding entry code).
      The mangler just removes the .subsections_via_symbols directive.
      6f985ae8
  16. 16 Jan, 2005 1 commit
    • wolfgang's avatar
      [project @ 2005-01-16 05:31:39 by wolfgang] · 7a1b0a6c
      wolfgang authored
      A first stab at position independent code generation for i386-linux.
      It doesn't work yet, but it shouldn't break anything.
      
      What we need now is one or both of the following:
      a) A volunteer to implement PIC for x86 -fvia-C
          (I definitely refuse to touch any piece of code that contains
           both Perl and x86 assembly).
      b) A volunteer to improve the NCG to the point where it can compile
         the RTS (so we won't need point a).
      7a1b0a6c
  17. 29 Nov, 2004 2 commits
  18. 26 Nov, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-11-26 16:19:45 by simonmar] · ef5b4b14
      simonmar authored
      Further integration with the new package story.  GHC now supports
      pretty much everything in the package proposal.
      
        - GHC now works in terms of PackageIds (<pkg>-<version>) rather than
          just package names.  You can still specify package names without
          versions on the command line, as long as the name is unambiguous.
      
        - GHC understands hidden/exposed modules in a package, and will refuse
          to import a hidden module.  Also, the hidden/eposed status of packages
          is taken into account.
      
        - I had to remove the old package syntax from ghc-pkg, backwards
          compatibility isn't really practical.
      
        - All the package.conf.in files have been rewritten in the new syntax,
          and contain a complete list of modules in the package.  I've set all
          the versions to 1.0 for now - please check your package(s) and fix the
          version number & other info appropriately.
      
        - New options:
      
      	-hide-package P    sets the expose flag on package P to False
      	-ignore-package P  unregisters P for this compilation
      
      	For comparison, -package P sets the expose flag on package P
              to True, and also causes P to be linked in eagerly.
      
              -package-name is no longer officially supported.  Unofficially, it's
      	a synonym for -ignore-package, which has more or less the same effect
      	as -package-name used to.
      
      	Note that a package may be hidden and yet still be linked into
      	the program, by virtue of being a dependency of some other package.
      	To completely remove a package from the compiler's internal database,
              use -ignore-package.
      
      	The compiler will complain if any two packages in the
              transitive closure of exposed packages contain the same
              module.
      
      	You *must* use -ignore-package P when compiling modules for
              package P, if package P (or an older version of P) is already
              registered.  The compiler will helpfully complain if you don't.
      	The fptools build system does this.
      
         - Note: the Cabal library won't work yet.  It still thinks GHC uses
           the old package config syntax.
      
      Internal changes/cleanups:
      
         - The ModuleName type has gone away.  Modules are now just (a
           newtype of) FastStrings, and don't contain any package information.
           All the package-related knowledge is in DynFlags, which is passed
           down to where it is needed.
      
         - DynFlags manipulation has been cleaned up somewhat: there are no
           global variables holding DynFlags any more, instead the DynFlags
           are passed around properly.
      
         - There are a few less global variables in GHC.  Lots more are
           scheduled for removal.
      
         - -i is now a dynamic flag, as are all the package-related flags (but
           using them in {-# OPTIONS #-} is Officially Not Recommended).
      
         - make -j now appears to work under fptools/libraries/.  Probably
           wouldn't take much to get it working for a whole build.
      ef5b4b14
  19. 20 Oct, 2004 1 commit
  20. 18 Oct, 2004 1 commit
  21. 07 Oct, 2004 1 commit
    • wolfgang's avatar
      [project @ 2004-10-07 15:54:03 by wolfgang] · b4d045ae
      wolfgang authored
      Position Independent Code and Dynamic Linking Support, Part 1
      
      This commit allows generation of position independent code (PIC) that fully supports dynamic linking on Mac OS X and PowerPC Linux.
      Other platforms are not yet supported, and there is no support for actually linking or using dynamic libraries - so if you use the -fPIC or -dynamic code generation flags, you have to type your (platform-specific) linker command lines yourself.
      
      
      nativeGen/PositionIndependentCode.hs:
      New file. Look here for some more comments on how this works.
      
      cmm/CLabel.hs:
      Add support for DynamicLinkerLabels and PIC base labels - for use inside the NCG.
      needsCDecl: Case alternative labels now need C decls, see the codeGen/CgInfoTbls.hs below for details
      
      cmm/Cmm.hs:
      Add CmmPicBaseReg (used in NCG),
      and CmmLabelDiffOff (used in NCG and for offsets in info tables)
      
      cmm/CmmParse.y:
      support offsets in info tables
      
      cmm/PprC.hs:
      support CmmLabelDiffOff
      Case alternative labels now need C decls (see the codeGen/CgInfoTbls.hs for details), so we need to pprDataExterns for info tables.
      
      cmm/PprCmm.hs:
      support CmmLabelDiffOff
      
      codeGen/CgInfoTbls.hs:
      no longer store absolute addresses in info tables, instead, we store offsets.
      Also, for vectored return points, emit the alternatives _after_ the vector table. This is to work around a limitation in Apple's as, which refuses to handle label differences where one label is at the end of a section. Emitting alternatives after vector info tables makes sure this never happens in GHC generated code. Case alternatives now require prototypes in hc code, though (see changes in PprC.hs, CLabel.hs).
      
      main/CmdLineOpts.lhs:
      Add a new option, -fPIC.
      
      main/DriverFlags.hs:
      Pass the correct options for PIC to gcc, depending on the platform. Only for powerpc for now.
      
      nativeGen/AsmCodeGen.hs:
      Many changes...
      Mac OS X-specific management of import stubs is no longer, it's now part of a general mechanism to handle such things for all platforms that need it (Darwin [both ppc and x86], Linux on ppc, and some platforms we don't support).
      Move cmmToCmm into its own monad which can accumulate a list of imported symbols. Make it call cmmMakeDynamicReference at the right places.
      
      nativeGen/MachCodeGen.hs:
      nativeGen/MachInstrs.hs:
      nativeGen/MachRegs.lhs:
      nativeGen/PprMach.hs:
      nativeGen/RegAllocInfo.hs:
      Too many changes to enumerate here, PowerPC specific.
      
      nativeGen/NCGMonad.hs:
      NatM still tracks imported symbols, as more labels can be created during code generation (float literals, jump tables; on some platforms all data access has to go through the dynamic linking mechanism).
      
      driver/mangler/ghc-asm.lprl:
      Mangle absolute addresses in info tables to offsets.
      Correctly pass through GCC-generated PIC for Mac OS X and powerpc linux.
      
      includes/Cmm.h:
      includes/InfoTables.h:
      includes/Storage.h:
      includes/mkDerivedConstants.c:
      rts/GC.c:
      rts/GCCompact.c:
      rts/HeapStackCheck.cmm:
      rts/Printer.c:
      rts/RetainerProfile.c:
      rts/Sanity.c:
      Adapt to the fact that info tables now contain offsets.
      
      rts/Linker.c:
      Mac-specific: change machoInitSymbolsWithoutUnderscore to support PIC.
      b4d045ae
  22. 10 Sep, 2004 2 commits
  23. 20 Aug, 2004 1 commit
  24. 13 Aug, 2004 1 commit