1. 02 Dec, 2011 1 commit
    • Simon Marlow's avatar
      More changes aimed at improving call stacks. · 1469f1eb
      Simon Marlow authored
        - Attach a SrcSpan to every CostCentre.  This had the side effect
          that CostCentres that used to be merged because they had the same
          name are now considered distinct; so I had to add a Unique to
          CostCentre to give them distinct object-code symbols.
      
        - New flag: -fprof-auto-calls.  This flag adds an automatic SCC to
          every call site (application, to be precise).  This is typically
          more useful for call stacks than annotating whole functions.
      
      Various tidy-ups at the same time: removed unused NoCostCentre
      constructor, and refactored a bit in Coverage.lhs.
      
      The call stack we get from traceStack now looks like this:
      
      Stack trace:
        Main.CAF (<entire-module>)
        Main.main.xs (callstack002.hs:18:12-24)
        Main.map (callstack002.hs:13:12-16)
        Main.map.go (callstack002.hs:15:21-34)
        Main.map.go (callstack002.hs:15:21-23)
        Main.f (callstack002.hs:10:7-43)
      1469f1eb
  2. 29 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Make profiling work with multiple capabilities (+RTS -N) · 50de6034
      Simon Marlow authored
      This means that both time and heap profiling work for parallel
      programs.  Main internal changes:
      
        - CCCS is no longer a global variable; it is now another
          pseudo-register in the StgRegTable struct.  Thus every
          Capability has its own CCCS.
      
        - There is a new built-in CCS called "IDLE", which records ticks for
          Capabilities in the idle state.  If you profile a single-threaded
          program with +RTS -N2, you'll see about 50% of time in "IDLE".
      
        - There is appropriate locking in rts/Profiling.c to protect the
          shared cost-centre-stack data structures.
      
      This patch does enough to get it working, I have cut one big corner:
      the cost-centre-stack data structure is still shared amongst all
      Capabilities, which means that multiple Capabilities will race when
      updating the "allocations" and "entries" fields of a CCS.  Not only
      does this give unpredictable results, but it runs very slowly due to
      cache line bouncing.
      
      It is strongly recommended that you use -fno-prof-count-entries to
      disable the "entries" count when profiling parallel programs. (I shall
      add a note to this effect to the docs).
      50de6034
  3. 07 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Cost centre names are now in UTF-8 (#5559) · 630b8955
      Simon Marlow authored
      So the .prof file will be UTF-8.  This is mostly ok, except that the
      RTS doesn't calculate the column widths correctly (it assumes bytes =
      chars).
      
      hp2ps doesn't do anything sensible with Unicode strings, it just dumps
      the bytes into the .ps file.
      630b8955
  4. 04 Nov, 2011 1 commit
  5. 02 Nov, 2011 1 commit
    • Simon Marlow's avatar
      Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints · 7bb0447d
      Simon Marlow authored
      User visible changes
      ====================
      
      Profilng
      --------
      
      Flags renamed (the old ones are still accepted for now):
      
        OLD            NEW
        ---------      ------------
        -auto-all      -fprof-auto
        -auto          -fprof-exported
        -caf-all       -fprof-cafs
      
      New flags:
      
        -fprof-auto              Annotates all bindings (not just top-level
                                 ones) with SCCs
      
        -fprof-top               Annotates just top-level bindings with SCCs
      
        -fprof-exported          Annotates just exported bindings with SCCs
      
        -fprof-no-count-entries  Do not maintain entry counts when profiling
                                 (can make profiled code go faster; useful with
                                 heap profiling where entry counts are not used)
      
      Cost-centre stacks have a new semantics, which should in most cases
      result in more useful and intuitive profiles.  If you find this not to
      be the case, please let me know.  This is the area where I have been
      experimenting most, and the current solution is probably not the
      final version, however it does address all the outstanding bugs and
      seems to be better than GHC 7.2.
      
      Stack traces
      ------------
      
      +RTS -xc now gives more information.  If the exception originates from
      a CAF (as is common, because GHC tends to lift exceptions out to the
      top-level), then the RTS walks up the stack and reports the stack in
      the enclosing update frame(s).
      
      Result: +RTS -xc is much more useful now - but you still have to
      compile for profiling to get it.  I've played around a little with
      adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
      quite accurately.
      
      I plan to add more facilities for stack tracing (e.g. in GHCi) in the
      future.
      
      Coverage (HPC)
      --------------
      
       * derived instances are now coloured yellow if they weren't used
       * likewise record field names
       * entry counts are more accurate (hpc --fun-entry-count)
       * tab width is now correct (markup was previously off in source with
         tabs)
      
      Internal changes
      ================
      
      In Core, the Note constructor has been replaced by
      
              Tick (Tickish b) (Expr b)
      
      which is used to represent all the kinds of source annotation we
      support: profiling SCCs, HPC ticks, and GHCi breakpoints.
      
      Depending on the properties of the Tickish, different transformations
      apply to Tick.  See CoreUtils.mkTick for details.
      
      Tickets
      =======
      
      This commit closes the following tickets, test cases to follow:
      
        - Close #2552: not a bug, but the behaviour is now more intuitive
          (test is T2552)
      
        - Close #680 (test is T680)
      
        - Close #1531 (test is result001)
      
        - Close #949 (test is T949)
      
        - Close #2466: test case has bitrotted (doesn't compile against current
          version of vector-space package)
      7bb0447d
  6. 25 Aug, 2011 2 commits
  7. 12 Apr, 2011 1 commit
    • Simon Marlow's avatar
      Change the way module initialisation is done (#3252, #4417) · a52ff761
      Simon Marlow authored
      Previously the code generator generated small code fragments labelled
      with __stginit_M for each module M, and these performed whatever
      initialisation was necessary for that module and recursively invoked
      the initialisation functions for imported modules.  This appraoch had
      drawbacks:
      
       - FFI users had to call hs_add_root() to ensure the correct
         initialisation routines were called.  This is a non-standard,
         and ugly, API.
      
       - unless we were using -split-objs, the __stginit dependencies would
         entail linking the whole transitive closure of modules imported,
         whether they were actually used or not.  In an extreme case (#4387,
         #4417), a module from GHC might be imported for use in Template
         Haskell or an annotation, and that would force the whole of GHC to
         be needlessly linked into the final executable.
      
      So now instead we do our initialisation with C functions marked with
      __attribute__((constructor)), which are automatically invoked at
      program startup time (or DSO load-time).  The C initialisers are
      emitted into the stub.c file.  This means that every time we compile
      with -prof or -hpc, we now get a stub file, but thanks to #3687 that
      is now invisible to the user.
      
      There are some refactorings in the RTS (particularly for HPC) to
      handle the fact that initialisers now get run earlier than they did
      before.
      
      The __stginit symbols are still generated, and the hs_add_root()
      function still exists (but does nothing), for backwards compatibility.
      a52ff761
  8. 24 Jan, 2011 1 commit
    • Simon Marlow's avatar
      Merge in new code generator branch. · 889c084e
      Simon Marlow authored
      This changes the new code generator to make use of the Hoopl package
      for dataflow analysis.  Hoopl is a new boot package, and is maintained
      in a separate upstream git repository (as usual, GHC has its own
      lagging darcs mirror in http://darcs.haskell.org/packages/hoopl).
      
      During this merge I squashed recent history into one patch.  I tried
      to rebase, but the history had some internal conflicts of its own
      which made rebase extremely confusing, so I gave up. The history I
      squashed was:
      
        - Update new codegen to work with latest Hoopl
        - Add some notes on new code gen to cmm-notes
        - Enable Hoopl lag package.
        - Add SPJ note to cmm-notes
        - Improve GC calls on new code generator.
      
      Work in this branch was done by:
         - Milan Straka <fox@ucw.cz>
         - John Dias <dias@cs.tufts.edu>
         - David Terei <davidterei@gmail.com>
      
      Edward Z. Yang <ezyang@mit.edu> merged in further changes from GHC HEAD
      and fixed a few bugs.
      889c084e
  9. 06 Nov, 2009 1 commit
    • Ben.Lippmeier@anu.edu.au's avatar
      * Refactor CLabel.RtsLabel to CLabel.CmmLabel · a02e7f40
      Ben.Lippmeier@anu.edu.au authored
      The type of the CmmLabel ctor is now
        CmmLabel :: PackageId -> FastString -> CmmLabelInfo -> CLabel
        
       - When you construct a CmmLabel you have to explicitly say what
         package it is in. Many of these will just use rtsPackageId, but
         I've left it this way to remind people not to pretend labels are
         in the RTS package when they're not. 
         
       - When parsing a Cmm file, labels that are not defined in the 
         current file are assumed to be in the RTS package. 
         
         Labels imported like
            import label
         are assumed to be in a generic "foreign" package, which is different
         from the current one.
         
         Labels imported like
            import "package-name" label
         are marked as coming from the named package.
         
         This last one is needed for the integer-gmp library as we want to
         refer to labels that are not in the same compilation unit, but
         are in the same non-rts package.
         
         This should help remove the nasty #ifdef __PIC__ stuff from
         integer-gmp/cbits/gmp-wrappers.cmm
         
      a02e7f40
  10. 18 Oct, 2009 1 commit
  11. 02 Aug, 2009 1 commit
    • Simon Marlow's avatar
      RTS tidyup sweep, first phase · a2a67cd5
      Simon Marlow authored
      The first phase of this tidyup is focussed on the header files, and in
      particular making sure we are exposinng publicly exactly what we need
      to, and no more.
      
       - Rts.h now includes everything that the RTS exposes publicly,
         rather than a random subset of it.
      
       - Most of the public header files have moved into subdirectories, and
         many of them have been renamed.  But clients should not need to
         include any of the other headers directly, just #include the main
         public headers: Rts.h, HsFFI.h, RtsAPI.h.
      
       - All the headers needed for via-C compilation have moved into the
         stg subdirectory, which is self-contained.  Most of the headers for
         the rest of the RTS APIs have moved into the rts subdirectory.
      
       - I left MachDeps.h where it is, because it is so widely used in
         Haskell code.
       
       - I left a deprecated stub for RtsFlags.h in place.  The flag
         structures are now exposed by Rts.h.
      
       - Various internal APIs are no longer exposed by public header files.
      
       - Various bits of dead code and declarations have been removed
      
       - More gcc warnings are turned on, and the RTS code is more
         warning-clean.
      
       - More source files #include "PosixSource.h", and hence only use
         standard POSIX (1003.1c-1995) interfaces.
      
      There is a lot more tidying up still to do, this is just the first
      pass.  I also intend to standardise the names for external RTS APIs
      (e.g use the rts_ prefix consistently), and declare the internal APIs
      as hidden for shared libraries.
      a2a67cd5
  12. 07 Jul, 2009 1 commit
  13. 17 Dec, 2008 1 commit
  14. 13 Oct, 2008 1 commit
    • dias@eecs.harvard.edu's avatar
      Big collection of patches for the new codegen branch. · e6243a81
      dias@eecs.harvard.edu authored
      o Fixed bug that emitted the copy-in code for closure entry
        in the wrong place -- at the initialization of the closure.
      o Refactored some of the closure entry code.
      o Added code to check that no LocalRegs are live-in to a procedure
         -- trip up some buggy programs earlier
      o Fixed environment bindings for thunks
         -- we weren't (re)binding the free variables in a thunk
      o Fixed a bug in proc-point splitting that dropped some updates
        to the entry block in a procedure.
      o Fixed improper calls to code that generates CmmLit's for strings
      o New invariant on cg_loc in CgIdInfo: the expression is always tagged
      o Code to load free vars on entry to a thunk was (wrongly) placed before
        the heap check.
      o Some of the StgCmm code was redundantly passing around Id's
        along with CgIdInfo's; no more.
      o Initialize the LocalReg's that point to a closure before allocating and
        initializing the closure itself -- otherwise, we have problems with
        recursive closure bindings
      o BlockEnv and BlockSet types are now abstract.
      o Update frames:
        - push arguments in Old call area
        - keep track of the return sp in the FCode monad
        - keep the return sp in every call, tail call, and return
            (because it might be different at different call sites,
             e.g. tail calls to the gc after a heap check are performed
                  before pushing the update frame)
        - set the sp appropriately on returns and tail calls
      o Reduce call, tail call, and return to a single LastCall node
      o Added slow entry code, using different calling conventions on entry and tail call
      o More fixes to the calling convention code.
        The tricky stuff is all about the closure environment: it must be passed in R1,
        but in non-closures, there is no such argument, so we can't treat all arguments
        the same way: the closure environment is special. Maybe the right step forward
        would be to define a different calling convention for closure arguments.
      o Let-no-escapes need to be emitted out-of-line -- otherwise, we drop code.
      o Respect RTS requirement of word alignment for pointers
        My stack allocation can pack sub-word values into a single word on the stack,
        but it wasn't requiring word-alignment for pointers. It does now,
        by word-aligning both pointer registers and call areas.
      o CmmLint was over-aggresively ruling out non-word-aligned memory references,
        which may be kosher now that we can spill small values into a single word.
      o Wrong label order on a conditional branch when compiling switches.
      o void args weren't dropped in many cases.
        To help prevent this kind of mistake, I defined a NonVoid wrapper,
        which I'm applying only to Id's for now, although there are probably
        other good candidates.
      o A little code refactoring: separate modules for procpoint analysis splitting, 
        stack layout, and building infotables.
      o Stack limit check: insert along with the heap limit check, using a symbolic
        constant (a special CmmLit), then replace it when the stack layout is known.
      o Removed last node: MidAddToContext 
      o Adding block id as a literal: means that the lowering of the calling conventions
        no longer has to produce labels early, which was inhibiting common-block elimination.
        Will also make it easier for the non-procpoint-splitting path.
      o Info tables: don't try to describe the update frame!
      o Over aggressive use of NonVoid!!!!
        Don't drop the non-void args before setting the type of the closure!!!
      o Sanity checking:
        Added a pass to stub dead dead slots on the stack
        (only ~10 lines with the dataflow framework)
      o More sanity checking:
        Check that incoming pointer arguments are non-stubbed.
        Note: these checks are still subject to dead-code removal, but they should
        still be quite helpful.
      o Better sanity checking: why stop at function arguments?
        Instead, in mkAssign, check that _any_ assignment to a pointer type is non-null
        -- the sooner the crash, the easier it is to debug.
        Still need to add the debugging flag to turn these checks on explicitly.
      o Fixed yet another calling convention bug.
        This time, the calls to the GC were wrong. I've added a new convention
        for GC calls and invoked it where appropriate.
        We should really straighten out the calling convention stuff:
          some of the code (and documentation) is spread across the compiler,
          and there's some magical use of the node register that should really
          be handled (not avoided) by calling conventions.
      o Switch bug: the arms in mkCmmLitSwitch weren't returning to a single join point.
      o Environment shadowing problem in Stg->Cmm:
        When a closure f is bound at the top-level, we should not bind f to the
        node register on entry to the closure.
        Why? Because if the body of f contains a let-bound closure g that refers
        to f, we want to make sure that it refers to the static closure for f.
        Normally, this would all be fine, because when we compile a closure,
        we rebind free variables in the environment. But f doesn't look like
        a free variable because it's a static value. So, the binding for f
        remains in the environment when we compile g, inconveniently referring
        to the wrong thing.
        Now, I bind the variable in the local environment only if the closure is not
        bound at the top level. It's still okay to make assumptions about the
        node holding the closure environment; we just won't find the binding
        in the environment, so code that names the closure will now directly
        get the label of the static closure, not the node register holding a
        pointer to the static closure.
      o Don't generate bogus Cmm code containing SRTs during the STG -> Cmm pass!
        The tables made reference to some labels that don't exist when we compute and
        generate the tables in the back end.
      o Safe foreign calls need some special treatment (at least until we have the integrated
        codegen). In particular:
        o they need info tables
        o they are not procpoints -- the successor had better be in the same procedure
        o we cannot (yet) implement the calling conventions early, which means we have
          to carry the calling-conv info all the way to the end
      o We weren't following the old convention when registering a module.
        Now, we use update frames to push any new modules that have to be registered
        and enter the youngest one on the stack.
        We also use the update frame machinery to specify that the return should pop
        the return address off the stack.
      o At each safe foreign call, an infotable must be at the bottom of the stack,
        and the TSO->sp must point to it.
      o More problems with void args in a direct call to a function:
        We were checking the args (minus voids) to check whether the call was saturated,
        which caused problems when the function really wasn't saturated because it
        took an extra void argument.
      o Forgot to distinguish integer != from floating != during Stg->Cmm
      o Updating slotEnv and areaMap to include safe foreign calls
        The dataflow analyses that produce the slotEnv and areaMap give
        results for each basic block, but we also need the results for
        a safe foreign call, which is a middle node.
        After running the dataflow analysis, we have another pass that
        updates the results to includ any safe foreign calls.
      o Added a static flag for the debugging technique that inserts
        instructions to stub dead slots on the stack and crashes when
        a stubbed value is loaded into a pointer-typed LocalReg.
      o C back end expects to see return continuations before their call sites.
        Sorted the flowgraphs appropriately after splitting.
      o PrimOp calling conventions are special -- unlimited registers, no stack
        Yet another calling convention...
      o More void value problems: if the RHS of a case arm is a void-typed variable,
        don't try to return it.
      o When calling some primOp, they may allocate memory; if so, we need to
        do a heap check when we return from the call.
      e6243a81
  15. 14 Aug, 2008 1 commit
    • dias@eecs.harvard.edu's avatar
      Merging in the new codegen branch · 176fa33f
      dias@eecs.harvard.edu authored
      This merge does not turn on the new codegen (which only compiles
      a select few programs at this point),
      but it does introduce some changes to the old code generator.
      
      The high bits:
      1. The Rep Swamp patch is finally here.
         The highlight is that the representation of types at the
         machine level has changed.
         Consequently, this patch contains updates across several back ends.
      2. The new Stg -> Cmm path is here, although it appears to have a
         fair number of bugs lurking.
      3. Many improvements along the CmmCPSZ path, including:
         o stack layout
         o some code for infotables, half of which is right and half wrong
         o proc-point splitting
      176fa33f
  16. 03 May, 2008 1 commit
  17. 12 Apr, 2008 1 commit
  18. 04 Jan, 2008 1 commit
  19. 21 Sep, 2007 1 commit
  20. 04 Sep, 2007 1 commit
  21. 03 Sep, 2007 1 commit
  22. 01 Sep, 2007 1 commit
  23. 27 Jul, 2007 1 commit
    • Simon Marlow's avatar
      Pointer Tagging · 6015a94f
      Simon Marlow authored
        
      This patch implements pointer tagging as per our ICFP'07 paper "Faster
      laziness using dynamic pointer tagging".  It improves performance by
      10-15% for most workloads, including GHC itself.
      
      The original patches were by Alexey Rodriguez Yakushev
      <mrchebas@gmail.com>, with additions and improvements by me.  I've
      re-recorded the development as a single patch.
      
      The basic idea is this: we use the low 2 bits of a pointer to a heap
      object (3 bits on a 64-bit architecture) to encode some information
      about the object pointed to.  For a constructor, we encode the "tag"
      of the constructor (e.g. True vs. False), for a function closure its
      arity.  This enables some decisions to be made without dereferencing
      the pointer, which speeds up some common operations.  In particular it
      enables us to avoid costly indirect jumps in many cases.
      
      More information in the commentary:
      
      http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging
      6015a94f
  24. 27 Jun, 2007 2 commits
  25. 11 Oct, 2006 1 commit
    • Simon Marlow's avatar
      Module header tidyup, phase 1 · 49c98d14
      Simon Marlow authored
      This patch is a start on removing import lists and generally tidying
      up the top of each module.  In addition to removing import lists:
      
         - Change DATA.IOREF -> Data.IORef etc.
         - Change List -> Data.List etc.
         - Remove $Id$
         - Update copyrights
         - Re-order imports to put non-GHC imports last
         - Remove some unused and duplicate imports
      49c98d14
  26. 03 Aug, 2006 1 commit
  27. 25 Jul, 2006 1 commit
    • Simon Marlow's avatar
      Generalise Package Support · 61d2625a
      Simon Marlow authored
      This patch pushes through one fundamental change: a module is now
      identified by the pair of its package and module name, whereas
      previously it was identified by its module name alone.  This means
      that now a program can contain multiple modules with the same name, as
      long as they belong to different packages.
      
      This is a language change - the Haskell report says nothing about
      packages, but it is now necessary to understand packages in order to
      understand GHC's module system.  For example, a type T from module M
      in package P is different from a type T from module M in package Q.
      Previously this wasn't an issue because there could only be a single
      module M in the program.
      
      The "module restriction" on combining packages has therefore been
      lifted, and a program can contain multiple versions of the same
      package.
      
      Note that none of the proposed syntax changes have yet been
      implemented, but the architecture is geared towards supporting import
      declarations qualified by package name, and that is probably the next
      step.
      
      It is now necessary to specify the package name when compiling a
      package, using the -package-name flag (which has been un-deprecated).
      Fortunately Cabal still uses -package-name.
      
      Certain packages are "wired in".  Currently the wired-in packages are:
      base, haskell98, template-haskell and rts, and are always referred to
      by these versionless names.  Other packages are referred to with full
      package IDs (eg. "network-1.0").  This is because the compiler needs
      to refer to entities in the wired-in packages, and we didn't want to
      bake the version of these packages into the comiler.  It's conceivable
      that someone might want to upgrade the base package independently of
      GHC.
      
      Internal changes:
      
        - There are two module-related types:
      
              ModuleName      just a FastString, the name of a module
              Module          a pair of a PackageId and ModuleName
      
          A mapping from ModuleName can be a UniqFM, but a mapping from Module
          must be a FiniteMap (we provide it as ModuleEnv).
      
        - The "HomeModules" type that was passed around the compiler is now
          gone, replaced in most cases by the current package name which is
          contained in DynFlags.  We can tell whether a Module comes from the
          current package by comparing its package name against the current
          package.
      
        - While I was here, I changed PrintUnqual to be a little more useful:
          it now returns the ModuleName that the identifier should be qualified
          with according to the current scope, rather than its original
          module.  Also, PrintUnqual tells whether to qualify module names with
          package names (currently unused).
      
      Docs to follow.
      61d2625a
  28. 04 Jul, 2006 1 commit
  29. 07 Apr, 2006 1 commit
    • Simon Marlow's avatar
      Reorganisation of the source tree · 0065d5ab
      Simon Marlow authored
      Most of the other users of the fptools build system have migrated to
      Cabal, and with the move to darcs we can now flatten the source tree
      without losing history, so here goes.
      
      The main change is that the ghc/ subdir is gone, and most of what it
      contained is now at the top level.  The build system now makes no
      pretense at being multi-project, it is just the GHC build system.
      
      No doubt this will break many things, and there will be a period of
      instability while we fix the dependencies.  A straightforward build
      should work, but I haven't yet fixed binary/source distributions.
      Changes to the Building Guide will follow, too.
      0065d5ab
  30. 06 Jan, 2006 1 commit
    • simonmar's avatar
      [project @ 2006-01-06 16:30:17 by simonmar] · 9d7da331
      simonmar authored
      Add support for UTF-8 source files
      
      GHC finally has support for full Unicode in source files.  Source
      files are now assumed to be UTF-8 encoded, and the full range of
      Unicode characters can be used, with classifications recognised using
      the implementation from Data.Char.  This incedentally means that only
      the stage2 compiler will recognise Unicode in source files, because I
      was too lazy to port the unicode classifier code into libcompat.
      
      Additionally, the following synonyms for keywords are now recognised:
      
        forall symbol 	(U+2200)	forall
        right arrow   	(U+2192)	->
        left arrow   		(U+2190)	<-
        horizontal ellipsis 	(U+22EF)	..
      
      there are probably more things we could add here.
      
      This will break some source files if Latin-1 characters are being used.
      In most cases this should result in a UTF-8 decoding error.  Later on
      if we want to support more encodings (perhaps with a pragma to specify
      the encoding), I plan to do it by recoding into UTF-8 before parsing.
      
      Internally, there were some pretty big changes:
      
        - FastStrings are now stored in UTF-8
      
        - Z-encoding has been moved right to the back end.  Previously we
          used to Z-encode every identifier on the way in for simplicity,
          and only decode when we needed to show something to the user.
          Instead, we now keep every string in its UTF-8 encoding, and
          Z-encode right before printing it out.  To avoid Z-encoding the
          same string multiple times, the Z-encoding is cached inside the
          FastString the first time it is requested.
      
          This speeds up the compiler - I've measured some definite
          improvement in parsing at least, and I expect compilations overall
          to be faster too.  It also cleans up a lot of cruft from the
          OccName interface.  Z-encoding is nicely hidden inside the
          Outputable instance for Names & OccNames now.
      
        - StringBuffers are UTF-8 too, and are now represented as
          ForeignPtrs.
      
        - I've put together some test cases, not by any means exhaustive,
          but there are some interesting UTF-8 decoding error cases that
          aren't obvious.  Also, take a look at unicode001.hs for a demo.
      9d7da331
  31. 02 Aug, 2005 1 commit
  32. 18 Mar, 2005 2 commits
    • simonmar's avatar
      [project @ 2005-03-18 13:37:27 by simonmar] · d1c1b7d0
      simonmar authored
      Flags cleanup.
      
      Basically the purpose of this commit is to move more of the compiler's
      global state into DynFlags, which is moving in the direction we need
      to go for the GHC API which can have multiple active sessions
      supported by a single GHC instance.
      
      Before:
      
      $ grep 'global_var' */*hs | wc -l
           78
      
      After:
      
      $ grep 'global_var' */*hs | wc -l
           27
      
      Well, it's an improvement.  Most of what's left won't really affect
      our ability to host multiple sessions.
      
      Lots of static flags have become dynamic flags (yay!).  Notably lots
      of flags that we used to think of as "driver" flags, like -I and -L,
      are now dynamic.  The most notable static flags left behind are the
      "way" flags, eg. -prof.  It would be nice to fix this, but it isn't
      urgent.
      
      On the way, lots of cleanup has happened.  Everything related to
      static and dynamic flags lives in StaticFlags and DynFlags
      respectively, and they share a common command-line parser library in
      CmdLineParser.  The flags related to modes (--makde, --interactive
      etc.) are now private to the front end: in fact private to Main
      itself, for now.
      d1c1b7d0
    • simonmar's avatar
      [project @ 2005-03-18 11:19:27 by simonmar] · 6a51f7df
      simonmar authored
      merge rev. 1.6.2.1, simplified slightly:
      
        Initialise a CostCentreStack by generating SIZEOF_CostCentreStack
        (gotten from the C compiler) zeros, padded to the nearest word.
        Improves on the previous fixes for unpredictable padding (see comment).
      6a51f7df
  33. 17 Mar, 2005 1 commit
  34. 28 Jan, 2005 1 commit
    • simonmar's avatar
      [project @ 2005-01-28 12:55:17 by simonmar] · 153b9cb9
      simonmar authored
      Rationalise the BUILD,HOST,TARGET defines.
      
      Recall that:
      
        - build is the platform we're building on
        - host is the platform we're running on
        - target is the platform we're generating code for
      
      The change is that now we take these definitions as applying from the
      point of view of the particular source code being built, rather than
      the point of view of the whole build tree.
      
      For example, in RTS and library code, we were previously testing the
      TARGET platform.  But under the new rule, the platform on which this
      code is going to run is the HOST platform.  TARGET only makes sense in
      the compiler sources.
      
      In practical terms, this means that the values of BUILD, HOST & TARGET
      may vary depending on which part of the build tree we are in.
      
      Actual changes:
      
       - new file: includes/ghcplatform.h contains platform defines for
         the RTS and library code.
      
       - new file: includes/ghcautoconf.h contains the autoconf settings
         only (HAVE_BLAH).  This is so that we can get hold of these
         settings independently of the platform defines when necessary
         (eg. in GHC).
      
       - ghcconfig.h now #includes both ghcplatform.h and ghcautoconf.h.
      
       - MachRegs.h, which is included into both the compiler and the RTS,
         now has to cope with the fact that it might need to test either
         _TARGET_ or _HOST_ depending on the context.
      
       - the compiler's Makefile now generates
           stage{1,2,3}/ghc_boot_platform.h
         which contains platform defines for the compiler.  These differ
         depending on the stage, of course: in stage2, the HOST is the
         TARGET of stage1.  This was wrong before.
      
       - The compiler doesn't get platform info from Config.hs any more.
         Previously it did (sometimes), but unless we want to generate
         a new Config.hs for each stage we can't do this.
      
       - GHC now helpfully defines *_{BUILD,HOST}_{OS,ARCH} automatically
         in CPP'd Haskell source.
      
       - ghcplatform.h defines *_TARGET_* for backwards compatibility
         (ghcplatform.h is included by ghcconfig.h, which is included by
         config.h, so code which still #includes config.h will get the TARGET
         settings as before).
      
       - The Users's Guide is updated to mention *_HOST_* rather than
         *_TARGET_*.
      
       - coding-style.html in the commentary now contains a section on
         platform defines.  There are further doc updates to come.
      
      Thanks to Wolfgang Thaller for pointing me in the right direction.
      153b9cb9
  35. 26 Nov, 2004 1 commit
    • simonmar's avatar
      [project @ 2004-11-26 16:19:45 by simonmar] · ef5b4b14
      simonmar authored
      Further integration with the new package story.  GHC now supports
      pretty much everything in the package proposal.
      
        - GHC now works in terms of PackageIds (<pkg>-<version>) rather than
          just package names.  You can still specify package names without
          versions on the command line, as long as the name is unambiguous.
      
        - GHC understands hidden/exposed modules in a package, and will refuse
          to import a hidden module.  Also, the hidden/eposed status of packages
          is taken into account.
      
        - I had to remove the old package syntax from ghc-pkg, backwards
          compatibility isn't really practical.
      
        - All the package.conf.in files have been rewritten in the new syntax,
          and contain a complete list of modules in the package.  I've set all
          the versions to 1.0 for now - please check your package(s) and fix the
          version number & other info appropriately.
      
        - New options:
      
      	-hide-package P    sets the expose flag on package P to False
      	-ignore-package P  unregisters P for this compilation
      
      	For comparison, -package P sets the expose flag on package P
              to True, and also causes P to be linked in eagerly.
      
              -package-name is no longer officially supported.  Unofficially, it's
      	a synonym for -ignore-package, which has more or less the same effect
      	as -package-name used to.
      
      	Note that a package may be hidden and yet still be linked into
      	the program, by virtue of being a dependency of some other package.
      	To completely remove a package from the compiler's internal database,
              use -ignore-package.
      
      	The compiler will complain if any two packages in the
              transitive closure of exposed packages contain the same
              module.
      
      	You *must* use -ignore-package P when compiling modules for
              package P, if package P (or an older version of P) is already
              registered.  The compiler will helpfully complain if you don't.
      	The fptools build system does this.
      
         - Note: the Cabal library won't work yet.  It still thinks GHC uses
           the old package config syntax.
      
      Internal changes/cleanups:
      
         - The ModuleName type has gone away.  Modules are now just (a
           newtype of) FastStrings, and don't contain any package information.
           All the package-related knowledge is in DynFlags, which is passed
           down to where it is needed.
      
         - DynFlags manipulation has been cleaned up somewhat: there are no
           global variables holding DynFlags any more, instead the DynFlags
           are passed around properly.
      
         - There are a few less global variables in GHC.  Lots more are
           scheduled for removal.
      
         - -i is now a dynamic flag, as are all the package-related flags (but
           using them in {-# OPTIONS #-} is Officially Not Recommended).
      
         - make -j now appears to work under fptools/libraries/.  Probably
           wouldn't take much to get it working for a whole build.
      ef5b4b14
  36. 10 Nov, 2004 1 commit
  37. 30 Sep, 2004 1 commit
    • simonpj's avatar
      [project @ 2004-09-30 10:35:15 by simonpj] · 23f40f0e
      simonpj authored
      ------------------------------------
      	Add Generalised Algebraic Data Types
      	------------------------------------
      
      This rather big commit adds support for GADTs.  For example,
      
          data Term a where
       	  Lit :: Int -> Term Int
      	  App :: Term (a->b) -> Term a -> Term b
      	  If  :: Term Bool -> Term a -> Term a
      	  ..etc..
      
          eval :: Term a -> a
          eval (Lit i) = i
          eval (App a b) = eval a (eval b)
          eval (If p q r) | eval p    = eval q
          		    | otherwise = eval r
      
      
      Lots and lots of of related changes throughout the compiler to make
      this fit nicely.
      
      One important change, only loosely related to GADTs, is that skolem
      constants in the typechecker are genuinely immutable and constant, so
      we often get better error messages from the type checker.  See
      TcType.TcTyVarDetails.
      
      There's a new module types/Unify.lhs, which has purely-functional
      unification and matching for Type. This is used both in the typechecker
      (for type refinement of GADTs) and in Core Lint (also for type refinement).
      23f40f0e