1. 15 Jun, 2000 9 commits
    • panne's avatar
      [project @ 2000-06-15 20:22:53 by panne] · cc0a715a
      panne authored
      Quick workaround for Reuben's M$ configuration problems
      cc0a715a
    • simonmar's avatar
      [project @ 2000-06-15 15:56:51 by simonmar] · 9f60c2c4
      simonmar authored
      sigh, fix the ordering of the phases so that splitting works again.
      9f60c2c4
    • daan's avatar
      [project @ 2000-06-15 13:23:51 by daan] · 3d124552
      daan authored
      Added new primitives and bytecodes that support
      code generation for XMLambda. All additions are
      surrounded by #ifdef XMLAMBDA.
      
      Most important additions:
      - Rows (n-tuples) which are implemented on top of Frozen Mutarrays
      - Inj (variant sums), which is implemented using a new constructor
      called Inj which contains both the value and an unboxed int
      which represents the index.
      3d124552
    • daan's avatar
      [project @ 2000-06-15 13:18:08 by daan] · b619d74d
      daan authored
      Added definition of int64 to make it compilable with both gcc and VisualC++
      
      Added functions to the bytecode assembler that
      support code generation for Xmlambda. All additions for
      Xmlambda are surrounded by #ifdef XMLAMBDA.
      b619d74d
    • daan's avatar
      [project @ 2000-06-15 13:16:16 by daan] · d188050a
      daan authored
      Added definition of int64 to make it compilable with both gcc and VisualC++
      d188050a
    • simonmar's avatar
      [project @ 2000-06-15 11:50:14 by simonmar] · 93c0c44f
      simonmar authored
      urk! the arity of a record selector Id didn't take into account any
      dictionary arguments due to the context on the datatype...
      
      (fixes bug on H/OpenGL reported by Sven)
      93c0c44f
    • sewardj's avatar
      [project @ 2000-06-15 11:17:41 by sewardj] · 0779a545
      sewardj authored
      Emit slightly better x86 floating point code for comparisons, +, -,
      * and /, in the common case where one of the source fake FP regs
      is the same as the destination reg.
      0779a545
    • rrt's avatar
      [project @ 2000-06-15 09:24:49 by rrt] · 8c0949c9
      rrt authored
      Fixed typo: .hs left out of .SUFFIXES list.
      8c0949c9
    • sewardj's avatar
      [project @ 2000-06-15 08:38:25 by sewardj] · 665229e5
      sewardj authored
      Major thing: new register allocator.  Brief description follows.
      Should correctly handle code with loops in, even though we don't
      generate any such at the moment.  A lot of comments.  The previous
      machinery for spilling is retained, as is the idea of a fast-and-easy
      initial allocation attempt intended to deal with the majority of code
      blocks (about 60% on x86) very cheaply.  Many comments explaining
      in detail how it works :-)
      
      The Stix inliner is now on by default.  Integer code seems to run
      within about 1% of that -fvia-C.  x86 fp code is significantly worse,
      up to about 30% slower, depending on the amount of fp activity.
      
      Minor thing: lazyfication of the top-level NCG plumbing, so that the
      NCG doesn't require any greater residency than compiling to C, just a
      bit more time.  Created lazyThenUs and lazyMapUs for this purpose.
      
      The new allocator is somewhat, although not catastophically, slower
      than the old one.  Fixing of the long-standing NCG space leak more
      than makes up for it; overall hsc run-time is down about 5%, due to
      significantly reduced GC time.
      
      --------------------------------------------------------------------
      
      Instructions are numbered sequentially, starting at zero.
      
      A flow edge (FE) is a pair of insn numbers (MkFE Int Int) denoting
      a possible flow of control from the first insn to the second.
      
      The input to the register allocator is a list of instructions, which
      mention Regs.  A Reg can be a RealReg -- a real machine reg -- or a
      VirtualReg, which carries a unique.  After allocation, all the
      VirtualReg references will have been converted into RealRegs, and
      possibly some spill code will have been inserted.
      
      The heart of the register allocator works in four phases.
      
      1.  (find_flow_edges) Calculate all the FEs for the code list.
          Return them not as a [FE], but implicitly, as a pair of
          Array Int [Int], being the successor and predecessor maps
          for instructions.
      
      2.  (calc_liveness) Returns a FiniteMap FE RegSet.  For each
          FE, indicates the set of registers live on that FE.  Note
          that the set includes both RealRegs and VirtualRegs.  The
          former appear because the code could mention fixed register
          usages, and we need to take them into account from the start.
      
      3.  (calc_live_range_sets) Invert the above mapping, giving a
          FiniteMap Reg FeSet, indicating, for each virtual and real
          reg mentioned in the code, which FEs it is live on.
      
      4.  (calc_vreg_to_rreg_mapping) For virtual reg, try and find
          an allocatable real register for it.  Each real register has
          a "current commitment", indicating the set of FEs it is
          currently live on.  A virtual reg v can be assigned to
          real reg r iff v's live-fe-set does not intersect with r's
          current commitment fe-set.  If the assignment is made,
          v's live-fe-set is union'd into r's current commitment fe-set.
          There is also the minor restriction that v and r must be of
          the same register class (integer or floating).
      
          Once this mapping is established, we simply apply it to the
          input insns, and that's it.
      
          If no suitable real register can be found, the vreg is mapped
          to itself, and we deem allocation to have failed.  The partially
          allocated code is returned.  The higher echelons of the allocator
          (doGeneralAlloc and runRegAlloc) then cooperate to insert spill
          code and re-run allocation, until a successful allocation is found.
      665229e5
  2. 14 Jun, 2000 13 commits
  3. 13 Jun, 2000 18 commits
    • simonmar's avatar
      [project @ 2000-06-13 16:10:00 by simonmar] · 0a423b55
      simonmar authored
      forgot one file
      0a423b55
    • simonmar's avatar
      [project @ 2000-06-13 16:07:20 by simonmar] · 877aad48
      simonmar authored
      New Driver
      ==========
      
      Most things work now, so I'm committing this for a shake down.
      Doubtless there'll be some breakage but things should be back to
      normal by the end of the week.
      
      NOTE: GHC 4.06 won't work to build this driver at the moment, due to a
      bug in its parser.  I'll commit a workaround shortly.
      
      There are several improvements here:
      
      	- the driver is written in Haskell, so is allegedly
      	  more maintainable than the previous one.  It's a bit shorter,
      	  at any rate.
      
      	- the package system has been generalised, so that eg.
      	  the RTS is a package, as is GMP and the prelude.  Packages
      	  are now configured via a configuration file, package.conf.
      	  Two versions of package.conf are automatically generated by
      	  PackageSrc.hs, one for ghc-inplace and one for the installed ghc.
      
      	- So that we only have to build the driver once, there's some
      	  special hackery to deal with locations of utilities, and
      	  other configuration stuff:
      
      	  ghc now has a -B option, which is used in a similar way
      	  to gcc's.  eg.
      
      		ghc -B/home/blah/fptools
      
      	  will run ghc in-place in the specified fptools tree, using
      	  /home/blah/fptools/ghc/utils/mkdependHS to find mkdependHS
      	  for example.  ghc-inplace is now a small shell script that
      	  simply invokes the above.  Whereas
      
      		ghc -B/usr/local/lib/ghc-4.07
      
      	  also works, for an installed copy of ghc in
      	  /usr/local/lib/ghc-4.07.
      
      	- the mangler, object splitter and GC stats gatherer are separate
      	  scripts in subdirectories of ghc/driver.  ghc-asm.lprl and
      	  ghc-split.lprl have been copied in the CVS repository to maintain
      	  the history (fingers crossed; I've never done this before)
      
      
      Other notes:
      
      	- Java support isn't there yet.  Andy: don't update for the time
      	  being until I can sort this.
      
      	- Windows support is also broken, but will be fixed in due course.
      877aad48
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · eab8ac17
      sof authored
      x86: Relativise register table offsets for Hp, R1, R2 and SpA
      eab8ac17
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 8da2e6d8
      sof authored
      x86: Catch fast entry points fallthroughs via %esi and %edi
      8da2e6d8
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 56f7d139
      sof authored
      m68k-*-nextstep3 updates
      56f7d139
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 64d3966d
      sof authored
      When doing -monly-x-regs, fix up entry and exit from PerformGC_wrapper
      64d3966d
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 8ef57b89
      sof authored
      Tweaked __fexp regexps
      8ef57b89
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · fe4c050b
      sof authored
      PPC updates
      fe4c050b
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 90840304
      sof authored
      On mingw32, which is the only 'platform' where we support producing
      DLLs, prefix each static closure with a zero word. This is needed so
      that we can distinguish between pointers to (reversed!) info tables
      and static closures just by checking whether there's a zero word just
      above the pointed-to entity. Wish there was a better way..
      90840304
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 54077fbc
      sof authored
      HPUX fix to allow non-empty consistency chunks pass through OK
      54077fbc
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · cc962601
      sof authored
      Groks output from cygwin32-gcc
      cc962601
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 7f4068af
      sof authored
      Fixed consist pattern for cygwin32
      7f4068af
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · c625c85f
      sof authored
      - include mingw32 in the list of x86 platforms supported.
      - weed out ecoff debug information.
      c625c85f
    • simonpj's avatar
      [project @ 2000-06-13 15:35:29 by simonpj] · 77bdc0d1
      simonpj authored
      More small changes towards 2.02
      77bdc0d1
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · c87a6127
      simonmar authored
      put SRTs in the text section.
      c87a6127
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · a18ef0ff
      simonmar authored
      gcc 2.95 on Sparc changed the assembly output slightly.  This should
      fix it.
      a18ef0ff
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · aa2cadf5
      simonmar authored
      freebsd3 ==> freebsd
      aa2cadf5
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · 4ccbb70f
      simonmar authored
      Push directives over literal chunks when attempting to move them to
      the following chunk on x86.  Occasionally gcc generates a .glob
      directive some distance before the symbol it refers to, and we were
      ending up with a whole load of .glob directives attached to strings,
      and duplicated in each .o file when splitting.
      
      This change reduces the size of my libHSstd_p.a from 43M (!!!) to 9M.
      I think this problem must have appeared with gcc 2.95.2, but it's a
      little strange that I didn't notice it until now.
      4ccbb70f