1. 15 Jun, 2000 3 commits
    • sewardj's avatar
      [project @ 2000-06-15 11:17:41 by sewardj] · 0779a545
      sewardj authored
      Emit slightly better x86 floating point code for comparisons, +, -,
      * and /, in the common case where one of the source fake FP regs
      is the same as the destination reg.
    • rrt's avatar
      [project @ 2000-06-15 09:24:49 by rrt] · 8c0949c9
      rrt authored
      Fixed typo: .hs left out of .SUFFIXES list.
    • sewardj's avatar
      [project @ 2000-06-15 08:38:25 by sewardj] · 665229e5
      sewardj authored
      Major thing: new register allocator.  Brief description follows.
      Should correctly handle code with loops in, even though we don't
      generate any such at the moment.  A lot of comments.  The previous
      machinery for spilling is retained, as is the idea of a fast-and-easy
      initial allocation attempt intended to deal with the majority of code
      blocks (about 60% on x86) very cheaply.  Many comments explaining
      in detail how it works :-)
      The Stix inliner is now on by default.  Integer code seems to run
      within about 1% of that -fvia-C.  x86 fp code is significantly worse,
      up to about 30% slower, depending on the amount of fp activity.
      Minor thing: lazyfication of the top-level NCG plumbing, so that the
      NCG doesn't require any greater residency than compiling to C, just a
      bit more time.  Created lazyThenUs and lazyMapUs for this purpose.
      The new allocator is somewhat, although not catastophically, slower
      than the old one.  Fixing of the long-standing NCG space leak more
      than makes up for it; overall hsc run-time is down about 5%, due to
      significantly reduced GC time.
      Instructions are numbered sequentially, starting at zero.
      A flow edge (FE) is a pair of insn numbers (MkFE Int Int) denoting
      a possible flow of control from the first insn to the second.
      The input to the register allocator is a list of instructions, which
      mention Regs.  A Reg can be a RealReg -- a real machine reg -- or a
      VirtualReg, which carries a unique.  After allocation, all the
      VirtualReg references will have been converted into RealRegs, and
      possibly some spill code will have been inserted.
      The heart of the register allocator works in four phases.
      1.  (find_flow_edges) Calculate all the FEs for the code list.
          Return them not as a [FE], but implicitly, as a pair of
          Array Int [Int], being the successor and predecessor maps
          for instructions.
      2.  (calc_liveness) Returns a FiniteMap FE RegSet.  For each
          FE, indicates the set of registers live on that FE.  Note
          that the set includes both RealRegs and VirtualRegs.  The
          former appear because the code could mention fixed register
          usages, and we need to take them into account from the start.
      3.  (calc_live_range_sets) Invert the above mapping, giving a
          FiniteMap Reg FeSet, indicating, for each virtual and real
          reg mentioned in the code, which FEs it is live on.
      4.  (calc_vreg_to_rreg_mapping) For virtual reg, try and find
          an allocatable real register for it.  Each real register has
          a "current commitment", indicating the set of FEs it is
          currently live on.  A virtual reg v can be assigned to
          real reg r iff v's live-fe-set does not intersect with r's
          current commitment fe-set.  If the assignment is made,
          v's live-fe-set is union'd into r's current commitment fe-set.
          There is also the minor restriction that v and r must be of
          the same register class (integer or floating).
          Once this mapping is established, we simply apply it to the
          input insns, and that's it.
          If no suitable real register can be found, the vreg is mapped
          to itself, and we deem allocation to have failed.  The partially
          allocated code is returned.  The higher echelons of the allocator
          (doGeneralAlloc and runRegAlloc) then cooperate to insert spill
          code and re-run allocation, until a successful allocation is found.
  2. 14 Jun, 2000 13 commits
  3. 13 Jun, 2000 24 commits
    • simonmar's avatar
      [project @ 2000-06-13 16:10:00 by simonmar] · 0a423b55
      simonmar authored
      forgot one file
    • simonmar's avatar
      [project @ 2000-06-13 16:07:20 by simonmar] · 877aad48
      simonmar authored
      New Driver
      Most things work now, so I'm committing this for a shake down.
      Doubtless there'll be some breakage but things should be back to
      normal by the end of the week.
      NOTE: GHC 4.06 won't work to build this driver at the moment, due to a
      bug in its parser.  I'll commit a workaround shortly.
      There are several improvements here:
      	- the driver is written in Haskell, so is allegedly
      	  more maintainable than the previous one.  It's a bit shorter,
      	  at any rate.
      	- the package system has been generalised, so that eg.
      	  the RTS is a package, as is GMP and the prelude.  Packages
      	  are now configured via a configuration file, package.conf.
      	  Two versions of package.conf are automatically generated by
      	  PackageSrc.hs, one for ghc-inplace and one for the installed ghc.
      	- So that we only have to build the driver once, there's some
      	  special hackery to deal with locations of utilities, and
      	  other configuration stuff:
      	  ghc now has a -B option, which is used in a similar way
      	  to gcc's.  eg.
      		ghc -B/home/blah/fptools
      	  will run ghc in-place in the specified fptools tree, using
      	  /home/blah/fptools/ghc/utils/mkdependHS to find mkdependHS
      	  for example.  ghc-inplace is now a small shell script that
      	  simply invokes the above.  Whereas
      		ghc -B/usr/local/lib/ghc-4.07
      	  also works, for an installed copy of ghc in
      	- the mangler, object splitter and GC stats gatherer are separate
      	  scripts in subdirectories of ghc/driver.  ghc-asm.lprl and
      	  ghc-split.lprl have been copied in the CVS repository to maintain
      	  the history (fingers crossed; I've never done this before)
      Other notes:
      	- Java support isn't there yet.  Andy: don't update for the time
      	  being until I can sort this.
      	- Windows support is also broken, but will be fixed in due course.
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · eab8ac17
      sof authored
      x86: Relativise register table offsets for Hp, R1, R2 and SpA
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 8da2e6d8
      sof authored
      x86: Catch fast entry points fallthroughs via %esi and %edi
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 56f7d139
      sof authored
      m68k-*-nextstep3 updates
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 64d3966d
      sof authored
      When doing -monly-x-regs, fix up entry and exit from PerformGC_wrapper
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 8ef57b89
      sof authored
      Tweaked __fexp regexps
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · fe4c050b
      sof authored
      PPC updates
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 90840304
      sof authored
      On mingw32, which is the only 'platform' where we support producing
      DLLs, prefix each static closure with a zero word. This is needed so
      that we can distinguish between pointers to (reversed!) info tables
      and static closures just by checking whether there's a zero word just
      above the pointed-to entity. Wish there was a better way..
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 54077fbc
      sof authored
      HPUX fix to allow non-empty consistency chunks pass through OK
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · cc962601
      sof authored
      Groks output from cygwin32-gcc
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · 7f4068af
      sof authored
      Fixed consist pattern for cygwin32
    • sof's avatar
      [project @ 2000-06-13 15:35:29 by sof] · c625c85f
      sof authored
      - include mingw32 in the list of x86 platforms supported.
      - weed out ecoff debug information.
    • simonpj's avatar
      [project @ 2000-06-13 15:35:29 by simonpj] · 77bdc0d1
      simonpj authored
      More small changes towards 2.02
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · c87a6127
      simonmar authored
      put SRTs in the text section.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · a18ef0ff
      simonmar authored
      gcc 2.95 on Sparc changed the assembly output slightly.  This should
      fix it.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · aa2cadf5
      simonmar authored
      freebsd3 ==> freebsd
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · 4ccbb70f
      simonmar authored
      Push directives over literal chunks when attempting to move them to
      the following chunk on x86.  Occasionally gcc generates a .glob
      directive some distance before the symbol it refers to, and we were
      ending up with a whole load of .glob directives attached to strings,
      and duplicated in each .o file when splitting.
      This change reduces the size of my libHSstd_p.a from 43M (!!!) to 9M.
      I think this problem must have appeared with gcc 2.95.2, but it's a
      little strange that I didn't notice it until now.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · c71969ee
      simonmar authored
      Oops, back out most of last revision.  Other changes crept in by mistake.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · cd7b1451
      simonmar authored
      Make object file splitting simpler, in preparation for conversion to
      the new driver.
      The "inject split markers" phase is now omitted, instead we generate
      the split markers directly.
      Driver: also removed now-defunct -fpedantic-bottoms flag.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · a25cf6cb
      simonmar authored
      Fix a bug in previous commit, some .globls were getting thrown away.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · 06862d5c
      simonmar authored
      Fix -monly-3-regs problem.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · 65b0474c
      simonmar authored
      Crude allocation-counting extension to ticky-ticky profiling.
      Allocations are counted against the closest lexically enclosing
      function closure, so you need to map the output back to the STG code.
    • simonmar's avatar
      [project @ 2000-06-13 15:35:29 by simonmar] · 266047a8
      simonmar authored
      Change the convention for cost-centre labels to be <name>_cc and
      cost-centre stacks to be <name>_ccs.  This makes cost-centre labels
      more consistent with our other naming conventions, and fixes some
      problems caused by cost-centre labels being misinterpreted by the
      This fixes one cause of profiled programs crashing; if you're seeing
      this symptom then this patch may help.