1. 27 Jan, 2020 1 commit
  2. 25 Jan, 2020 1 commit
  3. 13 Sep, 2019 1 commit
  4. 09 Sep, 2019 1 commit
    • Sylvain Henry's avatar
      Module hierarchy: StgToCmm (#13009) · 447864a9
      Sylvain Henry authored
      Add StgToCmm module hierarchy. Platform modules that are used in several
      other places (NCG, LLVM codegen, Cmm transformations) are put into
  5. 07 Aug, 2019 1 commit
  6. 20 Jun, 2019 1 commit
    • John Ericson's avatar
      Move 'Platform' to ghc-boot · bff2f24b
      John Ericson authored
      ghc-pkg needs to be aware of platforms so it can figure out which
      subdire within the user package db to use. This is admittedly
      roundabout, but maybe Cabal could use the same notion of a platform as
      GHC to good affect too.
  7. 31 May, 2019 1 commit
  8. 15 Mar, 2019 1 commit
    • Peter Trommler's avatar
      PPC NCG: Use liveness information in CmmCall · 83e09d3c
      Peter Trommler authored
      We make liveness information for global registers
      available on `JMP` and `BCTR`, which were the last instructions
      missing. With complete liveness information we do not need to
      reserve global registers in `freeReg` anymore. Moreover we
      assign R9 and R10 to callee saves registers.
      Cleanup by removing `Reg_Su`, which was unused, from `freeReg`
      and removing unused register definitions.
      The calculation of the number of floating point registers is too
      conservative. Just follow X86 and specify the constants directly.
      Overall on PowerPC this results in 0.3 % smaller code size in nofib
      while runtime is slightly better in some tests.
  9. 17 Jan, 2019 1 commit
  10. 16 Jan, 2019 1 commit
  11. 01 Jan, 2019 1 commit
    • Peter Trommler's avatar
      PPC NCG: Remove Darwin support · 374e4470
      Peter Trommler authored
      Support for Mac OS X on PowerPC has been dropped by Apple years ago. We
      follow suit and remove PowerPC support for Darwin.
      Fixes #16106.
  12. 17 Nov, 2018 1 commit
    • Andreas Klebinger's avatar
      NCG: New code layout algorithm. · 912fd2b6
      Andreas Klebinger authored
      This patch implements a new code layout algorithm.
      It has been tested for x86 and is disabled on other platforms.
      Performance varies slightly be CPU/Machine but in general seems to be better
      by around 2%.
      Nofib shows only small differences of about +/- ~0.5% overall depending on
      flags/machine performance in other benchmarks improved significantly.
      Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec,
      containers, text and xeno.
      While the magnitude of gains differed three different CPUs where tested with
      all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell,
      * Library benchmark results summarized:
        * containers: ~1.5% faster
        * aeson: ~2% faster
        * megaparsec: ~2-5% faster
        * xml library benchmarks: 0.2%-1.1% faster
        * vector-benchmarks: 1-4% faster
        * text: 5.5% faster
      On average GHC compile times go down, as GHC compiled with the new layout
      is faster than the overhead introduced by using the new layout algorithm,
      Things this patch does:
      * Move code responsilbe for block layout in it's own module.
      * Move the NcgImpl Class into the NCGMonad module.
      * Extract a control flow graph from the input cmm.
      * Update this cfg to keep it in sync with changes during
        asm codegen. This has been tested on x64 but should work on x86.
        Other platforms still use the old codelayout.
      * Assign weights to the edges in the CFG based on type and limited static
        analysis which are then used for block layout.
      * Once we have the final code layout eliminate some redundant jumps.
        In particular turn a sequences of:
            jne .foo
            jmp .bar
            je bar
      Test Plan: ci
      Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott
      Reviewed By: RyanGlScott
      Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton
      GHC Trac Issues: #15124
      Differential Revision: https://phabricator.haskell.org/D4726
  13. 18 Jul, 2018 1 commit
    • Tamar Christina's avatar
      stack: fix stack allocations on Windows · d0bbe1bf
      Tamar Christina authored
      On Windows one is not allowed to drop the stack by more than a page size.
      The reason for this is that the OS only allocates enough stack till what
      the TEB specifies. After that a guard page is placed and the rest of the
      virtual address space is unmapped.
      The intention is that doing stack allocations will cause you to hit the
      guard which will then map the next page in and move the guard.  This is
      done to prevent what in the Linux world is known as stack clash
      vulnerabilities https://access.redhat.com/security/cve/cve-2017-1000364.
      There are modules in GHC for which the liveliness analysis thinks the
      reserved 8KB of spill slots isn't enough.  One being DynFlags and the
      other being Cabal.
      Though I think the Cabal one is likely a bug:
        4d6544:       81 ec 00 46 00 00       sub    $0x4600,%esp
        4d654a:       8d 85 94 fe ff ff       lea    -0x16c(%ebp),%eax
        4d6550:       3b 83 1c 03 00 00       cmp    0x31c(%ebx),%eax
        4d6556:       0f 82 de 8d 02 00       jb     4ff33a <_cLpg_info+0x7a>
        4d655c:       c7 45 fc 14 3d 50 00    movl   $0x503d14,-0x4(%ebp)
        4d6563:       8b 75 0c                mov    0xc(%ebp),%esi
        4d6566:       83 c5 fc                add    $0xfffffffc,%ebp
        4d6569:       66 f7 c6 03 00          test   $0x3,%si
        4d656e:       0f 85 a6 d7 02 00       jne    503d1a <_cLpb_info+0x6>
        4d6574:       81 c4 00 46 00 00       add    $0x4600,%esp
      It allocates nearly 18KB of spill slots for a simple 4 line function
      and doesn't even use it.  Note that this doesn't happen on x64 or
      when making a validate build.  Only when making a build without a
      validate and build.mk.
      This and the allocation in DynFlags means the stack allocation will jump
      over the guard page into unmapped memory areas and GHC or an end program
      The pagesize on x86 Windows is 4KB which means we hit it very easily for
      these two modules, which explains the total DOA of GHC 32bit for the past
      3 releases and the "random" segfaults on Windows.
      0:000> bp 00503d29
      0:000> gn
      Breakpoint 0 hit
      WARNING: Stack overflow detected. The unwound frames are extracted from outside
               normal stack bounds.
      eax=03b6b9c9 ebx=00dc90f0 ecx=03cac48c edx=03cac43d esi=03b6b9c9 edi=03abef40
      eip=00503d29 esp=013e96fc ebp=03cf8f70 iopl=0         nv up ei pl nz na po nc
      cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
      00503d29 89442440        mov     dword ptr [esp+40h],eax ss:002b:013e973c=????????
      WARNING: Stack overflow detected. The unwound frames are extracted from outside
               normal stack bounds.
      WARNING: Stack overflow detected. The unwound frames are extracted from outside
               normal stack bounds.
      0:000> !teb
      TEB at 00384000
          ExceptionList:        013effcc
          StackBase:            013f0000
          StackLimit:           013eb000
      This doesn't fix the liveliness analysis but does fix the allocations, by
      emitting a function call to `__chkstk_ms` when doing allocations of larger
      than a page, this will make sure the stack is probed every page so the kernel
      maps in the next page.
      `__chkstk_ms` is provided by `libGCC`, which is under the
      `GNU runtime exclusion license`, so it's safe to link against it, even for
      proprietary code. (Technically we already do since we link compiled C code in.)
      For allocations smaller than a page we drop the stack and probe the new address.
      This avoids the function call and still makes sure we hit the guard if needed.
      PS: In case anyone is Wondering why we didn't notice this before, it's because we
      only test x86_64 and on Windows 10.  On x86_64 the page size is 8KB and also the
      kernel is a bit more lenient on Windows 10 in that it seems to catch the segfault
      and resize the stack if it was unmapped:
      0:000> t
      eax=03b6b9c9 ebx=00dc90f0 ecx=03cac48c edx=03cac43d esi=03b6b9c9 edi=03abef40
      eip=00503d2d esp=013e96fc ebp=03cf8f70 iopl=0         nv up ei pl nz na po nc
      cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
      00503d2d 8b461b          mov     eax,dword ptr [esi+1Bh] ds:002b:03b6b9e4=03cac431
      0:000> !teb
      TEB at 00384000
          ExceptionList:        013effcc
          StackBase:            013f0000
          StackLimit:           013e9000
      Likely Windows 10 has a guard page larger than previous versions.
      This fixes the stack allocations, and as soon as I get the time I will look at
      the liveliness analysis. I find it highly unlikely that simple Cabal function
      requires ~2200 spill slots.
      Test Plan: ./validate
      Reviewers: simonmar, bgamari
      Reviewed By: bgamari
      Subscribers: AndreasK, rwbarton, thomie, carter
      GHC Trac Issues: #15154
      Differential Revision: https://phabricator.haskell.org/D4917
  14. 09 Nov, 2017 1 commit
    • Peter Trommler's avatar
      Fix PPC NCG after blockID patch · f8e7fece
      Peter Trommler authored
      Commit rGHC8b007ab assigns the same label to the first basic block
      of a proc and to the proc entry point. This violates the PPC 64-bit ELF
      v. 1.9 and v. 2.0 ABIs and leads to duplicate symbols.
      This patch fixes duplicate symbols caused by block labels
      In commit rGHCd7b8da1 an info table label is generated from a block id.
      Getting the entry label from that info label leads to an undefined
      symbol because a suffix "_entry" that is not present in the block label.
      To fix that issue add a new info table label flavour for labels
      derived from block ids. Converting such a label with toEntryLabel
      produces the original block label.
      Fixes #14311
      Test Plan: ./validate
      Reviewers: austin, bgamari, simonmar, erikd, hvr, angerman
      Reviewed By: bgamari
      Subscribers: rwbarton, thomie
      GHC Trac Issues: #14311
      Differential Revision: https://phabricator.haskell.org/D4149
  15. 02 Nov, 2017 1 commit
    • Peter Trommler's avatar
      PPC NCG: Impl branch prediction, atomic ops. · 1130c67b
      Peter Trommler authored
      Implement AtomicRMW ops, atomic read, atomic write
      in PowerPC native code generator. Also implement
      branch prediction because we need it in atomic ops
      This patch improves the issue in #12537 a bit but
      does not fix it entirely.
      The fallback operations for atomicread and atomicwrite
      in libraries/ghc-prim/cbits/atomic.c are incorrect.
      This patch avoids those functions by implementing the
      operations directly in the native code generator. This
      is also what the x86/amd64 NCG and the LLVM backend
      Test Plan: validate on AIX and PowerPC (32-bit) Linux
      Reviewers: erikd, hvr, austin, bgamari, simonmar
      Reviewed By: hvr, bgamari
      Subscribers: rwbarton, thomie
      GHC Trac Issues: #12537
      Differential Revision: https://phabricator.haskell.org/D3984
  16. 19 Sep, 2017 1 commit
    • Herbert Valerio Riedel's avatar
      compiler: introduce custom "GhcPrelude" Prelude · f63bc730
      Herbert Valerio Riedel authored
      This switches the compiler/ component to get compiled with
      -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
      This is motivated by the upcoming "Prelude" re-export of
      `Semigroup((<>))` which would cause lots of name clashes in every
      modulewhich imports also `Outputable`
      Reviewers: austin, goldfire, bgamari, alanz, simonmar
      Reviewed By: bgamari
      Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
      Differential Revision: https://phabricator.haskell.org/D3989
  17. 23 Jun, 2017 1 commit
    • Michal Terepeta's avatar
      Hoopl: remove dependency on Hoopl package · 42eee6ea
      Michal Terepeta authored
      This copies the subset of Hoopl's functionality needed by GHC to
      `cmm/Hoopl` and removes the dependency on the Hoopl package.
      The main motivation for this change is the confusing/noisy interface
      between GHC and Hoopl:
      - Hoopl has `Label` which is GHC's `BlockId` but different than
        GHC's `CLabel`
      - Hoopl has `Unique` which is different than GHC's `Unique`
      - Hoopl has `Unique{Map,Set}` which are different than GHC's
      - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
        needed just to filter the exposed functions (filter out some of the
        Hoopl's and add the GHC ones)
      With this change, we'll be able to simplify this significantly.
      It'll also be much easier to do invasive changes (Hoopl is a public
      package on Hackage with users that depend on the current behavior)
      This should introduce no changes in functionality - it merely
      copies the relevant code.
      Signed-off-by: Michal Terepeta's avatarMichal Terepeta <michal.terepeta@gmail.com>
      Test Plan: ./validate
      Reviewers: austin, bgamari, simonmar
      Reviewed By: bgamari, simonmar
      Subscribers: simonpj, kavon, rwbarton, thomie
      Differential Revision: https://phabricator.haskell.org/D3616
  18. 01 May, 2017 1 commit
  19. 25 Apr, 2017 1 commit
    • Peter Trommler's avatar
      PPC NCG: Implement callish prim ops · 89a3241f
      Peter Trommler authored
      Provide PowerPC optimised implementations of callish prim ops.
      The generic implementation of quotient remainder prim ops uses
      a division and a remainder operation. There is no remainder on
      PowerPC and so we need to implement remainder "by hand" which
      results in a duplication of the divide operation when using the
      generic code.
      Avoid this duplication by implementing the prim op in the native
      code generator.
      Use PowerPC's instructions for long multiplication.
      Addition and subtraction
      Use PowerPC add/subtract with carry/overflow instructions
      MO_Clz and MO_Ctz
      Use PowerPC's CNTLZ instruction and implement count trailing
      zeros using count leading zeros
      Implement an algorithm given by Henry Warren in "Hacker's Delight"
      using PowerPC divide instruction. TODO: Use long division instructions
      when available (POWER7 and later).
      Test Plan: validate on AIX and 32-bit Linux
      Reviewers: simonmar, erikd, hvr, austin, bgamari
      Reviewed By: erikd, hvr, bgamari
      Subscribers: trofi, kgardas, thomie
      Differential Revision: https://phabricator.haskell.org/D2973
  20. 08 Dec, 2016 1 commit
  21. 31 Aug, 2016 1 commit
    • Peter Trommler's avatar
      PPC NCG: Implement minimal stack frame header. · 010b07aa
      Peter Trommler authored
      According to the ABI specifications a minimal stack frame consists
      of a header and a minimum size parameter save area. We reserve the
      minimal size for each ABI.
      On PowerPC 64-bil Linux and AIX the parameter save area can accomodate
      up to eight parameters. So calls with eight parameters and fewer
      can be done without allocating a new stack frame and deallocating
      that stack frame after the call. On AIX one additional spill slot
      is available on the stack.
      Code size for all nofib benchmarks is 0.3 % smaller on powerpc64.
      Test Plan: validate on AIX
      Reviewers: hvr!, erikd, austin, simonmar, bgamari
      Reviewed By: bgamari
      Subscribers: thomie
      Differential Revision: https://phabricator.haskell.org/D2445
  22. 29 Apr, 2016 1 commit
    • Peter Trommler's avatar
      PPC NCG: Improve pointer de-tagging code · b725fe0a
      Peter Trommler authored
      Generate a clrr[wd]i instruction to clear the tag bits in a pointer.
      This saves one instruction and one temporary register.
      Optimize signed comparison with zero after andi. operation This saves
      one instruction when comparing a pointer tag with zero.
      This reduces code size by 0.6 % in all nofib benchmarks.
      Test Plan: validate on AIX and 32-bit Linux
      Reviewed By: erikd, hvr
      Differential Revision: https://phabricator.haskell.org/D2093
  23. 02 Oct, 2015 1 commit
    • Peter Trommler's avatar
      nativeGen PPC: fix > 16 bit offsets in stack handling · b29f20ed
      Peter Trommler authored
      Implement access to spill slots at offsets larger than 16 bits.
      Also allocation and deallocation of spill slots was restricted to
      16 bit offsets. Now 32 bit offsets are supported on all PowerPC
      The implementation of 32 bit offsets requires more than one instruction
      but the native code generator wants one instruction. So we implement
      pseudo-instructions that are pretty printed into multiple assembly
      With pseudo-instructions for spill slot allocation and deallocation
      we can also implement handling of the back chain pointer according
      to the ELF ABIs.
      Test Plan: validate (especially on powerpc (32 bit))
      Reviewers: bgamari, austin, erikd
      Reviewed By: erikd
      Subscribers: thomie
      Differential Revision: https://phabricator.haskell.org/D1296
      GHC Trac Issues: #7830
  24. 21 Aug, 2015 1 commit
    • thomie's avatar
      Delete FastBool · 3452473b
      thomie authored
      This reverses some of the work done in Trac #1405, and assumes GHC is
      smart enough to do its own unboxing of booleans now.
      I would like to do some more performance measurements, but the code
      changes can be reviewed already.
      Test Plan:
      With a perf build:
      ./inplace/bin/ghc-stage2 nofib/spectral/simple/Main.hs -fforce-recomp
      +RTS -t --machine-readable
        [("bytes allocated", "1300744864")
        ,("num_GCs", "302")
        ,("average_bytes_used", "8811118")
        ,("max_bytes_used", "24477464")
        ,("num_byte_usage_samples", "9")
        ,("peak_megabytes_allocated", "64")
        ,("init_cpu_seconds", "0.001")
        ,("init_wall_seconds", "0.001")
        ,("mutator_cpu_seconds", "2.833")
        ,("mutator_wall_seconds", "4.283")
        ,("GC_cpu_seconds", "0.960")
        ,("GC_wall_seconds", "0.961")
        [("bytes allocated", "1301088064")
        ,("num_GCs", "310")
        ,("average_bytes_used", "8820253")
        ,("max_bytes_used", "24539904")
        ,("num_byte_usage_samples", "9")
        ,("peak_megabytes_allocated", "64")
        ,("init_cpu_seconds", "0.001")
        ,("init_wall_seconds", "0.001")
        ,("mutator_cpu_seconds", "2.876")
        ,("mutator_wall_seconds", "4.474")
        ,("GC_cpu_seconds", "0.965")
        ,("GC_wall_seconds", "0.979")
      CPU time seems to be up a bit, but I'm not sure. Unfortunately CPU time
      measurements are rather noisy.
      Reviewers: austin, bgamari, rwbarton
      Subscribers: nomeata
      Differential Revision: https://phabricator.haskell.org/D1143
      GHC Trac Issues: #1405
  25. 07 Jul, 2015 1 commit
  26. 03 Jul, 2015 1 commit
    • Peter Trommler's avatar
      Implement PowerPC 64-bit native code backend for Linux · d3c1dda6
      Peter Trommler authored
      Extend the PowerPC 32-bit native code generator for "64-bit
      PowerPC ELF Application Binary Interface Supplement 1.9" by
      Ian Lance Taylor and "Power Architecture 64-Bit ELF V2 ABI Specification --
      OpenPOWER ABI for Linux Supplement" by IBM.
      The latter ABI is mainly used on POWER7/7+ and POWER8
      Linux systems running in little-endian mode. The code generator
      supports both static and dynamic linking. PowerPC 64-bit
      code for ELF ABI 1.9 and 2 is mostly position independent
      anyway, and thus so is all the code emitted by the code
      generator. In other words, -fPIC does not make a difference.
      rts/stg/SMP.h support is implemented.
      Following the spirit of the introductory comment in
      PPC/CodeGen.hs, the rest of the code is a straightforward
      extension of the 32-bit implementation.
      * Code is generated only in the medium code model, which
        is also gcc's default
      * Local symbols are not accessed directly, which seems to
        also be the case for 32-bit
      * LLVM does not work, but this does not work on 32-bit either
      * Must use the system runtime linker in GHCi, because the
        GHC linker for "static" object files (rts/Linker.c) for
        PPC 64-bit is not implemented. The system runtime
        (dynamic) linker works.
      * The handling of the system stack (register 1) is not ELF-
        compliant so stack traces break. Instead of allocating a new
        stack frame, spill code should use the "official" spill area
        in the current stack frame and deallocation code should restore
        the back chain
      * DWARF support is missing
      Fixes #9863
      Test Plan: validate (on powerpc, too)
      Reviewers: simonmar, trofi, erikd, austin
      Reviewed By: trofi
      Subscribers: bgamari, arnons1, kgardas, thomie
      Differential Revision: https://phabricator.haskell.org/D629
      GHC Trac Issues: #9863
  27. 14 Dec, 2014 1 commit
    • Sergei Trofimovich's avatar
      powerpc: fix and enable shared libraries by default on linux · fa31e8f4
      Sergei Trofimovich authored
      And fix things all the way down to it. Namely:
          - remove 'r30' from free registers, it's an .LCTOC1 register
            for gcc. generated .plt stubs expect it to be initialised.
          - fix PicBase computation, which originally forgot to use 'tmp'
            reg in 'initializePicBase_ppc.fetchPC'
          - mark 'ForeighTarget's as implicitly using 'PicBase' register
            (see comment for details)
          - add 64-bit MO_Sub and test on alloclimit3/4 regtests
          - fix dynamic label offsets to match with .LCTOC1 offset
      Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>
      Test Plan: validate passes equal amount of vanilla/dyn tests
      Reviewers: simonmar, erikd, austin
      Reviewed By: erikd, austin
      Subscribers: carter, thomie
      Differential Revision: https://phabricator.haskell.org/D560
      GHC Trac Issues: #8024, #9831
  28. 27 Sep, 2014 1 commit
    • thomie's avatar
      Stop exporting, and stop using, functions marked as deprecated · 51aa2fa3
      thomie authored
      Don't export `getUs` and `getUniqueUs`. `UniqSM` has a `MonadUnique` instance:
          instance MonadUnique UniqSM where
              getUniqueSupplyM = getUs
              getUniqueM  = getUniqueUs
              getUniquesM = getUniquesUs
      Commandline-fu used:
          git grep -l 'getUs\>' |
              grep -v compiler/basicTypes/UniqSupply.lhs |
              xargs sed -i 's/getUs/getUniqueSupplyM/g
          git grep -l 'getUniqueUs\>' |
              grep -v combiler/basicTypes/UniqSupply.lhs |
              xargs sed -i 's/getUniqueUs/getUniqueM/g'
      Follow up on b522d3a3
      Reviewed By: austin, hvr
      Differential Revision: https://phabricator.haskell.org/D220
  29. 15 May, 2014 1 commit
    • Herbert Valerio Riedel's avatar
      Add LANGUAGE pragmas to compiler/ source files · 23892440
      Herbert Valerio Riedel authored
      In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
      reorganized, while following the convention, to
      - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
        any `{-# OPTIONS_GHC #-}`-lines.
      - Moreover, if the list of language extensions fit into a single
        `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
        line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
        individual language extension. In both cases, try to keep the
        enumeration alphabetically ordered.
        (The latter layout is preferable as it's more diff-friendly)
      While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
      occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
  30. 11 Feb, 2013 1 commit
  31. 02 Feb, 2013 2 commits
  32. 07 Jan, 2013 1 commit
    • Simon Marlow's avatar
      Fix bugs in allocMoreStack (#7498, #7510) · 03d360f2
      Simon Marlow authored
      There were four bugs here.  Clearly I didn't test this enough to
      expose the bugs - it appeared to work on x86/Linux, but completely by
      accident it seems.
      1. the delta was wrong by a factor of the slot size (as noted on #7498)
      2. we weren't correctly aligning the stack pointer (sp needs to be
      16-byte aligned on x86/x86_64)
      3. we were doing the adjustment multiple times in the case of a block
      that was both a return point and a local branch target.  To fix this I
      had to add new shim blocks to adjust the stack pointer, and retarget
      the original branches.  See comment for details.
      4. we were doing the adjustment for CALL instructions, which is
      unnecessary and wrong; only JMPs should be preceded by a stack
      (Someone with a PPC box will need to update the PPC version of
      allocMoreStack to fix the above bugs, using the x86 version as a
  33. 15 Dec, 2012 2 commits
  34. 12 Nov, 2012 1 commit
    • Simon Marlow's avatar
      Remove OldCmm, convert backends to consume new Cmm · d92bd17f
      Simon Marlow authored
      This removes the OldCmm data type and the CmmCvt pass that converts
      new Cmm to OldCmm.  The backends (NCGs, LLVM and C) have all been
      converted to consume new Cmm.
      The main difference between the two data types is that conditional
      branches in new Cmm have both true/false successors, whereas in OldCmm
      the false case was a fallthrough.  To generate slightly better code we
      occasionally need to invert a conditional to ensure that the
      branch-not-taken becomes a fallthrough; this was previously done in
      CmmCvt, and it is now done in CmmContFlowOpt.
      We could go further and use the Hoopl Block representation for native
      code, which would mean that we could use Hoopl's postorderDfs and
      analyses for native code, but for now I've left it as is, using the
      old ListGraph representation for native code.
  35. 20 Sep, 2012 1 commit
    • Simon Marlow's avatar
      Teach the linear register allocator how to allocate more stack if necessary · 0b0a41f9
      Simon Marlow authored
      This squashes the "out of spill slots" panic that occasionally happens
      on x86, by adding instructions to bump and retreat the C stack pointer
      as necessary.  The panic has become more common since the new codegen,
      because we lump code into larger blocks, and the register allocator
      isn't very good at reusing stack slots for spilling (see Note [extra
      spill slots]).
  36. 14 Sep, 2012 1 commit
  37. 28 Aug, 2012 2 commits