Skip to content
Snippets Groups Projects
  1. Feb 19, 2024
  2. Jan 31, 2023
    • sheaf's avatar
      Cmm: track the type of global registers · 5618fc21
      sheaf authored and Marge Bot's avatar Marge Bot committed
      This patch tracks the type of Cmm global registers. This is needed
      in order to lint uses of polymorphic registers, such as SIMD vector
      registers that can be used both for floating-point and integer values.
      
      This changes allows us to refactor VanillaReg to not store VGcPtr,
      as that information is instead stored in the type of the usage of the
      register.
      
      Fixes #22297
      5618fc21
  3. Dec 09, 2022
  4. Nov 11, 2022
  5. Aug 10, 2022
  6. Feb 01, 2022
  7. Jan 29, 2022
  8. Aug 09, 2021
    • John Ericson's avatar
      Move `/includes` to `/rts/include`, sort per package better · d5de970d
      John Ericson authored and Marge Bot's avatar Marge Bot committed
      In order to make the packages in this repo "reinstallable", we need to
      associate source code with a specific packages. Having a top level
      `/includes` dir that mixes concerns (which packages' includes?) gets in
      the way of this.
      
      To start, I have moved everything to `rts/`, which is mostly correct.
      There are a few things however that really don't belong in the rts (like
      the generated constants haskell type, `CodeGen.Platform.h`). Those
      needed to be manually adjusted.
      
      Things of note:
      
       - No symlinking for sake of windows, so we hard-link at configure time.
      
       - `CodeGen.Platform.h` no longer as `.hs` extension (in addition to
         being moved to `compiler/`) so as not to confuse anyone, since it is
         next to Haskell files.
      
       - Blanket `-Iincludes` is gone in both build systems, include paths now
         more strictly respect per-package dependencies.
      
       - `deriveConstants` has been taught to not require a `--target-os` flag
         when generating the platform-agnostic Haskell type. Make takes
         advantage of this, but Hadrian has yet to.
      d5de970d
  9. Jun 05, 2021
    • Moritz Angermann's avatar
      Adds AArch64 Native Code Generator · 3b1aa7db
      Moritz Angermann authored and Marge Bot's avatar Marge Bot committed
      In which we add a new code generator to the Glasgow Haskell
      Compiler. This codegen supports ELF and Mach-O targets, thus covering
      Linux, macOS, and BSDs in principle.  It was tested only on macOS and
      Linux.  The NCG follows a similar structure as the other native code
      generators we already have, and should therfore be realtively easy to
      follow.
      
      It supports most of the features required for a proper native code
      generator, but does not claim to be perfect or fully optimised.  There
      are still opportunities for optimisations.
      
      Metric Decrease:
          ManyAlternatives
          ManyConstructors
          MultiLayerModules
          PmSeriesG
          PmSeriesS
          PmSeriesT
          PmSeriesV
          T10421
          T10421a
          T10858
          T11195
          T11276
          T11303b
          T11374
          T11822
          T12227
          T12545
          T12707
          T13035
          T13253
          T13253-spj
          T13379
          T13701
          T13719
          T14683
          T14697
          T15164
          T15630
          T16577
          T17096
          T17516
          T17836
          T17836b
          T17977
          T17977b
          T18140
          T18282
          T18304
          T18478
          T18698a
          T18698b
          T18923
          T1969
          T3064
          T5030
          T5321FD
          T5321Fun
          T5631
          T5642
          T5837
          T783
          T9198
          T9233
          T9630
          T9872d
          T9961
          WWRec
      Metric Increase:
          T4801
      3b1aa7db
  10. Mar 05, 2021
  11. Nov 15, 2020
    • Moritz Angermann's avatar
      AArch64/arm64 adjustments · 8887102f
      Moritz Angermann authored and Marge Bot's avatar Marge Bot committed
      This addes the necessary logic to support aarch64 on elf, as well
      as aarch64 on mach-o, which Apple calls arm64.
      
      We change architecture name to AArch64, which is the official arm
      naming scheme.
      8887102f
  12. Jun 01, 2020
  13. Apr 26, 2020
  14. Feb 25, 2020
  15. Jan 25, 2020
  16. Oct 22, 2019
    • Stefan Schulze Frielinghaus's avatar
      Implement s390x LLVM backend. · fd8b666a
      Stefan Schulze Frielinghaus authored and Marge Bot's avatar Marge Bot committed
      This patch adds support for the s390x architecture for the LLVM code
      generator. The patch includes a register mapping of STG registers onto
      s390x machine registers which enables a registerised build.
      fd8b666a
  17. Sep 25, 2019
  18. Jul 16, 2019
  19. Jul 03, 2019
  20. May 24, 2019
    • Michael Sloan's avatar
      Add PlainPanic for throwing exceptions without depending on pprint · d9dfbde3
      Michael Sloan authored and Matthew Pickering's avatar Matthew Pickering committed
      This commit splits out a subset of GhcException which do not depend on
      pretty printing (SDoc), as a new datatype called
      PlainGhcException. These exceptions can be caught as GhcException,
      because 'fromException' will convert them.
      
      The motivation for this change is that that the Panic module
      transitively depends on many modules, primarily due to pretty printing
      code.  It's on the order of about 130 modules.  This large set of
      dependencies has a few implications:
      
      1. To avoid cycles / use of boot files, these dependencies cannot
      throw GhcException.
      
      2. There are some utility modules that use UnboxedTuples and also use
      `panic`. This means that when loading GHC into GHCi, about 130
      additional modules would need to be compiled instead of
      interpreted. Splitting the non-pprint exception throwing into a new
      module resolves this issue. See #13101
      d9dfbde3
  21. Apr 11, 2019
    • Carter Schonwald's avatar
      removing x87 register support from native code gen · 42504f4a
      Carter Schonwald authored
      * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors
      * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding
      behavior in 32bit haskell code
      * removes the 80bit floating point representation from the supported float sizes
      * theres still 1 tiny bit of x87 support needed,
      for handling float and double return values in FFI calls  wrt the C ABI on x86_32,
      but this one piece does not leak into the rest of NCG.
      * Lots of code thats not been touched in a long time got deleted as a
      consequence of all of this
      
      all in all, this change paves the way towards a lot of future further
      improvements in how GHC handles floating point computations, along with
      making the native code gen more accessible to a larger pool of contributors.
      42504f4a
  22. Mar 15, 2019
    • Peter Trommler's avatar
      PPC NCG: Use liveness information in CmmCall · 83e09d3c
      Peter Trommler authored and Marge Bot's avatar Marge Bot committed
      We make liveness information for global registers
      available on `JMP` and `BCTR`, which were the last instructions
      missing. With complete liveness information we do not need to
      reserve global registers in `freeReg` anymore. Moreover we
      assign R9 and R10 to callee saves registers.
      
      Cleanup by removing `Reg_Su`, which was unused, from `freeReg`
      and removing unused register definitions.
      
      The calculation of the number of floating point registers is too
      conservative. Just follow X86 and specify the constants directly.
      
      Overall on PowerPC this results in 0.3 % smaller code size in nofib
      while runtime is slightly better in some tests.
      83e09d3c
  23. Jan 16, 2019
  24. Jan 01, 2019
  25. Nov 02, 2018
    • Michal Terepeta's avatar
      Add Int8# and Word8# · 2c959a18
      Michal Terepeta authored and Ben Gamari's avatar Ben Gamari committed
      This is the first step of implementing:
      https://github.com/ghc-proposals/ghc-proposals/pull/74
      
      
      
      The main highlights/changes:
      
          primops.txt.pp gets two new sections for two new primitive types for
          signed and unsigned 8-bit integers (Int8# and Word8 respectively) along
          with basic arithmetic and comparison operations. PrimRep/RuntimeRep get
          two new constructors for them. All of the primops translate into the
          existing MachOPs.
      
          For CmmCalls the codegen will now zero-extend the values at call
          site (so that they can be moved to the right register) and then truncate
          them back their original width.
      
          x86 native codegen needed some updates, since it wasn't able to deal
          with the new widths, but all the changes are quite localized. LLVM
          backend seems to just work.
      
      This is the second attempt at merging this, after the first attempt in
      D4475 had to be backed out due to regressions on i386.
      
      Bumps binary submodule.
      
      Signed-off-by: default avatarMichal Terepeta <michal.terepeta@gmail.com>
      
      Test Plan: ./validate (on both x86-{32,64})
      
      Reviewers: bgamari, hvr, goldfire, simonmar
      
      Subscribers: rwbarton, carter
      
      Differential Revision: https://phabricator.haskell.org/D5258
      2c959a18
  26. Oct 01, 2018
  27. Apr 29, 2017
  28. Apr 05, 2017
  29. Apr 04, 2017
  30. Nov 01, 2015
  31. Oct 02, 2015
    • Peter Trommler's avatar
      nativeGen PPC: fix > 16 bit offsets in stack handling · b29f20ed
      Peter Trommler authored and Ben Gamari's avatar Ben Gamari committed
      Implement access to spill slots at offsets larger than 16 bits.
      Also allocation and deallocation of spill slots was restricted to
      16 bit offsets. Now 32 bit offsets are supported on all PowerPC
      platforms.
      
      The implementation of 32 bit offsets requires more than one instruction
      but the native code generator wants one instruction. So we implement
      pseudo-instructions that are pretty printed into multiple assembly
      instructions.
      
      With pseudo-instructions for spill slot allocation and deallocation
      we can also implement handling of the back chain pointer according
      to the ELF ABIs.
      
      Test Plan: validate (especially on powerpc (32 bit))
      
      Reviewers: bgamari, austin, erikd
      
      Reviewed By: erikd
      
      Subscribers: thomie
      
      Differential Revision: https://phabricator.haskell.org/D1296
      
      GHC Trac Issues: #7830
      b29f20ed
  32. Aug 21, 2015
    • Thomas Miedema's avatar
      Delete FastBool · 3452473b
      Thomas Miedema authored and Ben Gamari's avatar Ben Gamari committed
      This reverses some of the work done in Trac #1405, and assumes GHC is
      smart enough to do its own unboxing of booleans now.
      
      I would like to do some more performance measurements, but the code
      changes can be reviewed already.
      
      Test Plan:
      With a perf build:
      ./inplace/bin/ghc-stage2 nofib/spectral/simple/Main.hs -fforce-recomp
      +RTS -t --machine-readable
      
      before:
      ```
        [("bytes allocated", "1300744864")
        ,("num_GCs", "302")
        ,("average_bytes_used", "8811118")
        ,("max_bytes_used", "24477464")
        ,("num_byte_usage_samples", "9")
        ,("peak_megabytes_allocated", "64")
        ,("init_cpu_seconds", "0.001")
        ,("init_wall_seconds", "0.001")
        ,("mutator_cpu_seconds", "2.833")
        ,("mutator_wall_seconds", "4.283")
        ,("GC_cpu_seconds", "0.960")
        ,("GC_wall_seconds", "0.961")
        ]
      ```
      
      after:
      ```
        [("bytes allocated", "1301088064")
        ,("num_GCs", "310")
        ,("average_bytes_used", "8820253")
        ,("max_bytes_used", "24539904")
        ,("num_byte_usage_samples", "9")
        ,("peak_megabytes_allocated", "64")
        ,("init_cpu_seconds", "0.001")
        ,("init_wall_seconds", "0.001")
        ,("mutator_cpu_seconds", "2.876")
        ,("mutator_wall_seconds", "4.474")
        ,("GC_cpu_seconds", "0.965")
        ,("GC_wall_seconds", "0.979")
        ]
      ```
      
      CPU time seems to be up a bit, but I'm not sure. Unfortunately CPU time
      measurements are rather noisy.
      
      Reviewers: austin, bgamari, rwbarton
      
      Subscribers: nomeata
      
      Differential Revision: https://phabricator.haskell.org/D1143
      
      GHC Trac Issues: #1405
      3452473b
  33. Jul 03, 2015
    • Peter Trommler's avatar
      Implement PowerPC 64-bit native code backend for Linux · d3c1dda6
      Peter Trommler authored and Ben Gamari's avatar Ben Gamari committed
      Extend the PowerPC 32-bit native code generator for "64-bit
      PowerPC ELF Application Binary Interface Supplement 1.9" by
      Ian Lance Taylor and "Power Architecture 64-Bit ELF V2 ABI Specification --
      OpenPOWER ABI for Linux Supplement" by IBM.
      The latter ABI is mainly used on POWER7/7+ and POWER8
      Linux systems running in little-endian mode. The code generator
      supports both static and dynamic linking. PowerPC 64-bit
      code for ELF ABI 1.9 and 2 is mostly position independent
      anyway, and thus so is all the code emitted by the code
      generator. In other words, -fPIC does not make a difference.
      
      rts/stg/SMP.h support is implemented.
      
      Following the spirit of the introductory comment in
      PPC/CodeGen.hs, the rest of the code is a straightforward
      extension of the 32-bit implementation.
      
      Limitations:
      * Code is generated only in the medium code model, which
        is also gcc's default
      * Local symbols are not accessed directly, which seems to
        also be the case for 32-bit
      * LLVM does not work, but this does not work on 32-bit either
      * Must use the system runtime linker in GHCi, because the
        GHC linker for "static" object files (rts/Linker.c) for
        PPC 64-bit is not implemented. The system runtime
        (dynamic) linker works.
      * The handling of the system stack (register 1) is not ELF-
        compliant so stack traces break. Instead of allocating a new
        stack frame, spill code should use the "official" spill area
        in the current stack frame and deallocation code should restore
        the back chain
      * DWARF support is missing
      
      Fixes #9863
      
      Test Plan: validate (on powerpc, too)
      
      Reviewers: simonmar, trofi, erikd, austin
      
      Reviewed By: trofi
      
      Subscribers: bgamari, arnons1, kgardas, thomie
      
      Differential Revision: https://phabricator.haskell.org/D629
      
      GHC Trac Issues: #9863
      d3c1dda6
  34. Dec 14, 2014
    • Sergei Trofimovich's avatar
      powerpc: fix and enable shared libraries by default on linux · fa31e8f4
      Sergei Trofimovich authored
      
      Summary:
      And fix things all the way down to it. Namely:
          - remove 'r30' from free registers, it's an .LCTOC1 register
            for gcc. generated .plt stubs expect it to be initialised.
          - fix PicBase computation, which originally forgot to use 'tmp'
            reg in 'initializePicBase_ppc.fetchPC'
          - mark 'ForeighTarget's as implicitly using 'PicBase' register
            (see comment for details)
          - add 64-bit MO_Sub and test on alloclimit3/4 regtests
          - fix dynamic label offsets to match with .LCTOC1 offset
      
      Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>
      
      Test Plan: validate passes equal amount of vanilla/dyn tests
      
      Reviewers: simonmar, erikd, austin
      
      Reviewed By: erikd, austin
      
      Subscribers: carter, thomie
      
      Differential Revision: https://phabricator.haskell.org/D560
      
      GHC Trac Issues: #8024, #9831
      fa31e8f4
  35. Nov 19, 2014
  36. May 13, 2014
  37. May 04, 2014
  38. May 02, 2014
    • Simon Marlow's avatar
      Per-thread allocation counters and limits · b0534f78
      Simon Marlow authored
      This tracks the amount of memory allocation by each thread in a
      counter stored in the TSO.  Optionally, when the counter drops below
      zero (it counts down), the thread can be sent an asynchronous
      exception: AllocationLimitExceeded.  When this happens, given a small
      additional limit so that it can handle the exception.  See
      documentation in GHC.Conc for more details.
      
      Allocation limits are similar to timeouts, but
      
        - timeouts use real time, not CPU time.  Allocation limits do not
          count anything while the thread is blocked or in foreign code.
      
        - timeouts don't re-trigger if the thread catches the exception,
          allocation limits do.
      
        - timeouts can catch non-allocating loops, if you use
          -fno-omit-yields.  This doesn't work for allocation limits.
      
      I couldn't measure any impact on benchmarks with these changes, even
      for nofib/smp.
      b0534f78
Loading