Commits on Source (28)
-
Serge S. Gulin authored
The main purpose is to create tests for minimal app (hello world and its variations, i.e. unicode used) distribution size metric. Many platforms support distribution in compressed form via gzip. It would be nice to collect information on how much size is taken by the executional bundle for each platform at minimal edge case. 2 groups of tests are added: 1. We extend javascript backend size tests with gzip-enabled versions for all cases where an optimizing compiler is used (for now it is google closure compiler). 2. We add trivial hello world tests with gzip-enabled versions for all other platforms at CI pipeline where no external optimizing compiler is used.
eb1cb536 -
Fixes point 1 in #25052
d94410f8 -
Fixes #25052
bfe600f5 -
This partially addresses #25082.
62650d9f -
It was added earlier but hadn't appeared in any release notes yet. Partially addresses #25082.
5f0e23fd -
- beef6135 enabled the use of MO_Add/MO_Sub for 64-bit operations in the C and LLVM backends - 6755d833 did the same for the x86 NCG backend However we store some literal values as `Int` in the compiler. As a result, some Cmm optimizations transformed target 64-bit literals into compiler `Int`. If the compiler is 32-bit, this leads to computing with wrong literals (see #24893 and #24700). This patch disables these Cmm optimizations for 32-bit compilers. This is unsatisfying (optimizations shouldn't be compiler-word-size dependent) but it fixes the bug and it makes the patch easy to backport. A proper fix would be much more invasive but it shall be implemented in the future. Co-authored-by:
amesgen <amesgen@amesgen.de>
7446a09a -
Add a section on "types in terms" that were implemented in 8b2f70a2 and remove the now outdated suggestion of using `type` for them.
d59faaf2 -
39fd6714
-
Need to extend JSC externs with Emscripten RTS definitions to avoid JSC_UNDEFINED_VARIABLE errors when linking without the emcc rts. Fix #25138 Some recompilation avoidance tests now fail. This is tracked with the other instances of this failure in #23013. My hunch is that they were working by chance when we used the emcc linker. Metric Decrease: T24602_perf_size
e7764575 -
d1a40233
-
Fix #24377
610840eb -
We build these two packages as regular boot library dependencies rather than using the `in-ghc-tree` flag to include the source files into the haddock executable. The `in-ghc-tree` flag is moved into haddock-api to ensure that haddock built from hackage can still find the location of the GHC bindist using `ghc-paths`. Addresses #24834 This causes a metric decrease under non-release flavours because under these flavours libraries are compiled with optimisation but executables are not. Since we move the bulk of the code from the haddock executable to the haddock-api library, we see a metric decrease on the validate flavours. Metric Decrease: haddock.Cabal haddock.base haddock.compiler
6ae4b76a -
This is the Right Thing to Do™. And it prepares for storing a multiplicity coercion there. First step of the plan outlined here and below !12947 (comment 573091)
51ffba5d -
4d2faeeb
-
Omitted fields were simply ignored in the type checker and produced incorrect Core code. Fixes #24961 Metric Increase: RecordUpdPerf
623b4337 -
This patch is part of the patches upstreamed from haskell.nix. See https://github.com/input-output-hk/haskell.nix/pull/1960 for the original report/patch.
c749bdfd -
682a6a41
-
sheaf authored
This commit adds support for 128 bit wide SIMD vectors and vector operations to GHC's X86 native code generator. Main changes: - Introduction of vector formats (`GHC.CmmToAsm.Format`) - Introduction of 128-bit virtual register (`GHC.Platform.Reg`), and removal of unused Float virtual register. - Refactor of `GHC.Platform.Reg.Class.RegClass`: it now only contains two classes, `RcInteger` (for general purpose registers) and `RcFloatOrVector` (for registers that can be used for scalar floating point values as well as vectors). - Modify `GHC.CmmToAsm.X86.Instr.regUsageOfInstr` to keep track of which format each register is used at, so that the register allocator can know if it needs to spill the entire vector register or just the lower 64 bits. - Modify spill/load/reg-2-reg code to account for vector registers (`GHC.CmmToAsm.X86.Instr.{mkSpillInstr, mkLoadInstr, mkRegRegMoveInstr, takeRegRegMoveInstr}`). - Modify the register allocator code (`GHC.CmmToAsm.Reg.*`) to propagate the format we are storing in any given register, for instance changing `Reg` to `RegFormat` or `GlobalReg` to `GlobalRegUse`. - Add logic to lower vector `MachOp`s to X86 assembly (see `GHC.CmmToAsm.X86.CodeGen`) - Minor cleanups to genprimopcode, to remove the llvm_only attribute which is no longer applicable. Tests for this feature are provided in the "testsuite/tests/simd" directory. Fixes #7741 Keeping track of register formats adds a small memory overhead to the register allocator (in particular, regUsageOfInstr now allocates more to keep track of the `Format` each register is used at). This explains the following metric increases. ------------------------- Metric Increase: T12707 T13035 T13379 T3294 T4801 T5321FD T5321Fun T783 -------------------------
f046a759 -
sheaf authored
This commit updates genapply to use xmm, ymm and zmm registers, for stg_ap_v16/stg_ap_v32/stg_ap_v64, respectively. It also updates the Cmm lexer and parser to produce Cmm vectors rather than 128/256/512 bit wide scalars for V16/V32/V64, removing bits128, bits256 and bits512 in favour of vectors. The Cmm Lint check is weakened for vectors, as (in practice, e.g. on X86) it is okay to use a single vector register to hold multiple different types of data, and we don't know just from seeing e.g. "XMM1" how to interpret the 128 bits of data within. Fixes #25062
fae71b33 -
sheaf authored
This commit adds fused multiply add operations such as `fmaddDoubleX2#`. These are handled both in the X86 NCG and the LLVM backends.
92b728cf -
sheaf authored
This adds vector shuffle primops, such as ``` shuffleFloatX4# :: FloatX4# -> FloatX4# -> (# Int#, Int#, Int#, Int# #) -> FloatX4# ``` which shuffle the components of the input two vectors into the output vector. NB: the indices must be compile time literals, to match the X86 SHUFPD instruction immediate and the LLVM shufflevector instruction. These are handled in the X86 NCG and the LLVM backend. Tested in simd009.
f3386a59 -
sheaf authored
This adds proper MachOps for broadcast instructions, allowing us to produce better code for broadcasting a value than simply packing that value (doing many vector insertions in a row). These are lowered in the X86 NCG and LLVM backends. In the LLVM backend, it uses the previously introduced shuffle instructions.
3b3dfb92 -
sheaf authored
This commit fixes the handling of signed zero in floating-point vector negation. A slight hack was introduced to work around the fact that Cmm doesn't currently have a notion of signed floating point literals (see get_float_broadcast_value_reg). This can be removed once CmmFloat can express the value -0.0. The simd006 test has been updated to use a stricter notion of equality of floating-point values, which ensure the validity of this change.
aa5820fd -
sheaf authored
This commit adds min/max primops, such as minDouble# :: Double# -> Double# -> Double# minFloatX4# :: FloatX4# -> FloatX4# -> FloatX4# minWord16X8# :: Word16X8# -> Word16X8# -> Word16X8# These are supported in: - the X86, AArch64 and PowerPC NCGs, - the LLVM backend, - the WebAssembly and JavaScript backends. Fixes #25120
cec5908c -
sheaf authored
This commit modularises the RegClass datatype, allowing it to be used with architectures that have different register architectures, e.g. RISC-V which has separate floating-point and vector registers. The two modules GHC.Platform.Reg.Class.Unified and GHC.Platform.Reg.Class.Separate implement the two register architectures we currently support (corresponding to the two constructors of the GHC.Platform.Reg.Class.RegArch datatype).
22176f8b -
sheaf authored941add96
-
sheaf authored
This commit fixes the code generation for C calls, to take into account the calling convention. This is particularly tricky on Windows, where all vectors are expected to be passed by reference. See Note [The Windows X64 C calling convention] in GHC.CmmToAsm.X86.CodeGen.
606c72e4 -
sheaf authored
This commit clarifies that the GHC calling convention, on X86_64, uses xmm1, ..., xmm6 for argument passing. It does not use xmm0, because that's the convention we asked the LLVM compiler authors to define for usage with GHC. This unfortunately means a discrepancy with the C calling convention (which does use xmm0, for the first argument and for the result). Fixes #25156
7d1a3cc5
Showing
- compiler/GHC/Builtin/primops.txt.pp 61 additions, 31 deletionscompiler/GHC/Builtin/primops.txt.pp
- compiler/GHC/ByteCode/Asm.hs 4 additions, 3 deletionscompiler/GHC/ByteCode/Asm.hs
- compiler/GHC/Cmm.hs 1 addition, 1 deletioncompiler/GHC/Cmm.hs
- compiler/GHC/Cmm/CallConv.hs 34 additions, 31 deletionscompiler/GHC/Cmm/CallConv.hs
- compiler/GHC/Cmm/Graph.hs 15 additions, 13 deletionscompiler/GHC/Cmm/Graph.hs
- compiler/GHC/Cmm/Lexer.x 20 additions, 18 deletionscompiler/GHC/Cmm/Lexer.x
- compiler/GHC/Cmm/Lint.hs 1 addition, 1 deletioncompiler/GHC/Cmm/Lint.hs
- compiler/GHC/Cmm/Liveness.hs 2 additions, 2 deletionscompiler/GHC/Cmm/Liveness.hs
- compiler/GHC/Cmm/MachOp.hs 75 additions, 27 deletionscompiler/GHC/Cmm/MachOp.hs
- compiler/GHC/Cmm/Node.hs 10 additions, 9 deletionscompiler/GHC/Cmm/Node.hs
- compiler/GHC/Cmm/Opt.hs 23 additions, 3 deletionscompiler/GHC/Cmm/Opt.hs
- compiler/GHC/Cmm/Parser.y 19 additions, 17 deletionscompiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/ProcPoint.hs 1 addition, 1 deletioncompiler/GHC/Cmm/ProcPoint.hs
- compiler/GHC/Cmm/Reg.hs 26 additions, 18 deletionscompiler/GHC/Cmm/Reg.hs
- compiler/GHC/Cmm/Sink.hs 1 addition, 1 deletioncompiler/GHC/Cmm/Sink.hs
- compiler/GHC/Cmm/Type.hs 20 additions, 14 deletionscompiler/GHC/Cmm/Type.hs
- compiler/GHC/CmmToAsm.hs 4 additions, 3 deletionscompiler/GHC/CmmToAsm.hs
- compiler/GHC/CmmToAsm/AArch64.hs 2 additions, 2 deletionscompiler/GHC/CmmToAsm/AArch64.hs
- compiler/GHC/CmmToAsm/AArch64/CodeGen.hs 138 additions, 10 deletionscompiler/GHC/CmmToAsm/AArch64/CodeGen.hs
- compiler/GHC/CmmToAsm/AArch64/Instr.hs 33 additions, 13 deletionscompiler/GHC/CmmToAsm/AArch64/Instr.hs