- 24 Sep, 2013 2 commits
-
-
Gabor Greif authored
-
Iavor S. Diatchki authored
This is used in the definition of `ToNat1` in the `base` library (module GHC.TypeLits).
-
- 23 Sep, 2013 31 commits
-
-
Krzysztof Gogolewski authored
-
parcs authored
whitehole_spin is only defined when PROF_SPIN is set.
-
parcs authored
*p is both read and written to by the cmpxchg instruction, and therefore should be given the '+' constraint modifier. (In GCC's extended ASM language, '+' means that the operand is both read and written to whereas '=' means that it is only written to.) Otherwise, the compiler is allowed to rewrite something like SpinLock lock; initSpinLock(&lock); /* sets lock = 1 */ ACQUIRE_SPIN_LOCK(&lock); into SpinLock lock; ACQUIRE_SPIN_LOCK(&lock); because according to the asm statement, the previous value of 'lock' is not important.
-
Krzysztof Gogolewski authored
It has been deprecated for long and already removed from ghc --help
-
Simon Marlow authored
See also #5435. Now we have to remember the the StablePtrs that get created by the module initializer so that we can free them again in unloadObj().
-
Simon Marlow authored
The problem with unreachable code is that it might refer to undefined registers. This happens accidentally: a block can be orphaned by an optimisation, for example when the result of a comparsion becomes known. The register allocator panics when it finds an undefined register, because they shouldn't occur in generated code. So we need to also discard unreachable code to prevent this panic being triggered by optimisations. The register alloator already does a strongly-connected component analysis, so it ought to be easy to make it discard unreachable code as part of that traversal. It turns out that we need a different variant of the scc algorithm to do that (see Digraph), however the new variant also generates slightly better code by putting the blocks within a loop in a better order for register allocation.
-
Krzysztof Gogolewski authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
This merge revises and extends the current SIMD support in GHC. Notable features: * Support for AVX, AVX2, and AVX-512. Support for AVX-512 is untested. * SIMD primops are currently LLVM-only and documented in compiler/prelude/primops.txt.pp. * By default only 128-bit wide SIMD vectors are passed in registers, and then only on the X86_64 architecture. There is a "hidden" flag, -fllvm-pass-vectors-in-regs, that causes GHC to generate LLVM code that assumes all vectors are passed in registers by LLVM. This can be used with a suitably patched version of LLVM, and if we get LLVM 3.4 patched, we can consider turning it on by default for LLVM 3.4+. This would mean that we couldn't mix LLVM <3.4-compiled object files with LLVM >=3.4-compiled object files, but I don't see that as much of a problem. * utils/genprimcode has been hacked up to allow us to write vector operations once and have them instantiated at multiple vector types. I'm not thrilled with this solution, but after discussing with Simon PJ, what I've implemented seems to be the minimal reasonable solution to the problem of exploding primop boilerplate. The changes are documented in compiler/prelude/primops.txt.pp. * Error handling is sub-optimal. My patch checks to make sure that vector primops can be compiled efficiently based on the current set of dynamic flags. For example, if -mavx is not specified and the user tries to use a primop that adds together two 256-bit wide vectors of double-precision elements, the user will see an error message like: ghc-stage2: sorry! (unimplemented feature or known bug) (GHC version 7.7.20130916 for x86_64-unknown-linux): 256-bit wide floating point SIMD vector instructions require at least -mavx.
-
gmainlan@microsoft.com authored
SIMD vector instructions currently require the LLVM back-end. The set of available instructions also depends on the set of architecture flags specified on the command line.
-
gmainlan@microsoft.com authored
This sets the SSE "version" to 1.0.
-
gmainlan@microsoft.com authored
LLVM's GHC calling convention only allows 128-bit SIMD vectors to be passed in machine registers on X86-64. This may change in LLVM 3.4; the hidden flag -fllvm-pass-vectors-in-regs causes all SIMD vector widths to be passed in registers on both X86-64 and on X86-32.
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
LLVM uses aligned AVX moves to spill values onto the stack, which requires 32-bye aligned stacks. Since the stack in only 16-byte aligned, LLVM inserts extra instructions that munge the stack pointer. This is very very bad for the GHC calling convention, so we tell LLVM to assume the stack is 32-byte aligned. This patch rewrites the spill instructions that LLVM generates so they do not require an aligned stack.
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
-
gmainlan@microsoft.com authored
width and element type. SIMD primops are now polymorphic in vector size and element type, but only internally to the compiler. More specifically, utils/genprimopcode has been extended so that it "knows" about SIMD vectors. This allows us to, for example, write a single definition for the "add two vectors" primop in primops.txt.pp and have it instantiated at many vector types. This generates a primop in GHC.Prim for each vector type at which "add two vectors" is instantiated, but only one data constructor for the PrimOp data type, so the code generator is much, much simpler.
-
gmainlan@microsoft.com authored
GHC.PrimopWrappers is only used by GHCi, which cannot evaluate LLVM-only primops in any case.
-
Note that this will only work with the LLVM back end pending LLVM patches to change the GHC calling convention appropriately.
-
On x86-32, the C calling convention specifies that when SSE2 is enabled, vector arguments are passed in xmm* registers; however, float and double arguments are still passed on the stack. This patch allows us to make the same choice for GHC. Even when SSE2 is enabled, we don't want to pass Float and Double arguments in registers because this would change the ABI and break the ability to link with code that was compiled without -msse2. The next patch will enable passing vector arguments in xmm registers on x86-32.
-
gmainlan@microsoft.com authored
-
Austin Seipp authored
GHCi now runs constructors for linked libraries. Signed-off-by:
Austin Seipp <austin@well-typed.com>
-
- 22 Sep, 2013 3 commits
-
-
Austin Seipp authored
This commit exposes GHC's internal compiler pipeline through a `Hooks` module in the GHC API. It currently allows you to hook: * Foreign import/exports declarations * The frontend up to type checking * The one shot compilation mode * Core compilation, and the module iface * Linking and the phases in DriverPhases.hs * Quasiquotation Authored-by:
Luite Stegeman <stegeman@gmail.com> Authored-by:
Edsko de Vries <edsko@well-typed.com> Signed-off-by:
Austin Seipp <austin@well-typed.com>
-
-
Also a small formatting change in GHCi :help
-
- 21 Sep, 2013 2 commits
-
-
Herbert Valerio Riedel authored
-
Krzysztof Gogolewski authored
-
- 20 Sep, 2013 2 commits
-
-
-
Krzysztof Gogolewski authored
-