Commit 42504f4a authored by Carter Schonwald's avatar Carter Schonwald

removing x87 register support from native code gen

* simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors
* makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding
behavior in 32bit haskell code
* removes the 80bit floating point representation from the supported float sizes
* theres still 1 tiny bit of x87 support needed,
for handling float and double return values in FFI calls  wrt the C ABI on x86_32,
but this one piece does not leak into the rest of NCG.
* Lots of code thats not been touched in a long time got deleted as a
consequence of all of this

all in all, this change paves the way towards a lot of future further
improvements in how GHC handles floating point computations, along with
making the native code gen more accessible to a larger pool of contributors.
parent be0dde8e
Pipeline #4737 failed with stages
in 916 minutes and 31 seconds
......@@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments)
| passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss))
(W64, (vs, fs, d:ds, ls, ss))
| not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss))
(W80, _) -> panic "F80 unsupported register type"
_ -> (assts, (r:rs))
int = case (w, regs) of
(W128, _) -> panic "W128 unsupported register type"
......@@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments)
passFloatArgsInXmm :: DynFlags -> Bool
passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of
ArchX86_64 -> True
ArchX86 -> False
_ -> False
-- We used to spill vector registers to the stack since the LLVM backend didn't
......
......@@ -474,6 +474,9 @@ instance Eq GlobalReg where
FloatReg i == FloatReg j = i==j
DoubleReg i == DoubleReg j = i==j
LongReg i == LongReg j = i==j
-- NOTE: XMM, YMM, ZMM registers actually are the same registers
-- at least with respect to store at YMM i and then read from XMM i
-- and similarly for ZMM etc.
XmmReg i == XmmReg j = i==j
YmmReg i == YmmReg j = i==j
ZmmReg i == ZmmReg j = i==j
......@@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags
globalRegType _ (FloatReg _) = cmmFloat W32
globalRegType _ (DoubleReg _) = cmmFloat W64
globalRegType _ (LongReg _) = cmmBits W64
-- TODO: improve the internal model of SIMD/vectorized registers
-- the right design SHOULd improve handling of float and double code too.
-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim
globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32)
globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32)
globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32)
......
......@@ -166,9 +166,6 @@ isFloat64 _other = False
-----------------------------------------------------------------------------
data Width = W8 | W16 | W32 | W64
| W80 -- Extended double-precision float,
-- used in x86 native codegen only.
-- (we use Ord, so it'd better be in this order)
| W128
| W256
| W512
......@@ -185,7 +182,7 @@ mrStr W64 = sLit("W64")
mrStr W128 = sLit("W128")
mrStr W256 = sLit("W256")
mrStr W512 = sLit("W512")
mrStr W80 = sLit("W80")
-------- Common Widths ------------
......@@ -222,7 +219,7 @@ widthInBits W64 = 64
widthInBits W128 = 128
widthInBits W256 = 256
widthInBits W512 = 512
widthInBits W80 = 80
widthInBytes :: Width -> Int
widthInBytes W8 = 1
......@@ -232,7 +229,7 @@ widthInBytes W64 = 8
widthInBytes W128 = 16
widthInBytes W256 = 32
widthInBytes W512 = 64
widthInBytes W80 = 10
widthFromBytes :: Int -> Width
widthFromBytes 1 = W8
......@@ -242,7 +239,7 @@ widthFromBytes 8 = W64
widthFromBytes 16 = W128
widthFromBytes 32 = W256
widthFromBytes 64 = W512
widthFromBytes 10 = W80
widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n)
-- log_2 of the width in bytes, useful for generating shifts.
......@@ -254,7 +251,7 @@ widthInLog W64 = 3
widthInLog W128 = 4
widthInLog W256 = 5
widthInLog W512 = 6
widthInLog W80 = panic "widthInLog: F80"
-- widening / narrowing
......
......@@ -1727,8 +1727,38 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags)
vecElemProjectCast _ WordVec W64 = Nothing
vecElemProjectCast _ _ _ = Nothing
-- NOTE [SIMD Design for the future]
-- Check to make sure that we can generate code for the specified vector type
-- given the current set of dynamic flags.
-- Currently these checks are specific to x86 and x86_64 architecture.
-- This should be fixed!
-- In particular,
-- 1) Add better support for other architectures! (this may require a redesign)
-- 2) Decouple design choices from LLVM's pseudo SIMD model!
-- The high level LLVM naive rep makes per CPU family SIMD generation is own
-- optimization problem, and hides important differences in eg ARM vs x86_64 simd
-- 3) Depending on the architecture, the SIMD registers may also support general
-- computations on Float/Double/Word/Int scalars, but currently on
-- for example x86_64, we always put Word/Int (or sized) in GPR
-- (general purpose) registers. Would relaxing that allow for
-- useful optimization opportunities?
-- Phrased differently, it is worth experimenting with supporting
-- different register mapping strategies than we currently have, especially if
-- someday we want SIMD to be a first class denizen in GHC along with scalar
-- values!
-- The current design with respect to register mapping of scalars could
-- very well be the best,but exploring the design space and doing careful
-- measurments is the only only way to validate that.
-- In some next generation CPU ISAs, notably RISC V, the SIMD extension
-- includes support for a sort of run time CPU dependent vectorization parameter,
-- where a loop may act upon a single scalar each iteration OR some 2,4,8 ...
-- element chunk! Time will tell if that direction sees wide adoption,
-- but it is from that context that unifying our handling of simd and scalars
-- may benefit. It is not likely to benefit current architectures, though
-- it may very well be a design perspective that helps guide improving the NCG.
checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode ()
checkVecCompatibility dflags vcat l w = do
when (hscTarget dflags /= HscLlvm) $ do
......
......@@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE
widthToLlvmFloat :: Width -> LlvmType
widthToLlvmFloat W32 = LMFloat
widthToLlvmFloat W64 = LMDouble
widthToLlvmFloat W80 = LMFloat80
widthToLlvmFloat W128 = LMFloat128
widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w
......
......@@ -5833,20 +5833,24 @@ data SseVersion = SSE1
isSseEnabled :: DynFlags -> Bool
isSseEnabled dflags = case platformArch (targetPlatform dflags) of
ArchX86_64 -> True
ArchX86 -> sseVersion dflags >= Just SSE1
ArchX86 -> True
_ -> False
isSse2Enabled :: DynFlags -> Bool
isSse2Enabled dflags = case platformArch (targetPlatform dflags) of
ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be
-- possible to make it optional, but we'd need to
-- fix at least the foreign call code where the
-- calling convention specifies the use of xmm regs,
-- and possibly other places.
True
ArchX86 -> sseVersion dflags >= Just SSE2
-- We Assume SSE1 and SSE2 operations are available on both
-- x86 and x86_64. Historically we didn't default to SSE2 and
-- SSE1 on x86, which results in defacto nondeterminism for how
-- rounding behaves in the associated x87 floating point instructions
-- because variations in the spill/fpu stack placement of arguments for
-- operations would change the precision and final result of what
-- would otherwise be the same expressions with respect to single or
-- double precision IEEE floating point computations.
ArchX86_64 -> True
ArchX86 -> True
_ -> False
isSse4_2Enabled :: DynFlags -> Bool
isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42
......
......@@ -179,7 +179,7 @@ nativeCodeGen dflags this_mod modLoc h us cmms
x86NcgImpl :: DynFlags -> NcgImpl (Alignment, CmmStatics)
X86.Instr.Instr X86.Instr.JumpDest
x86NcgImpl dflags
= (x86_64NcgImpl dflags) { ncg_x86fp_kludge = map x86fp_kludge }
= (x86_64NcgImpl dflags)
x86_64NcgImpl :: DynFlags -> NcgImpl (Alignment, CmmStatics)
X86.Instr.Instr X86.Instr.JumpDest
......@@ -194,7 +194,6 @@ x86_64NcgImpl dflags
,pprNatCmmDecl = X86.Ppr.pprNatCmmDecl
,maxSpillSlots = X86.Instr.maxSpillSlots dflags
,allocatableRegs = X86.Regs.allocatableRegs platform
,ncg_x86fp_kludge = id
,ncgAllocMoreStack = X86.Instr.allocMoreStack platform
,ncgExpandTop = id
,ncgMakeFarBranches = const id
......@@ -215,7 +214,6 @@ ppcNcgImpl dflags
,pprNatCmmDecl = PPC.Ppr.pprNatCmmDecl
,maxSpillSlots = PPC.Instr.maxSpillSlots dflags
,allocatableRegs = PPC.Regs.allocatableRegs platform
,ncg_x86fp_kludge = id
,ncgAllocMoreStack = PPC.Instr.allocMoreStack platform
,ncgExpandTop = id
,ncgMakeFarBranches = PPC.Instr.makeFarBranches
......@@ -236,7 +234,6 @@ sparcNcgImpl dflags
,pprNatCmmDecl = SPARC.Ppr.pprNatCmmDecl
,maxSpillSlots = SPARC.Instr.maxSpillSlots dflags
,allocatableRegs = SPARC.Regs.allocatableRegs
,ncg_x86fp_kludge = id
,ncgAllocMoreStack = noAllocMoreStack
,ncgExpandTop = map SPARC.CodeGen.Expand.expandTop
,ncgMakeFarBranches = const id
......@@ -680,19 +677,10 @@ cmmNativeGen dflags this_mod modLoc ncgImpl us fileIds dbgMap cmm count
foldl' (\m (from,to) -> addImmediateSuccessor from to m )
cfgWithFixupBlks stack_updt_blks
---- x86fp_kludge. This pass inserts ffree instructions to clear
---- the FPU stack on x86. The x86 ABI requires that the FPU stack
---- is clear, and library functions can return odd results if it
---- isn't.
----
---- NB. must happen before shortcutBranches, because that
---- generates JXX_GBLs which we can't fix up in x86fp_kludge.
let kludged = {-# SCC "x86fp_kludge" #-} ncg_x86fp_kludge ncgImpl alloced
---- generate jump tables
let tabled =
{-# SCC "generateJumpTables" #-}
generateJumpTables ncgImpl kludged
generateJumpTables ncgImpl alloced
dumpIfSet_dyn dflags
Opt_D_dump_cfg_weights "CFG Update information"
......@@ -787,12 +775,6 @@ checkLayout procsUnsequenced procsSequenced =
getBlockIds (CmmProc _ _ _ (ListGraph blocks)) =
setFromList $ map blockId blocks
x86fp_kludge :: NatCmmDecl (Alignment, CmmStatics) X86.Instr.Instr -> NatCmmDecl (Alignment, CmmStatics) X86.Instr.Instr
x86fp_kludge top@(CmmData _ _) = top
x86fp_kludge (CmmProc info lbl live (ListGraph code)) =
CmmProc info lbl live (ListGraph $ X86.Instr.i386_insert_ffrees code)
-- | Compute unwinding tables for the blocks of a procedure
computeUnwinding :: Instruction instr
=> DynFlags -> NcgImpl statics instr jumpDest
......
......@@ -47,7 +47,6 @@ data Format
| II64
| FF32
| FF64
| FF80
deriving (Show, Eq)
......@@ -70,7 +69,7 @@ floatFormat width
= case width of
W32 -> FF32
W64 -> FF64
W80 -> FF80
other -> pprPanic "Format.floatFormat" (ppr other)
......@@ -80,7 +79,6 @@ isFloatFormat format
= case format of
FF32 -> True
FF64 -> True
FF80 -> True
_ -> False
......@@ -101,7 +99,7 @@ formatToWidth format
II64 -> W64
FF32 -> W32
FF64 -> W64
FF80 -> W80
formatInBytes :: Format -> Int
formatInBytes = widthInBytes . formatToWidth
......@@ -76,7 +76,6 @@ data NcgImpl statics instr jumpDest = NcgImpl {
pprNatCmmDecl :: NatCmmDecl statics instr -> SDoc,
maxSpillSlots :: Int,
allocatableRegs :: [RealReg],
ncg_x86fp_kludge :: [NatCmmDecl statics instr] -> [NatCmmDecl statics instr],
ncgExpandTop :: [NatCmmDecl statics instr] -> [NatCmmDecl statics instr],
ncgAllocMoreStack :: Int -> NatCmmDecl statics instr
-> UniqSM (NatCmmDecl statics instr, [(BlockId,BlockId)]),
......
......@@ -1593,7 +1593,7 @@ genCCall'
-> [CmmActual] -- arguments (of mixed type)
-> NatM InstrBlock
{-
{-
PowerPC Linux uses the System V Release 4 Calling Convention
for PowerPC. It is described in the
"System V Application Binary Interface PowerPC Processor Supplement".
......@@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args
FF32 -> (1, 1, 4, fprs)
FF64 -> (2, 1, 8, fprs)
II64 -> panic "genCCall' passArguments II64"
FF80 -> panic "genCCall' passArguments FF80"
GCP32ELF ->
case cmmTypeFormat rep of
II8 -> (1, 0, 4, gprs)
......@@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args
FF32 -> (0, 1, 4, fprs)
FF64 -> (0, 1, 8, fprs)
II64 -> panic "genCCall' passArguments II64"
FF80 -> panic "genCCall' passArguments FF80"
GCP64ELF _ ->
case cmmTypeFormat rep of
II8 -> (1, 0, 8, gprs)
......@@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args
-- the FPRs.
FF32 -> (1, 1, 8, fprs)
FF64 -> (1, 1, 8, fprs)
FF80 -> panic "genCCall' passArguments FF80"
moveResult reduceToFF32 =
case dest_regs of
......
......@@ -161,7 +161,7 @@ pprReg r
RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u
RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u
RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u
RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u
where
ppr_reg_no :: Int -> SDoc
ppr_reg_no i
......@@ -179,8 +179,7 @@ pprFormat x
II32 -> sLit "w"
II64 -> sLit "d"
FF32 -> sLit "fs"
FF64 -> sLit "fd"
_ -> panic "PPC.Ppr.pprFormat: no match")
FF64 -> sLit "fd")
pprCond :: Cond -> SDoc
......@@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [
II64 -> sLit "d"
FF32 -> sLit "fs"
FF64 -> sLit "fd"
_ -> panic "PPC.Ppr.pprInstr: no match"
),
case addr of AddrRegImm _ _ -> empty
AddrRegReg _ _ -> char 'x',
......@@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [
II64 -> sLit "d"
FF32 -> sLit "fs"
FF64 -> sLit "fd"
_ -> panic "PPC.Ppr.pprInstr: no match"
),
case addr of AddrRegImm _ _ -> empty
AddrRegReg _ _ -> char 'x',
......
......@@ -131,7 +131,7 @@ regDotColor reg
RcInteger -> text "blue"
RcFloat -> text "red"
RcDouble -> text "green"
RcDoubleSSE -> text "yellow"
-- immediates ------------------------------------------------------------------
......
......@@ -56,7 +56,7 @@ data VirtualReg
| VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register
| VirtualRegF {-# UNPACK #-} !Unique
| VirtualRegD {-# UNPACK #-} !Unique
| VirtualRegSSE {-# UNPACK #-} !Unique
deriving (Eq, Show)
-- This is laborious, but necessary. We can't derive Ord because
......@@ -69,15 +69,14 @@ instance Ord VirtualReg where
compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b
compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b
compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b
compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b
compare VirtualRegI{} _ = LT
compare _ VirtualRegI{} = GT
compare VirtualRegHi{} _ = LT
compare _ VirtualRegHi{} = GT
compare VirtualRegF{} _ = LT
compare _ VirtualRegF{} = GT
compare VirtualRegD{} _ = LT
compare _ VirtualRegD{} = GT
instance Uniquable VirtualReg where
......@@ -87,16 +86,18 @@ instance Uniquable VirtualReg where
VirtualRegHi u -> u
VirtualRegF u -> u
VirtualRegD u -> u
VirtualRegSSE u -> u
instance Outputable VirtualReg where
ppr reg
= case reg of
VirtualRegI u -> text "%vI_" <> pprUniqueAlways u
VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u
VirtualRegF u -> text "%vF_" <> pprUniqueAlways u
VirtualRegD u -> text "%vD_" <> pprUniqueAlways u
VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u
-- this code is kinda wrong on x86
-- because float and double occupy the same register set
-- namely SSE2 register xmm0 .. xmm15
VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u
VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u
renameVirtualReg :: Unique -> VirtualReg -> VirtualReg
......@@ -106,7 +107,6 @@ renameVirtualReg u r
VirtualRegHi _ -> VirtualRegHi u
VirtualRegF _ -> VirtualRegF u
VirtualRegD _ -> VirtualRegD u
VirtualRegSSE _ -> VirtualRegSSE u
classOfVirtualReg :: VirtualReg -> RegClass
......@@ -116,7 +116,7 @@ classOfVirtualReg vr
VirtualRegHi{} -> RcInteger
VirtualRegF{} -> RcFloat
VirtualRegD{} -> RcDouble
VirtualRegSSE{} -> RcDoubleSSE
-- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform
......
......@@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl
trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions
| let cALLOCATABLE_REGS_FLOAT
= (case platformArch platform of
-- On x86_64 and x86, Float and RcDouble
-- use the same registers,
-- so we only use RcDouble to represent the
-- register allocation problem on those types.
ArchX86 -> 0
ArchX86_64 -> 0
ArchPPC -> 0
......@@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus
trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions
| let cALLOCATABLE_REGS_DOUBLE
= (case platformArch platform of
ArchX86 -> 6
ArchX86_64 -> 0
ArchX86 -> 8
-- in x86 32bit mode sse2 there are only
-- 8 XMM registers xmm0 ... xmm7
ArchX86_64 -> 10
-- in x86_64 there are 16 XMM registers
-- xmm0 .. xmm15, here 10 is a
-- "dont need to solve conflicts" count that
-- was chosen at some point in the past.
ArchPPC -> 26
ArchSPARC -> 11
ArchSPARC64 -> panic "trivColorable ArchSPARC64"
......@@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu
= count3 < cALLOCATABLE_REGS_DOUBLE
trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions
| let cALLOCATABLE_REGS_SSE
= (case platformArch platform of
ArchX86 -> 8
ArchX86_64 -> 10
ArchPPC -> 0
ArchSPARC -> 0
ArchSPARC64 -> panic "trivColorable ArchSPARC64"
ArchPPC_64 _ -> 0
ArchARM _ _ _ -> panic "trivColorable ArchARM"
ArchARM64 -> panic "trivColorable ArchARM64"
ArchAlpha -> panic "trivColorable ArchAlpha"
ArchMipseb -> panic "trivColorable ArchMipseb"
ArchMipsel -> panic "trivColorable ArchMipsel"
ArchJavaScript-> panic "trivColorable ArchJavaScript"
ArchUnknown -> panic "trivColorable ArchUnknown")
, count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE
(virtualRegSqueeze RcDoubleSSE)
conflicts
, count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE
(realRegSqueeze RcDoubleSSE)
exclusions
= count3 < cALLOCATABLE_REGS_SSE
-- Specification Code ----------------------------------------------------------
......
......@@ -18,7 +18,6 @@ data RegClass
= RcInteger
| RcFloat
| RcDouble
| RcDoubleSSE -- x86 only: the SSE regs are a separate class
deriving Eq
......@@ -26,10 +25,8 @@ instance Uniquable RegClass where
getUnique RcInteger = mkRegClassUnique 0
getUnique RcFloat = mkRegClassUnique 1
getUnique RcDouble = mkRegClassUnique 2
getUnique RcDoubleSSE = mkRegClassUnique 3
instance Outputable RegClass where
ppr RcInteger = Outputable.text "I"
ppr RcFloat = Outputable.text "F"
ppr RcDouble = Outputable.text "D"
ppr RcDoubleSSE = Outputable.text "S"
......@@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot
RcInteger -> II32
RcFloat -> FF32
RcDouble -> FF64
_ -> panic "sparc_mkSpillInstr"
in ST fmt reg (fpRel (negate off_w))
......@@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot
RcInteger -> II32
RcFloat -> FF32
RcDouble -> FF64
_ -> panic "sparc_mkLoadInstr"
in LD fmt (fpRel (- off_w)) reg
......@@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst
RcInteger -> ADD False False src (RIReg g0) dst
RcDouble -> FMOV FF64 src dst
RcFloat -> FMOV FF32 src dst
_ -> panic "sparc_mkRegRegMoveInstr"
| otherwise
= panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same"
......
......@@ -143,7 +143,7 @@ pprReg reg
VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u
VirtualRegF u -> text "%vF_" <> pprUniqueAlways u
VirtualRegD u -> text "%vD_" <> pprUniqueAlways u
VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u
RegReal rr
-> case rr of
......@@ -211,8 +211,7 @@ pprFormat x
II32 -> sLit ""
II64 -> sLit "d"
FF32 -> sLit ""
FF64 -> sLit "d"
_ -> panic "SPARC.Ppr.pprFormat: no match")
FF64 -> sLit "d")
-- | Pretty print a format for an instruction suffix.
......@@ -226,8 +225,8 @@ pprStFormat x
II32 -> sLit ""
II64 -> sLit "x"
FF32 -> sLit ""
FF64 -> sLit "d"
_ -> panic "SPARC.Ppr.pprFormat: no match")
FF64 -> sLit "d")
-- | Pretty print a condition code.
......
......@@ -104,7 +104,6 @@ virtualRegSqueeze cls vr
VirtualRegD{} -> 1
_other -> 0
_other -> 0
{-# INLINE realRegSqueeze #-}
realRegSqueeze :: RegClass -> RealReg -> Int
......@@ -135,7 +134,6 @@ realRegSqueeze cls rr
RealRegPair{} -> 1
_other -> 0
-- | All the allocatable registers in the machine,
-- including register pairs.
......
This diff is collapsed.
......@@ -10,7 +10,7 @@
module X86.Instr (Instr(..), Operand(..), PrefetchVariant(..), JumpDest(..),
getJumpDestBlockId, canShortcut, shortcutStatics,
shortcutJump, i386_insert_ffrees, allocMoreStack,
shortcutJump, allocMoreStack,
maxSpillSlots, archWordFormat )
where
......@@ -240,46 +240,14 @@ data Instr
| BT Format Imm Operand
| NOP
-- x86 Float Arithmetic.
-- Note that we cheat by treating G{ABS,MOV,NEG} of doubles
-- as single instructions right up until we spit them out.
-- all the 3-operand fake fp insns are src1 src2 dst
-- and furthermore are constrained to be fp regs only.
-- IMPORTANT: keep is_G_insn up to date with any changes here
| GMOV Reg Reg -- src(fpreg), dst(fpreg)
| GLD Format AddrMode Reg -- src, dst(fpreg)
| GST Format Reg AddrMode -- src(fpreg), dst
| GLDZ Reg -- dst(fpreg)
| GLD1 Reg -- dst(fpreg)
| GFTOI Reg Reg -- src(fpreg), dst(intreg)
| GDTOI Reg Reg -- src(fpreg), dst(intreg)
| GITOF Reg Reg -- src(intreg), dst(fpreg)
| GITOD Reg Reg -- src(intreg), dst(fpreg)
| GDTOF Reg Reg -- src(fpreg), dst(fpreg)
| GADD Format Reg Reg Reg -- src1, src2, dst
| GDIV Format Reg Reg Reg -- src1, src2, dst
| GSUB Format Reg Reg Reg -- src1, src2, dst
| GMUL Format Reg Reg Reg -- src1, src2, dst
-- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT]
-- Compare src1 with src2; set the Zero flag iff the numbers are
-- comparable and the comparison is True. Subsequent code must
-- test the %eflags zero flag regardless of the supplied Cond.
| GCMP Cond Reg Reg -- src1, src2
| GABS Format Reg Reg -- src, dst
| GNEG Format Reg Reg -- src, dst
| GSQRT Format Reg Reg -- src, dst
| GSIN Format CLabel CLabel Reg Reg -- src, dst
| GCOS Format CLabel CLabel Reg Reg -- src, dst
| GTAN Format CLabel CLabel Reg Reg -- src, dst
| GFREE -- do ffree on all x86 regs; an ugly hack
-- We need to support the FSTP (x87 store and pop) instruction
-- so that we can correctly read off the return value of an
-- x86 CDECL C function call when its floating point.
-- so we dont include a register argument, and just use st(0)
-- this instruction is used ONLY for return values of C ffi calls
-- in x86_32 abi
| X87Store Format AddrMode -- st(0), dst