WIP: AArch64 NCG
-
Basic NCG that can compile trivial Haskell programs. -
Add Enum Set to NatM
with optimisation flags, and the corresponding-fasm-opt-<flag>
. These should eventually all be on by default, but having separate flags will allow us to measure performance impacts precisely and write tests that verify that certain optimisations actually happen.-
jumptbl
: Generate Jump Tables -
regoff
: Destructure Reg Offsets -
zeroreg
: Returnwzr
,xzr
for 0 loads, instead of assigning them to registers. -
immload
: Try to load immediates in as few as possible instructions. -
...
-
-
Add ANN SDoc Inst
, that prints asppr $inst <pad to 80># $comment
, to allow adding inline comments tot he assembly for better readability. TheCOMMENT
pseudo instruction isn't very good for that. -
Add fuse phase. This would need to run after the the register allocator, and turn subsequent LDR/STR instructions into LDP/STP instructions. Basically an OL -> OL fold.
A fuse phase would allow more general transformations. However for the LDR/STR situation having a mass-spill, mass-reload hook could also work. -
Add spill/reload counter/statistics. On large register machines, modules that produce high spills/reloads would provide good points for investigation. Why do we need to spill so much? Can we optimise the register usage, assignment, instruction order? -
Drop unused basic block labels. -
Build an llvm and a ncg version of GHC, with -keep-s-file
, then compute a statistic of [module] [llvm generated instruction] [ncg generated instructions], check the ratio and investigate modules where the ncg ends up producing a lot more instructions. Also look at instruction distributions. Does the llvm backend generate interesting alternative instructions that the ncg should learn about? -
PIC/no-PIC? Fixed, but requires !3433 (closed) for full linker support.
This is a bit tricky. We use pc relative loads with ranges +-4GB, this is essentially what-fPIC
will produce by default. However we might run into issues if we try to link code with the rts linker if that wasn't built with-fPIC
. E.g. Haskell Module (essentially always PIC) -> C Module (no -fPIC) -> symbol. The C Module might reference something that's out of reach (e.g.environ
,stdout
, ...), and the rts linker has no way of relocating those symbols correctly. -
Research: Is the linear register allocator the best suited one?
See Andreas comment below. -
Replace FileCheck
with some tool, such that we do not necessary depend on LLVM's toolchain just forFileCheck
. -
Integrate the cmm to asm test-suite in tests into to the ghc testsuite somehow?