Consider doing a CPS→SSA conversion in the backend
GHC currently uses CPS when generating code. This is difficult to map to LLVM, and even in the NCG may result in suboptimal use of the CPU, since CPUs are optimized for C-like languages.
It might be worthwhile to convert from CPS to SSA form in the backend and then optimize that. This would require a bunch of changes to the RTS, and add an entire new IR, but could enable optimizations that would otherwise be impossible.
To avoid a big compile-time regression, the SSA IR should be represented as a flat array, rather than a sea of linked nodes.