Experiment with different encodings for IRs for faster transformations
Data types and fine and dandy, and we can reason quite well about them. Well, except for occassional hiccups about laziness.
But as an encoding for an IR like Core, they are quite inefficient! Each pass tears down the incoming Core AST and immediately builds up another one. Thus, the ideal Core to Core pass fits the framework of recursion schemes, a fold (catamorphism) over the AST directly intertwined with an unfold (anamorphism) of the output AST. If there are multiple succeeding Core to Core passes, like simpl $ floatOut $ ...
, we could try to find an encoding that would allow us to fuse the unfold of floatOut
with the fold of simpl
!
Of course, this is easier said than done. For example, simpl
is generally not defined compositionally (e.g. defining pattern-matches are at most one level deep, primitively recursive), because it might see if the App
head is a Var
, or call functions like exprIsTrivial
etc. that consume entire expressions without really building a new one. I think that we could still get leverage through generalisation to paramorphism (generalised folds) and apomorphisms (generalised unfolds).
A few ideas:
-
Tagless final allows fusion simply through type class specialisation. Each Core to Core pass has its own Type class instance
CoreFold
, unfolding its output into anotherCoreFold
as it folds over its input. Non-compositional pattern matching seems to be possible, though inefficient. I think that this encoding can be syntactically sugared up with pattern synonyms to great effect. - Extract
type CoreF a = ExprF CoreBndr a
, then use a general recursion-schemes flavoured solution. The popular recursion-schemes library shows an established way to getpara
andapo
schemes that we'd need for non-compositional definitions.
Ideally, we'd combine both approaches, so that we get good fusion/specialisation properties from (1) and ideas for efficient non-compositional pattern matches from (2).