Improvements to Parser.y
I believe that the parser needs some love. Although yacc syntax might be considered inherently arcane and some people may not want to touch it with a ten foot pole, I still think there is vast amount for improvement to clean up its design.
Shift/Reduce Conflicts
Currently, there are about 250 shift/reduce and 50 reduce/reduce (apparently they were introduced by me, apologies) conflicts in the parser. That's a lot!
Apparently we came to the conclusion that these conflicts are OK and are automatically resolved exactly as we want them to be. There's even a list of hopefully all the conflicts with examples.
We currently use an %expect
declaration to make sure that changes to the parser don't change the number of conflicts. I argue that this is a pretty bad metric. For example, this patch rejects the idea of having an opt_instance
production (reducing whether or not the 'instance' keyword occurred) solely on the grounds of "it introduces more conflicts and I don't know why and how to fix it". I suspect that opt_instance
wasn't even the direct reason for such conflicts, but rather some combinatorial interaction with some pre-existing conflicts. Thus, the patch chose to just duplicate a production (a technique that I think could be implemented by the parser generator, see Menhir's %inline declaration). You might think that this is OK, but I'm currently in the process of implementing the UnliftedDatatypes proposal, where I plan to introduce opt_levity
for optional unlifted
or lifted
keywords. Following a similar reasoning, I'd have to duplicate the productions yet again! Clearly we don't want to maintain 4 productions of deceivingly similar form.
So why not consolidate the list by providing explicit %precedence
(%nonassoc
/%left
/%right
) declarations? At least that's the accepted way to resolve shift/reduce conflicts for bison
. This is much more explicit than just letting happy
decide which works best for us.
Make use of modern parser generator features
%inline
Implement The %inline
feature from Menhir I mentioned above is one feature I'd like to see in happy
. I opened this ticket to track.
Make use of "parameterized productions"
But as @Ericson2314 points out, happy
has parameterized productions for a long time now, without anyone in the GHC team having made use of it. There are a bunch of instances like opt_instance
and opt_family
and probably a number of productions that are simply there to collect (non-empty) lists of stuff, specialised to stuff. We should rewrite them with this feature!