Improve parse error reporting.
More accurate parse error
Motivation
The current parser gives limited information in case of parse errors and immediately stops, disabling any additional error reporting from the type checker.
For example, the following:
foo x y z = bar
(x, y, z)
(y, x, z)
(x, x, y
(y, y, y)
data Chicken = CotCotCot
bar a b c d = print ((a, b, c, d) + "hello")
Fails with the following error message:
Parse.hs:7:1: error:
parse error (possibly incorrect indentation or mismatched brackets)
|
7 | data Chicken = CotCotCot
| ^
In this example, the error is 3 lines above, so that's difficult to fix.
Proposal
I'd like more hints about the reasons for the parse error. An error message like:
- parse error (possible incorrect indentation or mismatched brackets)
- Expected ')'. Opening context was:
3 | (x, x, y
| ^
Error recovery
Motivation
Moreover, the last line contains a type error (i.e. +
between a tuple and a String
) that I'd like to see at the same time.
When I'm developing, that's really common that in the middle of a function definition, I decide to introduce another type / function. To get repl / type checking / editor integration during that sub task, I need to ensure that the function I was currently writing does not generate a syntax error. Most of the time I have to comment my work in progress.
If the parser was able to do some recovery and continue the type checking on my file, I may be able to stop right in the middle of an expression and start something else in the file. This can dramatically improve the workflow in an editor which implements check as you type using ghci.
For example, the following code:
bar = 1 * "hello"
data Point = Point Float Float Float
Triggers the following change of the editor status when I type the last line:
d*ata P|oint =* P|oint F#oat| F#oat| F#oat
With each chars, *
, |
and #
representing a change in the error reporting:
-
*
is a syntax error, meaning the type error of the first line is not reported -
|
syntax is ok, GHC happily report the type error of the first line -
#
syntax is OK, but ghc only reports the not in scope constructor F(oat)
Proposal
I'd like the parser to be able to recover from parsing error and pass an incomplete AST to the type checker so that parsing error and type error can cohabit.
For the comparaison, I wrote a simple c++ program:
class Foo
{}
int main()
{
int v[][3] = {
{1,2,3},
{4,5,6,
{7,8,9},
};
Foo() * 2;
}
This program contains a few syntax error and type errors. This is the gcc
output:
Parse.cpp:2:3: error: expected ‘;’ after class definition
{}
^
;
Parse.cpp: In function ‘int main()’:
Parse.cpp:10:8: error: expected ‘}’ before ‘;’ token
};
^
Parse.cpp:10:8: error: too many initializers for ‘int [3]’
Parse.cpp:11:11: error: no match for ‘operator*’ (operand types are ‘Foo’ and ‘int’)
Foo() * 2;
~~~~~~^~~
Here gcc
is able to find the two syntax error and gives precise position information. But gcc
also show the type error (i.e. Foo() * 2
) alongside.
Conclusion
I think theses changes may improve the development experience by giving more interesting feedbacks to the developer and help him focus on the important task. Sometime fixing a syntax error is less important that writing an utility function and I'd like GHC to help me even if another part of my file contains a syntax error.