GHC Lexer reports error in string literal including ZWJ
Summary
GHC Lexer reports error in string literal including ZWJ.
Steps to reproduce
- Provide the following file.
main = do
putStrLn "\128104\8205\128105\8205\128103\8205\128102"
putStrLn "👨👩👧👦"
- Try to compile it and get a lexical error at ZWJ.
$ ghc foo.hs
[1 of 1] Compiling Main ( foo.hs, foo.o )
foo.hs:3:16: error:
lexical error in string/character literal at character '\8205'
|
3 | putStrLn "👨👩👧👦"
| ^
Expected behavior
We expect same behavior on the both codes of
putStrLn "\128104\8205\128105\8205\128103\8205\128102"
and
putStrLn "👨👩👧👦"
without lexical error.
Environment
- GHC version used: 8.10.7, 9.0.2 and 9.2.2
Optional:
- Operating System: Ubuntu 20.04
- System Architecture: x86_64