Skip to content

GHC is inconsistent with the Haskell Report on which Unicode characters are allowed in string and character literals

GHC is inconsistent with the Haskell Report on which Unicode characters are allowed in string and character literals. (And I don't like either option, why leave out any characters in strings unnecessarily?)

Examples from ghci 7.6.3 (also tested in lambdabot on irc):

Prelude> "​" -- Unicode char \8203, Format class.

<interactive>:10:2:
    lexical error in string/character literal at character '\8203'
Prelude> " " -- Unicode char \8202, Space class.
"\8202"
Prelude> "t\ \est" -- Unicode char \8202 in a string gap.

<interactive>:14:4:
    lexical error in string/character literal at character '\8202'

My reading of http://www.haskell.org/onlinereport/haskell2010/haskellch2.html (section 2.2 and 2.6):

  • The report BNF token "graphic", which can be used in literals, includes indirectly many Unicode classes, but uniWhite is not one of them. Thus the only Unicode whitespace allowed to represent itself in literals is ASCII space.
  • Unicode formatting characters are not mentioned in the BNF that I can see, so are not allowed in literals.
  • String gaps are made out of the report BNF token whitespace, which does include uniWhite.

Who wants what:

GHC Report Me
Format in string No No Yes
Space/uniWhite in string Yes No Yes
Space/uniWhite in string gap No Yes Dunno

In short, GHC's behavior is buggy and/or annoying in two opposite ways:

  • It leaves out some Unicode characters as allowable in strings and character literals, presumably because the report says so.
  • It allows some characters the report says it ''shouldn't*, and refuses some characters the report says it *should''.
Trac metadata
Trac field Value
Version 7.6.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
Edited by Douglas Wilson
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information