Skip to content

The GHC User's Guide does not document GHC's extensions to valid whitespace

Summary

The Haskell Language Report documents that comments are valid whitespace (s2.3). GHC also treats the following as valid whitespace, except immediately after do, but that is not documented in the GHC User's Guide:

  • Lines beginning #!
  • Lines beginning #!
  • Lines beginning #pragma
  • Lines beginning #line <line> "<filename>" where <line> is a positive integer and <file> can comprise zero or more characters
  • Lines beginning #<line> "<filename>" where <line> is a positive integer and <file> can comprise zero or more characters

Where lines must end with a line feed character.

See compiler/GHC/Parser/Lexer.x where it introduces <bol>.

Proposed improvements or changes

I propose that the GHC User's Guide documents what GHC also treats as valid whitespace in source files. Perhaps a place to put that content would be in a new subsection of the 6.19 Miscellaneous part of the 6. Language extensions chapter.

For example:

6.20 Whitespace

In addition to Haskell comments, lines (which must end with a line feed character) which begin as follows are valid whitespace in source code, except immediately after a do statement:

  • #! (this accommodates 'shebang' interpreter directives in scripts on Unix-like operating systems)
  • #! (with an initial space character)
  • #pragma
  • #line <line> "<file>", where <line> is a positive integer and <file> can comprise zero or more characters
  • #<line> "<file>", where <line> is a positive integer and <file> can comprise zero or more characters

If others agree, I'll raise a merge request accordingly.

Edited by Mike Pilgrem
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information