Nested Multiline Haddock Comments
Today as I was working on exactprinting for Haddock comments in GHC (so
we can exact print modules compiled with the -haddock
option), I came
across this dark corner of the Lexer:
{-| This is a doc comment for 'foo'
{- this part will not show up in the docstring -}
doc comment continued
-}
foo :: _
In this case, the haddock docstring for 'foo' will be
"This is a doc comment for 'foo'
doc comment continued"
However, you cannot make use of this "feature" if you use single line comment syntax:
-- | This is a doc comment for 'foo'
-- {- this part will show up in the comment -}
-- doc comment continued
foo :: _
Does anyone actually rely on this feature (i.e. using nested doc comments to hide sections from the visible haddocks)? Is this an intended behaviour or simply an emergent property of the implementation?
It spells a bit of trouble for my goal, since now the Haddock comment is actually two (nested?!!?) tokens instead of a single one. As of now we try to treat comments as a single token in the lexer.
However, nested multiline comments would need to emit arbitrarily many tokens (one for the actual haddock docstring, and ω for all the embedded nested comments) if we want to support exactprinting this.
As of now, the exactprinter doesn't support pretty printing an AST lexed
using the -haddock
flag, so the entire comment is usually treated as a
single non-haddock comment token.
In -haddock mode, nested comments are simply thrown away, so that the token ends up being the actual contents of the docstring without the nested portions.
I think the easiest (for the sanity of people maintaining the lexer) would be to just remove this feature. Does anyone have objections?
A naive regex search over the latest releases of all packages on hackage
revealed nobody using this feature as intended, and quite a few
unintentional uses. For example, from the lvish
package:
https://hackage.haskell.org/package/lvish-1.1.4/docs/Control-LVish-DeepFrz.html
{-|
The `DeepFrz` module provides a way to return arbitrarily complex data
structures containing LVars from `Par` computations.
...
> {-# LANGUAGE TypeFamilies #-}
> import Control.LVish.DeepFrz
...
-}
The intention was have the LANGUAGE pragma included in the rendered code snippet, but it was stripped out due to the current behaviour