GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2023-12-19T13:58:20Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/23984EPA: incorrect comment location for module declaration2023-12-19T13:58:20ZAlan ZimmermanEPA: incorrect comment location for module declarationThe code
```hs
{-# LANGUAGE NamedFieldPuns #-}
{- -} module Main where
main = putStr "foo"
```
puts the comment on the `main` `FunDecl` rather than in the header comments.The code
```hs
{-# LANGUAGE NamedFieldPuns #-}
{- -} module Main where
main = putStr "foo"
```
puts the comment on the `main` `FunDecl` rather than in the header comments.Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/23595Tidyup: Move HasLoc instances into their own module from compiler/GHC/Iface/E...2023-07-04T14:07:49ZAlan ZimmermanTidyup: Move HasLoc instances into their own module from compiler/GHC/Iface/Ext/Ast.hsThis is a follow-up from !10743 which moved the `HasLoc` class from `compiler/GHC/Iface/Ext/Ast.hs` to `compiler/GHC/Parser/Annotation.hs`.
This had the unfortunate effect of introducing orphans in `Ast.hs`. Remove the orphan instances...This is a follow-up from !10743 which moved the `HasLoc` class from `compiler/GHC/Iface/Ext/Ast.hs` to `compiler/GHC/Parser/Annotation.hs`.
This had the unfortunate effect of introducing orphans in `Ast.hs`. Remove the orphan instances, probably by moving the `HasLoc` instance into its own module, as discussed in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10743#note_508331https://gitlab.haskell.org/ghc/ghc/-/issues/22752Include ghc-exactprint in head.hackage2023-06-05T19:29:17ZDavid Thrane ChristiansenInclude ghc-exactprint in head.hackage`ghc-exactprint` and its transitive dependencies should be included in head.hackage. This will allow tools that depend on it to be updated for GHC releases prior to the release occurring, and it will allow incompatibilities to be detecte...`ghc-exactprint` and its transitive dependencies should be included in head.hackage. This will allow tools that depend on it to be updated for GHC releases prior to the release occurring, and it will allow incompatibilities to be detected earlier.
This is related to: https://gitlab.haskell.org/ghc/ghc/-/issues/21355 That ticket, however, discusses the test infrastructure for the exactprint annotations, rather than the library itself.https://gitlab.haskell.org/ghc/ghc/-/issues/22319EPA : Clean up and support RenamedSource2022-12-06T22:11:08ZAlan ZimmermanEPA : Clean up and support RenamedSourceI am contemplating the next steps for EPA.
There are some issues around using the current version in tooling, from being limited to `ParsedSource`
1. The fixities are not resolved in the `ParsedSource`. This means multiple tools have ...I am contemplating the next steps for EPA.
There are some issues around using the current version in tooling, from being limited to `ParsedSource`
1. The fixities are not resolved in the `ParsedSource`. This means multiple tools have separately implemented a way to impose limited fixity adjustments, mostly based on known fixities from base.
2. Names are not yet resolved. So any tool-driven AST modification needs to link up the name in the `ParsedSource` to the one in `RenamedSource`, which is currently done using the wrapped `SrcSpan`, included on every name for this very purpose.
These can both be improved if we can also exact print the `RenamedSource`. But to do so we need to make sure that we bring over the required exact print annotations, as well as any missing information from the renamer transformation. What exactly is missing is currently unknown.
But before tackling this, I propose a cleanup/simplification of the exact print annotations, especially with respect to doubling up on the `SrcSpan` for location.
The first step would be to simplify `LocatedN` to correspond to the `EpAnn` constructor in `EpAnn` only.
So instead of the complex structure listed in https://gitlab.haskell.org/ghc/ghc/-/issues/21264#note_416594, we have
```hs
type instance XRec (GhcPass p) a = GenLocated (Anno a) a
type instance Anno RdrName = EpAnnS NameAnn
type instance Anno Name = EpAnnS NameAnn
type instance Anno Id = EpAnnS NameAnn
data NameAnn = ...quite an elaborate type...
data EpAnnS ann = EpAnn { entry:: !Anchor, anns:: !ann, comments :: !EpAnnComments}
```
And in general we can migrate to using `Anchor` instead of `SrcSpan` for locations throughout. We need the following attributes on it
1. It must correspond to the `SrcSpan.RealSrcSpan` constructor, i.e. must contain a `RealSrcSpan` and `Maybe BufSpan`, for use in haddock comment attachment.
2. It should have the `AnchorOperation`, so we can edit the AST and still print it if it has moved.
So
```hs
data Anchor = Anchor { anchor :: !RealSrcSpan
-- ^ Base location for the start of
-- the syntactic element holding
-- the annotations
, anchor_bufspan :: !(Strict.Maybe BufSpan)
-- ^ Needed to be able to convert
-- back to a @SrcSpan@ while parsing.
, anchor_op :: !AnchorOperation }
deriving (Data, Eq, Show)
```
At the moment, we have
```hs
data EpAnn ann
= EpAnn { ... same as `EpAnnS` }
| EpAnnNotUsed
```
If we do this, we can possibly get rid of the `EpAnnNotUsed` constructor completely so use `EpAnn` only not `EpAnnS` as a separate type.
Note: We add the `anchor_bufspan` field to `Anchor` because it is not in `RealSrcSpan`. An alternative would be to modify `SrcLoc`
```hs
data RealSrcSpan
= RealSrcSpan'
{ srcSpanFile :: !FastString,
srcSpanSLine :: {-# UNPACK #-} !Int,
srcSpanSCol :: {-# UNPACK #-} !Int,
srcSpanELine :: {-# UNPACK #-} !Int,
srcSpanECol :: {-# UNPACK #-} !Int,
srcSpanBufSpan :: !(Strict.Maybe BufSpan)
}
deriving Eq
data SrcSpan =
RealSrcSpan !RealSrcSpan
| UnhelpfulSpan !UnhelpfulSpanReason
```Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/21355EPA : ghc-exactprint logistics2023-01-13T13:21:55ZAlan ZimmermanEPA : ghc-exactprint logisticsWhen the exact print annotations were initially brought in, they were accompanied by a series of tests.
But these tests did not exercise the actual intended usage of the annotations, which took place in the
`ghc-exactprint` library, host...When the exact print annotations were initially brought in, they were accompanied by a series of tests.
But these tests did not exercise the actual intended usage of the annotations, which took place in the
`ghc-exactprint` library, hosted separately on github.
Historically, there has been a process which kicks off around when a new GHC release branch is cut to add support
for the pending release to the `ghc-exactprint` library. This includes generating issues on GHC when it is found that
the annotations need to be tweaked to properly round trip the extensive test suite.
Now that the exact print annotations (EPA's) are in the GHC `ParsedSource` (since GHC 9.2.1), there is a
full-functionality source code round tripper in the GHC source tree, under `../utils/check-exact`.
But the `ghc-exactprint` library has been updated separately from the one in GHC, and published to hackage.
At the same time, the one in GHC has been updated to keep track of the ongoing changes to GHC.
This means `ghc-exactprint` is in a similar situation to haddock, in that changes to GHC require changes to
`ghc-exactprint`, but it is (currently) published separately from GHC.
Here are some options as to how to proceed, and I am sure there are other ways of doing it too, hence this issue
1. Status Quo, continue as before.
In practical terms, this means creating a branch for GHC head on the ghc-exactprint library, and merging in
the changes from GHC head. Once this branch is stable, it gets merged back into the GHC utils/check-exact
version, ready for the next round.
2. Similar to haddock. ghc-exactprint becomes something inside the GHC source
This means `utils/check-exact` becomes a thin driver over the main library, which must then be kept up to date.
Having experienced the friction of landing MRs on haddock this does not feel like a very good solution.
3. Some sort of split of ghc-exactprint, into a library part fully inside the GHC source, and the other
parts still inside the github repo.
Any other suggestions?Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/20720EPA: LANGUAGE pragmas in the body of a module have wrong PSpan2021-11-23T16:17:10ZAlan ZimmermanEPA: LANGUAGE pragmas in the body of a module have wrong PSpanIn
```hs
{-# LANGUAGE GADTs #-}
module PragmaSpans where
{-# LANGUAGE TypeFamilies #-}
```
The `PSpan` for the second pragma is the beginning of its line, rather than the location of the `where` token as expected.In
```hs
{-# LANGUAGE GADTs #-}
module PragmaSpans where
{-# LANGUAGE TypeFamilies #-}
```
The `PSpan` for the second pragma is the beginning of its line, rather than the location of the `where` token as expected.Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/20715EPA: semicolon attached to wrong place2021-11-23T08:22:37ZAlan ZimmermanEPA: semicolon attached to wrong placeFor the code
```hs
foo x = x
-- comment
;
```
the `AddSemiAnn` should be attached to the top level `ValD`, but it is attached to the `Match` instead.
Note: if the function becomes `foo = x` then it is attached in the right place.For the code
```hs
foo x = x
-- comment
;
```
the `AddSemiAnn` should be attached to the top level `ValD`, but it is attached to the `Match` instead.
Note: if the function becomes `foo = x` then it is attached in the right place.Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/20413EPA: Document and/or restructure NameAnn2021-09-29T19:45:51ZRichard Eisenbergrae@richarde.devEPA: Document and/or restructure NameAnnIn reviewing some code, I decided to educate myself a bit on the new EPA structure, and I stumbled.
Very specifically: the `EpaAnnNotUsed` constructor says that it's used in generated code, such as TH or `deriving`. Yet, this is not tru...In reviewing some code, I decided to educate myself a bit on the new EPA structure, and I stumbled.
Very specifically: the `EpaAnnNotUsed` constructor says that it's used in generated code, such as TH or `deriving`. Yet, this is not true in practice: the `sL1n` function is used much in the parser of user-written code, and it calls `noAnnSrcSpan`, which uses `EpAnnNotUsed`. So, at a minimum, the comment on `EpaAnnNotUsed` should be updated.
But I think there is some restructuring that we could do here.
* I like the idea of `EpaAnnNotUsed`, as currently documented (but not as used): it says when there is no useful EPA to give. But shouldn't it always be paired with `UnhelpfulSpan`? If I'm right -- that `EpaAnnNotUsed` and `UnhelpfulSpan` go together -- that suggests that the `EpAnn`/`SrcSpanAnn'` structure is mis-aligned. That is, maybe `SrcSpanAnn'` (what is that prime doing there?!) should have two constructors, one for real spans and one for unhelpful ones. This would be instead of giving `EpAnn` two constructors.
* I was quite surprised reading `NameAnn`: I kept looking for a constructor that would apply for normal names. I had to poke around to find out that the normal case is covered by `EpaAnnNotUsed`... which (to me) doesn't communicate "normal name".
* There's also something that feels off about the five difference `SrcSpanAnn?` types. In particular, there is `NameAnn`, useful for names, and `AnnListItem`, useful for things that can be stored in lists. But what if I have a list of names? This happens in fixity declarations, for example. Looking at `data FixitySig` -- which just stores a `[LIdP pass]` (which stores `NameAnn`) -- I don't see how it's possible to store the commas correctly. That is, how do we tell the difference between `infix 5 *, +` and `infix 5 * ,+`? I really don't know! And it seems impossible to store this with the current `NameAnn`/`AnnListItem` split. So something seems wrong here.
* Perhaps related: each constructor of `NameAnn` has a `nann_trailing` field. Lots of other annotations have a similar field, too. This seems like something that should abstracted into its own definition.
Aha! I just unlocked new understanding. The last two points are intimately related. `AnnListItem` is, essentially, a least-common-denominator for these annotations, containing just `[TrailingAnn]` (still not 100% sure what that is). This relates to some wisdom I read somewhere, which said "If you need a `Located`, use `LocatedA` now." -- that's because `AnnListItem` is in some sense the least informative annotation. But because all sorts of other things can also appear in lists, all the other annotation forms also include the `[TrailingAnn]` -- in particular, `NameAnn`. So names *can* appear in a list: they just use the `nann_trailing` to act like the `lann_trailing` field of an `AnnListItem`.
This all strongly suggests that the `[TrailingAnn]` information is misplaced. Either it should be part of the enclosing list AST node somehow, or, failing that, it should be part of `EpAnn`, which frequently (always?) wraps the individual annotations (like `NameAnn` and `AnnListItem`). With this change, `AnnListItem` stores no information and can become a type isomorphic to `()`.
I'm sure there's more to work out here, but at the very very least, let's redocument `EpaAnnNotUsed` (and maybe rename it?) to say what it's really for. Thanks!Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/15309mkLHsOpTy is discarding API Annotations2020-01-23T19:18:02ZAlan ZimmermanmkLHsOpTy is discarding API AnnotationsFor
```hs
ft :: (->) a b
```
the `AnnRarrow` API annotation is discarded.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ------------ |
| Version | 8.5 ...For
```hs
ft :: (->) a b
```
the `AnnRarrow` API annotation is discarded.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ------------ |
| Version | 8.5 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"mkLHsOpTy is discarding API Annotations","status":"New","operating_system":"","component":"Compiler","related":[],"milestone":"8.6.1","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"8.5","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"For\r\n\r\n{{{#!hs\r\nft :: (->) a b\r\n}}}\r\n\r\nthe `AnnRarrow` API annotation is discarded.","type_of_failure":"OtherFailure","blocking":[]} -->Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/14726Add AnnTypeAt to distinguish between '@' for type application2019-07-07T18:15:52ZAlan ZimmermanAdd AnnTypeAt to distinguish between '@' for type applicationAt the moment all uses of '@' have the same annotation.
It simplifies tools to provide a different one for the `@` in visible type application.
<details><summary>Trac metadata</summary>
| Trac field | Value |
|...At the moment all uses of '@' have the same annotation.
It simplifies tools to provide a different one for the `@` in visible type application.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ----------------- |
| Version | 8.2.2 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (Parser) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Add AnnTypeAt to distinguish between '@' for type application","status":"New","operating_system":"","component":"Compiler (Parser)","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"8.2.2","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Task","description":"At the moment all uses of '@' have the same annotation.\r\n\r\nIt simplifies tools to provide a different one for the `@` in visible type application.","type_of_failure":"OtherFailure","blocking":[]} -->Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/13689Data.Either doesn't export INLINABLE short functions like "rights"2019-07-07T18:20:35ZvarosiData.Either doesn't export INLINABLE short functions like "rights"Currently if I use Data.Either's simple functions like "rights", "isLeft", etc. In Core/Cmm with -O2 I see that they are called like external functions and not inlined. This is because they are not marked as INLINABLE in the library itse...Currently if I use Data.Either's simple functions like "rights", "isLeft", etc. In Core/Cmm with -O2 I see that they are called like external functions and not inlined. This is because they are not marked as INLINABLE in the library itself.
It'll be great if such functions in base are marked as INLINABLE so optimizator/inliner to generate more efficient code.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 8.0.2 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Core Libraries |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Data.Either doesn't export INLINABLE short functions like \"rights\"","status":"New","operating_system":"","component":"Core Libraries","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"8.0.2","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"Currently if I use Data.Either's simple functions like \"rights\", \"isLeft\", etc. In Core/Cmm with -O2 I see that they are called like external functions and not inlined. This is because they are not marked as INLINABLE in the library itself.\r\n\r\nIt'll be great if such functions in base are marked as INLINABLE so optimizator/inliner to generate more efficient code.","type_of_failure":"OtherFailure","blocking":[]} -->⊥https://gitlab.haskell.org/ghc/ghc/-/issues/13521Remove comments about API annotations2019-07-07T18:21:31ZMatthew PickeringRemove comments about API annotationsThere are lots of comments scattered around the code which look like
```
-- - 'ApiAnnotation.AnnKeywordId's : 'ApiAnnotation.AnnOpen',
-- 'ApiAnnotation.AnnClose', ...There are lots of comments scattered around the code which look like
```
-- - 'ApiAnnotation.AnnKeywordId's : 'ApiAnnotation.AnnOpen',
-- 'ApiAnnotation.AnnClose',
-- 'ApiAnnotation.AnnComma',
-- 'ApiAnnotation.AnnType'
-- For details on above see note [Api annotations] in ApiAnnotation
```
these are meant to tell you which annotations are associated with each syntax element.
They were added in an heroic effort by alanz but are very hard to keep up to date and verify the correctness of.
It would be much better if we had a programatic description of which annotations could be attached to which syntax elements and thankfully one already exists in `ghc-exactprint`. This library isn't in the source tree but I don't think this really matters, users can just depend on it if they want to use them.
I think we should remove all these comments to clean up the source a bit and point users to `ghc-exactprint` from the `ApiAnnotations` page.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ------------ |
| Version | 8.0.1 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | alanz |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Remove comments about API annotations","status":"New","operating_system":"","component":"Compiler","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"8.0.1","keywords":["ApiAnnotations"],"differentials":[],"test_case":"","architecture":"","cc":["alanz"],"type":"Task","description":"There are lots of comments scattered around the code which look like\r\n\r\n{{{\r\n -- - 'ApiAnnotation.AnnKeywordId's : 'ApiAnnotation.AnnOpen', \r\n -- 'ApiAnnotation.AnnClose', \r\n -- 'ApiAnnotation.AnnComma', \r\n -- 'ApiAnnotation.AnnType' \r\n \r\n -- For details on above see note [Api annotations] in ApiAnnotation\r\n}}}\r\n\r\nthese are meant to tell you which annotations are associated with each syntax element.\r\nThey were added in an heroic effort by alanz but are very hard to keep up to date and verify the correctness of. \r\n\r\nIt would be much better if we had a programatic description of which annotations could be attached to which syntax elements and thankfully one already exists in `ghc-exactprint`. This library isn't in the source tree but I don't think this really matters, users can just depend on it if they want to use them. \r\n\r\nI think we should remove all these comments to clean up the source a bit and point users to `ghc-exactprint` from the `ApiAnnotations` page. ","type_of_failure":"OtherFailure","blocking":[]} -->https://gitlab.haskell.org/ghc/ghc/-/issues/11092ApiAnnotations : make annotation for shebang2020-01-23T19:37:51ZAlan ZimmermanApiAnnotations : make annotation for shebangCurrently a valid haskell file can have a first line of the form
```lhs
#!/usr/bin/env runhaskell
```
This does not end up as anything that can have an API Annotation, and so gets lost on round tripping.
<details><summary>Trac metadat...Currently a valid haskell file can have a first line of the form
```lhs
#!/usr/bin/env runhaskell
```
This does not end up as anything that can have an API Annotation, and so gets lost on round tripping.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 7.10.2 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | hvr, mpickering |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"ApiAnnotations : make annotation for shebang","status":"New","operating_system":"","component":"Compiler","related":[],"milestone":"8.0.1","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"alanz"},"version":"7.10.2","keywords":["ApiAnnotations"],"differentials":[],"test_case":"","architecture":"","cc":["hvr","mpickering"],"type":"Bug","description":"Currently a valid haskell file can have a first line of the form\r\n\r\n{{{#!lhs\r\n#!/usr/bin/env runhaskell\r\n}}}\r\n\r\nThis does not end up as anything that can have an API Annotation, and so gets lost on round tripping. ","type_of_failure":"OtherFailure","blocking":[]} -->Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/23935Empty Haddock comments no longer occur in the AST as `HsDoc`2023-09-14T19:20:21ZamesgenEmpty Haddock comments no longer occur in the AST as `HsDoc`## Summary
Consider the following two type signatures.
```haskell
foo :: {- |-} A -> B
bar :: {- | -} A -> B
```
Comparing the AST (with `-haddock`) of `foo` and `bar`, note that `foo` does not contain a `HsDoc` (searchf or `WithHsDocId...## Summary
Consider the following two type signatures.
```haskell
foo :: {- |-} A -> B
bar :: {- | -} A -> B
```
Comparing the AST (with `-haddock`) of `foo` and `bar`, note that `foo` does not contain a `HsDoc` (searchf or `WithHsDocIdentifiers`), but `bar` does:
<table>
<tr><th>
`foo`</th><th>
`bar`</th></tr>
<tr>
<td>
```haskell
(L
(SrcSpanAnn (EpAnn
(Anchor
{ <interactive>:1:1-20 }
(UnchangedAnchor))
(AnnListItem
[])
(EpaComments
[])) { <interactive>:1:1-20 })
(SigD
(NoExtField)
(TypeSig
(EpAnn
(Anchor
{ <interactive>:1:1-3 }
(UnchangedAnchor))
(AnnSig
(AddEpAnn AnnDcolon (EpaSpan { <interactive>:1:5-6 }))
[])
(EpaComments
[]))
[(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:1-3 })
(Unqual
{OccName: foo}))]
(HsWC
(NoExtField)
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:15-20 })
(HsSig
(NoExtField)
(HsOuterImplicit
(NoExtField))
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:15-20 })
(HsFunTy
(EpAnn
(Anchor
{ <interactive>:1:15 }
(UnchangedAnchor))
(NoEpAnns)
(EpaComments
[]))
(HsUnrestrictedArrow
(L
(TokenLoc
(EpaSpan { <interactive>:1:17-18 }))
(HsNormalTok)))
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:15 })
(HsTyVar
(EpAnn
(Anchor
{ <interactive>:1:15 }
(UnchangedAnchor))
[]
(EpaComments
[]))
(NotPromoted)
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:15 })
(Unqual
{OccName: A}))))
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:20 })
(HsTyVar
(EpAnn
(Anchor
{ <interactive>:1:20 }
(UnchangedAnchor))
[]
(EpaComments
[]))
(NotPromoted)
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:20 })
(Unqual
{OccName: B}))))))))))))
```
</td>
<td>
```haskell
(L
(SrcSpanAnn (EpAnn
(Anchor
{ <interactive>:1:1-21 }
(UnchangedAnchor))
(AnnListItem
[])
(EpaComments
[])) { <interactive>:1:1-21 })
(SigD
(NoExtField)
(TypeSig
(EpAnn
(Anchor
{ <interactive>:1:1-3 }
(UnchangedAnchor))
(AnnSig
(AddEpAnn AnnDcolon (EpaSpan { <interactive>:1:5-6 }))
[])
(EpaComments
[]))
[(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:1-3 })
(Unqual
{OccName: bar}))]
(HsWC
(NoExtField)
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:16-21 })
(HsSig
(NoExtField)
(HsOuterImplicit
(NoExtField))
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:16-21 })
(HsFunTy
(EpAnn
(Anchor
{ <interactive>:1:16 }
(UnchangedAnchor))
(NoEpAnns)
(EpaComments
[]))
(HsUnrestrictedArrow
(L
(TokenLoc
(EpaSpan { <interactive>:1:18-19 }))
(HsNormalTok)))
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:16 })
(HsDocTy
(EpAnnNotUsed)
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:16 })
(HsTyVar
(EpAnn
(Anchor
{ <interactive>:1:16 }
(UnchangedAnchor))
[]
(EpaComments
[]))
(NotPromoted)
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:16 })
(Unqual
{OccName: A}))))
(L
{ <interactive>:1:8-14 }
(WithHsDocIdentifiers
(NestedDocString
(HsDocStringNext)
(L
{ <interactive>:1:8-14 }
(HsDocStringChunk
" ")))
[]))))
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:21 })
(HsTyVar
(EpAnn
(Anchor
{ <interactive>:1:21 }
(UnchangedAnchor))
[]
(EpaComments
[]))
(NotPromoted)
(L
(SrcSpanAnn (EpAnnNotUsed) { <interactive>:1:21 })
(Unqual
{OccName: B}))))))))))))
```
</td>
</tr>
</table>
Is there a particular reason for this? In GHC 8.10, the AST contained Haddock comments in both cases.
Concrete effects of this behavior:
- It makes the job of formatters like Ormolu (see issues [1068](https://github.com/tweag/ormolu/pull/1068), [1065](https://github.com/tweag/ormolu/issues/1065), [726](https://github.com/tweag/ormolu/issues/726)) that check of AST discrepancies automatically harder than necessary, as eg a natural rewrite from
```haskell
foo ::
-- |
--
A ->
B
```
to
```haskell
foo ::
-- |
A ->
B
```
contains a Haddock comment in the AST in the first snippet, but not in the second.
- A nice Haddock trick by @tomjaguarpaw1 ([blog post](http://h2.jaguarpaw.co.uk/posts/improving-the-typed-process-documentation/), search for "Forced type signatures to wrap") does [no longer work](https://github.com/tweag/ormolu/pull/1068#issuecomment-1707237587).
Ideally, the behavior would be changed as it was in 8.10; I could try to do that in case this behavior is not intentional.
## Environment
* GHC version used: Any GHC since 9.0 (I think this change is due to !2377)https://gitlab.haskell.org/ghc/ghc/-/issues/17638Exploring in-tree API Annotations via TTG2021-03-09T14:16:54ZAlan ZimmermanExploring in-tree API Annotations via TTG## Motivation
At the moment the API Annotations are produced as a separate artifact of the parsing process, and are processed via external libraries to re-attach them to their relevant locations.
Also, in order to support the conventio...## Motivation
At the moment the API Annotations are produced as a separate artifact of the parsing process, and are processed via external libraries to re-attach them to their relevant locations.
Also, in order to support the convention used to associate an API Annotation with an hsSyn AST element, the hsSyn AST has a lot more locations stored in it.
One of the purposes of the Trees That Grow mechanism is that it can be used to bring in additional information, as required.
This issue explores a way of bringing the API Annotations into the hsSyn AST, as a precursor to migrating the bulk of the `ghc-exactprint` functionality directly into GHC.
As a by-product, it will enable reproducing the original source text for GHC-issued diagnostics.
## Proposal
In addition to emitting the API Annotations as currently, also store them inside the TTG extension points.
We use a structure as per below to capture these
```haskell
-- | The API Annotations are now kept in the HsSyn AST for the GhcPs
-- phase. We do not always have API Annotations though, only for
-- parsed code. This type captures that, and allows the
-- representation decision to be easily revisited as it evolves.
data AA = AA [AddAnn] -- ^ Annotations added by the Parser
| AANotUsed -- ^ No Annotation for generated code, e.g. from
-- TH, deriving, etc.
deriving (Data, Show)
noAnn :: AA
noAnn = AANotUsed
```Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/20257EPA: DecBrL should have an AnnList exactprint annotation2022-08-15T16:10:06ZAlan ZimmermanEPA: DecBrL should have an AnnList exactprint annotationThe code
```
emptyBK = [d| {} |]
```
Results in the annotations for the braces in an awkward position, and it is actually a normal list, which can have internal semicolons too.
So use an `EpAnn AnnList` specifically for it.The code
```
emptyBK = [d| {} |]
```
Results in the annotations for the braces in an awkward position, and it is actually a normal list, which can have internal semicolons too.
So use an `EpAnn AnnList` specifically for it.9.4.3Alan ZimmermanAlan Zimmermanhttps://gitlab.haskell.org/ghc/ghc/-/issues/20039Refactor Anno type family and expcitly mark out annotation types in the AST.2021-06-29T14:45:01ZZubinRefactor Anno type family and expcitly mark out annotation types in the AST.The `Anno` type family is currently used to attach exactprint meta data to `SrcSpan`s via the `XRec` wrapper type family:
```haskell
type instance XRec (GhcPass p) a = GenLocated (Anno a) a
```
There is currently a [plan to move this i...The `Anno` type family is currently used to attach exactprint meta data to `SrcSpan`s via the `XRec` wrapper type family:
```haskell
type instance XRec (GhcPass p) a = GenLocated (Anno a) a
```
There is currently a [plan to move this information directly into the syntax tree](https://gitlab.haskell.org/ghc/ghc/-/wikis/api-annotations#token-information-in-the-syntax-tree-plan-b). However, this will be a major refactoring and has an indeterminate ETA. Meanwhile,
these instances get in the way of other people trying to use TTG to define custom "passes" over the AST (like @fendor's WIP rework of !3866) and result in terrible type errors(https://paste.tomsmeding.com/PtDRKhYf).
There are a lot of instances of `Anno` defined for general purpose types, like `[..]`, `Maybe ...`, `(..,..)`, which might be used in many contexts in the AST, requiring different annotations for each. However, currently these are tied to a single context and will always be given a single type of exactprint annotations.
For example, the following instances of `Anno` are OK, since they are defined for a specific kind of AST construct:
```haskell
type instance Anno (RuleBndr (GhcPass p)) = SrcSpan
type instance Anno (RuleDecl (GhcPass p)) = SrcSpanAnnA
type instance Anno (DerivStrategy (GhcPass p)) = SrcSpan
```
However, some instances are for more generic types that might be used in different contexts:
```haskell
-- For CompleteMatchSig
type instance Anno [LocatedN RdrName] = SrcSpan
type instance Anno [LocatedN Name] = SrcSpan
type instance Anno [LocatedN Id] = SrcSpan
```
`[LocatedN Name]` is a very general purpose type that may occur in multiple places in the AST. We can easily imagine some kind of future
construct which would require a `[Located Name]` field with a `SrcSpanAnnA` annotation instead. However, `Anno` implicitly ties it to the
current usage in `CompleteMatchSig`, which means that this type cannot currently be used anywhere else without significant refactoring.
Then here are even more horrific instances of `Anno`...
```haskell
type instance Anno [LocatedA ((StmtLR (GhcPass pl) (GhcPass pr) (LocatedA (HsExpr (GhcPass pr)))))] = SrcSpanAnnL
type instance Anno [LocatedA ((StmtLR (GhcPass pl) (GhcPass pr) (LocatedA (HsCmd (GhcPass pr)))))] = SrcSpanAnnL
```
One way to fix this could be to define a newtype wrapper that has a phantom type that determines the type of the annotation:
```haskell
newtype Annotated a x = Annotated { unAnnotate :: x }
type instance Anno (Annotated a x) = a
```
Then we can use `Annotated SrcSpan [LocatedN Name]` when we don't need additional annotations, `Annotated SrcSpanAnnA [LocatedN Name]` when we need `[LocatedN Name]` annotated with list annotations and so on.
~~I also propose the following refactoring to `GenLocated`, since we always need a `SrcSpan` regardless of the annotation type:~~(Scratch this idea, it doesn't seem to be worth it)
```haskell
data GenLocated a l e = L a l e
type instance XRec (GhcPass p) a = GenLocated (Anno a) SrcSpan a
type LocatedA = GenLocated AnnListItem SrcSpan
type LocatedN = GenLocated NameAnn SrcSpan
type Located = GenLocated () SrcSpan
-- For elements without annotations:
type instance Anno (RuleBndr (GhcPass p)) = ()
-- Or possibly
data NoAnn = NoAnn
type Located = GenLocated NoAnn SrcSpan
type instance Anno (RuleBndr (GhcPass p)) = NoAnn
```
~~Then we can get rid of the `type SrcAnn ann = SrcSpanAnn' (EpAnn ann)` and `type SrcSpanAnnA = SrcAnn AnnListItem` etc. types.~~
/cc @alanz @int-index @fendorHannes SiebenhandlHannes Siebenhandlhttps://gitlab.haskell.org/ghc/ghc/-/issues/19706EPA: Opt_KeepRawTokenStream is no longer used2022-05-30T17:12:27ZAlan ZimmermanEPA: Opt_KeepRawTokenStream is no longer usedFollow-up from !2418, see #19579
The flag `Opt_KeepRawTokenStream` is no longer used. It should be removed.
Is there a policy around deprecation, or can it just be removed?Follow-up from !2418, see #19579
The flag `Opt_KeepRawTokenStream` is no longer used. It should be removed.
Is there a policy around deprecation, or can it just be removed?https://gitlab.haskell.org/ghc/ghc/-/issues/23447Where should "tokens" live in the abstract syntax tree?2023-12-21T21:18:04ZSimon Peyton JonesWhere should "tokens" live in the abstract syntax tree?Language.Haskell.Syntax is a *compiler-independent* data type for the Haskell abstract
syntax tree. It is designed to be [extensible using Trees that Grow](https://gitlab.haskell.org/ghc/ghc/-/wikis/implementing-trees-that-grow).
The q...Language.Haskell.Syntax is a *compiler-independent* data type for the Haskell abstract
syntax tree. It is designed to be [extensible using Trees that Grow](https://gitlab.haskell.org/ghc/ghc/-/wikis/implementing-trees-that-grow).
The question that this ticket addresses is **where should we store information about the precise position of the keywords and punctuation of the program?**.
Progress:
* !11716: move tokens for `HsLet` into the extension field, and `EpAnn` stuff into the `<xrec-stuff>` field
* !11756: move tokens into `GhcPs` extension fields
There has been some discussion in the past:
* The ["API annotations" wiki page](https://gitlab.haskell.org/ghc/ghc/-/wikis/api-annotations)
* #19623
* #22558
* MRs in flight: !9476 !9477
* [ghc-devs discussion thread (July 23)](https://mail.haskell.org/pipermail/ghc-devs/2023-July/021305.html)
## Tokens
We use the term **tokens** for the "keywords and punctuation".
We already have the type `HsToken` defined in `Language.Haskell.Syntax`,
defined as follows:
```
type LHsToken tok p = XRec p (HsToken tok)
data HsToken (tok :: Symbol) = HsTok
```
So `LHsToken p "wombat"` represents the keyword `wombat`, with the "wombat" in the type giving some helpful documentation. The main payload is the `XRec` part which allows a client to record the location of the token.
## Motivation
Why do we want to store those tokens in the syntax tree at all? Use cases:
1. Refactoring tools could parse the source program, modify a small part of it, and print it back into the source file. The formatting of unmodified parts should be preserved, so we need the locations of every token (that's called "exactprinting").
2. Haddock needs to associate documentation comments with AST nodes. Doing so in the parser is very difficult, so we just accumulate the comments in a list and insert them back into the tree in a separate pass. We need token location information to do this.
In other cases, those tokens are an annoyance:
1. `template-haskell`, as well as any other GHC API client that generates ASTs, doesn't have token locations and has to fill them with `noHsTok`.
2. The renamer, the type checker, and the desugarer have no use for those tokens. Passing them around is a distraction from the actual renaming/type-checking/desugaring logic.
## Possible Approaches
There are two general approaches:
* **Token Plan A**. Tokens are not part of the *abstract* syntax tree, and do not belong in Language.Haskell.Syntax at all. If you want to store that stuff, do it in an extension field.
* **Token Plan B**. It is often helpful to be able to reproduce *precisely* what the
programmer wrote (so called "exact-print"). That means knowing precisely where the keywords and
punctuation were. Rather than duplicating this rendering/pretty-printing code separately for each tool, it would be nice to do it once, in Language.Haskell.Syntax
One might argue that this makes our AST less abstract, so it’s actually a concrete syntax tree. But Language.Haskell.Syntax already retain some information uncharacteristic of a proper AST, such as parentheses (with `HsPar`), so adding token information is arguably appropriate.
Currently in GHC HEAD we have mainly Plan A, with a spinkling of Plan B. For example:
```
data HsExpr p
= ...
| HsPar (XPar p)
!(LHsToken "(" p)
(LHsExpr p) -- ^ Parenthesised expr; see Note [Parens in HsSyn]
!(LHsToken ")" p)
```
But we have no clear decision or plan. Hence this ticket.
## Details about Plan A
To put the token information in the extension fields, a client of Language.Hasekll.Syntax
would do something like this. Here is the declaration of `HsExpr`:
```haskell
data HsExpr p
= ...
| HsLet (XLet p)
(LHsLocalBinds p)
(LHsExpr p)
```
The question is then downstream API users should consume these annotations while still being able to extend the AST themselves. I think this can be accomplished by introducing a new pass transformer, `WithExact`:
```haskell
-- | A pass @p@ augmented with information necessary for exact-printing.
data WithExact p
```
We can then introduce the appropriate type family instances to capture tokens as necessary. For instance, `let` might look like:
```haskell
type instance XLet (WithExact p) = (XLet p, (LHsToken p "let", LHsToken p "in"))
```
The various GHC passes would then be defined as:
```haskell
type GhcPs = WithExact (GhcPass 'Parsed)
type GhcRn = WithExact (GhcPass 'Renamed)
type GhcTc = WithExact (GhcPass 'Typechecked)
```
Now the extension field is always a pair, of the previous `XLet p` information, and a tuple of tokens. If there are no tokens for a constructor `K` one could say
```
type instance XK (WithExact p) = XK p
```
Alternatively one could use a data type with named fields:
```
type instance XLet (WithExact p) = ExactLet p
data ExactLet p = ExactLet { exactLetLet :: !(LHsToken p "let")
, exactLetIn :: !(LHsToken p "in")
, exactLetX :: XLet p
}
```
but that seems overkill: the tokens are already self-documenting.
## Details about Plan B
In Plan B we directly put the tokens in the tree, *not* in extension fields.
We can do so using two different stles:
* **Token Plan B1**: keep the tokens together in a tuple.
* **Token Plan BN**: spread the tokens across the data constructor in suggestive places.
For example, for `HsLet` here is what Plan B1 looks like:
```
data HsExpr p
= ...
...
| HsLet (XLet p)
+ (LHsToken p "let", LHsToken p "in")
(LHsLocalBinds p)
(LHsExpr p)
```
And here is the same for Plan BN:
```
data HsExpr p
= ...
...
| HsLet (XLet p)
+ (LHsToken p "let")
(LHsLocalBinds p)
+ (LHsToken p "in")
(LHsExpr p)
```
## Comparing plans
Plan A advantages:
* Clients can completely ignore all the exact-print stuff. With Plan B they have to handle those fields, if only to pass them on. With Plan B1 that is not too bad (one field), but it's pretty tiresome with Plan BN.
* Runtime: Plan A doesn't have to pay for exact-print information if it doesn't use it. Plan B allocates more: every data constructor gets more fields, and each pass needs to copy those fields into a new copy of the construtor. Plan B1 is better than Plan BN in this respect.
* Generated code: some clients (such as GHC) *generate* HsSyn, e.g. by desugaring source. For this generated code, the location of the tokens makes no sense. Plan A does not force programmers to invent fake tokens; Plan B does.
Plan B advantages:
* The Big Adantage is to be able to write a single, client-independent exact-print pretty-printer.
* The data type declaration for Plan BN looks quite perspicuous: the tokens appear in the data type interspersed with the non-token arguments, just as in the concrete syntax.
* When a GHC pass uses the extension field, it doesn't need to worry about pairing it up with the exact-print information.
## Missing information
The main benefit of Plan B is that we can make a single exact-print implementation,
in Language.Haskell.Syntax. But that means more than putting `LHsToken` in the
tree: it means that exact-print has to be able to get `SrcSpan`s out of `XRec`.
How does it do that? We need to see that design; otherwise we don't know if
we'll get the payoff.