Add Annotations to HsSyn to explicitly track the locations of all non-blank source code elements to allow tools to parse a Haskell file, modify the AST and then produce an updated version of the source preserving the layout for unchanged parts.
Discussion of the feature is at GhcAstAnnotations
Note: an early effort was at D246, but this was abandoned as unworkable.
Edited
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
Alan writes (in email): I have decided to first tackle adding a type parameter to the entire AST, so that tool writers can add custom information as required. My first stab at this is to do is as follows
data HsModule r name = HsModule { ann :: r, -- ^ Annotation for external tool writers hsmodName :: Maybe (Located ModuleName), -- ^ @Nothing@: \"module X where\" is omitted (in which case the next -- field is Nothing too) hsmodExports :: Maybe [LIE name], ....
Salient points
It comes as the first type parameter, and is called r
It gets added as the first field of the syntax element
It is always called ann
Before undertaking this particular change, I would appreciate some feedback.
In ticket:9628#comment:88519 the issue appears to be about attaching an annotation of client-specified type to every node in the tree.
These seem quite orthogonal to me.
For the latter I would suggest looking at the Located type, instead of what you suggest in ticket:9628#comment:88519. The Located type is wrapped around almost every node in the tree, and if you want to add some ubiquitous annotation type, it would be the place to do so.
Firstly, I agree that this is in fact two orthogonal things, but I think it is good to discuss them together because some potential solutions could couple them together.
Located is something that I considered for the annotation parameter, but I am concerned that it is so baked in to everything else that if it changed it would cause other unforeseen problems.
In order to use it in this way it would have to become a parameter to the AST too, effectively replacing all instances of Located with GenLocated.
Reference:
data GenLocated l e = L l e deriving (Eq, Ord, Typeable, Data)type Located e = GenLocated SrcSpan e
Yes, that's right. Looks much nicer to me. The syntax tree is already heavily decorated with SrcSpans. Just parameterise over that, and you can decorate with something else instead.
An alternative would be to insist that there was always a SrcSpan, plus perhaps something else:
data GenLocated l e = L SrcSpan l etype Located e = GenLocated () e
That's tiresome because there are lots of () values, but does mean you can always find a SrcSpan. One would have to explore use cases.
On further reflection, option 1 will allow for example a smart editor which uses the annotated ParsedSource as a primary data structure and submits changes through for incremental renaming/type checking and eventual code generation.
This way the normal command line invocation will lock the annotation down as a SrcSpan, but it can be replaced all the way through when used for tooling.
You'll need to be much more explicit before anyone is likely to venture an opinion. What is the impact of "managed separately"? What is the motivation for making a change at all?
There are two different Phab tickets: D246 is linked to this ticket, but D297 (I believe) may supercede it. If so please let's redirect the "Differential revision" field of this ticket, and explicit mark the moribund one as moribund.
The wiki page GhcAstAnnotations does not appear to reflect any of the discussion. Indeed it appears to describe only the first bullet from ticket:9628#comment:88527
There has been quite a lot of traffic on ghc-devs that is not captured anywhere. That's fine: an email list is good for discussion. But my input bandwidth is low and struggle to make sense of it all. And the conclusions from the discussion may be useful.
Alan has posted a useful summary to Haskell Cafe, which isn't captured on a wiki anywhere.
Alan has done some work identifying users for the new features, and written some email notes about that; again this would be useful to capture.
I am too slow to take a big patch and try to reverse-engineer the thought process that went into it. Would be possible to update the wiki page (presumably GhcAstAnnotations) to state
The problem we are trying to solve
The user-visible (or at least visible-to-client-of-GHC-API) design
Other notes about the implementation.
Covering the larger picture about the GHC API improvements you are making (eg no landmines) would be helpful. Maybe you need more than one page.
I'm delighted you are doing this. But I don't want to throw a lot of code into GHC without a clear, shared consensus about what it is we are trying do to, and how we are doing it.
I'm concerned about the proliferation of data types. As I read it you intend to have a new data type for each constructor of each data type in HsSyn. That's a LOT of new data types! And I bet you'll soon want Eq, Ord, Data instances for them as well as Typeable. Indeed you say
I wonder if something simpler and more dynamically-typed might do. Suppose you had
lookupApiAnns :: Typeable value => ApiAnns -> SrcSpan -> String -> Maybe value
so that ApiAnns is really a map from (SrcSpan, String, TypeRep) to values, where TypeRep there is the TypeRep of the value. The String is the dynamic bit. Now you could say
processHsClassDecl :: ApiAnns -> LTyClDecl n -> ...processHsClassDecl anns (L l (ClassDecl { ..} )) = r where Just kwd_loc = lookupAPiAnns anns loc "class-keyword" :: Maybe SrcSpan Just mb_loc = lookupApiAnns anns loc "class-mwhere" :: Maybe (Maybe SrcSpan) ...
OK so you might type those strings in wrong -- but if you do the look up will fail.
I don't want this to sink under the sheer weight of gratuitous declarations.
Oh and you could use the same string in lots of places. e.g. "where-keyword" might be used in a number of constructs.