Skip to content

GHC API should preserve and provide access to comments

one class of applications of the GHC API are program transformations (refactoring, source to source optimisation, partial evaluation, ..) and code layouters (pretty-print, 2html, syntax-colouring, ..). but, even ignoring layout, parsing and pretty-printing with the GHC API does not currently preserve the source (nor does it generate syntactically valid code..).

consider this simple test: we want to parse a module, then pretty-print it (we might want to adjust the layout, or switch between layout and explicit braces). applying the attached code to itself gives this result:

$ /cygdrive/c/fptools/ghc/compiler/stage2/ghc-inplace -package ghc -e main API_Layout.hs
module API where
import DynFlags
import GHC
import PprTyThing
import System.Process
import System.IO
import Outputable
import Data.Maybe
instance Num () where
    []
    []
    { fromInteger = undefined }
mode = CompManager
compileToCoreFlag = False
writer >| cmd = runInteractiveCommand cmd >>= \ (i, o, e, p) -> writer i
cmd |> reader = runInteractiveCommand cmd >>= \ (i, o, e, p) -> reader o
ghcDir = "c:/fptools/ghc/compiler/stage2/ghc-inplace --print-libdir"
       |>
         (fmap dropLineEnds . hGetContents)
       where
           dropLineEnds = filter (not . (`elem` "\r\n"))
main = defaultErrorHandler defaultDynFlags
     $ do s <- newSession . Just =<< ghcDir
          flags <- getSessionDynFlags s
          (flags, _) <- parseDynamicFlags flags ["-package ghc"]
            GHC.defaultCleanupHandler flags
          $ do setSessionDynFlags s (flags {hscTarget = HscInterpreted})
                 addTarget s =<< guessTarget "API_Layout.hs" Nothing
               load s LoadAllTargets
               prelude <- findModule s (mkModuleName "Prelude") Nothing
               usermod <- findModule s (mkModuleName "API") Nothing
               setContext s [usermod] [prelude]
               Just cm <- checkModule s (mkModuleName "API") compileToCoreFlag
               unqual <- getPrintUnqual s
                   printForUser stdout unqual $ ppr $ parsedSource cm

this has lost all comments, including pragmas, and is syntactically invalid!

one suggestion, to avoid upsetting the rest of ghc, would be to preserve the comments, with source locations, but to separate them from the main abstract syntax tree. there would also need to be a way to link ast fragments to comments, which might be slightly awkward. perhaps something like:

-- was there a comment just preceeding the current AST fragment?
commentsBefore :: AST -> Maybe String
-- was there a comment immediately following the current AST fragment?
commentsAfter :: AST -> Maybe String
Trac metadata
Trac field Value
Version 6.9
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component GHC API
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system Unknown
Architecture Unknown
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information