Skip to content
  • Simon Marlow's avatar
    [project @ 2003-12-10 14:15:16 by simonmar] · 55042138
    Simon Marlow authored
    Add accurate source location annotations to HsSyn
    -------------------------------------------------
    
    Every syntactic entity in HsSyn is now annotated with a SrcSpan, which
    details the exact beginning and end points of that entity in the
    original source file.  All honest compilers should do this, and it was
    about time GHC did the right thing.
    
    The most obvious benefit is that we now have much more accurate error
    messages; when running GHC inside emacs for example, the cursor will
    jump to the exact location of an error, not just a line somewhere
    nearby.  We haven't put a huge amount of effort into making sure all
    the error messages are accurate yet, so there could be some tweaking
    still needed, although the majority of messages I've seen have been
    spot-on.
    
    Error messages now contain a column number in addition to the line
    number, eg.
    
       read001.hs:25:10: Variable not in scope: `+#'
    
    To get the full text span info, use the new option -ferror-spans.  eg.
    
       read001.hs:25:10-11: Variable not in scope: `+#'
    
    I'm not sure whether we should do this by default.  Emacs won't
    understand the new error format, for one thing.
    
    In a more elaborate editor setting (eg. Visual Studio), we can arrange
    to actually highlight the subexpression containing an error.  Eventually
    this information will be used so we can find elements in the abstract
    syntax corresponding to text locations, for performing high-level editor
    functions (eg. "tell me the type of this expression I just highlighted").
    
    Performance of the compiler doesn't seem to be adversely affected.
    Parsing is still quicker than in 6.0.1, for example.
    
    Implementation:
    
    This was an excrutiatingly painful change to make: both Simon P.J. and
    myself have been working on it for the last three weeks or so.  The
    basic changes are:
    
     - a new datatype SrcSpan, which represents a beginning and end position
       in a source file.
    
     - To reduce the pain as much as possible, we also defined:
    
          data Located e = L SrcSpan e
    
     - Every datatype in HsSyn has an equivalent Located version.  eg.
    
          type LHsExpr id = Located (HsExpr id)
    
       and pretty much everywhere we used to use HsExpr we now use
       LHsExpr.  Believe me, we thought about this long and hard, and
       all the other options were worse :-)
    
    
    Additional changes/cleanups we made at the same time:
    
      - The abstract syntax for bindings is now less arcane.  MonoBinds
        and HsBinds with their built-in list constructors have gone away,
        replaced by HsBindGroup and HsBind (see HsSyn/HsBinds.lhs).
    
      - The various HsSyn type synonyms have now gone away (eg. RdrNameHsExpr,
        RenamedHsExpr, and TypecheckedHsExpr are now HsExpr RdrName,
        HsExpr Name, and HsExpr Id respectively).
    
      - Utilities over HsSyn are now collected in a new module HsUtils.
        More stuff still needs to be moved in here.
    
      - MachChar now has a real Char instead of an Int.  All GHC versions that
        can compile GHC now support 32-bit Chars, so this was a simplification.
    55042138