Skip to content
  • Alec Theriault's avatar
    Fix a bug in 'alexInputPrevChar' · 821adee1
    Alec Theriault authored and Ben Gamari's avatar Ben Gamari committed
    The lexer hacks around unicode by squishing any character into a 'Word8'
    and then storing the actual character in its state. This happens at
    'alexGetByte'.
    
    That is all and well, but we ought to be careful that the characters we
    retrieve via 'alexInputPrevChar' also fit this convention.
    
    In fact, #13986 exposes nicely what can go wrong: the regex in the left
    context of the type application rule uses the '$idchar' character set
    which relies on the unicode hack. However, a left context corresponds
    to a call to 'alexInputPrevChar', and we end up passing full blown
    unicode characters to '$idchar', despite it not being equipped to deal
    with these.
    
    Test Plan: Added a regression test case
    
    Reviewers: austin, bgamari
    
    Reviewed By: bgamari
    
    Subscribers: rwbarton, thomie
    
    GHC Trac Issues: #13986
    
    Differential Revision: https://phabricator.haskell.org/D4105
    821adee1