Skip to content

Avoid predicates in varsym lexing rules (#22201)

Vladislav Zavialov requested to merge wip/op-ws-value into master

This merge request contains two commits:

Lexer: define varsym without predicates (#22201)

Before this patch, the varsym lexing rules were defined as follows:

        <0> {
          @varsym / { precededByClosingToken `alexAndPred` followedByOpeningToken } { varsym_tight_infix }
          @varsym / { followedByOpeningToken }  { varsym_prefix }
          @varsym / { precededByClosingToken }  { varsym_suffix }
          @varsym                               { varsym_loose_infix }
        }

Unfortunately, this meant that the predicates 'precededByClosingToken' and
'followedByOpeningToken' were recomputed several times before we could figure
out the whitespace context.

With this patch, we check for whitespace context directly in the lexer
action:

        <0> {
          @varsym { with_op_ws varsym }
        }

The checking for opening/closing tokens happens in 'with_op_ws' now,
which is part of the lexer action rather than the lexer predicate.
Lexer: pass updated buffer to actions (#22201)

In the lexer, predicates have the following type:
        { ... } :: user       -- predicate state
                -> AlexInput  -- input stream before the token
                -> Int        -- length of the token
                -> AlexInput  -- input stream after the token
                -> Bool       -- True <=> accept the token
This is documented in the Alex manual.

There is access to the input stream both before and after the token.
But when the time comes to construct the token, GHC passes only the
initial string buffer to the lexer action. This patch fixes it:

        - type Action = PsSpan -> StringBuffer -> Int ->                 P (PsLocated Token)
        + type Action = PsSpan -> StringBuffer -> Int -> StringBuffer -> P (PsLocated Token)

Now lexer actions have access to the string buffer both before and after
the token, just like the predicates. It's just a matter of passing an
additional function parameter throughout the lexer.

fixes #22201 (closed)

Merge request reports