Skip to content

Implement multiline strings

Brandon Chinn requested to merge wip/multiline-strings into master

Resolves #24390. I'm marking this feature for 9.12, since 9.10 is being cut next week, which seems tight.

Corresponding MR in Haddock side: haddock!51

I opened a proposal to change the multiline string to not include the trailing newline: https://github.com/ghc-proposals/ghc-proposals/pull/637. We can start reviewing this MR in parallel to that conversation.

Merge instructions:

  • Rebase Haddock branch onto ghc-head
  • Update MR to new commit in Haddock repo
  • Make sure CI passes
  • Assign to marge AND fast-forward merge your haddock patch into ghc-head
    • Q: What exactly is the ordering here? Do I fast forward the haddock MR after Marge merges? Or immediately after assigning it to Marge?

Post-merge tasks:

  • Add -XMultilineStrings to Cabal in Cabal/Language/Haskell/Extension.hs
    • After merging, update Cabal submodule and remove from expectedGhcOnlyExtensions

Reviewer instructions

Probably easiest to review commit-by-commit. Notable commits:

  • "Add test cases for MultilineStrings"

    • Encodes all the examples and other edge cases as tests
  • "Break out common lex_magic_hash logic for strings and chars"

    • Deduplicate the logic for -XMagicHash and get rid of LexedString
    • Refactor, Should not contain any behavior changes - if you see any behavior changes, it's a bug
  • "Factor out string processing functions"

    • Big refactor, apologies in advance. Should not contain any behavior changes - if you see any behavior changes, it's a bug
    • Breaks out a new GHC.Parser.String module that effectively operates on [(Char, loc)] (where loc ~ AlexInput, but left polymorphic to avoid circular dependency)
    • Lexing strings now does the bare minimum of "parse everything between quotes, excluding escaped quotes" and builds up this list with alexGetChar', then the string is processed at the very end to remove string gaps, resolve \&, and resolve other escape characters
    • Lexing characters also uses the same logic to resolve escape characters
    • Benchmarked the change here, shouldn't be any performance regressions when GHC is compiled with optimizations
  • "Implement MultilineStrings"

    • The main feature change adding support for multiline string literals
    • Adds logic to the new GHC.Parser.String module to post-process multiline strings per the GHC proposal
  • if your MR may break existing programs (e.g. touches base or causes the compiler to reject programs), please describe the expected breakage and add the user-facing label. This will run ghc/head.hackage> to characterise the effect of your change on Hackage.

  • ensure that your commits are either individually buildable or squashed

  • ensure that your commit messages describe what they do (referring to tickets using #NNNN syntax when appropriate)

  • have added source comments describing your change. For larger changes you likely should add a Note and cross-reference it from the relevant places.

  • add a testcase to the testsuite.

  • updates the users guide if applicable

  • mentions new features in the release notes for the next release

Edited by Brandon Chinn

Merge request reports