Unique sometimes has tag '\0' (nul byte)
Summary
Unique
values in GHC sometimes have a tag (topmost byte) that is zero. The Outputable
instance of Unique
simply prints the tag as a Char
, so such zero-tag Unique
values result in a zero byte in the output.
Steps to reproduce
I do not have a small reproducer, sorry. The following works for me:
$ git clone https://git.tomsmeding.com/chad-fast
$ git checkout bbf7e83f8c33941d37a6344e6a79c42af2d0a6a3
$ cabal build chad-fast --ghc-options=-ddump-tc-trace | grep -aP '\0'
The grep
call filters for lines containing zero bytes.
If you want a more reduced reproducer, ping me.
The result of zero bytes in GHC's output is that e.g. Emacs does not recognise GHC debug output as UTF-8 (because a zero byte is not valid in UTF-8, technically), hence falling back to ascii/latin-1 and displaying ugly multi-byte sequences for things like GHC's curly quotes in diagnostics.
Expected behavior
GHC should never output zero bytes in text output, not even in debug output.
Environment
- GHC version used: 9.10.1
Optional:
- Operating System: Linux
- System Architecture: n/a
Edited by Tom Smeding