Unicode output in GHC
Unicode output is somewhat broken in GHC as a whole. We should fix it properly.
Most output is generated by the Pretty module. Pretty has two ways to output:
printLeftRender, which is used when the rendering mode is
This method uses the
BufWritemodule to speed up output. For
the output will be in UTF-8, for strings and other characters the output
takes the low 8 bits of each character.
printDoc, when used in modes other than
LeftMode(e.g. for things like
error messages and
hPutStrfor strings which uses the
prevailing encoding on stdout. However, it calls
which always emits UTF-8.
In GHCi, there is an additional layer due to Haskeline, which pipes all the
output through its own decoder (or tries to, I think there are cases not
This is all a bit of a mess.
We should be using the Unicode layer in the IO library for all encoding/decoding now. I suggest that:
printLeftRenderalone. It is used for printing things like the
.sfile, and never outputs any Unicode characters because everything is
printDoc, instead of
hPutFS, should use
hPutStr . decodeFS
We get rid of the Haskeline decoding layer.
However, this will introduce a regression on Windows, because the Haskeline encoding layer currently does code-page encoding. Judah has mentioned looking at doing code-page encoding in the GHC IO library, so let's see what happens there.
Once this is done, we can do #2507 (closed) (quotation characters in error messages).