Document that getFileSystemEncoding can return ASCII
Summary
https://gitlab.haskell.org/ghc/ghc/-/blob/9fc0fe008c13782cb7b1962b0ebed0bb09ecfb6f/libraries/base/GHC/IO/Encoding.hs#L112-123 claims that GHC.IO.Encoding.getFileSystemEncoding
returns "Unicode encoding of the current locale". It seems to be incorrect: on some systems it returns ASCII, which is not a Unicode encoding.
Steps to reproduce
On an Ubuntu host run a virtual s390x machine:
apt install -y docker.io && \
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes && \
docker run --rm -i -t s390x/ubuntu
Then on the virtual machine:
apt update && \
apt install -y ghc && \
ghc --version && \
ghci
and finally in ghci
on the virtual machine:
> enc <- GHC.IO.Encoding.getFileSystemEncoding
> enc
ASCII
> GHC.Foreign.newCStringLen enc "\160"
*** Exception: recoverEncode: invalid argument (invalid character)
Proposed changes
I'm filing this as a documentation issue, because foremost the comment for getFileSystemEncoding
should not mislead developers that it returns Unicode encodings only. But it would also be nice to make the exception from newCStringLen
more helpful, at least mentioning target encoding and ideally a stack trace as well. I hit the issue while running bytestring
test suite on s390x, and as one can imagine debugging was not pleasant at all.
Environment
- GHC version: 8.6.5
- Platform: s390x