hpc markup error: hGetContents: invalid argument (invalid byte sequence)
Summary
hpc
depends on the system locale when reading source files, while ghc
expects those files to be encoded in utf-8
Steps to reproduce
When compiling a simple Main.hs
file with some unicode characters:
module Main where
main :: IO ()
main = putStrLn "Добрый день"
we will not have a problem, but when running hpc markup
we do get an error, unless our locale is set to unicode:
$ rm -fr .hpc Main.tix && LANG=ASCII ghc -fhpc Main.hs -fforce-recomp && ./Main && hpc report Main && LANG=ASCII hpc markup Main
[1 of 1] Compiling Main ( Main.hs, Main.o )
Linking Main ...
Добрый день
100% expressions used (2/2)
100% boolean coverage (0/0)
100% guards (0/0)
100% 'if' conditions (0/0)
100% qualifiers (0/0)
100% alternatives used (0/0)
100% local declarations used (0/0)
100% top-level declarations used (1/1)
Writing: Main.hs.html
hpc: ./Main.hs: hGetContents: invalid argument (invalid byte sequence)
Expected behavior
hpc
should not rely on system locale for source file encoding, since they are expected to be in utf-8
, and will not be compiled otherwise. Moreover, while generating the html files with hpc markup
system locale is queried with System.IO.localeEncoding
and set as a charset
in the Content-Type
header, which is again incorrect, since .hs
files will be in utf-8
. Expected outcome:
$ LANG=en_US.UTF-8 hpc markup Main
Writing: Main.hs.html
Writing: hpc_index.html
Writing: hpc_index_fun.html
Writing: hpc_index_alt.html
Writing: hpc_index_exp.html
Environment
- GHC version used: 8.6.5 and others
Optional:
- Operating System: Ubuntu 18.04LTS
- System Architecture: x86_64