Incorrect encoding for "error" strings on windows
Summary
Cyrillic characters provided to error
get mangled; cyrillic characters printed via hPutStrLn stderr
are ok.
Steps to reproduce
The following code
module Main where
import System.IO
import GHC.IO.Encoding
main :: IO ()
main = do
putStr "stdout encoding: " >> hGetEncoding stdout >>= print
putStr "stderr encoding: " >> hGetEncoding stderr >>= print
putStr "getForeignEncoding: " >> getForeignEncoding >>= print
putStr "getLocaleEncoding: " >> getLocaleEncoding >>= print
putStr "getFileSystemEncoding: " >> getFileSystemEncoding >>= print
putStrLn "latin"
putStrLn "кириллица 1"
hPutStrLn stderr "кириллица 2"
error "кириллица 3"
is built with ghc-9.0.1
via stack
and resolver: nightly-2021-09-09
.
Running it with stack exec -- nameOfexecutable
(optionally +RTS --io-manager=native
) yields
stdout encoding: Just UTF-8
stderr encoding: Just UTF-8
getForeignEncoding: UTF-8
getLocaleEncoding: UTF-8
getFileSystemEncoding: UTF-8
latin
кириллица 1
кириллица 2
codepage-exe.EXE: кириллица 3
CallStack (from HasCallStack):
error, called at app\Main.hs:17:5 in main:Main
Expected behavior
All cyrillic characters are displayed correctly.
Environment
- GHC version used: 9.0.1
- Terminal: Windows Terminal
- Shell: powershell-7.1.4 with
$OutputEncoding
Preamble :
BodyName : utf-8
EncodingName : Unicode (UTF-8)
HeaderName : utf-8
WebName : utf-8
WindowsCodePage : 1200
IsBrowserDisplay : True
IsBrowserSave : True
IsMailNewsDisplay : True
IsMailNewsSave : True
IsSingleByte : False
EncoderFallback : System.Text.EncoderReplacementFallback
DecoderFallback : System.Text.DecoderReplacementFallback
IsReadOnly : True
CodePage : 65001
The git-bash shell exhibits the same behavior.
Optional:
- Operating System: win10
- System Architecture: x86_64
Judging by the form of the above mojibake, it is a utf8-encoded cyrillic interpreted as cp1251. When I add mkTextEncoding "CP1251" >>= setForeignEncoding
before error
, everything is displayed correctly.
Is there a setting that would make the program detect a correct encoding (foreign locale?) for error
output regardless of OS and/or windows flavor? Here the assumption that foreign locale is utf8 fails.
Cheers!