Skip to content

UTF-16//ROUNDTRIP encoding behaves weirdly

Try this program:

module Main where

import System.IO

main = do
    roundtrip_enc <- mkTextEncoding "UTF16//ROUNDTRIP"
    h <- openFile "out.temp" WriteMode
    hSetEncoding h roundtrip_enc
    hPutStr h "Hi\xEFE8Hi"

It fails with:

hSetEncoding: invalid argument (Invalid argument)

If you change UTF16 to UTF-16 (so we use the builtin encoding rather than iconv) it works, but the output file only contains the first Hi.

I think part of what is going on here is that iconv does not generate EILSEQ for identity transformations such as that between a UTF-16 text file and our UTF-16 CharBuffers. Since we never get that exception, we can't fix up the lone surrogates we use to encode roundtrip characters.

Edited by batterseapower
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information