Cabal/Distribution/Utils/String.hs · ed3290740fd116318edec77b3ab3b628bac4460a · Glasgow Haskell Compiler / Packages / Cabal

Fix thinko in `decodeStringUtf8` · ed329074

Herbert Valerio Riedel authored Oct 04, 2016

This resulted in some two-bytes utf8 encodings to be decoded
into U+FFFD unintentionally (such as e.g. U+0142).

With this fix, the property

    [ c | c <- [minBound..maxBound]
        , c < '\xD800' || c >= '\xE000' -- surrogate pair codes
        , (decodeStringUtf8 . encodeStringUtf8) [c] /= [c]
        ] == ['\xfffe','\xffff']

holds. It's not clear to me why U+FFFE and U+FFFF ought to be singled
out. Needs more investigation.

TODO: testsuite coverage

ed329074