Skip to content
  • Herbert Valerio Riedel's avatar
    Modify replacement properties of `encodeStringUtf8`/`decodeStringUtf8` · 4aedd00c
    Herbert Valerio Riedel authored
    This changes `decodeStringUtf8` to not replace U+FFFE and U+FFFF into
    U+FFFD, while `encodeStringUtf8` now replaces surrogate pairs
    (i.e. code-points U+D800 through U+DFFF which are invalid in UTF-8)
    with U+FFFD.
    
    Consequently, `decodeStringUtf8 . encodeStringUtf8` can now properly
    round-trip all scalar code-points
    (i.e. [U+0000..U+D7FF] ∪ [U+E000..U+10FFFF]).
    
    This should finally address #4644
    4aedd00c