Skip to content

Incorrect UTF16 decoding with the version of text bundled with 9.2.1 on aarch64

Summary

9.2.1 on aarch64 is incorrect in decoding "\0\216" as a little-endian UTF16 encoded bytestring with the bundled version of text. (Similarly for "\216\0" big endian)

9.0.1 on aarch64 has the correct behavior.

9.2.1 on x86_64 has the correct behavior.

According to @Bodigrim both the Hackage and git versions of text have the correct behavior with 9.2.1 on aarch64.

Issue in text: https://github.com/haskell/text/issues/380

Steps to reproduce

Evaluate decodeUtf16LE "\0\216" on 9.2.1 with the bundled version of text on aarch64.

aarch64

$ ghci -XOverloadedStrings
GHCi, version 9.2.0.20210821: https://www.haskell.org/ghc/  :? for help
ghci> import Data.Text.Lazy.Encoding
ghci> let !x = decodeUtf16LE "\0\216"
ghci>
Leaving GHCi.
$ uname -a
Linux aarch64.nixos.community 5.10.66 #1-NixOS SMP Thu Sep 16 10:51:23 UTC 2021 aarch64 GNU/Linux

x86_64

❯ ghci
GHCi, version 9.2.0.20210821: https://www.haskell.org/ghc/  :? for help
macro 'doc' overwrites builtin command.  Use ':def!' to overwrite.
Loaded GHCi configuration from /nix/store/r9mqkkpyb2abjwq94l84zg63x83fg76z-ghci
λ> import Data.Text.Lazy.Encoding
λ> let !x = decodeUtf16LE "\0\216"
*** Exception: Cannot decode input: Data.Text.Lazy.Encoding.Fusion.streamUtf16LE: Invalid UTF-16LE stream
λ>
Leaving GHCi.

❯ uname -a
Linux orion 5.10.71 #1-NixOS SMP Wed Oct 6 13:56:04 UTC 2021 x86_64 GNU/Linux

Environment

  • GHC 9.2.1
  • aarch64
Edited by Ellie Hermaszewska
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information