Skip to content

segfault when ignoring invalid byte sequence when decoding UTF8//IGNORE

The code below segfaults on a variety of platforms and GHC versions - I've tried 7.4.1 and 7.6.1, on Windows and Linux.

It seems to be related to (a) the specific choice of UTF8 - doesn't happen with UTF16 or UTF32 etc and (b) having the invalid byte sequence at the end of the thing being decoded.

import qualified Data.ByteString as B
import System.IO

tempFile = "temp"

main = do
   utf8Ignore <- mkTextEncoding "UTF8//IGNORE"
   B.writeFile tempFile (B.pack [128])
   h <- openFile tempFile ReadMode
   hSetEncoding h utf8Ignore
   hGetContents h >>= putStrLn
Trac metadata
Trac field Value
Version 7.6.1
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information