Skip to content

encoding errors could be handled better

With the new Unicode I/O library, using the following program (badchar.hs):

import System.IO
main = do
    hSetBuffering stdin NoBuffering
    getChar >> print

If the terminal's LANG is utf-8 but a latin-1 non-ASCII character is typed, then the terminal hangs and doesn't throw an error until three more bytes are entered. Since NoBuffering is set, I'd expect the program to immediately perform error handling rather than waiting for more input.

Furthermore, if the end of input is reached then the invalid byte is accepted without error. For example, in a utf-8 terminal:

dhcp-19-155:tmp judah$ ghc -e "putStrLn \"\\249\"" | ./badchar
'\249'
Trac metadata
Trac field Value
Version 6.11
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component libraries/base
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information