encoding errors could be handled better

With the new Unicode I/O library, using the following program (badchar.hs):

import System.IO
main = do
    hSetBuffering stdin NoBuffering
    getChar >> print

If the terminal's LANG is utf-8 but a latin-1 non-ASCII character is typed, then the terminal hangs and doesn't throw an error until three more bytes are entered. Since NoBuffering is set, I'd expect the program to immediately perform error handling rather than waiting for more input.

Furthermore, if the end of input is reached then the invalid byte is accepted without error. For example, in a utf-8 terminal:

dhcp-19-155:tmp judah$ ghc -e "putStrLn \"\\249\"" | ./badchar
'\249'

Trac metadata

Trac field	Value
Version	6.11
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	libraries/base
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information