Skip to content

winio: do not re-translate input when handle is uncooked

Tamar Christina requested to merge Phyx/ghc:wip/T21488 into master

The documentation for KEY_EVENT_RECORD is rather terse here [1] But the Docs say for the UnicodeChar that it's already Translated Unicode character.

It was unclear what this actually meant, Considering the type is still a WCHAR I assumed this means that this was still a Unicode character. i.e. a CodePoint.

Instead what this means is that any surrogate pairs have already been interpreted and the output is actually a stream of bytes. The type is still WCHAR just for type-checking purposes.

Essentially this means that on input we get a stream as:

utf8_decode: 27
utf8_decode: 91
utf8_decode: 55
utf8_decode: 51
utf8_decode: 59
utf8_decode: 49
utf8_decode: 82
utf8_decode: 13
utf8_decode: 13

However we try to re-translate them, causing it to interpret two bytes as a single WCHAR. So after translation we get:

utf8_decode: 229
utf8_decode: 227
utf8_decode: 59
utf8_decode: 0
utf8_decode: 0

which when we decode using the utf-8 decoder becomes:

readBuf (9) after 9
readTextDevice after reading: bbuf=0x7ef4a08ec010@buf8192(0-9) (>=9) [[229,172,155,227,140,183,59,0,0]]
readTextDevice after decoding: cbuf=0x7ef4a08e9010@buf2048(0-5) (>=0) ["\23323\13111;\NUL\NUL"] bbuf=0x7ef4a08ec010@buf8192(0-0) (>=9) [[]]

So out 9 input bytes become 5 decoded characters.

This is a long winded way to say, that for cooked inputs we shouldn't translate the buffer to a MBS (Multi-Byte Sequence) as it has already been done by the API call itself. Instead we should pass it directly to the decode buffer.

Fixes #21488 (closed)

Note that I can't write a test for this since the testsuite runs inside an msys2 process, which doesn't support console APIs as it uses named pipes for it's stdhandles.

So I'd appreciate a review.

[1] https://docs.microsoft.com/en-us/windows/console/key-event-record-str

Edited by Tamar Christina

Merge request reports