GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2019-07-07T18:04:10Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/15553GHC.IO.Encoding not flushing partially converted input2019-07-07T18:04:10ZMasahiro SakaiGHC.IO.Encoding not flushing partially converted inputConversion by `GHC.IO.Encoding` produces incomplete output for some encodings because it does not flush *partially converted input* at the end of the string.
[iconv(3)](https://manpages.debian.org/stretch/manpages-dev/iconv.3) provides ...Conversion by `GHC.IO.Encoding` produces incomplete output for some encodings because it does not flush *partially converted input* at the end of the string.
[iconv(3)](https://manpages.debian.org/stretch/manpages-dev/iconv.3) provides API for the flushing.
> In each series of calls to iconv(), the last should be one with inbuf or \*inbuf equal to NULL, in order to flush out any partially converted input.
But `GHC.IO.Encoding` does not perform the flushing properly and it can cause incomplete conversion result.
I found two cases that it actually produces incomplete output, but there might be more cases.
# Case 1: EUC-JISX0213
For example, the following code is expected to output two bytes 0xa4 0xb1, but it outputs none.
```hs
enc <- mkTextEncoding "EUC-JISX0213"
withFile "test.txt" WriteMode $ \h -> hSetEncoding h enc >> hPutStr h "\x3051"
```
The problem happens because of the following mapping between Unicode and EUC-JISX0213.
<table><tr><td>Unicode</td>
<td>EUC-JISX0213</td></tr>
<tr><td>U+3051 U+309A</td>
<td>0xa4 0xfa</td></tr>
<tr><td>U+3051</td>
<td>0xa4 0xb1</td></tr></table>
After seeing the codepoint U+3051, the converter is unable to determine which of the two byte sequence to output until it sees the next character or *the end of the string*. But `GHC.IO.Encoding` does not call the above mentioned *flushing* API, therefore the converter is unable to recognize the end of the string.
# Case 2: ISO-2022-JP
Similarly, following code is expected to output byte sequence `0x1b 0x24 0x42` `0x24 0x22` `0x1b 0x28 0x42` but the last three bytes `0x1b 0x28 0x42` is not produced.
```hs
enc <- mkTextEncoding "ISO-2022-JP"
withFile "test.txt" WriteMode $ \h -> hSetEncoding h enc >> hPutStr h "\x3042"
```
ISO-2022-JP is a stateful encoding and [RFC 1468](https://www.ietf.org/rfc/rfc1468.txt) requires the state is reset to initial state at the end of the string. The missing three bytes `0x1b 0x28 0x42` are the escape sequence for that purpose. But again `GHC.IO.Encoding` does not call the above mentioned`flushing` API, therefore the converter cannot recognize the end of the string and cannot reset the state.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 8.4.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Core Libraries |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"GHC.IO.Encoding not flushing partially converted input","status":"New","operating_system":"","component":"Core Libraries","related":[],"milestone":"8.6.1","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"8.4.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"Conversion by `GHC.IO.Encoding` produces incomplete output for some encodings because it does not flush ''partially converted input'' at the end of the string.\r\n\r\n[https://manpages.debian.org/stretch/manpages-dev/iconv.3 iconv(3)] provides API for the flushing.\r\n\r\n> In each series of calls to iconv(), the last should be one with inbuf or *inbuf equal to NULL, in order to flush out any partially converted input.\r\n\r\nBut `GHC.IO.Encoding` does not perform the flushing properly and it can cause incomplete conversion result.\r\nI found two cases that it actually produces incomplete output, but there might be more cases.\r\n\r\n= Case 1: EUC-JISX0213\r\n\r\nFor example, the following code is expected to output two bytes 0xa4 0xb1, but it outputs none.\r\n\r\n{{{#!hs\r\nenc <- mkTextEncoding \"EUC-JISX0213\"\r\nwithFile \"test.txt\" WriteMode $ \\h -> hSetEncoding h enc >> hPutStr h \"\\x3051\"\r\n}}}\r\n\r\nThe problem happens because of the following mapping between Unicode and EUC-JISX0213. \r\n\r\n||Unicode||EUC-JISX0213||\r\n||U+3051 U+309A||0xa4 0xfa||\r\n||U+3051||0xa4 0xb1||\r\n\r\nAfter seeing the codepoint U+3051, the converter is unable to determine which of the two byte sequence to output until it sees the next character or ''the end of the string''. But `GHC.IO.Encoding` does not call the above mentioned ''flushing'' API, therefore the converter is unable to recognize the end of the string.\r\n\r\n= Case 2: ISO-2022-JP\r\n\r\nSimilarly, following code is expected to output byte sequence `0x1b 0x24 0x42` `0x24 0x22` `0x1b 0x28 0x42` but the last three bytes `0x1b 0x28 0x42` is not produced. \r\n\r\n{{{#!hs\r\nenc <- mkTextEncoding \"ISO-2022-JP\"\r\nwithFile \"test.txt\" WriteMode $ \\h -> hSetEncoding h enc >> hPutStr h \"\\x3042\"\r\n}}}\r\n\r\nISO-2022-JP is a stateful encoding and [https://www.ietf.org/rfc/rfc1468.txt RFC 1468] requires the state is reset to initial state at the end of the string. The missing three bytes `0x1b 0x28 0x42` are the escape sequence for that purpose. But again `GHC.IO.Encoding` does not call the above mentioned`flushing` API, therefore the converter cannot recognize the end of the string and cannot reset the state.","type_of_failure":"OtherFailure","blocking":[]} -->8.6.1https://gitlab.haskell.org/ghc/ghc/-/issues/5611Asynchronous exception discarded after safe FFI call2019-07-07T18:54:23ZjoeyadamsAsynchronous exception discarded after safe FFI call**Note:** This bug appears to be fixed already, as it does not appear with GHC 7.2.1 . I'm submitting a bug report anyway, to document its presence.
The bug is: when an asynchronous exception is thrown to a thread making a (safe) foreig...**Note:** This bug appears to be fixed already, as it does not appear with GHC 7.2.1 . I'm submitting a bug report anyway, to document its presence.
The bug is: when an asynchronous exception is thrown to a thread making a (safe) foreign call, the thread throwing the exception blocks like it should, but then the exception isn't actually delivered. For example:
```haskell
{-# LANGUAGE ForeignFunctionInterface #-}
import Control.Concurrent
import Foreign.C
import System.IO
foreign import ccall safe "unistd.h sleep"
sleep :: CUInt -> IO CUInt
main :: IO ()
main = do
hSetBuffering stdout LineBuffering
tid <- forkIO $ do
putStrLn "child: Sleeping"
_ <- sleep 1
-- The following lines should not happen after the killThread from the
-- parent thread completes. However, they do...
putStrLn "child: Done sleeping"
threadDelay 1000000
putStrLn "child: Done waiting"
threadDelay 100000
putStrLn $ "parent: Throwing exception to thread " ++ show tid
throwTo tid $ userError "Exception delivered successfully"
putStrLn "parent: Done throwing exception"
threadDelay 2000000
```
When the bug is present, the program prints:
```
child: Sleeping
parent: Throwing exception to thread ThreadId 4
child: Done sleeping
parent: Done throwing exception
child: Done waiting
```
"child: Done waiting" should not be printed after completion of the throwTo, and the exception message should appear. On GHC 7.2.1, the program prints:
```
child: Sleeping
parent: Throwing exception to thread ThreadId 4
parent: Done throwing exception
ffi-sleep: user error (Exception delivered successfully)
```
This bug has been reproduced in GHC 7.0.3, on both Linux and Windows. It has also been reproduced on GHC 7.0.4.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.0.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Asynchronous exception discarded after safe FFI call","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.0.3","keywords":["FFI,","exception"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"'''Note:''' This bug appears to be fixed already, as it does not appear with GHC 7.2.1 . I'm submitting a bug report anyway, to document its presence.\r\n\r\nThe bug is: when an asynchronous exception is thrown to a thread making a (safe) foreign call, the thread throwing the exception blocks like it should, but then the exception isn't actually delivered. For example:\r\n\r\n{{{\r\n\r\n{-# LANGUAGE ForeignFunctionInterface #-}\r\n\r\nimport Control.Concurrent\r\nimport Foreign.C\r\nimport System.IO\r\n\r\nforeign import ccall safe \"unistd.h sleep\"\r\n sleep :: CUInt -> IO CUInt\r\n\r\nmain :: IO ()\r\nmain = do\r\n hSetBuffering stdout LineBuffering\r\n\r\n tid <- forkIO $ do\r\n putStrLn \"child: Sleeping\"\r\n _ <- sleep 1\r\n\r\n -- The following lines should not happen after the killThread from the\r\n -- parent thread completes. However, they do...\r\n putStrLn \"child: Done sleeping\"\r\n threadDelay 1000000\r\n putStrLn \"child: Done waiting\"\r\n\r\n threadDelay 100000\r\n putStrLn $ \"parent: Throwing exception to thread \" ++ show tid\r\n throwTo tid $ userError \"Exception delivered successfully\"\r\n putStrLn \"parent: Done throwing exception\"\r\n\r\n threadDelay 2000000\r\n}}}\r\n\r\nWhen the bug is present, the program prints:\r\n\r\n{{{\r\nchild: Sleeping\r\nparent: Throwing exception to thread ThreadId 4\r\nchild: Done sleeping\r\nparent: Done throwing exception\r\nchild: Done waiting\r\n}}}\r\n\r\n\"child: Done waiting\" should not be printed after completion of the throwTo, and the exception message should appear. On GHC 7.2.1, the program prints:\r\n\r\n{{{\r\nchild: Sleeping\r\nparent: Throwing exception to thread ThreadId 4\r\nparent: Done throwing exception\r\nffi-sleep: user error (Exception delivered successfully)\r\n}}}\r\n\r\nThis bug has been reproduced in GHC 7.0.3, on both Linux and Windows. It has also been reproduced on GHC 7.0.4.","type_of_failure":"OtherFailure","blocking":[]} -->8.6.1Simon MarlowSimon Marlow