GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2019-07-07T18:37:48Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/10046Linker script patch in rts/Linker.c doesn't work for (non-C or non-en..) locales2019-07-07T18:37:48ZHoward B. GoldenLinker script patch in rts/Linker.c doesn't work for (non-C or non-en..) localesPlease see [ticket:2615\#comment:95729](https://gitlab.haskell.org//ghc/ghc/issues/2615#note_95729) and replies.
A bug is illustrated by this Haskell program:
```
import ObjLink
import Foreign
import Foreign.C.Types
import Foreign.C.St...Please see [ticket:2615\#comment:95729](https://gitlab.haskell.org//ghc/ghc/issues/2615#note_95729) and replies.
A bug is illustrated by this Haskell program:
```
import ObjLink
import Foreign
import Foreign.C.Types
import Foreign.C.String
foreign import ccall "setlocale" c_setlocale :: CInt -> CString -> IO CString
main = do
withCString "zh_CN.UTF-8" $ \lc -> c_setlocale 5 lc
r <- loadDLL "/usr/lib/libc.so"
putStrLn (show r)
```
which outputs:
```
Just "/usr/lib/libc.so: \26080\25928\30340 ELF \22836"
```
The "\\26080\\25928\\30340 ELF \\22836" part is "无效的ELF头" in Chinese.
This error only occurs on systems where linker scripts are used. The linker script patch (as it has evolved) assumes that the error messages it will receive are in English. This would be true if the locale (LC_MESSAGES) is C or en (or one of the en variants). However, in other locales, the message will be in a different language. Unfortunately, the semantics of POSIX dlerror() specify that the error is returned as a pointer to a human-readable text string, rather than an error code. The string returned depends on the locale.
The code could be made more robust by momentarily changing the locale (LC_MESSAGES) to C before calling dlerror() and reverting it to its previous value immediately after. This has been tested on a zh_CN.utf-8 (see [ticket:2615\#comment:95752](https://gitlab.haskell.org//ghc/ghc/issues/2615#note_95752)) and works. The only concern I have is in the case of multithreaded code that _might_ be affected if it is running while the locale is changed. I don't know enough to know if this is a real issue or not, nor do I know how to deal with it if necessary.
Also see #9237 for another corner case in the linker script code that should be dealt with at the same time.8.0.1Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/9839RTS options parser silently accepts invalid flags2019-07-07T18:38:49ZAdam GundryRTS options parser silently accepts invalid flagsRTS options that do not take arguments (such as `-T`) silently ignore anything that comes afterwards. For example, `+RTS -T,-s` or `+RTS -T-s` turns on collection of GC statistics (`-T`) but does not print out a summary (`-s`). Instead, ...RTS options that do not take arguments (such as `-T`) silently ignore anything that comes afterwards. For example, `+RTS -T,-s` or `+RTS -T-s` turns on collection of GC statistics (`-T`) but does not print out a summary (`-s`). Instead, this should produce an error message. Otherwise, users may mistakenly think that options have been applied.
(This has just bitten us in a production system.)
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.8.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"RTS options parser silently accepts invalid flags","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.8.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Bug","description":"RTS options that do not take arguments (such as `-T`) silently ignore anything that comes afterwards. For example, `+RTS -T,-s` or `+RTS -T-s` turns on collection of GC statistics (`-T`) but does not print out a summary (`-s`). Instead, this should produce an error message. Otherwise, users may mistakenly think that options have been applied.\r\n\r\n(This has just bitten us in a production system.)","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1nkartashovnkartashovhttps://gitlab.haskell.org/ghc/ghc/-/issues/9706New block-structured heap organization for 64-bit2020-09-14T13:59:59ZEdward Z. YangNew block-structured heap organization for 64-bitI was having some discussion about GHC's block structured heap with Sergio Benitez and Adam Belay, and during the discussion it was suggested that the way GHC manages the block structured heap is suboptimal when we're on 64-bit architect...I was having some discussion about GHC's block structured heap with Sergio Benitez and Adam Belay, and during the discussion it was suggested that the way GHC manages the block structured heap is suboptimal when we're on 64-bit architectures.
At the moment, we allocate memory from the operating system per-megablock, storing metadata in the very first megablock. We have to do this because, on 32-bit, we can't generally be too picky about what address our memory ends up living. On 64-bits, we have a lot more flexibility.
Here is the proposal:
1. Statically decide on a maximum heap size in a power of two.
1. Next, probe for some appropriately aligned chunk of available virtual address space for this. On POSIX, we can mmap /dev/null using PROT_NONE and MAP_NORESERVE. On Windows, we can use VirtualAlloc with MEM_RESERVE. (There are few other runtimes which do this trick, including GCC Go.)
1. Divide this region into blocks as before. The maximum heap size is now the megablock size, and the block size is still the same as before. Masking to find the block descriptor works as before.
1. To allocate, we keep track of the high-watermark, and mmap in 1MB pages as they are requested. We also keep track of how much metadata we need, and mmap extra pages to store metadata as necessary.
We still want to request memory from the operating system in conveniently sized chunks, but we can now abolish the notion of a megablock and the megablock allocator, and work purely with block coalescing. Additionally, the recorded heap location means that we can check if a pointer is HEAP_ALLOCED using a mask and equality check, solving #8199.
What do people think?
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.8.3 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"New block-structured heap organization for 64-bit","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.8.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Task","description":"I was having some discussion about GHC's block structured heap with Sergio Benitez and Adam Belay, and during the discussion it was suggested that the way GHC manages the block structured heap is suboptimal when we're on 64-bit architectures.\r\n\r\nAt the moment, we allocate memory from the operating system per-megablock, storing metadata in the very first megablock. We have to do this because, on 32-bit, we can't generally be too picky about what address our memory ends up living. On 64-bits, we have a lot more flexibility.\r\n\r\nHere is the proposal:\r\n\r\n1. Statically decide on a maximum heap size in a power of two.\r\n2. Next, probe for some appropriately aligned chunk of available virtual address space for this. On POSIX, we can mmap /dev/null using PROT_NONE and MAP_NORESERVE. On Windows, we can use VirtualAlloc with MEM_RESERVE. (There are few other runtimes which do this trick, including GCC Go.)\r\n3. Divide this region into blocks as before. The maximum heap size is now the megablock size, and the block size is still the same as before. Masking to find the block descriptor works as before.\r\n4. To allocate, we keep track of the high-watermark, and mmap in 1MB pages as they are requested. We also keep track of how much metadata we need, and mmap extra pages to store metadata as necessary.\r\n\r\nWe still want to request memory from the operating system in conveniently sized chunks, but we can now abolish the notion of a megablock and the megablock allocator, and work purely with block coalescing. Additionally, the recorded heap location means that we can check if a pointer is HEAP_ALLOCED using a mask and equality check, solving #8199.\r\n\r\nWhat do people think?","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1gcampaxgcampaxhttps://gitlab.haskell.org/ghc/ghc/-/issues/9579Runtime suggests using +RTS when that's not possible2019-07-07T18:39:56ZgintasRuntime suggests using +RTS when that's not possibleI just ran into a stack overflow in an application, and GHC told me:
Stack space overflow: current size 8388608 bytes.
Use \`+RTS -Ksize -RTS' to increase it.
I tried to specify +RTS and got an error saying "Most RTS options are disabl...I just ran into a stack overflow in an application, and GHC told me:
Stack space overflow: current size 8388608 bytes.
Use \`+RTS -Ksize -RTS' to increase it.
I tried to specify +RTS and got an error saying "Most RTS options are disabled. Link with -rtsopts to enable them."
The runtime should not suggest using +RTS when that is not possible.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.9 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | low |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Runtime suggests using +RTS when that's not possible","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.9","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Bug","description":"I just ran into a stack overflow in an application, and GHC told me:\r\n\r\nStack space overflow: current size 8388608 bytes.\r\nUse `+RTS -Ksize -RTS' to increase it.\r\n\r\nI tried to specify +RTS and got an error saying \"Most RTS options are disabled. Link with -rtsopts to enable them.\"\r\n\r\nThe runtime should not suggest using +RTS when that is not possible.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/9516unsafeUnmask unmasks even inside uninterruptibleMask2019-07-07T18:40:12Zedsko@edsko.netunsafeUnmask unmasks even inside uninterruptibleMaskControl.Exception exports
```hs
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()
```
with documentation:
*When invoked inside `mask`, this function allows a blocked asynchronous exception to be raised, if one exists. ...Control.Exception exports
```hs
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()
```
with documentation:
*When invoked inside `mask`, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking. When called outside `mask`, or inside `uninterruptibleMask`, this function has no effect.*
However, this is not actually true: `unsafeUnmask` unmasks exceptions even inside `uninterruptibleUnmask`, as the attached test demonstrates (the test uses a foreign call just to have something non-interruptible but still observable; in particular, doing a `print` *is* interruptible because it uses an `MVar` under the hood).
I think it is possible to define a better `unsafeUnmask` in user-land:
```hs
interruptible :: IO a -> IO a
interruptible act = do
st <- getMaskingState
case st of
Unmasked -> act
MaskedInterruptible -> unsafeUnmask act
MaskedUninterruptible -> act
```
but it still seems to be that we should either (i) change the behaviour of unsafeUnmask, or (ii) provide a version of `unsafeUnmask` with the behaviour as described and then change `allowInterrupt` to use that new version of `unsafeUnmask`, or at the very least (iii) change the documentation.
(One question with the above definition of `interruptible` is what happens when we *nest* `mask` and `uninterruptibleMask`?)
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.8.2 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"unsafeUnmask unmasks even inside uninterruptibleMask","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.8.2","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Bug","description":"Control.Exception exports\r\n\r\n{{{#!hs\r\nallowInterrupt :: IO ()\r\nallowInterrupt = unsafeUnmask $ return ()\r\n}}}\r\n\r\nwith documentation:\r\n\r\n''When invoked inside `mask`, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking. When called outside `mask`, or inside `uninterruptibleMask`, this function has no effect.''\r\n\r\nHowever, this is not actually true: `unsafeUnmask` unmasks exceptions even inside `uninterruptibleUnmask`, as the attached test demonstrates (the test uses a foreign call just to have something non-interruptible but still observable; in particular, doing a `print` ''is'' interruptible because it uses an `MVar` under the hood).\r\n\r\nI think it is possible to define a better `unsafeUnmask` in user-land:\r\n\r\n{{{#!hs\r\ninterruptible :: IO a -> IO a\r\ninterruptible act = do\r\n st <- getMaskingState\r\n case st of\r\n Unmasked -> act\r\n MaskedInterruptible -> unsafeUnmask act\r\n MaskedUninterruptible -> act\r\n}}}\r\n\r\nbut it still seems to be that we should either (i) change the behaviour of unsafeUnmask, or (ii) provide a version of `unsafeUnmask` with the behaviour as described and then change `allowInterrupt` to use that new version of `unsafeUnmask`, or at the very least (iii) change the documentation. \r\n\r\n(One question with the above definition of `interruptible` is what happens when we ''nest'' `mask` and `uninterruptibleMask`?)","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/9261-S prints incorrect number of bound tasks2019-07-07T18:41:09Zedsko@edsko.net-S prints incorrect number of bound tasksIn `rts/Stats.c` we have:
```
statsPrintf(" TASKS: %d (%d bound, %d peak workers (%d total), using -N%d)\n",
taskCount, taskCount - workerCount,
peakWorkerCount, workerCount,
...In `rts/Stats.c` we have:
```
statsPrintf(" TASKS: %d (%d bound, %d peak workers (%d total), using -N%d)\n",
taskCount, taskCount - workerCount,
peakWorkerCount, workerCount,
n_capabilities);
```
but I think `taskCount - workerCount` must be wrong, because `taskCount` is the _current_ number of tasks, while `workerAcount` is the _total_ number of workers (accumulating). I think it should be:
```
statsPrintf(" TASKS: %d (%d bound, %d peak workers (%d total), using -N%d)\n",
taskCount, taskCount - currentWorkerCount,
peakWorkerCount, workerCount,
n_capabilities);
```8.0.1Thomas MiedemaThomas Miedemahttps://gitlab.haskell.org/ghc/ghc/-/issues/9105Profiling binary consumes CPU even when idle on Linux.2019-07-07T18:41:54ZrobinpProfiling binary consumes CPU even when idle on Linux.The program is
```hs
import Control.Concurrent
import Control.Monad (forever)
main :: IO ()
main = forever $ threadDelay 1000000 >> return ()
```
Compiled with 32bit GHC 7.6.3 or 7.8.2 on Debian (inside a VM), GHC 7.4.1 on Ubuntu (not ...The program is
```hs
import Control.Concurrent
import Control.Monad (forever)
main :: IO ()
main = forever $ threadDelay 1000000 >> return ()
```
Compiled with 32bit GHC 7.6.3 or 7.8.2 on Debian (inside a VM), GHC 7.4.1 on Ubuntu (not VM).
The non-profiling binary doesn't consume CPU, the profiling does \~10% (of a 2Ghz machine). Running with `+RTS -I0`, so this is not the idle gc.
When strace-ing, the profiling one seems to receive a constant flow of SIGVTALRM, while the normal receives one burst each second.
I see I can switch off "master tick interval" with `-V0`, and then CPU is not used, but the consequences of this are not very well documented (apart from context switching becoming deterministic).
Interestingly, if I compile using profiling on Windows (latest haskell-platform, 64-bit), it doesn't use more CPU than the non-profiling.
So, the question is, why does this happen on Linux, and if it can be avoided somehow.8.0.1Ben GamariBen Gamarihttps://gitlab.haskell.org/ghc/ghc/-/issues/8785Replace hooks API in the RTS with something better2019-07-07T18:43:28ZSimon MarlowReplace hooks API in the RTS with something betterHooks rely on static linking behaviour which doesn't always work: we have to disable -Bsymbolic for the RTS on Linux (see `compiler/main/SysTools.lhs`) and it apparently doesn't work at all on Mac (#8754).
So instead of hooks we should ...Hooks rely on static linking behaviour which doesn't always work: we have to disable -Bsymbolic for the RTS on Linux (see `compiler/main/SysTools.lhs`) and it apparently doesn't work at all on Mac (#8754).
So instead of hooks we should be passing in information when we initialize the RTS, like we already do for some other things (`-rtsopts` etc.).
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | high |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Replace hooks API in the RTS with something better","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"7.10.1","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.6.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Task","description":"Hooks rely on static linking behaviour which doesn't always work: we have to disable -Bsymbolic for the RTS on Linux (see `compiler/main/SysTools.lhs`) and it apparently doesn't work at all on Mac (#8754).\r\n\r\nSo instead of hooks we should be passing in information when we initialize the RTS, like we already do for some other things (`-rtsopts` etc.).","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/8733I/O manager causes unnecessary syscalls in send/recv loops2019-07-07T18:43:44ZtibbeI/O manager causes unnecessary syscalls in send/recv loopsNetwork applications often call `send` followed by `recv`, to send a message and then read an answer. This causes syscall traces like this one:
```
recvfrom(13, ) -- Haskell thread A
sendto(13, ) -- Haske...Network applications often call `send` followed by `recv`, to send a message and then read an answer. This causes syscall traces like this one:
```
recvfrom(13, ) -- Haskell thread A
sendto(13, ) -- Haskell thread A
recvfrom(13, ) = -1 EAGAIN -- Haskell thread A
epoll_ctl(3, ) -- Haskell thread A (a job for the IO manager)
recvfrom(14, ) -- Haskell thread B
sendto(14, ) -- Haskell thread B
recvfrom(14, ) = -1 EAGAIN -- Haskell thread B
epoll_ctl(3, ) -- Haskell thread B (a job for the IO manager)
```
The `recvfrom` call always fails, as the response from the partner we're communicating with won't be available right after we send the request.
We ought to consider descheduling the thread as soon as sending is "done". The hard part is to figure out when that is.
See http://www.yesodweb.com/blog/2014/02/new-warp for a real world example.8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/8732Global big object heap allocator lock causes contention2019-07-07T18:43:44ZtibbeGlobal big object heap allocator lock causes contentionThe lock `allocate` takes when allocating big objects hurts scalability of I/O bound application. `Network.Socket.ByteString.recv` is typically called with a buffer size of 4096, which causes a `ByteString` of that size to be allocated. ...The lock `allocate` takes when allocating big objects hurts scalability of I/O bound application. `Network.Socket.ByteString.recv` is typically called with a buffer size of 4096, which causes a `ByteString` of that size to be allocated. The size of this `ByteString` causes it to be allocated from the big object space, which causes contention of the global lock that guards that space.
See http://www.yesodweb.com/blog/2014/02/new-warp for a real world example.8.0.1Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/8623Strange slowness when using async library with FFI callbacks2021-03-31T03:41:02ZJohnWiegleyStrange slowness when using async library with FFI callbacksI've attached a Haskell and a C file, when compiled as such:
```
ghc -DSPEED_BUG=0 -threaded -O2 -main-is SpeedTest SpeedTest.hs SpeedTestC.c
```
You should find that with 7.4.2, 7.6.3 or a recent build of 7.8, building with `SPEED_BUG...I've attached a Haskell and a C file, when compiled as such:
```
ghc -DSPEED_BUG=0 -threaded -O2 -main-is SpeedTest SpeedTest.hs SpeedTestC.c
```
You should find that with 7.4.2, 7.6.3 or a recent build of 7.8, building with `SPEED_BUG=0` produces an executable that takes more than a second to run, while building with `SPEED_BUG=1` runs very quickly. I've also attached the Core for both scenarios.8.0.1Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/8309traceEvent truncates to 512 bytes2019-07-07T18:45:37ZduncantraceEvent truncates to 512 bytesThe `Debug.Trace.traceEvent` (& `traceEventIO`) use a code path that unnecessarily uses printf format strings and uses a fixed size 512-byte buffer, hence truncating all user trace messages at that size.
Here is the call path:
- `Debug...The `Debug.Trace.traceEvent` (& `traceEventIO`) use a code path that unnecessarily uses printf format strings and uses a fixed size 512-byte buffer, hence truncating all user trace messages at that size.
Here is the call path:
- `Debug.Trace.traceEvent` (& `traceEventIO`) call `traceEvent#` primop
- cmm impl of the primop `stg_traceEventzh` calls C RTS function `traceUserMsg`
- `traceUserMsg` calls `traceFormatUserMsg(cap, "%s", msg);`, using the printf format string
- `traceFormatUserMsg` uses `postUserMsg` which eventually calls `postLogMsg`
- `postLogMsg` does the printf stuff
Here's what `postLogMsg` does:
```
#define BUF 512
void postLogMsg(EventsBuf *eb, EventTypeNum type, char *msg, va_list ap)
{
char buf[BUF];
nat size;
size = vsnprintf(buf,BUF,msg,ap);
if (size > BUF) {
buf[BUF-1] = '\0';
size = BUF;
}
....
```
This is obviously designed for RTS-internal users, not the user path.
So the problem starts with this bit:
```
void traceUserMsg(Capability *cap, char *msg)
{
traceFormatUserMsg(cap, "%s", msg);
}
```
It just should not use that code path. It should call something that directly posts the message, without any silly printf strings. In fact, the code path from the primop down should really be changed to use an explicit given length, rather than using a null-terminated string and having to call strlen.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"traceEvent truncates to 512 bytes","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.6.3","keywords":["eventlog","tracing"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The `Debug.Trace.traceEvent` (& `traceEventIO`) use a code path that unnecessarily uses printf format strings and uses a fixed size 512-byte buffer, hence truncating all user trace messages at that size.\r\n\r\nHere is the call path:\r\n\r\n * `Debug.Trace.traceEvent` (& `traceEventIO`) call `traceEvent#` primop\r\n * cmm impl of the primop `stg_traceEventzh` calls C RTS function `traceUserMsg`\r\n * `traceUserMsg` calls `traceFormatUserMsg(cap, \"%s\", msg);`, using the printf format string\r\n * `traceFormatUserMsg` uses `postUserMsg` which eventually calls `postLogMsg`\r\n * `postLogMsg` does the printf stuff\r\n\r\nHere's what `postLogMsg` does:\r\n\r\n{{{\r\n#define BUF 512\r\n\r\nvoid postLogMsg(EventsBuf *eb, EventTypeNum type, char *msg, va_list ap)\r\n{\r\n char buf[BUF];\r\n nat size;\r\n\r\n size = vsnprintf(buf,BUF,msg,ap);\r\n if (size > BUF) {\r\n buf[BUF-1] = '\\0';\r\n size = BUF;\r\n }\r\n\r\n ....\r\n}}}\r\n\r\nThis is obviously designed for RTS-internal users, not the user path.\r\n\r\nSo the problem starts with this bit:\r\n\r\n{{{\r\nvoid traceUserMsg(Capability *cap, char *msg)\r\n{\r\n traceFormatUserMsg(cap, \"%s\", msg);\r\n}\r\n}}}\r\n\r\nIt just should not use that code path. It should call something that directly posts the message, without any silly printf strings. In fact, the code path from the primop down should really be changed to use an explicit given length, rather than using a null-terminated string and having to call strlen.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/8303defer StackOverflow exceptions (rather than dropping them) when exceptions ar...2019-07-07T18:45:39Zrwbartondefer StackOverflow exceptions (rather than dropping them) when exceptions are maskedSee http://www.reddit.com/r/haskell/comments/1luan1/strange_io_sequence_behaviour/ for a very simple program (`main'`) that accidentally evades the stack size limit, running to completion even though it has allocated hundreds of megabyte...See http://www.reddit.com/r/haskell/comments/1luan1/strange_io_sequence_behaviour/ for a very simple program (`main'`) that accidentally evades the stack size limit, running to completion even though it has allocated hundreds of megabytes of stack chunks, and [my comment](http://www.reddit.com/r/haskell/comments/1luan1/strange_io_sequence_behaviour/cc32ec4) for an explanation of this behavior.
ryani suggested that when a thread exceeds its stack limit but it is currently blocking exceptions, the RTS shouldn't simply drop the `StackOverflow` exception, but rather deliver it when the `mask` operation completes. That sounds sensible to me and it would give a nice guarantee that when any individual `mask` operation uses a small amount of stack, the stack size limit is approximately enforced.
(I know that the default stack size limit may go away or essentially go away, but it can still be nice when developing to use a small stack size limit, so that one's system isn't run into the ground by infinite recursion quickly gobbling up tons of memory.)
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"defer StackOverflow exceptions (rather than dropping them) when exceptions are masked","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.6.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"See http://www.reddit.com/r/haskell/comments/1luan1/strange_io_sequence_behaviour/ for a very simple program (`main'`) that accidentally evades the stack size limit, running to completion even though it has allocated hundreds of megabytes of stack chunks, and [http://www.reddit.com/r/haskell/comments/1luan1/strange_io_sequence_behaviour/cc32ec4 my comment] for an explanation of this behavior.\r\n\r\nryani suggested that when a thread exceeds its stack limit but it is currently blocking exceptions, the RTS shouldn't simply drop the `StackOverflow` exception, but rather deliver it when the `mask` operation completes. That sounds sensible to me and it would give a nice guarantee that when any individual `mask` operation uses a small amount of stack, the stack size limit is approximately enforced.\r\n\r\n(I know that the default stack size limit may go away or essentially go away, but it can still be nice when developing to use a small stack size limit, so that one's system isn't run into the ground by infinite recursion quickly gobbling up tons of memory.)","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1thoughtpolicethoughtpolicehttps://gitlab.haskell.org/ghc/ghc/-/issues/8238Implement unloading of shared libraries2023-03-27T12:22:37ZSimon MarlowImplement unloading of shared librariesIn #8039 we added support for unloading static objects from the runtime linker, with the GC detecting when there are no references left. We could add this functionality for shared libraries too, using `dl_iterate_phdr`.
\@heisenbug's co...In #8039 we added support for unloading static objects from the runtime linker, with the GC detecting when there are no references left. We could add this functionality for shared libraries too, using `dl_iterate_phdr`.
\@heisenbug's comment from #8039 with the relevant pointers:
[Eli Bendersky's article](http://eli.thegreenplace.net/2011/08/25/load-time-relocation-of-shared-libraries) suggests to use the [dl_iterate_phdr](http://linux.die.net/man/3/dl_iterate_phdr) function for finding information about loaded libraries. Seems to be linux only. There is a [workaround on OSX](http://stackoverflow.com/questions/10009043/dl-iterate-phdr-equivalent-on-mac), though, on !StackOverflow.
And here is how Böhm-Demers-Weiser's GC implement a [callback function for dl_iterate_phdr](https://github.com/ivmai/bdwgc/blob/master/dyn_load.c#L451).
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.7 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Implement unloading of shared libraries","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"7.10.1","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.7","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Task","description":"In #8039 we added support for unloading static objects from the runtime linker, with the GC detecting when there are no references left. We could add this functionality for shared libraries too, using `dl_iterate_phdr`. \r\n\r\n@heisenbug's comment from #8039 with the relevant pointers:\r\n\r\n[http://eli.thegreenplace.net/2011/08/25/load-time-relocation-of-shared-libraries Eli Bendersky's article] suggests to use the [http://linux.die.net/man/3/dl_iterate_phdr dl_iterate_phdr] function for finding information about loaded libraries. Seems to be linux only. There is a [http://stackoverflow.com/questions/10009043/dl-iterate-phdr-equivalent-on-mac workaround on OSX], though, on !StackOverflow.\r\n\r\nAnd here is how Böhm-Demers-Weiser's GC implement a [https://github.com/ivmai/bdwgc/blob/master/dyn_load.c#L451 callback function for dl_iterate_phdr].","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/7930Nested STM Invariants are lost2019-07-07T18:47:24ZRyan YatesNested STM Invariants are lostInvariants from a successful nested transaction should be merged with the parent.
```
import Control.Concurrent
import Control.Concurrent.STM
main = do
x <- atomically $
do a <- newTVar True
(always (readTVar a) ...Invariants from a successful nested transaction should be merged with the parent.
```
import Control.Concurrent
import Control.Concurrent.STM
main = do
x <- atomically $
do a <- newTVar True
(always (readTVar a) >> retry) `orElse` return ()
return a
atomically (writeTVar x False) -- Should not and does not fail
y <- atomically $
do a <- newTVar True
always (readTVar a) `orElse` return ()
return a
atomically (writeTVar y False) -- Should fail, but does not!
putStrLn "Ahhh!"
z <- atomically $
do a <- newTVar True
always (readTVar a)
return a
atomically (writeTVar z False) -- should and does fail
```
I know how to fix this. I'll have a patch with some tests and a fix soon.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Nested STM Invariants are lost","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"fryguybob"},"version":"7.6.3","keywords":["STM"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"Invariants from a successful nested transaction should be merged with the parent.\r\n\r\n{{{\r\nimport Control.Concurrent\r\nimport Control.Concurrent.STM\r\n\r\nmain = do\r\n x <- atomically $\r\n do a <- newTVar True\r\n (always (readTVar a) >> retry) `orElse` return ()\r\n return a\r\n atomically (writeTVar x False) -- Should not and does not fail\r\n\r\n y <- atomically $\r\n do a <- newTVar True\r\n always (readTVar a) `orElse` return ()\r\n return a\r\n atomically (writeTVar y False) -- Should fail, but does not!\r\n\r\n putStrLn \"Ahhh!\"\r\n\r\n z <- atomically $\r\n do a <- newTVar True\r\n always (readTVar a)\r\n return a\r\n atomically (writeTVar z False) -- should and does fail\r\n}}}\r\n\r\nI know how to fix this. I'll have a patch with some tests and a fix soon.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Ryan YatesRyan Yateshttps://gitlab.haskell.org/ghc/ghc/-/issues/7919Heap corruption (segfault) from large 'let' expression2020-09-10T13:31:11ZduncanHeap corruption (segfault) from large 'let' expressionThe attached test program reliably triggers an assertion in the storage manager with the `-debug` rts.
```
LargeUse: internal error: ASSERTION FAILED: file rts/sm/GCUtils.c, line 208
(GHC version 7.6.3 for x86_64_unknown_linux)
...The attached test program reliably triggers an assertion in the storage manager with the `-debug` rts.
```
LargeUse: internal error: ASSERTION FAILED: file rts/sm/GCUtils.c, line 208
(GHC version 7.6.3 for x86_64_unknown_linux)
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
```
This behaviour is reproducible with many recent ghc versions (tried 7.6.3, 7.4.2, 6.12.3) and all fail at the same assertion when using the `-debug` rts. (Without `-debug` we get a more random variety of segfaults and GC errors.)
It looks like a pretty clear case of heap corruption. I'll explain why...
The test program uses TH to generate a program that looks like this:
```
data Large = Large Int Int ... -- 512 non-strict Int fields
test =
let step (Large i1 i2 ... i512) =
let j1 = i1 + i4
j2 = i2 + i7
...
j511 = i511 + i510
j512 = i512 + i1
in Large j1 j2 ... j512
in runSteps step 100000 (Large 1 1 ... 1)
-- basically an unfoldr:
runSteps :: (state -> (state, Int)) -> Int -> state -> [Int]
runSteps f n i | n <= 0 = []
| otherwise = case f i of
(i', r) -> r : runSteps f (n - 1) i'
```
We use TH to generate this program, and we use a "size" parameter that determines size of the data constructor (and corresponding letrec). This makes it easy to find the size threshold where it fails.
For small sizes this program works fine, and for larger values it triggers the assert. With ghc 7.6.3 on a x86-64 machine, the magic threshold is 511: that is, the program works fine with size 510 and hits the assertion at size 511. This is suspiciously close to 512. And of course on a 64bit machine 512 \* 8 is 4k, which is the storage manager's block size. And the failing assertion is in a bit of the storage manager that is dealing with blocks...
```
// If this block does not have enough space to allocate the
// current object, but it also doesn't have any work to push, then
// push it on to the scanned list. It cannot be empty, because
// then there would be enough room to copy the current object.
if (bd->u.scan == bd->free)
{
ASSERT(bd->free != bd->start);
push_scanned_block(bd, ws);
}
```
So it looks very much like we have a situation where something is writing over the end of a block and messing up the SM's data structures.
But, it is not nearly as simple as the data constructor being too big. We can demonstrate other programs that use much larger data constructors without any problem at all. Our suspicion falls on the big letrec.
Indeed with this program if we change the data constructor to have strict fields then it no longer fails, and we can run it with much larger data constructor sizes. What would be different between strict and non-strict fields here? Well, one observation is that when it is strict then ghc can (and does) turn the code into a big cascade of case expressions, while when it is non-strict then the STG code is all 'let's.
```
case tpl_s6jQ of tpl_s6Ak {
__DEFAULT ->
case tpl_s6jS of tpl_s6Al {
__DEFAULT ->
case tpl_s6jU of tpl_s6Am {
-- etc for all 500+ elements
```
versus
```
let {
sat_s5UK :: GHC.Types.Int
[LclId] =
\u [] GHC.Num.$fNumInt_$c+ i511_s5Ly i1_s5E9; } in
let {
sat_s62X :: GHC.Types.Int
[LclId] =
\u [] GHC.Num.$fNumInt_$c+ i510_s5R2 i509_s5UG; } in
let {
sat_s62W :: GHC.Types.Int
[LclId] =
\u [] GHC.Num.$fNumInt_$c+ i509_s5UG i506_s5UC; } in
-- etc for all 500+ elements
```
Note also, that it is nothing to do with the obvious space leak here. If we modify the code to generate an NFData instance and to use deepseq at each iteration then we eliminate the space leak, but we keep the big stg 'let', and it still fails.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Heap corruption (segfault) from large 'let' expression","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.6.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The attached test program reliably triggers an assertion in the storage manager with the `-debug` rts.\r\n\r\n{{{\r\nLargeUse: internal error: ASSERTION FAILED: file rts/sm/GCUtils.c, line 208\r\n\r\n (GHC version 7.6.3 for x86_64_unknown_linux)\r\n Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug\r\n}}}\r\n\r\nThis behaviour is reproducible with many recent ghc versions (tried 7.6.3, 7.4.2, 6.12.3) and all fail at the same assertion when using the `-debug` rts. (Without `-debug` we get a more random variety of segfaults and GC errors.)\r\n\r\nIt looks like a pretty clear case of heap corruption. I'll explain why...\r\n\r\nThe test program uses TH to generate a program that looks like this:\r\n{{{\r\ndata Large = Large Int Int ... -- 512 non-strict Int fields\r\n\r\ntest =\r\n let step (Large i1 i2 ... i512) =\r\n let j1 = i1 + i4\r\n j2 = i2 + i7\r\n ...\r\n j511 = i511 + i510\r\n j512 = i512 + i1\r\n in Large j1 j2 ... j512\r\n\r\n in runSteps step 100000 (Large 1 1 ... 1)\r\n\r\n-- basically an unfoldr:\r\nrunSteps :: (state -> (state, Int)) -> Int -> state -> [Int]\r\nrunSteps f n i | n <= 0 = []\r\n | otherwise = case f i of\r\n (i', r) -> r : runSteps f (n - 1) i'\r\n}}}\r\n\r\nWe use TH to generate this program, and we use a \"size\" parameter that determines size of the data constructor (and corresponding letrec). This makes it easy to find the size threshold where it fails.\r\n\r\nFor small sizes this program works fine, and for larger values it triggers the assert. With ghc 7.6.3 on a x86-64 machine, the magic threshold is 511: that is, the program works fine with size 510 and hits the assertion at size 511. This is suspiciously close to 512. And of course on a 64bit machine 512 * 8 is 4k, which is the storage manager's block size. And the failing assertion is in a bit of the storage manager that is dealing with blocks...\r\n\r\n{{{\r\n // If this block does not have enough space to allocate the\r\n // current object, but it also doesn't have any work to push, then \r\n // push it on to the scanned list. It cannot be empty, because\r\n // then there would be enough room to copy the current object.\r\n if (bd->u.scan == bd->free)\r\n {\r\n ASSERT(bd->free != bd->start);\r\n push_scanned_block(bd, ws);\r\n }\r\n}}}\r\n\r\nSo it looks very much like we have a situation where something is writing over the end of a block and messing up the SM's data structures.\r\n\r\nBut, it is not nearly as simple as the data constructor being too big. We can demonstrate other programs that use much larger data constructors without any problem at all. Our suspicion falls on the big letrec.\r\n\r\nIndeed with this program if we change the data constructor to have strict fields then it no longer fails, and we can run it with much larger data constructor sizes. What would be different between strict and non-strict fields here? Well, one observation is that when it is strict then ghc can (and does) turn the code into a big cascade of case expressions, while when it is non-strict then the STG code is all 'let's.\r\n\r\n{{{\r\n case tpl_s6jQ of tpl_s6Ak {\r\n __DEFAULT ->\r\n case tpl_s6jS of tpl_s6Al {\r\n __DEFAULT ->\r\n case tpl_s6jU of tpl_s6Am {\r\n -- etc for all 500+ elements\r\n}}}\r\nversus\r\n{{{\r\n let {\r\n sat_s5UK :: GHC.Types.Int\r\n [LclId] =\r\n \\u [] GHC.Num.$fNumInt_$c+ i511_s5Ly i1_s5E9; } in\r\n let {\r\n sat_s62X :: GHC.Types.Int\r\n [LclId] =\r\n \\u [] GHC.Num.$fNumInt_$c+ i510_s5R2 i509_s5UG; } in\r\n let {\r\n sat_s62W :: GHC.Types.Int\r\n [LclId] =\r\n \\u [] GHC.Num.$fNumInt_$c+ i509_s5UG i506_s5UC; } in\r\n -- etc for all 500+ elements\r\n}}}\r\n\r\nNote also, that it is nothing to do with the obvious space leak here. If we modify the code to generate an NFData instance and to use deepseq at each iteration then we eliminate the space leak, but we keep the big stg 'let', and it still fails.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/7670StablePtrs should be organized by generation for efficient minor collections2019-07-07T18:48:41ZEdward Z. YangStablePtrs should be organized by generation for efficient minor collectionsCurrently, stable pointers are all in one giant pointer table (see markStablePtrTable); this results in pretty bad GC behavior when you create a lot of stable pointers (Peaker has a test-case which he thinks is suffering due to repeated ...Currently, stable pointers are all in one giant pointer table (see markStablePtrTable); this results in pretty bad GC behavior when you create a lot of stable pointers (Peaker has a test-case which he thinks is suffering due to repeated traversal of the stable pointers list.) We should partition them up into generations like we do for mutable lists. There might be some trickiness keeping the table up-to-date after GCs.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.7 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"StablePtrs should be organized by generation for efficient minor collections","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.7","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"Currently, stable pointers are all in one giant pointer table (see markStablePtrTable); this results in pretty bad GC behavior when you create a lot of stable pointers (Peaker has a test-case which he thinks is suffering due to repeated traversal of the stable pointers list.) We should partition them up into generations like we do for mutable lists. There might be some trickiness keeping the table up-to-date after GCs.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/7606Stride scheduling for Haskell threads with priorities2019-07-07T18:49:01ZEdward Z. YangStride scheduling for Haskell threads with prioritiesCurrently, GHC uses a round-robin scheduler for Haskell threads, with some heuristics for when threads should be bumped to the front of the queue. This patch set replaces this scheduler with 'stride scheduling', as described by [Waldspur...Currently, GHC uses a round-robin scheduler for Haskell threads, with some heuristics for when threads should be bumped to the front of the queue. This patch set replaces this scheduler with 'stride scheduling', as described by [Waldspurger and Weihl '95](http://read.seas.harvard.edu/~kohler/class/aosref/waldspurger95stride.pdf), which is an efficient, deterministic method for scheduling processes with differing priorities. Priorities are assigned by giving 'tickets' to threads; a thread with twice as many tickets as another will run twice as often. I’d like to replace the round-robin scheduler completely with this scheduler.
Here are nofib benchmarks comparing the old scheduler to the new:
```
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
Min -0.0% -52.2% -18.8% -18.6% -83.1%
Max +1.0% +4.2% +4.9% +5.1% +7.7%
Geometric Mean +0.1% -2.8% -0.9% -0.9% -2.9%
```
Here are some technical details:
- Without any changes to ticket values, scheduling behavior should be functionally identical to round-robin. (By default, new threads, including the IO thread, get allocated the max nubmer of tickets possible.) This is not quite identical, since our heap does not have FIFO property (see below.)
- The current patch-set uses a very simple (e.g. undergrad level) resizable-array backed heap to implement the priority queue; we can play some tricks to reduce the memory footprint of the priority queue (e.g. using the container_of macro to eliminate the extra keys store); and a more fancy data structure would make it easier for us to surgically remove entries or reweight them, but I wanted to keep overhead low. If anyone has a pet implementation of priority queues in C feel free to swap it in. Right now, this only affects the uses of promoteInRunQueue() in Messages.c; I still need to check if #3838 has regressed.
- We get three new primops: `setTickets#`, `getTickets#` and `modifyTickets#`. We don't support creating threads with specific numbers of tickets (mostly because that would have added an annoyingly large set of extra primops); instead, you're expected to spawn a thread which gets max-ticket allocation, and then weight it down.
- `_link` is no longer used for linking TSOs in the run queue. I have tried my best to stamp out any code which operated on this assumption, but I may have missed some.
- Modifying a TSO's tickets takes out the scheduler lock; the hope is that this operation is quick and rare enough that a global lock here is not too bad.
- We are unfortunately stuck with some magic constants w.r.t. ticket values: 1 \<\< 20 is the maximum number of tickets our implementation is hard-coded to support.
- Sleeping and blocked tasks do not get any 'bonus' for being blocked.
- In an ideal world, when a thread hits a black hole, it should temporarily give its tickets to the thread evaluating the black hole, so it will unblock more quickly. Unfortunately, implementing this is pretty complicated (the blackhole evaluating thread could die; or it could get stuck on a blackhole itself and need to gift its tickets; it shouldn't be able to give away the tickets it was gifted.) So this implementation leaves that out. Similar semantics for MVars are probably possible, but will require userland assistance too.
I haven't prepared a patch to base yet with a user-level API, but here is a proposed draft:
```
type Tickets = Int
-- | Maximum number of tickets we support a thread having. (Currently 2 >> 20)
-- Note that this doesn't bound the *global* maximum tickets.
maxTickets :: Tickets
-- | Changes the number of tickets allocated to a thread. The ticket count must
-- not be less than or equal to zero, or greater than maxTickets. (Corresponds
-- to setTickets# primop)
setTickets :: ThreadId -> Tickets -> IO ()
-- | Returns the number of tickets currently allocated to a thread. (Corresponds to
-- getTickets# primop)
getTickets :: ThreadId -> IO Tickets
-- | Atomically performs a linear transformation on the number of tickets a thread;
-- e.g. scaling the number of tickets by a rational number, adding another fixed
-- set of tickets, and then returning the number of 'leftover' tickets from the operation; e.g.
-- if the net amount of tickets is reduced, then the returned result is positive;
-- if the net amount of tickets is increased, the returned result is negative.
-- In the absence of concurrent threads, the following property holds forall
-- t, m and x:
--
-- do r <- getTickets t
-- d <- scaleTickets t m x
-- r' <- getTickets t
-- return (r == r' + d)
--
-- If the scaling would reduce the number of tickets below zero or increase the
-- number of tickets beyond maxTickets, this function will instead fail with
-- an exception. This function will be subject to some rounding error; powers of two
-- are, however, likely to be exact. (Corresponds to modifyTickets# primop; note
-- that the sentinel value for failure is maxTickets + 1, since it is impossible for
-- a thread's ticket value to change by that much.)
modifyTickets :: ThreadId -> Ratio Int -> Tickets -> IO Tickets
-- | Forks a new thread, transferring some percentage of tickets from the current
-- thread to it (so the net number of tickets stays constant.) Fails if the rational
-- is greater than 1 or less than or equal to zero, or if there are not enough tickets
-- in the current thread.
forkIOWith :: IO a -> Ratio Int -> IO ThreadId
```8.0.1Edward Z. YangEdward Z. Yanghttps://gitlab.haskell.org/ghc/ghc/-/issues/7482GHC.Event overwrites main IO managers hooks to RTS2019-07-07T18:49:33ZAndreasVoellmyGHC.Event overwrites main IO managers hooks to RTSThe IO manager registers two file descriptors with the RTS which the RTS uses to send control and wakeup signals to the IO manager. The main IO manager is started up by default and registers some file descriptors that it has allocated wi...The IO manager registers two file descriptors with the RTS which the RTS uses to send control and wakeup signals to the IO manager. The main IO manager is started up by default and registers some file descriptors that it has allocated with the RTS.
The base package also exposes a GHC.Event module which when initialized will also register files with the RTS, overwriting the main IO manager's files. Now the RTS can no longer signal the main IO manager.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.4.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | libraries/base |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"GHC.Event overwrites main IO managers hooks to RTS","status":"New","operating_system":"","component":"libraries/base","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.4.1","keywords":["IO","Manager,","RTS"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The IO manager registers two file descriptors with the RTS which the RTS uses to send control and wakeup signals to the IO manager. The main IO manager is started up by default and registers some file descriptors that it has allocated with the RTS. \r\n\r\nThe base package also exposes a GHC.Event module which when initialized will also register files with the RTS, overwriting the main IO manager's files. Now the RTS can no longer signal the main IO manager.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/7320GHC crashes when building on 32-bit Linux in a Linode2019-07-07T18:50:18ZbenlGHC crashes when building on 32-bit Linux in a LinodeTrying to build `haskell-src-exts` crashes in a 32-bit Linux Linode (under Xen).
GHC 7.0.4 seems ok, but I've tried GHC 7.2.2, 7.4.1 and 7.6.1 and they all either segfault or issue a bad instruction when building. This also happens when...Trying to build `haskell-src-exts` crashes in a 32-bit Linux Linode (under Xen).
GHC 7.0.4 seems ok, but I've tried GHC 7.2.2, 7.4.1 and 7.6.1 and they all either segfault or issue a bad instruction when building. This also happens when building GHC from source.
GHC 7.6.1 crashes more frequently than the others. When run under GDB it usually dies at the same program point (0x093c6263). Running GHC with -v says it's in the 'Tidy Core' stage.
I've tried under the latest Arch, as well as Debian. The Linode has 1GB of real RAM and 4GB of swap, so it shouldn't be running out of memory.
The Linode has been stable otherwise, it runs apache and trac, and has had uptimes of 3 months or more.
Running Ubuntu-32bit under Parallels (using OSX as the host) with the same memory configuration seems fine.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ------------ |
| Version | 7.6.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"GHC crashes when building on 32-bit Linux in a Linode","status":"New","operating_system":"","component":"Compiler","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.6.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"\r\nTrying to build {{{haskell-src-exts}}} crashes in a 32-bit Linux Linode (under Xen).\r\n\r\nGHC 7.0.4 seems ok, but I've tried GHC 7.2.2, 7.4.1 and 7.6.1 and they all either segfault or issue a bad instruction when building. This also happens when building GHC from source. \r\n\r\nGHC 7.6.1 crashes more frequently than the others. When run under GDB it usually dies at the same program point (0x093c6263). Running GHC with -v says it's in the 'Tidy Core' stage.\r\n\r\nI've tried under the latest Arch, as well as Debian. The Linode has 1GB of real RAM and 4GB of swap, so it shouldn't be running out of memory.\r\n\r\nThe Linode has been stable otherwise, it runs apache and trac, and has had uptimes of 3 months or more.\r\n\r\nRunning Ubuntu-32bit under Parallels (using OSX as the host) with the same memory configuration seems fine.\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Simon MarlowSimon Marlow