GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2023-02-14T11:22:19Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/11299T5435_gcc_v fails on ARM2023-02-14T11:22:19ZBen GamariT5435_gcc_v fails on ARMThe `rts/T5435_gcc_v` testcase sometimes fails on ARM. As far as I can tell there are two principle failure modes: hanging and segmentation faulting.The `rts/T5435_gcc_v` testcase sometimes fails on ARM. As far as I can tell there are two principle failure modes: hanging and segmentation faulting.8.0.1Ben GamariBen Gamarihttps://gitlab.haskell.org/ghc/ghc/-/issues/11223Runtime linker performs eager loading of all object files2021-11-11T17:51:13ZTamar ChristinaRuntime linker performs eager loading of all object filesThe runtime linker seems to be re-exporting some of the symbols of `libmingwex` from the rts archive (using `SymI_HasProto`). Only a very small subset of symbols are re-exporting.
If a symbol is needed that isn't re-exported (e.g. `log1...The runtime linker seems to be re-exporting some of the symbols of `libmingwex` from the rts archive (using `SymI_HasProto`). Only a very small subset of symbols are re-exporting.
If a symbol is needed that isn't re-exported (e.g. `log1p`) then this code can't be run in GHCi because it will result in a duplicate symbols error.
### A workaround
The `rts` seems to be a special case again. The linker seems to ignore the `extra-libraries` from the `package.conf`, which explains why you can put anything you want in there and it'll still compile.
```
128 emptyPLS :: DynFlags -> PersistentLinkerState
129 emptyPLS _ = PersistentLinkerState {
130 closure_env = emptyNameEnv,
131 itbl_env = emptyNameEnv,
132 pkgs_loaded = init_pkgs,
133 bcos_loaded = [],
134 objs_loaded = [],
135 temp_sos = [] }
136
137 -- Packages that don't need loading, because the compiler
138 -- shares them with the interpreted program.
139 --
140 -- The linker's symbol table is populated with RTS symbols using an
141 -- explicit list. See rts/Linker.c for details.
142 where init_pkgs = [rtsUnitId]
```
I've tried 2 approaches which haven't worked completely:
1. I tried removing the symbols from the export list and adding `libmingwex` to the rts's `package.conf`and have it just link against it. But turns out the `rts`'s `package.conf` is ignored on all platforms. I didn't want to have to make an exception for Windows here and I don't know why the other platforms also ignore it so I abandoned this approach.
1. I tried marking the symbols we're re-exporting as weak symbols, so there wouldn't be a change to existing code and would allow you to link against `libmingwex`. But unfortunately because of when the other libraries specified by `-l` are linked in, some of the symbols have already been used and thus aren't weak anymore. So I still get duplicate link errors.
What I want to try now is leaving them as weak symbols, but loading `libmingwex.a` at `rts` initialization time. Much like how `kernel32` is loaded. This is hopefully early enough that the symbols haven't been used yet.
### Example
```hs
-- LogFloat.hs
module Main (main) where
import Data.Number.LogFloat (log1p)
main :: IO ()
main = print $ log1p 1.0
```
`runGhc LogFloat.hs` will fail:
```
Loading package logfloat-0.13.3.3 ...
linking ...
LogFloat.hs: ...\x86_64-windows-ghc-7.11.20151123\logfloat-0.13.3.3-4JZYNCXKwghOD60rvMUAcn\HSlogfloat-0.13.3.3-4JZYNCXKwghOD60rvMUAcn.o: unknown symbol `log1p'
LogFloat.hs: LogFloat.hs: unable to load package `logfloat-0.13.3.3'
```8.0.1Tamar ChristinaTamar Christinahttps://gitlab.haskell.org/ghc/ghc/-/issues/3869RTS GC Statistics from -S should be logged via the eventlog system2021-03-05T21:34:44ZcjsRTS GC Statistics from -S should be logged via the eventlog systemThe -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged ...The -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged via the eventlog system (perhaps enabled via a "-lg" option) one could get more insight into the garbage generation and collection behaviour of one's program.
Note that it's probably not necessary also to store the information given at the end of the run with both "-s" and "-S", though it may be interesting to contemplate moving this sort of thing into the eventlog file as well.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.12.1 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"RTS GC Statistics from -S should be logged via the eventlog system","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"6.12.1","keywords":["collection,","garbage","gc,","statistics"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"FeatureRequest","description":"The -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged via the eventlog system (perhaps enabled via a \"-lg\" option) one could get more insight into the garbage generation and collection behaviour of one's program.\r\n\r\nNote that it's probably not necessary also to store the information given at the end of the run with both \"-s\" and \"-S\", though it may be interesting to contemplate moving this sort of thing into the eventlog file as well.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/9706New block-structured heap organization for 64-bit2020-09-14T13:59:59ZEdward Z. YangNew block-structured heap organization for 64-bitI was having some discussion about GHC's block structured heap with Sergio Benitez and Adam Belay, and during the discussion it was suggested that the way GHC manages the block structured heap is suboptimal when we're on 64-bit architect...I was having some discussion about GHC's block structured heap with Sergio Benitez and Adam Belay, and during the discussion it was suggested that the way GHC manages the block structured heap is suboptimal when we're on 64-bit architectures.
At the moment, we allocate memory from the operating system per-megablock, storing metadata in the very first megablock. We have to do this because, on 32-bit, we can't generally be too picky about what address our memory ends up living. On 64-bits, we have a lot more flexibility.
Here is the proposal:
1. Statically decide on a maximum heap size in a power of two.
1. Next, probe for some appropriately aligned chunk of available virtual address space for this. On POSIX, we can mmap /dev/null using PROT_NONE and MAP_NORESERVE. On Windows, we can use VirtualAlloc with MEM_RESERVE. (There are few other runtimes which do this trick, including GCC Go.)
1. Divide this region into blocks as before. The maximum heap size is now the megablock size, and the block size is still the same as before. Masking to find the block descriptor works as before.
1. To allocate, we keep track of the high-watermark, and mmap in 1MB pages as they are requested. We also keep track of how much metadata we need, and mmap extra pages to store metadata as necessary.
We still want to request memory from the operating system in conveniently sized chunks, but we can now abolish the notion of a megablock and the megablock allocator, and work purely with block coalescing. Additionally, the recorded heap location means that we can check if a pointer is HEAP_ALLOCED using a mask and equality check, solving #8199.
What do people think?
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.8.3 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"New block-structured heap organization for 64-bit","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.8.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Task","description":"I was having some discussion about GHC's block structured heap with Sergio Benitez and Adam Belay, and during the discussion it was suggested that the way GHC manages the block structured heap is suboptimal when we're on 64-bit architectures.\r\n\r\nAt the moment, we allocate memory from the operating system per-megablock, storing metadata in the very first megablock. We have to do this because, on 32-bit, we can't generally be too picky about what address our memory ends up living. On 64-bits, we have a lot more flexibility.\r\n\r\nHere is the proposal:\r\n\r\n1. Statically decide on a maximum heap size in a power of two.\r\n2. Next, probe for some appropriately aligned chunk of available virtual address space for this. On POSIX, we can mmap /dev/null using PROT_NONE and MAP_NORESERVE. On Windows, we can use VirtualAlloc with MEM_RESERVE. (There are few other runtimes which do this trick, including GCC Go.)\r\n3. Divide this region into blocks as before. The maximum heap size is now the megablock size, and the block size is still the same as before. Masking to find the block descriptor works as before.\r\n4. To allocate, we keep track of the high-watermark, and mmap in 1MB pages as they are requested. We also keep track of how much metadata we need, and mmap extra pages to store metadata as necessary.\r\n\r\nWe still want to request memory from the operating system in conveniently sized chunks, but we can now abolish the notion of a megablock and the megablock allocator, and work purely with block coalescing. Additionally, the recorded heap location means that we can check if a pointer is HEAP_ALLOCED using a mask and equality check, solving #8199.\r\n\r\nWhat do people think?","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1gcampaxgcampaxhttps://gitlab.haskell.org/ghc/ghc/-/issues/7919Heap corruption (segfault) from large 'let' expression2020-09-10T13:31:11ZduncanHeap corruption (segfault) from large 'let' expressionThe attached test program reliably triggers an assertion in the storage manager with the `-debug` rts.
```
LargeUse: internal error: ASSERTION FAILED: file rts/sm/GCUtils.c, line 208
(GHC version 7.6.3 for x86_64_unknown_linux)
...The attached test program reliably triggers an assertion in the storage manager with the `-debug` rts.
```
LargeUse: internal error: ASSERTION FAILED: file rts/sm/GCUtils.c, line 208
(GHC version 7.6.3 for x86_64_unknown_linux)
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
```
This behaviour is reproducible with many recent ghc versions (tried 7.6.3, 7.4.2, 6.12.3) and all fail at the same assertion when using the `-debug` rts. (Without `-debug` we get a more random variety of segfaults and GC errors.)
It looks like a pretty clear case of heap corruption. I'll explain why...
The test program uses TH to generate a program that looks like this:
```
data Large = Large Int Int ... -- 512 non-strict Int fields
test =
let step (Large i1 i2 ... i512) =
let j1 = i1 + i4
j2 = i2 + i7
...
j511 = i511 + i510
j512 = i512 + i1
in Large j1 j2 ... j512
in runSteps step 100000 (Large 1 1 ... 1)
-- basically an unfoldr:
runSteps :: (state -> (state, Int)) -> Int -> state -> [Int]
runSteps f n i | n <= 0 = []
| otherwise = case f i of
(i', r) -> r : runSteps f (n - 1) i'
```
We use TH to generate this program, and we use a "size" parameter that determines size of the data constructor (and corresponding letrec). This makes it easy to find the size threshold where it fails.
For small sizes this program works fine, and for larger values it triggers the assert. With ghc 7.6.3 on a x86-64 machine, the magic threshold is 511: that is, the program works fine with size 510 and hits the assertion at size 511. This is suspiciously close to 512. And of course on a 64bit machine 512 \* 8 is 4k, which is the storage manager's block size. And the failing assertion is in a bit of the storage manager that is dealing with blocks...
```
// If this block does not have enough space to allocate the
// current object, but it also doesn't have any work to push, then
// push it on to the scanned list. It cannot be empty, because
// then there would be enough room to copy the current object.
if (bd->u.scan == bd->free)
{
ASSERT(bd->free != bd->start);
push_scanned_block(bd, ws);
}
```
So it looks very much like we have a situation where something is writing over the end of a block and messing up the SM's data structures.
But, it is not nearly as simple as the data constructor being too big. We can demonstrate other programs that use much larger data constructors without any problem at all. Our suspicion falls on the big letrec.
Indeed with this program if we change the data constructor to have strict fields then it no longer fails, and we can run it with much larger data constructor sizes. What would be different between strict and non-strict fields here? Well, one observation is that when it is strict then ghc can (and does) turn the code into a big cascade of case expressions, while when it is non-strict then the STG code is all 'let's.
```
case tpl_s6jQ of tpl_s6Ak {
__DEFAULT ->
case tpl_s6jS of tpl_s6Al {
__DEFAULT ->
case tpl_s6jU of tpl_s6Am {
-- etc for all 500+ elements
```
versus
```
let {
sat_s5UK :: GHC.Types.Int
[LclId] =
\u [] GHC.Num.$fNumInt_$c+ i511_s5Ly i1_s5E9; } in
let {
sat_s62X :: GHC.Types.Int
[LclId] =
\u [] GHC.Num.$fNumInt_$c+ i510_s5R2 i509_s5UG; } in
let {
sat_s62W :: GHC.Types.Int
[LclId] =
\u [] GHC.Num.$fNumInt_$c+ i509_s5UG i506_s5UC; } in
-- etc for all 500+ elements
```
Note also, that it is nothing to do with the obvious space leak here. If we modify the code to generate an NFData instance and to use deepseq at each iteration then we eliminate the space leak, but we keep the big stg 'let', and it still fails.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Heap corruption (segfault) from large 'let' expression","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.6.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The attached test program reliably triggers an assertion in the storage manager with the `-debug` rts.\r\n\r\n{{{\r\nLargeUse: internal error: ASSERTION FAILED: file rts/sm/GCUtils.c, line 208\r\n\r\n (GHC version 7.6.3 for x86_64_unknown_linux)\r\n Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug\r\n}}}\r\n\r\nThis behaviour is reproducible with many recent ghc versions (tried 7.6.3, 7.4.2, 6.12.3) and all fail at the same assertion when using the `-debug` rts. (Without `-debug` we get a more random variety of segfaults and GC errors.)\r\n\r\nIt looks like a pretty clear case of heap corruption. I'll explain why...\r\n\r\nThe test program uses TH to generate a program that looks like this:\r\n{{{\r\ndata Large = Large Int Int ... -- 512 non-strict Int fields\r\n\r\ntest =\r\n let step (Large i1 i2 ... i512) =\r\n let j1 = i1 + i4\r\n j2 = i2 + i7\r\n ...\r\n j511 = i511 + i510\r\n j512 = i512 + i1\r\n in Large j1 j2 ... j512\r\n\r\n in runSteps step 100000 (Large 1 1 ... 1)\r\n\r\n-- basically an unfoldr:\r\nrunSteps :: (state -> (state, Int)) -> Int -> state -> [Int]\r\nrunSteps f n i | n <= 0 = []\r\n | otherwise = case f i of\r\n (i', r) -> r : runSteps f (n - 1) i'\r\n}}}\r\n\r\nWe use TH to generate this program, and we use a \"size\" parameter that determines size of the data constructor (and corresponding letrec). This makes it easy to find the size threshold where it fails.\r\n\r\nFor small sizes this program works fine, and for larger values it triggers the assert. With ghc 7.6.3 on a x86-64 machine, the magic threshold is 511: that is, the program works fine with size 510 and hits the assertion at size 511. This is suspiciously close to 512. And of course on a 64bit machine 512 * 8 is 4k, which is the storage manager's block size. And the failing assertion is in a bit of the storage manager that is dealing with blocks...\r\n\r\n{{{\r\n // If this block does not have enough space to allocate the\r\n // current object, but it also doesn't have any work to push, then \r\n // push it on to the scanned list. It cannot be empty, because\r\n // then there would be enough room to copy the current object.\r\n if (bd->u.scan == bd->free)\r\n {\r\n ASSERT(bd->free != bd->start);\r\n push_scanned_block(bd, ws);\r\n }\r\n}}}\r\n\r\nSo it looks very much like we have a situation where something is writing over the end of a block and messing up the SM's data structures.\r\n\r\nBut, it is not nearly as simple as the data constructor being too big. We can demonstrate other programs that use much larger data constructors without any problem at all. Our suspicion falls on the big letrec.\r\n\r\nIndeed with this program if we change the data constructor to have strict fields then it no longer fails, and we can run it with much larger data constructor sizes. What would be different between strict and non-strict fields here? Well, one observation is that when it is strict then ghc can (and does) turn the code into a big cascade of case expressions, while when it is non-strict then the STG code is all 'let's.\r\n\r\n{{{\r\n case tpl_s6jQ of tpl_s6Ak {\r\n __DEFAULT ->\r\n case tpl_s6jS of tpl_s6Al {\r\n __DEFAULT ->\r\n case tpl_s6jU of tpl_s6Am {\r\n -- etc for all 500+ elements\r\n}}}\r\nversus\r\n{{{\r\n let {\r\n sat_s5UK :: GHC.Types.Int\r\n [LclId] =\r\n \\u [] GHC.Num.$fNumInt_$c+ i511_s5Ly i1_s5E9; } in\r\n let {\r\n sat_s62X :: GHC.Types.Int\r\n [LclId] =\r\n \\u [] GHC.Num.$fNumInt_$c+ i510_s5R2 i509_s5UG; } in\r\n let {\r\n sat_s62W :: GHC.Types.Int\r\n [LclId] =\r\n \\u [] GHC.Num.$fNumInt_$c+ i509_s5UG i506_s5UC; } in\r\n -- etc for all 500+ elements\r\n}}}\r\n\r\nNote also, that it is nothing to do with the obvious space leak here. If we modify the code to generate an NFData instance and to use deepseq at each iteration then we eliminate the space leak, but we keep the big stg 'let', and it still fails.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/3937Cannot killThread in listen/accept on Windows threaded runtime2020-07-27T06:46:51ZguestCannot killThread in listen/accept on Windows threaded runtimeThe killThread is not able to kill threads accepting socket connections on Windows.
Run attached file either in GHCi:
```
runghc.exe ListenOn.hs
```
or compile threaded:
```
ghc --make -threaded ListenOn.hs
```
Resulting binary hang...The killThread is not able to kill threads accepting socket connections on Windows.
Run attached file either in GHCi:
```
runghc.exe ListenOn.hs
```
or compile threaded:
```
ghc --make -threaded ListenOn.hs
```
Resulting binary hangs.
Expected behavior: should finish without problems.
Non-threaded runtime produces expected behavior. Seems to work correctly on Linux.
Affected: ghc-6.10.4 and ghc-6.12.1.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.12.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Cannot killThread in listen/accept on Windows threaded runtime","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"6.12.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The killThread is not able to kill threads accepting socket connections on Windows.\r\n\r\nRun attached file either in GHCi:\r\n\r\n{{{\r\nrunghc.exe ListenOn.hs\r\n}}}\r\n\r\n\r\nor compile threaded:\r\n\r\n\r\n{{{\r\nghc --make -threaded ListenOn.hs\r\n}}}\r\n\r\n\r\nResulting binary hangs.\r\n\r\nExpected behavior: should finish without problems.\r\n\r\nNon-threaded runtime produces expected behavior. Seems to work correctly on Linux.\r\n\r\nAffected: ghc-6.10.4 and ghc-6.12.1.\r\n\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/3946Better diagnostic when entering a GC'd CAF2019-11-14T10:52:47ZSimon MarlowBetter diagnostic when entering a GC'd CAFCurrently a GC'd CAF contains a dangling pointer, so entering it will result in a segfault or some other random failure. It would be better to give a useful diagnostic in this case (see #3900), not only to detect when `keepCAFs` is neede...Currently a GC'd CAF contains a dangling pointer, so entering it will result in a segfault or some other random failure. It would be better to give a useful diagnostic in this case (see #3900), not only to detect when `keepCAFs` is needed, but also to help find bugs in the code generator's SRT table generation and GC bugs.
Here is one way it could be done. We use the static link field of a static closure to indicate whether the closure is live or not:
- link field is non-zero if and only if the closure was reachable at the last GC, otherwise it is zero
if an `IND_STATIC` closure has a zero link field, then we know for sure that the closure pointed to by the `IND_STATIC` is invalid and entering the `IND_STATIC` should give a helpful error message.
To implement this:
- on entering a CAF, set the link field to 1.
- at the beginning of (major) GC, set all the link fields for static closures that were reachable during the last major GC to zero
- during GC, link fields for reachable static closures get set to non-zero as they are linked onto first the static_objects list and then the scavenged_static_objects list.
One problem is that this scheme means traversing and writing to all the static closures at the beginning of GC, when some of them may be dead, and many will not be in the cache. The current way of doing this at the end of GC is better from a cache perspective. To refine the above approach, we could do the extra zeroing phase at the beginning of GC for `IND_STATIC` closures only, and the others would get the current treatment.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.12.1 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Better diagnostic when entering a GC'd CAF","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"7.0.1","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"6.12.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Task","description":"Currently a GC'd CAF contains a dangling pointer, so entering it will result in a segfault or some other random failure. It would be better to give a useful diagnostic in this case (see #3900), not only to detect when `keepCAFs` is needed, but also to help find bugs in the code generator's SRT table generation and GC bugs.\r\n\r\nHere is one way it could be done. We use the static link field of a static closure to indicate whether the closure is live or not:\r\n\r\n * link field is non-zero if and only if the closure was reachable at the last GC, otherwise it is zero\r\n\r\nif an `IND_STATIC` closure has a zero link field, then we know for sure that the closure pointed to by the `IND_STATIC` is invalid and entering the `IND_STATIC` should give a helpful error message.\r\n\r\nTo implement this:\r\n\r\n * on entering a CAF, set the link field to 1.\r\n * at the beginning of (major) GC, set all the link fields for static closures that were reachable during the last major GC to zero\r\n * during GC, link fields for reachable static closures get set to non-zero as they are linked onto first the static_objects list and then the scavenged_static_objects list.\r\n\r\nOne problem is that this scheme means traversing and writing to all the static closures at the beginning of GC, when some of them may be dead, and many will not be in the cache. The current way of doing this at the end of GC is better from a cache perspective. To refine the above approach, we could do the extra zeroing phase at the beginning of GC for `IND_STATIC` closures only, and the others would get the current treatment.\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/485AdjustorAsm.S doesn't build on AIX2019-07-07T19:17:47ZjgoerzenAdjustorAsm.S doesn't build on AIXOK, this is a weird one.
I'm building GHC 6.4.1 on AIX and using IBM's
assembler, since GNU binutils is known to have issues
on AIX.
When the build reached AdjustorAsm.S, I got:
```
imer.h -#include ProfHeap.h -#include LdvProfile.h
-...OK, this is a weird one.
I'm building GHC 6.4.1 on AIX and using IBM's
assembler, since GNU binutils is known to have issues
on AIX.
When the build reached AdjustorAsm.S, I got:
```
imer.h -#include ProfHeap.h -#include LdvProfile.h
-#include Profiling.h -#inclu
de Apply.h -fvia-C -dcmm-lint -c AdjustorAsm.S -o
AdjustorAsm.o
Assembler:
/tmp//ccq7dlbU.s: line 15: 1252-016 The specified
opcode or pseudo-op is not val
id.
Use supported instructions or pseudo-ops only.
/tmp//ccq7dlbU.s: line 48: 1252-149 Instruction subf is
not implemented in the c
urrent assembly mode COM.
/tmp//ccq7dlbU.s: line 52: 1252-142 Syntax error.
/tmp//ccq7dlbU.s: line 53: 1252-142 Syntax error.
/tmp//ccq7dlbU.s: line 58: 1252-142 Syntax error.
/tmp//ccq7dlbU.s: line 59: 1252-142 Syntax error.
make[2]: *** [AdjustorAsm.o] Error 1
```
After some research, I added -opta -Wa,-mppc, which
reduced the errors to:
```
/tmp//ccA1yNhC.s: line 15: 1252-016 The specified
opcode or pseudo-op is not val
id.
Use supported instructions or pseudo-ops only.
/tmp//ccA1yNhC.s: line 52: 1252-142 Syntax error.
/tmp//ccA1yNhC.s: line 53: 1252-142 Syntax error.
/tmp//ccA1yNhC.s: line 58: 1252-142 Syntax error.
/tmp//ccA1yNhC.s: line 59: 1252-142 Syntax error.
```
I examined the temp files and found that line 15
contains only the word ".text".
I was finally able to work around the problem by adding
-opta -save-temps to the command line, then using GNU
as like so:
as mppc -I. AdjustorAsm.s -o AdjustorAsm.o
I then copied the resulting .o file to the thr, p,
debug, etc. .o files. The build was then able to complete.8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/1820Windows segfault-catching only works for the main thread2019-07-07T19:11:35ZSimon MarlowWindows segfault-catching only works for the main threadOn Windows, the RTS tries to catch segmentation faults and divide-by-zero exceptions using structured exception handling in `rts/Main.c`. Unfortunately this only works for the main thread, so if the exception occurs in another thread it ...On Windows, the RTS tries to catch segmentation faults and divide-by-zero exceptions using structured exception handling in `rts/Main.c`. Unfortunately this only works for the main thread, so if the exception occurs in another thread it won't be caught. GHCi runs all its computations in a separate thread, hence `derefnull(ghci)` and `divbyzero(ghci)` are failing.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.8.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | Unknown |
</details>
<!-- {"blocked_by":[],"summary":"Windows segfault-catching only works for the main thread","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"6.8 branch","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"6.8.1","keywords":[],"differentials":[],"test_case":"","architecture":"Unknown","cc":[""],"type":"Bug","description":"On Windows, the RTS tries to catch segmentation faults and divide-by-zero exceptions using structured exception handling in `rts/Main.c`. Unfortunately this only works for the main thread, so if the exception occurs in another thread it won't be caught. GHCi runs all its computations in a separate thread, hence `derefnull(ghci)` and `divbyzero(ghci)` are failing.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/3693Show stack traces2019-07-07T19:02:45ZjpetShow stack tracesDebugging stack overflows can be very difficult, because GHC gives very little information as to exactly what is overflowing. Showing a basic stack dump (on crash, or in the ghci debugger) would be enormously helpful.
(Entered after spe...Debugging stack overflows can be very difficult, because GHC gives very little information as to exactly what is overflowing. Showing a basic stack dump (on crash, or in the ghci debugger) would be enormously helpful.
(Entered after spending two days trying to determine the cause of a stack overflow, before discovering it was a GHC bug \[3677\], which would have been apparent immediately if I could have only seen a callstack.)
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.10.4 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Show stack traces","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"6.10.4","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"FeatureRequest","description":"Debugging stack overflows can be very difficult, because GHC gives very little information as to exactly what is overflowing. Showing a basic stack dump (on crash, or in the ghci debugger) would be enormously helpful.\r\n\r\n(Entered after spending two days trying to determine the cause of a stack overflow, before discovering it was a GHC bug [3677], which would have been apparent immediately if I could have only seen a callstack.)\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1TarraschTarraschhttps://gitlab.haskell.org/ghc/ghc/-/issues/4520startup code on Windows should use SetDllDirectory("")2019-07-07T18:58:37Zduncanstartup code on Windows should use SetDllDirectory("")See [Raymond's blog](http://blogs.msdn.com/b/oldnewthing/archive/2010/11/10/10088566.aspx) about (un)safe dll loading. He points to a [support article](http://support.microsoft.com/kb/2389418) which recommends that new programs use `SetD...See [Raymond's blog](http://blogs.msdn.com/b/oldnewthing/archive/2010/11/10/10088566.aspx) about (un)safe dll loading. He points to a [support article](http://support.microsoft.com/kb/2389418) which recommends that new programs use `SetDllDirectory("")` to prevent the problem (it's not the default because that'd break old programs).
In the GHC context, this could go in the startup code for standalone executables. It is a process-scope property so changing it is not appropriate for DLL startup.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ---------------- |
| Version | 7.0.1 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | Unknown/Multiple |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"startup code on Windows should use SetDllDirectory(\"\")","status":"New","operating_system":"Unknown/Multiple","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.0.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"FeatureRequest","description":"See [http://blogs.msdn.com/b/oldnewthing/archive/2010/11/10/10088566.aspx Raymond's blog] about (un)safe dll loading. He points to a [http://support.microsoft.com/kb/2389418 support article] which recommends that new programs use `SetDllDirectory(\"\")` to prevent the problem (it's not the default because that'd break old programs).\r\n\r\nIn the GHC context, this could go in the startup code for standalone executables. It is a process-scope property so changing it is not appropriate for DLL startup.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/6079SEH exception handler not implemented on Win642019-07-07T18:52:17ZIan Lynagh <igloo@earth.li>SEH exception handler not implemented on Win64In `RtsMain.c` we only enable `BEGIN_CATCH`/`END_CATCH` on Win32. I think we need a completely different implementation for Win64.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ----------------...In `RtsMain.c` we only enable `BEGIN_CATCH`/`END_CATCH` on Win32. I think we need a completely different implementation for Win64.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.5 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | x86_64 (amd64) |
</details>
<!-- {"blocked_by":[],"summary":"SEH exception handler not implemented on Win64","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"7.6.1","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.5","keywords":[],"differentials":[],"test_case":"","architecture":"x86_64 (amd64)","cc":[""],"type":"Bug","description":"In `RtsMain.c` we only enable `BEGIN_CATCH`/`END_CATCH` on Win32. I think we need a completely different implementation for Win64.\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Tamar ChristinaTamar Christinahttps://gitlab.haskell.org/ghc/ghc/-/issues/7320GHC crashes when building on 32-bit Linux in a Linode2019-07-07T18:50:18ZbenlGHC crashes when building on 32-bit Linux in a LinodeTrying to build `haskell-src-exts` crashes in a 32-bit Linux Linode (under Xen).
GHC 7.0.4 seems ok, but I've tried GHC 7.2.2, 7.4.1 and 7.6.1 and they all either segfault or issue a bad instruction when building. This also happens when...Trying to build `haskell-src-exts` crashes in a 32-bit Linux Linode (under Xen).
GHC 7.0.4 seems ok, but I've tried GHC 7.2.2, 7.4.1 and 7.6.1 and they all either segfault or issue a bad instruction when building. This also happens when building GHC from source.
GHC 7.6.1 crashes more frequently than the others. When run under GDB it usually dies at the same program point (0x093c6263). Running GHC with -v says it's in the 'Tidy Core' stage.
I've tried under the latest Arch, as well as Debian. The Linode has 1GB of real RAM and 4GB of swap, so it shouldn't be running out of memory.
The Linode has been stable otherwise, it runs apache and trac, and has had uptimes of 3 months or more.
Running Ubuntu-32bit under Parallels (using OSX as the host) with the same memory configuration seems fine.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ------------ |
| Version | 7.6.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"GHC crashes when building on 32-bit Linux in a Linode","status":"New","operating_system":"","component":"Compiler","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.6.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"\r\nTrying to build {{{haskell-src-exts}}} crashes in a 32-bit Linux Linode (under Xen).\r\n\r\nGHC 7.0.4 seems ok, but I've tried GHC 7.2.2, 7.4.1 and 7.6.1 and they all either segfault or issue a bad instruction when building. This also happens when building GHC from source. \r\n\r\nGHC 7.6.1 crashes more frequently than the others. When run under GDB it usually dies at the same program point (0x093c6263). Running GHC with -v says it's in the 'Tidy Core' stage.\r\n\r\nI've tried under the latest Arch, as well as Debian. The Linode has 1GB of real RAM and 4GB of swap, so it shouldn't be running out of memory.\r\n\r\nThe Linode has been stable otherwise, it runs apache and trac, and has had uptimes of 3 months or more.\r\n\r\nRunning Ubuntu-32bit under Parallels (using OSX as the host) with the same memory configuration seems fine.\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/7482GHC.Event overwrites main IO managers hooks to RTS2019-07-07T18:49:33ZAndreasVoellmyGHC.Event overwrites main IO managers hooks to RTSThe IO manager registers two file descriptors with the RTS which the RTS uses to send control and wakeup signals to the IO manager. The main IO manager is started up by default and registers some file descriptors that it has allocated wi...The IO manager registers two file descriptors with the RTS which the RTS uses to send control and wakeup signals to the IO manager. The main IO manager is started up by default and registers some file descriptors that it has allocated with the RTS.
The base package also exposes a GHC.Event module which when initialized will also register files with the RTS, overwriting the main IO manager's files. Now the RTS can no longer signal the main IO manager.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.4.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | libraries/base |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"GHC.Event overwrites main IO managers hooks to RTS","status":"New","operating_system":"","component":"libraries/base","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.4.1","keywords":["IO","Manager,","RTS"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The IO manager registers two file descriptors with the RTS which the RTS uses to send control and wakeup signals to the IO manager. The main IO manager is started up by default and registers some file descriptors that it has allocated with the RTS. \r\n\r\nThe base package also exposes a GHC.Event module which when initialized will also register files with the RTS, overwriting the main IO manager's files. Now the RTS can no longer signal the main IO manager.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/7930Nested STM Invariants are lost2019-07-07T18:47:24ZRyan YatesNested STM Invariants are lostInvariants from a successful nested transaction should be merged with the parent.
```
import Control.Concurrent
import Control.Concurrent.STM
main = do
x <- atomically $
do a <- newTVar True
(always (readTVar a) ...Invariants from a successful nested transaction should be merged with the parent.
```
import Control.Concurrent
import Control.Concurrent.STM
main = do
x <- atomically $
do a <- newTVar True
(always (readTVar a) >> retry) `orElse` return ()
return a
atomically (writeTVar x False) -- Should not and does not fail
y <- atomically $
do a <- newTVar True
always (readTVar a) `orElse` return ()
return a
atomically (writeTVar y False) -- Should fail, but does not!
putStrLn "Ahhh!"
z <- atomically $
do a <- newTVar True
always (readTVar a)
return a
atomically (writeTVar z False) -- should and does fail
```
I know how to fix this. I'll have a patch with some tests and a fix soon.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Nested STM Invariants are lost","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"fryguybob"},"version":"7.6.3","keywords":["STM"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"Invariants from a successful nested transaction should be merged with the parent.\r\n\r\n{{{\r\nimport Control.Concurrent\r\nimport Control.Concurrent.STM\r\n\r\nmain = do\r\n x <- atomically $\r\n do a <- newTVar True\r\n (always (readTVar a) >> retry) `orElse` return ()\r\n return a\r\n atomically (writeTVar x False) -- Should not and does not fail\r\n\r\n y <- atomically $\r\n do a <- newTVar True\r\n always (readTVar a) `orElse` return ()\r\n return a\r\n atomically (writeTVar y False) -- Should fail, but does not!\r\n\r\n putStrLn \"Ahhh!\"\r\n\r\n z <- atomically $\r\n do a <- newTVar True\r\n always (readTVar a)\r\n return a\r\n atomically (writeTVar z False) -- should and does fail\r\n}}}\r\n\r\nI know how to fix this. I'll have a patch with some tests and a fix soon.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Ryan YatesRyan Yateshttps://gitlab.haskell.org/ghc/ghc/-/issues/8309traceEvent truncates to 512 bytes2019-07-07T18:45:37ZduncantraceEvent truncates to 512 bytesThe `Debug.Trace.traceEvent` (& `traceEventIO`) use a code path that unnecessarily uses printf format strings and uses a fixed size 512-byte buffer, hence truncating all user trace messages at that size.
Here is the call path:
- `Debug...The `Debug.Trace.traceEvent` (& `traceEventIO`) use a code path that unnecessarily uses printf format strings and uses a fixed size 512-byte buffer, hence truncating all user trace messages at that size.
Here is the call path:
- `Debug.Trace.traceEvent` (& `traceEventIO`) call `traceEvent#` primop
- cmm impl of the primop `stg_traceEventzh` calls C RTS function `traceUserMsg`
- `traceUserMsg` calls `traceFormatUserMsg(cap, "%s", msg);`, using the printf format string
- `traceFormatUserMsg` uses `postUserMsg` which eventually calls `postLogMsg`
- `postLogMsg` does the printf stuff
Here's what `postLogMsg` does:
```
#define BUF 512
void postLogMsg(EventsBuf *eb, EventTypeNum type, char *msg, va_list ap)
{
char buf[BUF];
nat size;
size = vsnprintf(buf,BUF,msg,ap);
if (size > BUF) {
buf[BUF-1] = '\0';
size = BUF;
}
....
```
This is obviously designed for RTS-internal users, not the user path.
So the problem starts with this bit:
```
void traceUserMsg(Capability *cap, char *msg)
{
traceFormatUserMsg(cap, "%s", msg);
}
```
It just should not use that code path. It should call something that directly posts the message, without any silly printf strings. In fact, the code path from the primop down should really be changed to use an explicit given length, rather than using a null-terminated string and having to call strlen.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"traceEvent truncates to 512 bytes","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.6.3","keywords":["eventlog","tracing"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The `Debug.Trace.traceEvent` (& `traceEventIO`) use a code path that unnecessarily uses printf format strings and uses a fixed size 512-byte buffer, hence truncating all user trace messages at that size.\r\n\r\nHere is the call path:\r\n\r\n * `Debug.Trace.traceEvent` (& `traceEventIO`) call `traceEvent#` primop\r\n * cmm impl of the primop `stg_traceEventzh` calls C RTS function `traceUserMsg`\r\n * `traceUserMsg` calls `traceFormatUserMsg(cap, \"%s\", msg);`, using the printf format string\r\n * `traceFormatUserMsg` uses `postUserMsg` which eventually calls `postLogMsg`\r\n * `postLogMsg` does the printf stuff\r\n\r\nHere's what `postLogMsg` does:\r\n\r\n{{{\r\n#define BUF 512\r\n\r\nvoid postLogMsg(EventsBuf *eb, EventTypeNum type, char *msg, va_list ap)\r\n{\r\n char buf[BUF];\r\n nat size;\r\n\r\n size = vsnprintf(buf,BUF,msg,ap);\r\n if (size > BUF) {\r\n buf[BUF-1] = '\\0';\r\n size = BUF;\r\n }\r\n\r\n ....\r\n}}}\r\n\r\nThis is obviously designed for RTS-internal users, not the user path.\r\n\r\nSo the problem starts with this bit:\r\n\r\n{{{\r\nvoid traceUserMsg(Capability *cap, char *msg)\r\n{\r\n traceFormatUserMsg(cap, \"%s\", msg);\r\n}\r\n}}}\r\n\r\nIt just should not use that code path. It should call something that directly posts the message, without any silly printf strings. In fact, the code path from the primop down should really be changed to use an explicit given length, rather than using a null-terminated string and having to call strlen.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/8785Replace hooks API in the RTS with something better2019-07-07T18:43:28ZSimon MarlowReplace hooks API in the RTS with something betterHooks rely on static linking behaviour which doesn't always work: we have to disable -Bsymbolic for the RTS on Linux (see `compiler/main/SysTools.lhs`) and it apparently doesn't work at all on Mac (#8754).
So instead of hooks we should ...Hooks rely on static linking behaviour which doesn't always work: we have to disable -Bsymbolic for the RTS on Linux (see `compiler/main/SysTools.lhs`) and it apparently doesn't work at all on Mac (#8754).
So instead of hooks we should be passing in information when we initialize the RTS, like we already do for some other things (`-rtsopts` etc.).
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.6.3 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | high |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Replace hooks API in the RTS with something better","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"7.10.1","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.6.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Task","description":"Hooks rely on static linking behaviour which doesn't always work: we have to disable -Bsymbolic for the RTS on Linux (see `compiler/main/SysTools.lhs`) and it apparently doesn't work at all on Mac (#8754).\r\n\r\nSo instead of hooks we should be passing in information when we initialize the RTS, like we already do for some other things (`-rtsopts` etc.).","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/9105Profiling binary consumes CPU even when idle on Linux.2019-07-07T18:41:54ZrobinpProfiling binary consumes CPU even when idle on Linux.The program is
```hs
import Control.Concurrent
import Control.Monad (forever)
main :: IO ()
main = forever $ threadDelay 1000000 >> return ()
```
Compiled with 32bit GHC 7.6.3 or 7.8.2 on Debian (inside a VM), GHC 7.4.1 on Ubuntu (not ...The program is
```hs
import Control.Concurrent
import Control.Monad (forever)
main :: IO ()
main = forever $ threadDelay 1000000 >> return ()
```
Compiled with 32bit GHC 7.6.3 or 7.8.2 on Debian (inside a VM), GHC 7.4.1 on Ubuntu (not VM).
The non-profiling binary doesn't consume CPU, the profiling does \~10% (of a 2Ghz machine). Running with `+RTS -I0`, so this is not the idle gc.
When strace-ing, the profiling one seems to receive a constant flow of SIGVTALRM, while the normal receives one burst each second.
I see I can switch off "master tick interval" with `-V0`, and then CPU is not used, but the consequences of this are not very well documented (apart from context switching becoming deterministic).
Interestingly, if I compile using profiling on Windows (latest haskell-platform, 64-bit), it doesn't use more CPU than the non-profiling.
So, the question is, why does this happen on Linux, and if it can be avoided somehow.8.0.1Ben GamariBen Gamarihttps://gitlab.haskell.org/ghc/ghc/-/issues/9261-S prints incorrect number of bound tasks2019-07-07T18:41:09Zedsko@edsko.net-S prints incorrect number of bound tasksIn `rts/Stats.c` we have:
```
statsPrintf(" TASKS: %d (%d bound, %d peak workers (%d total), using -N%d)\n",
taskCount, taskCount - workerCount,
peakWorkerCount, workerCount,
...In `rts/Stats.c` we have:
```
statsPrintf(" TASKS: %d (%d bound, %d peak workers (%d total), using -N%d)\n",
taskCount, taskCount - workerCount,
peakWorkerCount, workerCount,
n_capabilities);
```
but I think `taskCount - workerCount` must be wrong, because `taskCount` is the _current_ number of tasks, while `workerAcount` is the _total_ number of workers (accumulating). I think it should be:
```
statsPrintf(" TASKS: %d (%d bound, %d peak workers (%d total), using -N%d)\n",
taskCount, taskCount - currentWorkerCount,
peakWorkerCount, workerCount,
n_capabilities);
```8.0.1Thomas MiedemaThomas Miedemahttps://gitlab.haskell.org/ghc/ghc/-/issues/9516unsafeUnmask unmasks even inside uninterruptibleMask2019-07-07T18:40:12Zedsko@edsko.netunsafeUnmask unmasks even inside uninterruptibleMaskControl.Exception exports
```hs
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()
```
with documentation:
*When invoked inside `mask`, this function allows a blocked asynchronous exception to be raised, if one exists. ...Control.Exception exports
```hs
allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()
```
with documentation:
*When invoked inside `mask`, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking. When called outside `mask`, or inside `uninterruptibleMask`, this function has no effect.*
However, this is not actually true: `unsafeUnmask` unmasks exceptions even inside `uninterruptibleUnmask`, as the attached test demonstrates (the test uses a foreign call just to have something non-interruptible but still observable; in particular, doing a `print` *is* interruptible because it uses an `MVar` under the hood).
I think it is possible to define a better `unsafeUnmask` in user-land:
```hs
interruptible :: IO a -> IO a
interruptible act = do
st <- getMaskingState
case st of
Unmasked -> act
MaskedInterruptible -> unsafeUnmask act
MaskedUninterruptible -> act
```
but it still seems to be that we should either (i) change the behaviour of unsafeUnmask, or (ii) provide a version of `unsafeUnmask` with the behaviour as described and then change `allowInterrupt` to use that new version of `unsafeUnmask`, or at the very least (iii) change the documentation.
(One question with the above definition of `interruptible` is what happens when we *nest* `mask` and `uninterruptibleMask`?)
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.8.2 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"unsafeUnmask unmasks even inside uninterruptibleMask","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.8.2","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Bug","description":"Control.Exception exports\r\n\r\n{{{#!hs\r\nallowInterrupt :: IO ()\r\nallowInterrupt = unsafeUnmask $ return ()\r\n}}}\r\n\r\nwith documentation:\r\n\r\n''When invoked inside `mask`, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking. When called outside `mask`, or inside `uninterruptibleMask`, this function has no effect.''\r\n\r\nHowever, this is not actually true: `unsafeUnmask` unmasks exceptions even inside `uninterruptibleUnmask`, as the attached test demonstrates (the test uses a foreign call just to have something non-interruptible but still observable; in particular, doing a `print` ''is'' interruptible because it uses an `MVar` under the hood).\r\n\r\nI think it is possible to define a better `unsafeUnmask` in user-land:\r\n\r\n{{{#!hs\r\ninterruptible :: IO a -> IO a\r\ninterruptible act = do\r\n st <- getMaskingState\r\n case st of\r\n Unmasked -> act\r\n MaskedInterruptible -> unsafeUnmask act\r\n MaskedUninterruptible -> act\r\n}}}\r\n\r\nbut it still seems to be that we should either (i) change the behaviour of unsafeUnmask, or (ii) provide a version of `unsafeUnmask` with the behaviour as described and then change `allowInterrupt` to use that new version of `unsafeUnmask`, or at the very least (iii) change the documentation. \r\n\r\n(One question with the above definition of `interruptible` is what happens when we ''nest'' `mask` and `uninterruptibleMask`?)","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Simon MarlowSimon Marlow