GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2024-02-08T20:09:11Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/24287ACQUIRE_LOCK deadlock with ipe2024-02-08T20:09:11ZZubinACQUIRE_LOCK deadlock with ipeWhen GHC is compiled with both profiling and ipe information, we try to acquire the eventlog mutex twice in the same thread.
```
ghc: internal error: ACQUIRE_LOCK(0x1dc2e7e0) failed (rts/eventlog/EventLog.c:1572): 35
(GHC version 9....When GHC is compiled with both profiling and ipe information, we try to acquire the eventlog mutex twice in the same thread.
```
ghc: internal error: ACQUIRE_LOCK(0x1dc2e7e0) failed (rts/eventlog/EventLog.c:1572): 35
(GHC version 9.9.20231213 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
```
GHC is compiled using `--flavour=perf+profiled_ghc+no_dynamic_ghc+ipe+debug_ghc+debug_info`
Then we run ghc with both heap profiling and eventlogs enabled: `ghc +RTS -hb -l`
The lock is initially acquired in `postIPE`:
```c
void postIPE(const InfoProvEnt *ipe)
{
ACQUIRE_LOCK(&eventBufMutex);
```
Then we attempt to acquire it again in `flushAllCapsEventsBufs`
```c
void flushAllCapsEventsBufs(void)
{
if (!event_log_writer) {
return;
}
ACQUIRE_LOCK(&eventBufMutex);
printAndClearEventBuf(&eventBuf);
RELEASE_LOCK(&eventBufMutex);
```https://gitlab.haskell.org/ghc/ghc/-/issues/24197`dumpIPEToEventLog` disregards maximum event length2024-01-30T11:44:36ZBen Gamari`dumpIPEToEventLog` disregards maximum event lengthBuilding a compiler in the `default+ipe+debug_info+ghc_debug` flavour will quickly reveal that `dumpIPEToEventLog` makes no attempt to honor the maximum event length:
```
ghc: internal error: ASSERTION FAILED: file rts/eventlog/EventLog....Building a compiler in the `default+ipe+debug_info+ghc_debug` flavour will quickly reveal that `dumpIPEToEventLog` makes no attempt to honor the maximum event length:
```
ghc: internal error: ASSERTION FAILED: file rts/eventlog/EventLog.c, line 183
Stack trace:
0x110e52c8 set_initial_registers (rts/Libdw.c:294.5)
0x7ffff7b826b8 dwfl_thread_getframes (/nix/store/dbvlgp6fsp0ar1pn9chyc41j8y2lh2jx-elfutils-0.189/lib/libdw-0.189.so)
0x7ffff7b8220b get_one_thread_cb (/nix/store/dbvlgp6fsp0ar1pn9chyc41j8y2lh2jx-elfutils-0.189/lib/libdw-0.189.so)
0x7ffff7b8251a dwfl_getthreads (/nix/store/dbvlgp6fsp0ar1pn9chyc41j8y2lh2jx-elfutils-0.189/lib/libdw-0.189.so)
0x7ffff7b82a47 dwfl_getthread_frames (/nix/store/dbvlgp6fsp0ar1pn9chyc41j8y2lh2jx-elfutils-0.189/lib/libdw-0.189.so)
0x110e519b libdwGetBacktrace (rts/Libdw.c:263.15)
0x1107def8 rtsFatalInternalErrorFn (rts/RtsMessages.c:175.22)
0x1107dab7 barf (rts/RtsMessages.c:49.3)
0x1107db1a errorBelch (rts/RtsMessages.c:68.1)
0x1109959c postString (rts/eventlog/EventLog.c:184.9)
0x1109c6ca postIPE (rts/eventlog/EventLog.c:1449.36)
0x11095807 traceIPE (rts/Trace.c:702.1)
0x11074957 dumpIPEToEventLog (rts/IPE.c:122.50)
0x11099a1d postInitEvent (rts/eventlog/EventLog.c:321.5)
0x1107e721 hs_init_ghc (rts/RtsStartup.c:403.5)
0x1107d93e hs_main (rts/RtsMain.c:57.5)
0x751bc1 (null) (/mnt/data/exp/ghc/ghc-landing/_build/stage1/bin/ghc)
0x7ffff7c09ace __libc_start_call_main (/nix/store/dg8mpqqykmw9c7l0bgzzb5znkymlbfjw-glibc-2.37-8/lib/libc.so.6)
0x7ffff7c09b89 __libc_start_main@@GLIBC_2.34 (/nix/store/dg8mpqqykmw9c7l0bgzzb5znkymlbfjw-glibc-2.37-8/lib/libc.so.6)
0x409055 _start (/mnt/data/exp/ghc/ghc-landing/_build/stage1/bin/ghc)
(GHC version 9.9.20231025 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
```
The offending IPE field appears to be `ty_desc`, which in this case is 76314 characters long (being the `Generic` representation of the `GHC.Tc.Errors.Types.TcRnMessage` type).
Without RTS assertions enabled this results in the program crashing due to non-obvious heap corruption.https://gitlab.haskell.org/ghc/ghc/-/issues/21647Crash on shutdown when eventlog is used2023-09-07T09:35:25ZAdam GundryCrash on shutdown when eventlog is used## Summary
When using the eventlog in a large server application, I observe
```
munmap_chunk(): invalid pointer
Aborted (core dumped)
```
when the program exits. The problem does not arise without the `-l` flag. The resulting eventlog c...## Summary
When using the eventlog in a large server application, I observe
```
munmap_chunk(): invalid pointer
Aborted (core dumped)
```
when the program exits. The problem does not arise without the `-l` flag. The resulting eventlog cannot be read by `eventlog2html`, presumably because it is incomplete or corrupted.
A backtrace indicates the problem is that `finishCapEventLogging` tries to call `stgFree` on a buffer that has already been freed.
```
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007fcc82cde859 in __GI_abort () at abort.c:79
#2 0x00007fcc82d4926e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fcc82e73298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007fcc82d512fc in malloc_printerr (str=str@entry=0x7fcc82e751e0 "munmap_chunk(): invalid pointer") at malloc.c:5347
#4 0x00007fcc82d5154c in munmap_chunk (p=<optimised out>) at malloc.c:2830
#5 0x00007fcc834004e6 in finishCapEventLogging () at rts/eventlog/EventLog.c:778
#6 0x00007fcc833d1a94 in hs_exit_ (wait_foreign=false) at rts/RtsStartup.c:526
#7 0x00007fcc833d2055 in shutdownHaskellAndExit (n=1, fastExit=<optimised out>) at rts/RtsStartup.c:652
#8 0x00007fcc84206680 in ?? () from /opt/ghc/9.2.2/lib/ghc-9.2.2/base-4.16.1.0/libHSbase-4.16.1.0-ghc9.2.2.so
#9 0x0000000000000000 in ?? ()
```
I don't see why `finishCapEventLogging` would be called twice though.
## Steps to reproduce
Unfortunately this is a large commercial codebase so I can't post a public reproducer, and I'm not sure where to start with extracting a minimal example.
## Expected behavior
`finishCapEventLogging` should not double free the buffer.
## Environment
* GHC version used: 9.2.2
* Operating System: Linux
* System Architecture: x86_649.8.1https://gitlab.haskell.org/ghc/ghc/-/issues/20221Very long command line leads to an invalid eventlog2023-05-03T11:31:31ZMatthew PickeringVery long command line leads to an invalid eventlogReproduce with:
```
echo "module A100 where" > A100.hs
echo -optP-D__F{1..10000}__ > T14697-flags
ghc --make A100 $(cat T14697-flags) +RTS -l
```
Produces an eventlog which `ghc-events` fails to parse.
```
[nix-shell:~/eventlog2htm...Reproduce with:
```
echo "module A100 where" > A100.hs
echo -optP-D__F{1..10000}__ > T14697-flags
ghc --make A100 $(cat T14697-flags) +RTS -l
```
Produces an eventlog which `ghc-events` fails to parse.
```
[nix-shell:~/eventlog2html]$ cabal run eventlog2html -- ghc.eventlog
eventlog2html: Ix{Int}.index: Index (24415) out of range ((0,212))
```https://gitlab.haskell.org/ghc/ghc/-/issues/22603Wasm backend testsuite failure: T200062023-02-01T13:31:42ZCheng ShaoWasm backend testsuite failure: T20006`T20006` fails with:
```
Wrong exit code for T20006(normal)(expected 0 , actual 1 )
Stderr ( T20006 ):
error: Uncaught RuntimeError: divide by zero
at setupRtsFlags (wasm://wasm/005c2cca:1:781073)
at hs_init_ghc (wasm://wasm/005...`T20006` fails with:
```
Wrong exit code for T20006(normal)(expected 0 , actual 1 )
Stderr ( T20006 ):
error: Uncaught RuntimeError: divide by zero
at setupRtsFlags (wasm://wasm/005c2cca:1:781073)
at hs_init_ghc (wasm://wasm/005c2cca:1:809573)
at hs_main (wasm://wasm/005c2cca:1:790284)
at main (wasm://wasm/005c2cca:1:27177)
at __main_void (wasm://wasm/005c2cca:1:1116161)
at _start (wasm://wasm/005c2cca:1:26644)
at Context.start (https://deno.land/std@0.167.0/wasi/snapshot_preview1.ts:1765:7)
at file:///workspace/.ghc-wasm/wasm-run/bin/wasm-run.js:24:9
*** unexpected failure for T20006(normal)
```
`T20006.hs`:
```haskell
module Main where
main :: IO ()
main = return ()
```
Run opts:
```python
test('T20006', [extra_run_opts('+RTS --eventlog-flush-interval=1')],
compile_and_run,
[''])
```Cheng ShaoCheng Shaohttps://gitlab.haskell.org/ghc/ghc/-/issues/21813`-la` doesn't enable all event types2022-11-23T18:18:12ZTeo Camarasu`-la` doesn't enable all event types
If you look at the implementation of `read_trace_flags` (https://gitlab.haskell.org/ghc/ghc/-/blob/master/rts/RtsFlags.c#L2335) you will see that `-la` doesn't in fact turn on all classes of events.
It seems to be missing `nonmoving_g...
If you look at the implementation of `read_trace_flags` (https://gitlab.haskell.org/ghc/ghc/-/blob/master/rts/RtsFlags.c#L2335) you will see that `-la` doesn't in fact turn on all classes of events.
It seems to be missing `nonmoving_gc` and `ticky`.
I'd be happy to submit a fix.Teo CamarasuTeo Camarasuhttps://gitlab.haskell.org/ghc/ghc/-/issues/21537Add fragmentation statistics to GHC.Stats2022-09-22T10:35:26ZTeo CamarasuAdd fragmentation statistics to GHC.Stats## Motivation
Fragmentation is a key statistic when using the non-moving garbage collector. The event log has information about heap fragmentation, but it would be helpful to have the same information available from `GHC.Stats`.
As far...## Motivation
Fragmentation is a key statistic when using the non-moving garbage collector. The event log has information about heap fragmentation, but it would be helpful to have the same information available from `GHC.Stats`.
As far as I understand we can already get at this statistic by taking `gcdetails_mem_in_use_bytes - gcdetails_live_bytes `. But I'm not sure that's right. And even if that's true, I think it would be more user friendly to make this stat available in an explicit way.
## Proposal
Add a field to `GHC.Stats.GCDetails` called `gcdetails_fragmentation_bytes` that gives the same stats as the event log fragmentation counter.
I'm happy to implement this.9.6.1Teo CamarasuTeo Camarasuhttps://gitlab.haskell.org/ghc/ghc/-/issues/21070gen_event_types requires an audit2022-04-01T17:27:27ZMatthew Pickeringgen_event_types requires an auditDue to the recent refactor of the eventlog event generation I discovered two bugs due to incorrect information in `gen_event_types.py`, in particular the size for `EVENT_IPE` and `EVENT_MEM_RETURN` were incorrect. The whole file should b...Due to the recent refactor of the eventlog event generation I discovered two bugs due to incorrect information in `gen_event_types.py`, in particular the size for `EVENT_IPE` and `EVENT_MEM_RETURN` were incorrect. The whole file should be audited to ensure there are no more of these bugs.9.4.1Ben GamariBen Gamarihttps://gitlab.haskell.org/ghc/ghc/-/issues/19382Interupting doesn't seem to flush eventlog buffers2021-06-19T15:18:51ZMatthew PickeringInterupting doesn't seem to flush eventlog buffersWhen heap profiling long-running applications it seems that killing a process with `Ctrl-C` doesn't cause the eventlog buffers to be flushed. The signal handle seems to be designed to flush "profiling output" but the control flow around ...When heap profiling long-running applications it seems that killing a process with `Ctrl-C` doesn't cause the eventlog buffers to be flushed. The signal handle seems to be designed to flush "profiling output" but the control flow around this area made my head spin, so I couldn't see where the flush was supposed to happen.https://gitlab.haskell.org/ghc/ghc/-/issues/3869RTS GC Statistics from -S should be logged via the eventlog system2021-03-05T21:34:44ZcjsRTS GC Statistics from -S should be logged via the eventlog systemThe -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged ...The -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged via the eventlog system (perhaps enabled via a "-lg" option) one could get more insight into the garbage generation and collection behaviour of one's program.
Note that it's probably not necessary also to store the information given at the end of the run with both "-s" and "-S", though it may be interesting to contemplate moving this sort of thing into the eventlog file as well.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.12.1 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"RTS GC Statistics from -S should be logged via the eventlog system","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"6.12.1","keywords":["collection,","garbage","gc,","statistics"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"FeatureRequest","description":"The -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged via the eventlog system (perhaps enabled via a \"-lg\" option) one could get more insight into the garbage generation and collection behaviour of one's program.\r\n\r\nNote that it's probably not necessary also to store the information given at the end of the run with both \"-s\" and \"-S\", though it may be interesting to contemplate moving this sort of thing into the eventlog file as well.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/19078Restarting event log results in invalid output2021-03-05T21:33:31ZBen GamariRestarting event log results in invalid outputCurrently if one calls `endEventLogging()`, followed by `beginEventLogging()`, the eventlog that results has block markers before the header, rendering the output unreadable to `ghc-events`. The problem is that `endEventLogging()` calls ...Currently if one calls `endEventLogging()`, followed by `beginEventLogging()`, the eventlog that results has block markers before the header, rendering the output unreadable to `ghc-events`. The problem is that `endEventLogging()` calls `printAndClearEventsBuf(&eventBuf)`, which posts a marker after flushing the buffer. `postHeaderEvents()` then fails to clear this marker before posting the header.
The solution seems clear: call `resetEventsBuf()` in either `endEventLogging()` or `postHeaderEvents()`.https://gitlab.haskell.org/ghc/ghc/-/issues/19436Eventlog flushing in non-threaded is broken2021-03-01T16:19:18ZMatthew PickeringEventlog flushing in non-threaded is broken`flushEventLog` in non-threaded way doesn't flush the `capEventBuf[0]` event buffer.`flushEventLog` in non-threaded way doesn't flush the `capEventBuf[0]` event buffer.Matthew PickeringMatthew Pickeringhttps://gitlab.haskell.org/ghc/ghc/-/issues/19435Eventlog flushing with threaded is broken2021-03-01T15:19:39ZMatthew PickeringEventlog flushing with threaded is brokenCalling `flushEventLog` doesn't flush the buffers for each capability.
This is because the flush is in the wrong place. `yieldCapability` is not called on the normal code path for `stopAllCapabilities`, `yieldCapability` is only called ...Calling `flushEventLog` doesn't flush the buffers for each capability.
This is because the flush is in the wrong place. `yieldCapability` is not called on the normal code path for `stopAllCapabilities`, `yieldCapability` is only called when a sync has already been requested for another reason.
The probably correct place to flush is in between the `stop/releaseAllCapabilities` call.https://gitlab.haskell.org/ghc/ghc/-/issues/16710Work out and document how to build GHC with eventlog enabled2021-02-26T16:30:26ZMatthew PickeringWork out and document how to build GHC with eventlog enabledI want to build GHC with tracing (the eventlog) enabled. So far I have not succeeded by trying to replicate the approach taken
by `ghcDebugged`.
The criteria for success is invoking the resulting executable with `+RTS -l` would produc...I want to build GHC with tracing (the eventlog) enabled. So far I have not succeeded by trying to replicate the approach taken
by `ghcDebugged`.
The criteria for success is invoking the resulting executable with `+RTS -l` would produce an eventlog.Make removalMatthew PickeringAlp MestanogullariMatthew Pickeringhttps://gitlab.haskell.org/ghc/ghc/-/issues/18043RTS flushTrace function doesn't flush the eventlog2020-11-24T10:08:02ZDmitry IvanovRTS flushTrace function doesn't flush the eventlog## Summary
The `flushTrace` function flushes the eventlog file descriptor, but doesn't flush the event buffers to that descriptor.
For example, here's a fragment of RTS startup code:
```
/* Trace some basic information about the p...## Summary
The `flushTrace` function flushes the eventlog file descriptor, but doesn't flush the event buffers to that descriptor.
For example, here's a fragment of RTS startup code:
```
/* Trace some basic information about the process */
traceWallClockTime(); // (1)
traceOSProcessInfo(); // (2)
flushTrace(); // (3)
```
The intention of this code is to have this info at the very beginning of the eventlog. But I observe these events near the very end for my programs. Looks like (1) and (2) write to some memory buffers but don't touch the eventlog file descriptor, and (3) is flushing said descriptor but doesn't flush said memory buffer.
## Steps to reproduce
Compile some meaningful multithreaded program with `-eventlog -threaded` and run with `+RTS -l -N`. I could probably attach something small but sufficient if necessary.
Observe that the Wallclock event is somewhere not at the beginning of the eventlog.
## Expected behavior
WallClock event being at the beginning of the eventlog file, before events emitted by the application code like `Debug.Trace.traceEventIO`.
## Environment
* GHC version used: 8.8.3, 8.10.1
Optional:
* Operating System: macOS high sierra
* System Architecture: x86_649.2.1https://gitlab.haskell.org/ghc/ghc/-/issues/18210Eventlog flush racy?2020-05-27T07:58:03ZBen GamariEventlog flush racy?While responding to a thread on `ghc-devs` I noticed that there is apparently zero locking surrounding eventlog flushing. This means that if multiple capabilities fill their capabilities' event buffers simultaneously, begin flushing, and...While responding to a thread on `ghc-devs` I noticed that there is apparently zero locking surrounding eventlog flushing. This means that if multiple capabilities fill their capabilities' event buffers simultaneously, begin flushing, and those flushes require multiple `write`s we may get interleaving of event data chunks. This seems quite bad.https://gitlab.haskell.org/ghc/ghc/-/issues/10350Should be able to specify path for eventlog output.2020-04-10T13:38:41ZliteronShould be able to specify path for eventlog output.Currently it is a fixed path, but the binary might run in an env where write access is restricted.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Vers...Currently it is a fixed path, but the binary might run in an env where write access is restricted.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.10.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Should be able to specify path for eventlog output.","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.10.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Bug","description":"Currently it is a fixed path, but the binary might run in an env where write access is restricted.","type_of_failure":"OtherFailure","blocking":[]} -->Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/17322Add eventlog events for `-p` profiling2019-10-23T17:19:28ZMatthew PickeringAdd eventlog events for `-p` profilingThe summary provided by `-p` profiling doesn't give an extremely accurate picture of the execution of a program. You can see generally how long something takes but you don't know what time it was called.
If we want to use tools like [sp...The summary provided by `-p` profiling doesn't give an extremely accurate picture of the execution of a program. You can see generally how long something takes but you don't know what time it was called.
If we want to use tools like [speedscope](https://www.speedscope.app/) the information needs to be more fine grained.
I can think of two potential options.
1. On each tick dump the current call stack to the eventlog (easy)
2. Log an event each time a cost centre is entered (more overhead)
Implementation notes:
1. Modifying `Proftimer.handleProfTick`
2. Modifying `ENTER_CCS_THUNK (perhaps)?`https://gitlab.haskell.org/ghc/ghc/-/issues/16908Profiling samples on the eventlog are all given very similar timestamp2019-09-18T08:03:44ZMatthew PickeringProfiling samples on the eventlog are all given very similar timestampOn a simple test program is seems that events are not properly added to the eventlog buffer. I would at least like to understand why this is.
Starting from !1340
```
module Main where ...On a simple test program is seems that events are not properly added to the eventlog buffer. I would at least like to understand why this is.
Starting from !1340
```
module Main where
main = print $ foldl (+) 0 [1..10000000]
```
```
ghc Run.hs -prof
./Run +RTS -hb -l-au
```
The samples in the resulting eventlog are all too close together. It seems the time comes from when the buffer is flushed rather than
when the events are added to the buffer.
https://gist.github.com/mpickering/6d4a9e47047aa41f452930121688ab27
The `.hp` file looks fine.
https://gist.github.com/e69563c8a3712a3423ba1d287ae9f330https://gitlab.haskell.org/ghc/ghc/-/issues/16698Documentation is not very clear about the nature of an eventlog timestamp2019-07-03T21:37:49ZMatthew PickeringDocumentation is not very clear about the nature of an eventlog timestampWhat is the timestamp in an eventlog? Is it supposed to be POSIX time or just an incrementing counter only useful for ordering?
The documentation doesn't really clarify this but is useful to know if you want to plot events.
https://dow...What is the timestamp in an eventlog? Is it supposed to be POSIX time or just an incrementing counter only useful for ordering?
The documentation doesn't really clarify this but is useful to know if you want to plot events.
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime_control.html#tracing