GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2019-06-07T16:35:53Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/16697Eventlog Heap Profiling events and profiling time2019-06-07T16:35:53ZMatthew PickeringEventlog Heap Profiling events and profiling timeI'm not sure how to properly reconstruct a .hp graph from an eventlog.
The problem is that the times in `.hp` file subtract the time spend profiling from the sample times. If you use the
timestamps in the eventlog and the heap profile ...I'm not sure how to properly reconstruct a .hp graph from an eventlog.
The problem is that the times in `.hp` file subtract the time spend profiling from the sample times. If you use the
timestamps in the eventlog and the heap profile events then the timestamps clearly don't discount the time actually spent profiling.
There are also no event which happens at the end of a sample so you can't work out how long the sampling period was afterwards unless there is a guarantee that only events related to heap profiling will happen after the `HeapProfSampleBegin` and before the end of the sample.
I'm not sure if this skew really matters too much, the graphs have approximately the same shape.
Some solutions would be
1. Add a `HeapProfEndSample` event so that future times can be adjusted
2. Add the computed sample start time as a field to `HeapProfBeginSample`
The first sounds preferable so that other events can be more easily correlated with heap profiles.
What do you think @bgamari ?https://gitlab.haskell.org/ghc/ghc/-/issues/16744Teach the RTS to enable tracing (eventlog) when built in a profiling-enabled way2019-06-11T23:15:14ZAlp MestanogullariTeach the RTS to enable tracing (eventlog) when built in a profiling-enabled wayAs a follow-up to !1070 and !1093: in addition to having `-debug => -eventlog` and `-prof => -eventlog` implications hardcoded in both build systems, the RTS itself is "aware" that building it in a debug-enabled way also makes it capable...As a follow-up to !1070 and !1093: in addition to having `-debug => -eventlog` and `-prof => -eventlog` implications hardcoded in both build systems, the RTS itself is "aware" that building it in a debug-enabled way also makes it capable of producing eventlogs, thanks to these lines: https://gitlab.haskell.org/ghc/ghc/blob/master/includes/rts/Config.h#L29-37
``` c
/* DEBUG implies TRACING and TICKY_TICKY */
#if defined(DEBUG)
#if !defined(TRACING)
#define TRACING
#endif
...
#endif
```
Building the RTS with profiling support should have a similar effect (defining `TRACING`), but that is not the case at the moment. I'm making this issue as a reminder for myself or others to implement this. It should be a fairly straightforward patch, unless I'm missing something.8.10.1https://gitlab.haskell.org/ghc/ghc/-/issues/16698Documentation is not very clear about the nature of an eventlog timestamp2019-07-03T21:37:49ZMatthew PickeringDocumentation is not very clear about the nature of an eventlog timestampWhat is the timestamp in an eventlog? Is it supposed to be POSIX time or just an incrementing counter only useful for ordering?
The documentation doesn't really clarify this but is useful to know if you want to plot events.
https://dow...What is the timestamp in an eventlog? Is it supposed to be POSIX time or just an incrementing counter only useful for ordering?
The documentation doesn't really clarify this but is useful to know if you want to plot events.
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime_control.html#tracinghttps://gitlab.haskell.org/ghc/ghc/-/issues/16908Profiling samples on the eventlog are all given very similar timestamp2019-09-18T08:03:44ZMatthew PickeringProfiling samples on the eventlog are all given very similar timestampOn a simple test program is seems that events are not properly added to the eventlog buffer. I would at least like to understand why this is.
Starting from !1340
```
module Main where ...On a simple test program is seems that events are not properly added to the eventlog buffer. I would at least like to understand why this is.
Starting from !1340
```
module Main where
main = print $ foldl (+) 0 [1..10000000]
```
```
ghc Run.hs -prof
./Run +RTS -hb -l-au
```
The samples in the resulting eventlog are all too close together. It seems the time comes from when the buffer is flushed rather than
when the events are added to the buffer.
https://gist.github.com/mpickering/6d4a9e47047aa41f452930121688ab27
The `.hp` file looks fine.
https://gist.github.com/e69563c8a3712a3423ba1d287ae9f330https://gitlab.haskell.org/ghc/ghc/-/issues/17322Add eventlog events for `-p` profiling2019-10-23T17:19:28ZMatthew PickeringAdd eventlog events for `-p` profilingThe summary provided by `-p` profiling doesn't give an extremely accurate picture of the execution of a program. You can see generally how long something takes but you don't know what time it was called.
If we want to use tools like [sp...The summary provided by `-p` profiling doesn't give an extremely accurate picture of the execution of a program. You can see generally how long something takes but you don't know what time it was called.
If we want to use tools like [speedscope](https://www.speedscope.app/) the information needs to be more fine grained.
I can think of two potential options.
1. On each tick dump the current call stack to the eventlog (easy)
2. Log an event each time a cost centre is entered (more overhead)
Implementation notes:
1. Modifying `Proftimer.handleProfTick`
2. Modifying `ENTER_CCS_THUNK (perhaps)?`https://gitlab.haskell.org/ghc/ghc/-/issues/10350Should be able to specify path for eventlog output.2020-04-10T13:38:41ZliteronShould be able to specify path for eventlog output.Currently it is a fixed path, but the binary might run in an env where write access is restricted.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Vers...Currently it is a fixed path, but the binary might run in an env where write access is restricted.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 7.10.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | simonmar |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Should be able to specify path for eventlog output.","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"simonmar"},"version":"7.10.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["simonmar"],"type":"Bug","description":"Currently it is a fixed path, but the binary might run in an env where write access is restricted.","type_of_failure":"OtherFailure","blocking":[]} -->Simon MarlowSimon Marlowhttps://gitlab.haskell.org/ghc/ghc/-/issues/18210Eventlog flush racy?2020-05-27T07:58:03ZBen GamariEventlog flush racy?While responding to a thread on `ghc-devs` I noticed that there is apparently zero locking surrounding eventlog flushing. This means that if multiple capabilities fill their capabilities' event buffers simultaneously, begin flushing, and...While responding to a thread on `ghc-devs` I noticed that there is apparently zero locking surrounding eventlog flushing. This means that if multiple capabilities fill their capabilities' event buffers simultaneously, begin flushing, and those flushes require multiple `write`s we may get interleaving of event data chunks. This seems quite bad.https://gitlab.haskell.org/ghc/ghc/-/issues/18043RTS flushTrace function doesn't flush the eventlog2020-11-24T10:08:02ZDmitry IvanovRTS flushTrace function doesn't flush the eventlog## Summary
The `flushTrace` function flushes the eventlog file descriptor, but doesn't flush the event buffers to that descriptor.
For example, here's a fragment of RTS startup code:
```
/* Trace some basic information about the p...## Summary
The `flushTrace` function flushes the eventlog file descriptor, but doesn't flush the event buffers to that descriptor.
For example, here's a fragment of RTS startup code:
```
/* Trace some basic information about the process */
traceWallClockTime(); // (1)
traceOSProcessInfo(); // (2)
flushTrace(); // (3)
```
The intention of this code is to have this info at the very beginning of the eventlog. But I observe these events near the very end for my programs. Looks like (1) and (2) write to some memory buffers but don't touch the eventlog file descriptor, and (3) is flushing said descriptor but doesn't flush said memory buffer.
## Steps to reproduce
Compile some meaningful multithreaded program with `-eventlog -threaded` and run with `+RTS -l -N`. I could probably attach something small but sufficient if necessary.
Observe that the Wallclock event is somewhere not at the beginning of the eventlog.
## Expected behavior
WallClock event being at the beginning of the eventlog file, before events emitted by the application code like `Debug.Trace.traceEventIO`.
## Environment
* GHC version used: 8.8.3, 8.10.1
Optional:
* Operating System: macOS high sierra
* System Architecture: x86_649.2.1https://gitlab.haskell.org/ghc/ghc/-/issues/16710Work out and document how to build GHC with eventlog enabled2021-02-26T16:30:26ZMatthew PickeringWork out and document how to build GHC with eventlog enabledI want to build GHC with tracing (the eventlog) enabled. So far I have not succeeded by trying to replicate the approach taken
by `ghcDebugged`.
The criteria for success is invoking the resulting executable with `+RTS -l` would produc...I want to build GHC with tracing (the eventlog) enabled. So far I have not succeeded by trying to replicate the approach taken
by `ghcDebugged`.
The criteria for success is invoking the resulting executable with `+RTS -l` would produce an eventlog.Make removalMatthew PickeringAlp MestanogullariMatthew Pickeringhttps://gitlab.haskell.org/ghc/ghc/-/issues/19435Eventlog flushing with threaded is broken2021-03-01T15:19:39ZMatthew PickeringEventlog flushing with threaded is brokenCalling `flushEventLog` doesn't flush the buffers for each capability.
This is because the flush is in the wrong place. `yieldCapability` is not called on the normal code path for `stopAllCapabilities`, `yieldCapability` is only called ...Calling `flushEventLog` doesn't flush the buffers for each capability.
This is because the flush is in the wrong place. `yieldCapability` is not called on the normal code path for `stopAllCapabilities`, `yieldCapability` is only called when a sync has already been requested for another reason.
The probably correct place to flush is in between the `stop/releaseAllCapabilities` call.https://gitlab.haskell.org/ghc/ghc/-/issues/19436Eventlog flushing in non-threaded is broken2021-03-01T16:19:18ZMatthew PickeringEventlog flushing in non-threaded is broken`flushEventLog` in non-threaded way doesn't flush the `capEventBuf[0]` event buffer.`flushEventLog` in non-threaded way doesn't flush the `capEventBuf[0]` event buffer.Matthew PickeringMatthew Pickeringhttps://gitlab.haskell.org/ghc/ghc/-/issues/19078Restarting event log results in invalid output2021-03-05T21:33:31ZBen GamariRestarting event log results in invalid outputCurrently if one calls `endEventLogging()`, followed by `beginEventLogging()`, the eventlog that results has block markers before the header, rendering the output unreadable to `ghc-events`. The problem is that `endEventLogging()` calls ...Currently if one calls `endEventLogging()`, followed by `beginEventLogging()`, the eventlog that results has block markers before the header, rendering the output unreadable to `ghc-events`. The problem is that `endEventLogging()` calls `printAndClearEventsBuf(&eventBuf)`, which posts a marker after flushing the buffer. `postHeaderEvents()` then fails to clear this marker before posting the header.
The solution seems clear: call `resetEventsBuf()` in either `endEventLogging()` or `postHeaderEvents()`.https://gitlab.haskell.org/ghc/ghc/-/issues/3869RTS GC Statistics from -S should be logged via the eventlog system2021-03-05T21:34:44ZcjsRTS GC Statistics from -S should be logged via the eventlog systemThe -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged ...The -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged via the eventlog system (perhaps enabled via a "-lg" option) one could get more insight into the garbage generation and collection behaviour of one's program.
Note that it's probably not necessary also to store the information given at the end of the run with both "-s" and "-S", though it may be interesting to contemplate moving this sort of thing into the eventlog file as well.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | -------------- |
| Version | 6.12.1 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"RTS GC Statistics from -S should be logged via the eventlog system","status":"New","operating_system":"","component":"Runtime System","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"6.12.1","keywords":["collection,","garbage","gc,","statistics"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"FeatureRequest","description":"The -Sfilename option to the RTS gives useful GC statistics, but it's hard to correlate these with other events, particularly to see if GC is interrupting critical sections in mutator threads. If the same information were instead logged via the eventlog system (perhaps enabled via a \"-lg\" option) one could get more insight into the garbage generation and collection behaviour of one's program.\r\n\r\nNote that it's probably not necessary also to store the information given at the end of the run with both \"-s\" and \"-S\", though it may be interesting to contemplate moving this sort of thing into the eventlog file as well.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/19382Interupting doesn't seem to flush eventlog buffers2021-06-19T15:18:51ZMatthew PickeringInterupting doesn't seem to flush eventlog buffersWhen heap profiling long-running applications it seems that killing a process with `Ctrl-C` doesn't cause the eventlog buffers to be flushed. The signal handle seems to be designed to flush "profiling output" but the control flow around ...When heap profiling long-running applications it seems that killing a process with `Ctrl-C` doesn't cause the eventlog buffers to be flushed. The signal handle seems to be designed to flush "profiling output" but the control flow around this area made my head spin, so I couldn't see where the flush was supposed to happen.https://gitlab.haskell.org/ghc/ghc/-/issues/21070gen_event_types requires an audit2022-04-01T17:27:27ZMatthew Pickeringgen_event_types requires an auditDue to the recent refactor of the eventlog event generation I discovered two bugs due to incorrect information in `gen_event_types.py`, in particular the size for `EVENT_IPE` and `EVENT_MEM_RETURN` were incorrect. The whole file should b...Due to the recent refactor of the eventlog event generation I discovered two bugs due to incorrect information in `gen_event_types.py`, in particular the size for `EVENT_IPE` and `EVENT_MEM_RETURN` were incorrect. The whole file should be audited to ensure there are no more of these bugs.9.4.1Ben GamariBen Gamarihttps://gitlab.haskell.org/ghc/ghc/-/issues/21537Add fragmentation statistics to GHC.Stats2022-09-22T10:35:26ZTeo CamarasuAdd fragmentation statistics to GHC.Stats## Motivation
Fragmentation is a key statistic when using the non-moving garbage collector. The event log has information about heap fragmentation, but it would be helpful to have the same information available from `GHC.Stats`.
As far...## Motivation
Fragmentation is a key statistic when using the non-moving garbage collector. The event log has information about heap fragmentation, but it would be helpful to have the same information available from `GHC.Stats`.
As far as I understand we can already get at this statistic by taking `gcdetails_mem_in_use_bytes - gcdetails_live_bytes `. But I'm not sure that's right. And even if that's true, I think it would be more user friendly to make this stat available in an explicit way.
## Proposal
Add a field to `GHC.Stats.GCDetails` called `gcdetails_fragmentation_bytes` that gives the same stats as the event log fragmentation counter.
I'm happy to implement this.9.6.1Teo CamarasuTeo Camarasuhttps://gitlab.haskell.org/ghc/ghc/-/issues/21813`-la` doesn't enable all event types2022-11-23T18:18:12ZTeo Camarasu`-la` doesn't enable all event types
If you look at the implementation of `read_trace_flags` (https://gitlab.haskell.org/ghc/ghc/-/blob/master/rts/RtsFlags.c#L2335) you will see that `-la` doesn't in fact turn on all classes of events.
It seems to be missing `nonmoving_g...
If you look at the implementation of `read_trace_flags` (https://gitlab.haskell.org/ghc/ghc/-/blob/master/rts/RtsFlags.c#L2335) you will see that `-la` doesn't in fact turn on all classes of events.
It seems to be missing `nonmoving_gc` and `ticky`.
I'd be happy to submit a fix.Teo CamarasuTeo Camarasuhttps://gitlab.haskell.org/ghc/ghc/-/issues/22603Wasm backend testsuite failure: T200062023-02-01T13:31:42ZCheng ShaoWasm backend testsuite failure: T20006`T20006` fails with:
```
Wrong exit code for T20006(normal)(expected 0 , actual 1 )
Stderr ( T20006 ):
error: Uncaught RuntimeError: divide by zero
at setupRtsFlags (wasm://wasm/005c2cca:1:781073)
at hs_init_ghc (wasm://wasm/005...`T20006` fails with:
```
Wrong exit code for T20006(normal)(expected 0 , actual 1 )
Stderr ( T20006 ):
error: Uncaught RuntimeError: divide by zero
at setupRtsFlags (wasm://wasm/005c2cca:1:781073)
at hs_init_ghc (wasm://wasm/005c2cca:1:809573)
at hs_main (wasm://wasm/005c2cca:1:790284)
at main (wasm://wasm/005c2cca:1:27177)
at __main_void (wasm://wasm/005c2cca:1:1116161)
at _start (wasm://wasm/005c2cca:1:26644)
at Context.start (https://deno.land/std@0.167.0/wasi/snapshot_preview1.ts:1765:7)
at file:///workspace/.ghc-wasm/wasm-run/bin/wasm-run.js:24:9
*** unexpected failure for T20006(normal)
```
`T20006.hs`:
```haskell
module Main where
main :: IO ()
main = return ()
```
Run opts:
```python
test('T20006', [extra_run_opts('+RTS --eventlog-flush-interval=1')],
compile_and_run,
[''])
```Cheng ShaoCheng Shaohttps://gitlab.haskell.org/ghc/ghc/-/issues/20221Very long command line leads to an invalid eventlog2023-05-03T11:31:31ZMatthew PickeringVery long command line leads to an invalid eventlogReproduce with:
```
echo "module A100 where" > A100.hs
echo -optP-D__F{1..10000}__ > T14697-flags
ghc --make A100 $(cat T14697-flags) +RTS -l
```
Produces an eventlog which `ghc-events` fails to parse.
```
[nix-shell:~/eventlog2htm...Reproduce with:
```
echo "module A100 where" > A100.hs
echo -optP-D__F{1..10000}__ > T14697-flags
ghc --make A100 $(cat T14697-flags) +RTS -l
```
Produces an eventlog which `ghc-events` fails to parse.
```
[nix-shell:~/eventlog2html]$ cabal run eventlog2html -- ghc.eventlog
eventlog2html: Ix{Int}.index: Index (24415) out of range ((0,212))
```https://gitlab.haskell.org/ghc/ghc/-/issues/21647Crash on shutdown when eventlog is used2023-09-07T09:35:25ZAdam GundryCrash on shutdown when eventlog is used## Summary
When using the eventlog in a large server application, I observe
```
munmap_chunk(): invalid pointer
Aborted (core dumped)
```
when the program exits. The problem does not arise without the `-l` flag. The resulting eventlog c...## Summary
When using the eventlog in a large server application, I observe
```
munmap_chunk(): invalid pointer
Aborted (core dumped)
```
when the program exits. The problem does not arise without the `-l` flag. The resulting eventlog cannot be read by `eventlog2html`, presumably because it is incomplete or corrupted.
A backtrace indicates the problem is that `finishCapEventLogging` tries to call `stgFree` on a buffer that has already been freed.
```
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007fcc82cde859 in __GI_abort () at abort.c:79
#2 0x00007fcc82d4926e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fcc82e73298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007fcc82d512fc in malloc_printerr (str=str@entry=0x7fcc82e751e0 "munmap_chunk(): invalid pointer") at malloc.c:5347
#4 0x00007fcc82d5154c in munmap_chunk (p=<optimised out>) at malloc.c:2830
#5 0x00007fcc834004e6 in finishCapEventLogging () at rts/eventlog/EventLog.c:778
#6 0x00007fcc833d1a94 in hs_exit_ (wait_foreign=false) at rts/RtsStartup.c:526
#7 0x00007fcc833d2055 in shutdownHaskellAndExit (n=1, fastExit=<optimised out>) at rts/RtsStartup.c:652
#8 0x00007fcc84206680 in ?? () from /opt/ghc/9.2.2/lib/ghc-9.2.2/base-4.16.1.0/libHSbase-4.16.1.0-ghc9.2.2.so
#9 0x0000000000000000 in ?? ()
```
I don't see why `finishCapEventLogging` would be called twice though.
## Steps to reproduce
Unfortunately this is a large commercial codebase so I can't post a public reproducer, and I'm not sure where to start with extracting a minimal example.
## Expected behavior
`finishCapEventLogging` should not double free the buffer.
## Environment
* GHC version used: 9.2.2
* Operating System: Linux
* System Architecture: x86_649.8.1