Skip to content

Propagate breakpoints when inlining

The goal of this MR is to enable breakpoint preservation when inlining code as described in #23394 (closed).

Update: I've gained some confidence about the constraints involved in this topic and came up with a solution that seems reasonable to me. I've kept my previously confused thoughts about it in this description below in the collapsible section.

Quick summary of the changes:

  • Add constructor IfaceBreakpoint to IfaceTickish
  • Store breakpoint data in interface files
  • Store BreakArray for the breakpoint's module, not the current module, in BCOs
  • Store module name in BCOs instead of Unique, since the Unique from an Iface doesn't match the modules in GHCi's state
  • Allocate module name in ModBreaks, like BreakArray
  • Lookup breakpoint by module name in GHCi
  • Skip creating breakpoint instructions when no ModBreaks are available, rather than injecting ModBreaks in the linker when breakpoints are enabled, and panicking when ModBreaks is missing

I decided that the contortions made with the ModBreaks in the BCGen are unnecessary, so I implemented that when ModBreaks are Nothing (either from the arg given to byteCodeGen or the HPT for foreign-module breakpoints), we assume that breakpoints are undesired for this expression and completely omit them from the BCOs. My understanding is that ModBreaks are optional primarily because of TH (and pending !10466 (closed) because the user explicitly disabled them), but please disabuse me of that misconception if I'm mistaken. In any case, it doesn't seem dangerous to not panic when ModBreaks are missing for obscure reasons (there's a test in which a boot module is reloaded and ModBreaks is reset, which results in a core lint error when adding inlining, but that might better be handled in !10550 (closed) or !10563 (closed)

Work sponsored by Tweag

Previous description

My implementation is comically clunky since I've figured out how all of this works while writing it, so I'd appreciate any advice about more appropriate solutions.

The first assumption I made is that breakpoint data needs to be propagated across the Iface boundary, including the module in which it was first created, which was pretty straightforward to achieve.

As a result, an inlined function's breakpoint would show up in StgToByteCode.schemeER_wrk with its defining module attached. As a naive attempt, I replaced the Unique that was previously obtained from the currently processed module with the one from the breakpoint, but that did not have the desired result – the Unique stored in GHCi's state turned out to be different from the one in the Iface data.

My next assumptions about what needed to be done were:

  • Like cost centres, the breakpoint information relayed to GHCi must contain the module name instead of the Unique
  • The array that the RTS consults when encountering a break site needs to be the modBreaks_flags for the breakpoint's module, not the current module's referenced by BCOPtrBreakArray

For the first part, I had a hard time figuring out how to send a string from StgToByteCode through the RTS to GHCi, and after many trials and errors I achieved it by using the GHCi message mallocData. This is certainly a pretty bad solution, since it allocates a new string for each breakpoint with the same contents. A nicer solution might be to add another ForeignRef to ModBreaks, but that is only available as Maybe for the current module (although mallocData also isn't unconditionally available, so both solutions are equally bad in that regard).

On the GHCi side, I then use the module to lookup the breakpoints in the HPT, which works fine. I had some trouble figuring out what type to use for the BreakpointCallback argument that doesn't segfault, and only had success with a Word# that I then assembled into a RemotePtr. There's surely a better way to do this as well.

For the break array, I looked up the ModBreaks of the breakpoints module in the HPT in StgToByteCode. I don't know whether it's guaranteed that all local dependencies are present in there, and I assume that this won't work at all for Iface data from an object file – for the latter, I assume we can't use breakpoints anyway, so maybe the breakpoint should just be discarded in that case. Anyway, right now the code assumes that a missing ModBreaks means that the breakpoint is from the current module.

I modified BCOPtrBreakArray to take a Maybe (ForeignRef BreakArray) since that was the only way I managed to encode the pointer without segfaulting. Ultimately, I didn't have to make any changes to the RTS – all arguments have the same representation as before.

As you can see from the test I added, it works for a simple case! But there is certainly a lot to take care of still, or do differently, so I'd appreciate any suggestions and elucidations of my mistakes.

Edited by Torsten Schmits

Merge request reports