Propagate breakpoints when inlining
The goal of this MR is to enable breakpoint preservation when inlining code as described in #23394 (closed).
Update: I've gained some confidence about the constraints involved in this topic and came up with a solution that seems reasonable to me. I've kept my previously confused thoughts about it in this description below in the collapsible section.
Quick summary of the changes:
- Add constructor
IfaceBreakpoint
toIfaceTickish
- Store breakpoint data in interface files
- Store
BreakArray
for the breakpoint's module, not the current module, in BCOs - Store module name in BCOs instead of
Unique
, since theUnique
from anIface
doesn't match the modules in GHCi's state - Allocate module name in
ModBreaks
, likeBreakArray
- Lookup breakpoint by module name in GHCi
- Skip creating breakpoint instructions when no
ModBreaks
are available, rather than injectingModBreaks
in the linker when breakpoints are enabled, and panicking whenModBreaks
is missing
I decided that the contortions made with the ModBreaks
in the BCGen are unnecessary, so I implemented that when ModBreaks
are Nothing
(either from the arg given to byteCodeGen
or the HPT for foreign-module breakpoints), we assume that breakpoints are undesired for this expression and completely omit them from the BCOs.
My understanding is that ModBreaks
are optional primarily because of TH (and pending !10466 (closed) because the user explicitly disabled them), but please disabuse me of that misconception if I'm mistaken.
In any case, it doesn't seem dangerous to not panic when ModBreaks
are missing for obscure reasons (there's a test in which a boot module is reloaded and ModBreaks
is reset, which results in a core lint error when adding inlining, but that might better be handled in !10550 (closed) or !10563 (closed)
Work sponsored by Tweag
Previous description
My implementation is comically clunky since I've figured out how all of this works while writing it, so I'd appreciate any advice about more appropriate solutions.
The first assumption I made is that breakpoint data needs to be propagated across the Iface
boundary, including the module in which it was first created, which was pretty straightforward to achieve.
As a result, an inlined function's breakpoint would show up in StgToByteCode.schemeER_wrk
with its defining module attached.
As a naive attempt, I replaced the Unique
that was previously obtained from the currently processed module with the one from the breakpoint, but that did not have the desired result – the Unique
stored in GHCi's state turned out to be different from the one in the Iface
data.
My next assumptions about what needed to be done were:
- Like cost centres, the breakpoint information relayed to GHCi must contain the module name instead of the
Unique
- The array that the RTS consults when encountering a break site needs to be the
modBreaks_flags
for the breakpoint's module, not the current module's referenced byBCOPtrBreakArray
For the first part, I had a hard time figuring out how to send a string from StgToByteCode
through the RTS to GHCi, and after many trials and errors I achieved it by using the GHCi message mallocData
.
This is certainly a pretty bad solution, since it allocates a new string for each breakpoint with the same contents.
A nicer solution might be to add another ForeignRef
to ModBreaks
, but that is only available as Maybe
for the current module (although mallocData
also isn't unconditionally available, so both solutions are equally bad in that regard).
On the GHCi side, I then use the module to lookup the breakpoints in the HPT, which works fine.
I had some trouble figuring out what type to use for the BreakpointCallback
argument that doesn't segfault, and only had success with a Word#
that I then assembled into a RemotePtr
.
There's surely a better way to do this as well.
For the break array, I looked up the ModBreaks
of the breakpoints module in the HPT in StgToByteCode
.
I don't know whether it's guaranteed that all local dependencies are present in there, and I assume that this won't work at all for Iface
data from an object file – for the latter, I assume we can't use breakpoints anyway, so maybe the breakpoint should just be discarded in that case.
Anyway, right now the code assumes that a missing ModBreaks
means that the breakpoint is from the current module.
I modified BCOPtrBreakArray
to take a Maybe (ForeignRef BreakArray)
since that was the only way I managed to encode the pointer without segfaulting.
Ultimately, I didn't have to make any changes to the RTS – all arguments have the same representation as before.
As you can see from the test I added, it works for a simple case! But there is certainly a lot to take care of still, or do differently, so I'd appreciate any suggestions and elucidations of my mistakes.