Attach stack traces to exceptions
A few months ago I wrote a blog post where I described some minimal changes to GHC's exception infrastructure which would allow us to use DWARF stack information to annotate exceptions.
Today @domenkozar brought up the matter with me on IRC and said that tracking down exceptions is his greatest challenge in production.
Proposal
We modify SomeException
as follows:
data SomeException where
SomeExceptionWithLocation
:: forall e. Exception e
=> Maybe Backtrace -- ^ backtrace, if available
-> e -- ^ the exception
-> SomeException
and add a bidirectional pattern synonym for compatibility:
pattern SomeException e <- SomeExceptionWithLocation _ e
where
SomeException e = SomeExceptionWithLocation Nothing e
Backtrace
is a simple type capturing the various backtrace
data Backtrace
= CostCenterBacktrace GHC.Stack.CCS.CostCentreStack
-- ^ a cost center profiler backtrace
| HasCallStackBacktrace GHC.Stack.CallStack
-- ^ a stack from HasCallStack
| ExecutionBacktrace [GHC.ExecutionStack.Location]
-- ^ a stack unwinding (e.g. DWARF) backtrace
We can then implement a variety of throwIO
actions which attach stack traces and a catch
operation which provides access to the backtrace:
-- | Throws an exception with no stack trace.
throwIOWithoutBacktrace :: Exception e => e -> IO a
-- | Throws an exception with the current cost center stack.
throwIOWithCostCenterStack :: Exception e => e -> IO a
-- | Throws an exception with a stack trace captured via
-- 'GHC.Stack.getStackTrace' with the given stack limit
-- (or unlimited if 'Nothing').
throwIOWithExecutionStack :: Exception e => Maybe Int -> e -> IO a
-- | Throws an exception with a `HasCallStack` stack trace.
throwIOWithCallStack :: HasCallStack => e -> IO a
-- | Catch an exception, providing access to the exception's backtrace:
catchWithLocation :: IO a -> (Maybe Backtrace -> e -> IO a) -> IO a
throwIO :: Exception e => e -> IO a
throwIO = throwIOWithoutBacktrace
Appropriate analogues would be provided for the pure throw
function.
Making backtraces ubiquitous
So far we have discussed a mechanism for attaching backtraces to exceptions in a backwards-compatible manner. However, this doesn't help existing code which call throwIO
. This is where we need to be a bit careful as some of these backtrace-collection mechanisms are potentially expensive. For instance, throwIO = throwIOWithExecutionStack
may open the user to DOS attacks due to the slow nature of DWARF unwinding.
Given that backtraces are primarily a debugging tool, I think a pragmatic, albeit somewhat ugly, solution using global state is appropriate. That is, we expose:
-- | How to collect a backtrace when an exception is thrown.
data BacktraceMechanism
= NoBacktrace
-- ^ don't collect a backtrace
| CostCenterBacktrace
-- ^ collect a cost center stacktrace (only available when built with profiling)
| ExecutionStackBacktrace (Maybe Int)
-- ^ use execution stack unwinding with given limit
setGlobalBacktraceMechanism :: BacktraceMechanism -> IO ()
The default global backtrace mechanism will be NoBacktrace
.
We can then redefine throwIO
as:
throwIO :: Exception e => e -> IO a
throwIO exc = do
m <- getGlobalBacktraceMechanism
case m of
NoBacktraces -> throwIONoBacktrace exc
CostCenterBacktrace -> throwIOWithCostCenterStack exc
ExecutionStackBacktrace limit -> throwIOWithExecutionStack limit exc
This leaves the policy question of backtrace mechanism selection to the user while preserving backwards compatibility.
The only wrinkle here is that of toException
, since SomeException
values created in this way won't get a Backtrace
. One option here would be to rely on unsafePerformIO
to collect a backtrace when constructing SomeException
. That is rewrite the SomeException
pattern synonym as:
pattern SomeException e <- SomeExceptionWithLocation _ e
where
SomeException e = mkSomeException e
mkSomeException :: Exception e => e -> SomeException
mkSomeException e = unsafePerformIO $ do
bt <- collectBacktrace
return (SomeExceptionWithLocation bt e)
Setting mechanism from environment
Some languages require that the user set an environment variable to enable backtrace reporting (e.g. Rust requires RUST_BACKTRACE=1
). In principle we could expose such a mechanism. I'm a bit unsure of whether this would be a good thing or not.