TVar with unpacked machine integer
Motivation
Sometimes, a programmer wants to put an Int
or a Word
(both machine-sized) into a TVar
. This may happen when there's a transactional variable that acts as a counter. It may also happen when the user has a data type with fewer than 2^32 inhabitants, and they are willing to manually do some bit arithmetic to pack their data better than the compiler will. In either situation, users who want to put Int
/Word
in a TVar
are penalized. TVar
can only hold lifted types. That is, even if the Int
/Word
is only ever used strictly, it is always boxed before being written and unboxed when being read.
More concretely, I've recently been working on a networking library that rolls its own event manager. In my macrobenchmark, I've reached the point where the boxed Word
s in my event manager's TVar
s are one of the primary sources of allocations. I'd like to eliminate these allocations.
Proposal
User-Facing Changes
Introduce the following primitives and types:
data TIntVar# :: Type -> Type
readTIntVar# :: TIntVar# s -> State# s -> (# State# s, Int# #)
newTIntVar# :: Int# -> State# s -> (# State# s, TIntVar# s #)
writeTIntVar# :: TIntVar# s -> a -> State# s -> State# s
readTIntVarIO# :: TIntVar# s -> State# s -> (# State# s, Int# #)
These work just like their TVar#
counterparts. However, they all use machine-word values instead of lifted values. The inclusion of "Int" in the names is inspired by the dichotomy established by casArray#
and casIntArray#
.
Implementation
I've not come up with a satisfactory way to do this. Ideally, it would be possible to reuse most of the existing implementation of TVar
. Here are the definitions of relevant runtime data structures as they are implemented today in Closures.h
:
typedef struct {
StgHeader header;
StgClosure *volatile current_value;
StgTVarWatchQueue *volatile first_watch_queue_entry;
StgInt volatile num_updates;
} StgTVar;
/* new_value == expected_value for read-only accesses */
/* new_value is a StgTVarWatchQueue entry when trec in state TREC_WAITING */
typedef struct {
StgTVar *tvar;
StgClosure *expected_value;
StgClosure *new_value;
#if defined(THREADED_RTS)
StgInt num_updates;
#endif
} TRecEntry;
#define TREC_CHUNK_NUM_ENTRIES 16
typedef struct StgTRecChunk_ {
StgHeader header;
struct StgTRecChunk_ *prev_chunk;
StgWord next_entry_idx;
TRecEntry entries[TREC_CHUNK_NUM_ENTRIES];
} StgTRecChunk;
And, from ClosureTypes.h
, we have:
#define TVAR 41
#define TREC_CHUNK 54
I speculate that it is possible to just add #define TINTVAR NN
and reuse all of the existing struct
s. In this approach, expected_value
, new_value
, and current_value
would sometimes hold a machine int instead of pointer. Only during GC would it be necessary to distinguish which one it really was (so you knew whether or not you should follow expected_value
, new_value
, and current_value
). But, this information could be recovered by looking in header
to see if we were dealing with TVAR
or TVARINT
. We must also take special care to handle the invariant mentioned in this comment:
/* new_value is a StgTVarWatchQueue entry when trec in state TREC_WAITING */
With a TVARINT
, this could still happen, meaning that while a TRec for a TVARINT
was in state TREC_WAITING
, expected_value
would be a machine int, but new_value
would be a pointer. Even still, I think that only GC code would be
changed, and everything else could be reused, performing the int-to-pointer casts right at the boundary in the C code.
Interaction with other proposals
Improving STM Performance withTransactional Structs by Ryan Yates and Michael Scott comes to mind. If my reading of the paper is correct, TVarInt
should still be better that TStruct
in the single-machine-word case since it avoids an indirection.
Feedback
I'd like some feedback on:
- Is this even correct?
- Is is worth it? GC will be slightly negatively impacted. More importantly in my mind, adding these primitives means they must continue to be supported. If anyone ever tried to reimplement STM in Cmm (which I think I've seen Ryan Yates mention in a presentation), I think this might make it harder to do.