pmaier · 72e4170c
--- a/static-pointers/typesafe-decoding.md
+++ b/static-pointers/typesafe-decoding.md
+# Type-safe encoding and decoding of static pointers
+
+
+This page attempts to spell out in detail a design of the GHC
+`StaticPointers` language extension, in particular with regard to the
+assumptions for type-safe and robust encoding and decoding of static pointers.
+
+
+This proposal should be read in connection with
+[the design and implementation of static pointers](static-pointers),
+which explains what static pointers are and explores the broader design space
+but deliberately omits encoding details. Also relevant is the static pointers
+[implementation plan for GHC 7.10](static-pointers/implementation-plan).
+
+**Not discussed in this proposal:**
+
+- Type-safe encoding and decoding of polymorphic static pointers.
+  The [static pointers design](static-pointers) page sets out ideas
+  how to deal with rank-1 types.
+- Decoding static pointers with existentials.
+  This requires the [typed Typeable proposal](typeable).
+
+---
+
+### Proposed `StaticPointers` API
+
+```wiki
+module Data.StaticPtr
+  ( StaticKey
+  , StaticPtr
+  , DynStaticPtr(..)
+  , StaticInfo(..)
+  , staticKey
+  , deRefStaticPtr
+  , lookupStaticPtr
+  , lookupDynStaticPtr
+  , lookupStaticInfo
+  , sptKeys
+  , sptFingerprint
+  ) where
+
+-- A cryptographic hash as a key for static expressions; serialisable and instance of `Eq` and `Ord`.
+type StaticKey = Fingerprint
+
+-- A static pointer refering to a static expression of type `t`.
+data StaticPtr t
+
+-- A `StaticPtr t` wrapped with a TypeRep of `t`.
+data DynStaticPtr where
+  DSP :: (Typeable a) => StaticPtr a -> DynStaticPtr
+
+-- Human-comprehensible debug info identifying a static expression.
+data StaticInfo = StaticInfo
+       { staticModule :: String     -- context: mondule containing static expr
+       , staticLoc    :: (Int,Int)  -- src location (line/column) of static expr
+       } deriving (Eq, Ord, Show)
+
+-- All `StaticKey`s known to the SPT, in ascending order.
+sptKeys :: [StaticKey]
+
+-- Computes a fingerprint of the whole SPT.
+sptFingerprint :: Fingerprint
+
+-- Operations on `StaticPtr`s; constant time, very fast.
+staticKey      :: StaticPtr t -> StaticKey
+deRefStaticPtr :: StaticPtr t -> t
+
+-- SPT lookups; constant time, reasonably fast.
+lookupDynStaticPtr :: StaticKey -> Maybe DynStaticPtr
+lookupStaticInfo   :: StaticKey -> Maybe StaticInfo
+
+-- SPT lookup and dynamic type check.
+lookupStaticPtr :: (Typeable t) => StaticKey -> Either (Maybe StaticInfo) (StaticPtr t)
+lookupStaticPtr sk = case do { DSP sp <- lookupDynStaticPtr sk; cast sp } of
+  Nothing -> Left $! lookupStaticInfo sk  -- key unknown or dynamic typecheck failed
+  Just sp -> Right sp
+```
+
+
+Some remarks about the API:
+
+- The API isn't quite complete. Not listed is the expression former
+  `static <expr>`, which constructs a **static pointer** of type `StaticPtr t`
+  provided `<expr>` is a **static expression** (i.e. its free term
+  variables are top-level) of type `t` and `t` is `Typeable`.
+
+- The API is tied to the compiler and runtime system.
+  The compiler constructs all values of type `StaticKey`, `StaticInfo`,
+  `DynStaticPtr` and `StaticPtr`.
+  The runtime system maintains an immutable hash table mapping `StaticKey`s
+  to SPT entries, which `sptKeys` and the `lookup*` operations access.
+  Only the convenience function `lookupStaticPtr` (and possibly the watermark
+  `sptFingerprint`) could be moved to an independent library.
+
+- The CAF `sptFingerprint` is a fingerprint (cryptographic hash value) of
+  the whole SPT and can be used to check whether the SPTs on different nodes
+  are identical. If `StaticKey`s are sufficiently robust (see next section)
+  the fingerprint can be computed by hashing the list `sptKeys`.
+
+- The `StaticPtr` type is abstract but the other types are deliberately not.
+
+  - `StaticKey` is concrete so serialisation can be left to client code.
+  - `StaticInfo` is concrete so client code can produce human-readable error
+    messages and debug output.
+  - `DynStaticPtr` is concrete so client code can pattern match and implement
+    their own type casts (including unsafe casts bypassing the dynamic
+    type checks).
+
+- The API suggests
+
+  - that a `StaticPtr` has at least two fields: one for the `StaticKey`
+    and one holding the underlying static expression; and
+  - that an SPT entry has at least two fields: one holding a `DynStaticPtr`
+    and one storing the `StaticInfo`.
+
+- Naming: There is no `Ptr` in the type name `StaticInfo` because the info
+  relates to a particular *static expression* rather than a particular
+  *static pointer*. The same is true for `StaticKey` (see next section).
+
+### Computing `StaticKey`
+
+
+For every static expression `<expr>` there is an associated key.
+We postulate the following requirements for this `StaticKey`.
+
+- Primary requirement: The `StaticKey` must uniquely identify `<expr>`.
+- Secondary requirement: The `StaticKey` should not change unless `<expr>`
+  itself changes.
+
+
+The primary requirement is essential; thanks to it `StaticKey`s faithfully
+encode their static expressions.
+The secondary requirement is not essential but provides robustness
+in the face of simple source code changes like re-ordering of declarations.
+
+
+For the purpose of demonstration, here is a concrete proposal for how to
+compute a `StaticKey`.
+
+```wiki
+static_key = cryptographic_hash(unique_module_identifier(<expr>), canonical_source(<expr>))
+```
+
+
+That is, a `StaticKey` is a hash computed from two parts: the actual
+source code of the static `<expr>` and its unique module identifier
+(UMI).  The UMI provides the context for `<expr>`, namely the unique
+module defining the free (hence top-level) names occuring in `<expr>`.
+Like in Cabal, a UMI can be represented as a triple consisting of
+package name, package version and fully qualified module name.  For
+added robustness, the source code of `<expr>` should be converted into
+a canonical form before computing the hash, to abstract from inessentials
+like layout and choice of bound local names. (But note that used-provided
+type annotations are essential and must not be abstracted away.)
+
+
+Computing `StaticKey` as above meets the primary requirement thanks to
+cryptographic hash functions being virtually collision free. But what
+would happen in the unlikely event of a collision? That is, what happens
+if there are two inequivalent static expressions `expr1` and `expr2` that
+both happen to hash to the same key `sk`? 
+
+- Type-safety is not undermined. However collisions can lead to
+  hard-to-debug behaviour.
+
+  Suppose the SPT contains an entry for `expr1` but the intended result
+  when calling `lookupStaticPtr sk` is a static pointer to `expr2`.
+
+  - If the types of `expr1` and `expr2` coincide, the result will be
+    `Right sp1`  where `sp1` is a static pointer to `expr1`. This may
+    lead to surprising runtime behaviour (but it won't crash so the
+    collision might go unnoticed).
+  - If the types of `expr1` and `expr2` differ (which is the more likely
+    scenario), the dynamic typecheck will fail and the result will be
+    `Left (Just info)`. However, the `info` refers to `expr1` rather than
+    the expected `expr2`, possibly confusing the user.
+
+- To limit the potential of collisions, the compiler should check that
+  there are no collisions when building the local SPT. That could be done
+  at link time or even at RTS startup. (The check is expected never to fail
+  so doing it at RTS startup is defensible.)
+
+  Such a check can't guard against cross-site collisions when
+  different sites run different code and hence have differing
+  SPTs. Thus, `expr1` with key `sk` may be registered in the SPT on
+  node1, and `expr2` with the same key `sk` may be registered in the SPT
+  on node2. In this case, the collision can only occur if node2 sends
+  a message containing `sk` to node1 (or vice versa). However, since
+  `expr1` and `expr2` are inequivalent and not shared between node1 and
+  node2, such a message should not be sent in the first place.
+
+**Remark:**
+A `StaticKey` of `<expr>` could be represented by its unique source
+location, consisting of UMI and line/column number of `<expr>`. This
+representation meets the primary requirement. However, it falls short
+on the secondary requirement: Even minor changes, like adding comments
+might change the line/column number of `<expr>`, thus changing its
+`StaticKey` on re-compilation. Which would require re-compilation
+of all client modules referring to `<expr>` by its `StaticKey`.
+
+### Decoding `StaticPtr`
+
+
+The ultimate goal of static pointers is type-safe transmission over the network.
+Here is such a scenario spelt out in detail.
+
+- Node A wants to communicate `sp_A :: StaticPtr t_A`,
+  where `sp_A` refers to a static expression `expr_A :: t_A`.
+
+- Node A encodes `sp_A` by its key `staticKey sp_A`, and sends this
+  (suitably serialised) to node B.
+
+- Node B receives the message and deserialises `sk :: StaticKey`.
+
+- Node B looks up `sk` in its SPT, by calling `lookupDynStaticPtr sk`.
+  This may return `Nothing` in case B's SPT knows no key `sk`, which means
+  that B knows no static expression corresponding to `expr_A` on A.
+  However, if the lookup returns `Just dsp_B` then A and B do agree on the
+  key `sk`.
+
+- `dsp_B` wraps a static pointer `sp_B :: StaticPtr t_B`, which refers to
+  a static expression `expr_B :: t_B` on node B.
+  Since `expr_A` and `expr_B` share the static key `sk`, we infer (by
+  the primary requirement on `StaticKey`s) that they are equivalent.
+  (But note that the nature of the equivalence depends on how exactly
+  `StaticKey`s are constructed. It is possible that `expr_A` and `expr_B`
+  conincide statically but differ dynamically as the definitions of
+  their contexts may differ.)
+
+- To actually use the static pointer, node B must unwrap `dsp_B` and cast
+  to `sp_B`. This cast will succeed only if the existentially quantified
+  type in `dsp_B` matches the ambient type of `sp_B`. If so, dereferencing
+  `sp_B` is type-safe on node B.
+
+---
+
+# Closed World Encoding
+
+
+The `StaticPointers` API described so far guarantees type safety in an
+**open world**. That is, there is no assumption that all nodes must
+agree on the set of shared static expressions. Instead, nodes are
+expected to safely decode any `StaticKey`, failing gracefully if the key
+is not registered in the local SPT.
+
+
+A consequence of the open world assumption is that `StaticKey`s must
+be fairly large, for instance the size of a `FingerPrint` (currently
+128 bits). This may be wasteful when storing data structures with lots
+of static pointers, for example the nested Closures of HdpH.
+Fortunately, there is a more compact encoding of static pointers
+when restricting to a **closed world**, where all nodes agree on the
+exact set of shared static expressions.
+
+### `StaticPointers` API with Closed World Encoding
+
+
+The following *closed world API* is an alternative to the above
+*open world API*. Actually, the closed world API is almost identical
+to open world one. There are only two differences.
+First, the type `StaticKey`, which is a 0-based index into the SPT rather
+than a cryptographic hash, hence often serialisable into one or two bytes.
+Second, there is an additional function for `lift`ing `static` expressions
+from the open world to the closed world.
+
+```wiki
+module Data.StaticPtr.ClosedWorldEncoding
+  ( StaticKey
+  , ...
+  , sptFingerprint
+  , lift
+  ) where
+
+-- A 0-based index as a key for static expressions; serialisable and instance of `Eq`, `Ord`, `Ix` and `Bounded`.
+type StaticKey = Int
+
+-- Lifting open world static pointers to closed world ones.
+lift :: Data.StaticPtr.StaticPtr t -> StaticPtr t
+```
+
+
+The closed world API can can be implemented as a library on top of the
+open world API without further compiler or RTS modification.  The
+implementation essentially duplicates the SPT, creating an indexed
+table.  Note that the `sptFingerprint`s of open world and closed world
+API coincide - changing the indexing does not change the content of
+the SPT.
+
+
+The performance of the operations on static pointers and the SPT is
+the same as in the open world case; in fact, SPT lookups may be
+marginally faster since they involve a simple array access rather than
+a hash table lookup.  The only additional overhead is introduced by
+`lift`, which requires an extra lookup.  However, thanks to lazy
+evaluation, this overhead is bourne at most once per static
+expression.
+
+### Decoding `StaticPtr` in a Closed World
+
+
+The ultimate goal of closed world static pointers is type-safe
+transmission over the network, when all parties agree on the set of
+shared static expressions. This *closed world assumption* can
+be tested by comparing `sptFingerprint`s, aborting program execution
+if nodes with differing `sptFingerprint`s are discovered. Since
+`sptFingerprint`s don't change over time, it suffices to check once,
+e.g. on opening a connection between nodes.
+
+
+Here is a typical messaging scenario (after passing the closed world test).
+
+- Node A wants to communicate `sp_A :: StaticPtr t_A`,
+  where `sp_A` refers to a static expression `expr_A :: t_A`.
+
+- Node A encodes `sp_A` by its key `staticKey sp_A`, and sends this
+  (suitably serialised) to node B.
+
+- Node B receives the message and deserialises the index `i :: StaticKey`.
+
+- Node B looks up `i` in its SPT, by calling `lookupDynStaticPtr i`.
+  By closed world assumption this must return `Just dsp_B`
+  because A and B agree on the set of static expressions.
+
+- `dsp_B` wraps a static pointer `sp_B :: StaticPtr t_B`, which refers to
+  a static expression `expr_B :: t_B` on node B.
+  Since `expr_A` and `expr_B` share the static key `i`, we infer (by
+  the primary requirement on `StaticKey`s) that they are equivalent.
+
+- To actually use the static pointer, node B must unwrap `dsp_B` and cast
+  to `sp_B`. This cast will succeed only if the existentially quantified
+  type in `dsp_B` matches the ambient type of `sp_B`. If so, dereferencing
+  `sp_B` is type-safe on node B.
+
+
+It is worth noting that the closed world encoding does not compromise
+type safety, even if the closed world assumption fails. In this case,
+`lookupDynStaticPtr` may return `Nothing`, or casting the wrapped
+static pointer may fail. Yet, if casting succeeds, the resulting static
+pointer is safe to use on node B (even if it isn't pointing to the same
+static expression as on node A).
+
+---
+
+# Robustness
+
+
+This section discusses the robustness of the SPT and serialised
+`Statickey`s against a number of perturbations.
+It assumes that `StaticKey`s are constructed as proposed in Section
+"Computing Statickeys" above.
+
+### Compilation
+
+
+Static pointers are not automatically exported. Hence statics do not
+impact the module dependency graph.
+
+
+Changing the static expressions in a module only requires re-compiling
+that particular module since the global SPT is constructed by the RTS
+on the fly from per-module SPTs.
+
+
+Changing anything other than the static expressions of a module does
+not affect the SPT. Even certain changes of static expressions themselves,
+e.g. reformatting layout, should not affect the SPT.
+
+### Non-uniform binaries and non-uniform sets of static expressions
+
+
+The original CloudHaskell paper posited that all nodes execute the
+same binary, which implies that all nodes agree on a uniform set of
+statics.  There have since emerged use cases where neither assumption
+is true.  That is, nodes may run different binaries, and they may
+define *overlapping* sets of statics rather than agree on a uniform
+set.
+
+
+The proposed *open world* design can cope with this situation
+(assuming a high-quality cryptographic hash function for computing
+`StaticKey`s, making key collisions extremely unlikely).
+
+
+The *closed world* library assumes a uniform set of static expressions
+but not necessarily uniform binaries. Moreover, the closed world library
+may behave weirdly if the closed world assumption is violated but
+it will not sacrifice type safety, i.e. it can't segfault.
+
+### Non-uniform compilers
+
+
+In a multi-site distributed system binaries may have been produced by
+different versions of the GHC. Nodes can still exchange static pointers
+safely, provided they agree
+
+- on the static pointer's `StaticKey` (in the open world case), or
+
+- on the `sptFingerprint` (in the closed world case).
+
+
+Such agreement implies that all nodes, and hence all compiler versions, agree
+
+- on the static pointer's `StaticKey`, which implies they agree on the
+  static pointer's UMI, including the package version, and
+
+- on the implementation of the cryptographic hash used for fingerprinting.
+
+
+Decoding of static pointers will fail safely if compiler versions
+disagree on either of the above two items.