Optimise SizedSeq

SizedSeq is defined in ghc-boot:GHC.Data.SizedSeq. It is used to create BCOs (more precisely, sequences of ByteCode instructions for GHCi).

It hasn't changed much since it was introduced in 7b4c4250 and it looks particularly inefficient:

data SizedSeq a = SizedSeq {-# UNPACK #-} !Word [a]
  deriving (Generic, Show)

instance Functor SizedSeq where
  fmap f (SizedSeq sz l) = SizedSeq sz (fmap f l)

instance Foldable SizedSeq where
  foldr f c ss = foldr f c (ssElts ss)

instance Traversable SizedSeq where
  traverse f (SizedSeq sz l) = SizedSeq sz . reverse <$> traverse f (reverse l)

emptySS :: SizedSeq a
emptySS = SizedSeq 0 []

addToSS :: SizedSeq a -> a -> SizedSeq a
addToSS (SizedSeq n r_xs) x = SizedSeq (n+1) (x:r_xs)

addListToSS :: SizedSeq a -> [a] -> SizedSeq a
addListToSS (SizedSeq n r_xs) xs
  = SizedSeq (n + genericLength xs) (reverse xs ++ r_xs)

ssElts :: SizedSeq a -> [a]
ssElts (SizedSeq _ r_xs) = reverse r_xs

In particular ssElts returns the reversed list of elements. AFAICT all of its uses don't really need to be reversed:

we use it to create an array from a SizedSeq in: GHCi.CreateBCO.mkPtrsArray, GHC.ByteCode.Asm.assembleBCO and GHC.ByteCode.Linker.linkBCO. Just creating the array starting from the end would be better.
In GHC.ByteCode.Linker.linkBCO we do this:

  ptrs <- mapM (resolvePtr interp ie ce bco_ix breakarray) (ssElts ptrs0)
  return (ResolvedBCO isLittleEndian arity insns bitmap
              (listArray (0, fromIntegral (sizeSS lits0)-1) lits)
              (addListToSS emptySS ptrs))

So we reverse ptrs0 in ssElts and then we reverse it again in addListToSS. I don't think it matters if we call resolvePtr in the reverse order.

In GHC.ByteCode.Asm.bcoFreeNames we use ssElts to create a set like this:

bcoFreeNames :: UnlinkedBCO -> UniqDSet Name
bcoFreeNames bco
  = bco_refs bco `uniqDSetMinusUniqSet` mkNameSet [unlinkedBCOName bco]
  where
    bco_refs (UnlinkedBCO _ _ _ _ nonptrs ptrs)
        = unionManyUniqDSets (
             mkUniqDSet [ n | BCOPtrName n <- ssElts ptrs ] :
             mkUniqDSet [ n | BCONPtrItbl n <- ssElts nonptrs ] :
             map bco_refs [ bco | BCOPtrBCO bco <- ssElts ptrs ]
          )

Finally, we have:

instance Foldable SizedSeq where
  foldr f c ss = foldr f c (ssElts ss)

instance Traversable SizedSeq where
  traverse f (SizedSeq sz l) = SizedSeq sz . reverse <$> traverse f (reverse l)

I don't know if they are used but we could use foldl' to define foldr, and if we need traverse then there must be another better data structure somewhere to use instead of this.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information