Add `primitive` as a boot dependency
There are 4 basic options in the Hackage ecosystem when you need an array data structure:
- The
array
package (already a boot dependency of GHC). The wired-in use ofIx
makes the API much harder to use and more bloated than necessary, just to accomodate rare use cases such as(Int, Int, Int)
indices or pretending that indices don't start from 0. The lack of beign able to say inIx
lingo that "this is one past the last index" means we have to fight with off-by-one errors when allocating an Array (newArray_ (0, len-1)
) for example. It's just a bloated and bad API all around. - Use
Array#
primops from GHC.Exts directly. Bad choice, because you have to wrap all mutating operations inIO
/ST
. And even if you do that locally in your module, you can't export anArray#
because then callers have to deal with the same mess, too. So you'll have to extract everything into a utility module that wraps aroundArray#
. - All this wrapping of
Array#
andSmallArray#
(which we'd like to have in #18927) has already been done by theprimitive
package (not currently a boot dependency). So we could copy most of the stuff from there. But then why not simply depend onprimitive
in the first place? - The
vector
package (not currently a boot dependency of GHC). Basicallyarray
done right, a higher-level API aroundprimitive
'sData.Primitive.Array
wrapper. If we depend onvector
we implicitly depend onprimitive
, so we might as well stick toprimitive
. - Other even higher-level array types such as those from
repa
. Serve an entirely different purpose, I just want a linear chunk of memory.
At the moment, we only have option (1) or (2) in GHC. I dislike both, to the point that I always hesitate to reach out to arrays when I better should. The obvious choice here is (3), a new boot depenency on primitive
, that is. Here is some motivation:
- Unlike now, I'd try to use an array in new code where an array would make sense, simply because I could import the API of
Data.Primitive.Array
andData.Primitive.SmallArray
. -
#18927 provides some existing use cases for
SmallArray#
- I think even if the length of the list that we (otherwise) build up is not known in advance or even fixed, we would benefit from continuously (à la
std::vector::push_back
or evenstd::deque::push_back
) building up an array rather than a list. Keep in mind that each new cons cell has an overhead of 3 words; even growing astd::vector
when out of capacity by a factor of 2 has less than 2 words of overhead and allows O(1) random access and length at the same time. The huge win of using lists is simplicity and fusion; the latter can be achieved for arrays, too (see #18927 (comment 312644)), while the former is a matter of providing good pattern synonyms for slices of arrays. - There is other useful stuff in
primitive
, such as abstraction overPrimMonad
(which hasST
andIO
instances)
What are the drawbacks?
- We'd have to maintain one more boot package. Given that
vector
is already quite a prominent Core package and thatvector
depends onprimitive
, I would think that the difference isn't too huge. Besides, we already have to provideprimitive
patches for the head.hackage overlay today, otherwise most of the ecosystem wouldn't even build. Patchingprimitive
as part of an MR that breaks it probably means less maintenance overhead overall. - Longer time to bootstrap. Fair enough.
- Intellectual overhead from using
PrimMonad
abstractions and other stuff we have to get used to. I don't think there's anything quite so complicated in the package that would pose a problem.
To me the cost/benefit ratio is clearly >1, so I propose to add primitive
as a boot dependency. I hereby ask for permission to do so.