Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,865
    • Issues 4,865
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 455
    • Merge requests 455
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #20299
Closed
Open
Created Aug 27, 2021 by Sebastian Graf@sgraf812Developer

Add `primitive` as a boot dependency

There are 4 basic options in the Hackage ecosystem when you need an array data structure:

  1. The array package (already a boot dependency of GHC). The wired-in use of Ix makes the API much harder to use and more bloated than necessary, just to accomodate rare use cases such as (Int, Int, Int) indices or pretending that indices don't start from 0. The lack of beign able to say in Ix lingo that "this is one past the last index" means we have to fight with off-by-one errors when allocating an Array (newArray_ (0, len-1)) for example. It's just a bloated and bad API all around.
  2. Use Array# primops from GHC.Exts directly. Bad choice, because you have to wrap all mutating operations in IO/ST. And even if you do that locally in your module, you can't export an Array# because then callers have to deal with the same mess, too. So you'll have to extract everything into a utility module that wraps around Array#.
  3. All this wrapping of Array# and SmallArray# (which we'd like to have in #18927) has already been done by the primitive package (not currently a boot dependency). So we could copy most of the stuff from there. But then why not simply depend on primitive in the first place?
  4. The vector package (not currently a boot dependency of GHC). Basically array done right, a higher-level API around primitive's Data.Primitive.Array wrapper. If we depend on vector we implicitly depend on primitive, so we might as well stick to primitive.
  5. Other even higher-level array types such as those from repa. Serve an entirely different purpose, I just want a linear chunk of memory.

At the moment, we only have option (1) or (2) in GHC. I dislike both, to the point that I always hesitate to reach out to arrays when I better should. The obvious choice here is (3), a new boot depenency on primitive, that is. Here is some motivation:

  1. Unlike now, I'd try to use an array in new code where an array would make sense, simply because I could import the API of Data.Primitive.Array and Data.Primitive.SmallArray.
  2. #18927 provides some existing use cases for SmallArray#
  3. I think even if the length of the list that we (otherwise) build up is not known in advance or even fixed, we would benefit from continuously (à la std::vector::push_back or even std::deque::push_back) building up an array rather than a list. Keep in mind that each new cons cell has an overhead of 3 words; even growing a std::vector when out of capacity by a factor of 2 has less than 2 words of overhead and allows O(1) random access and length at the same time. The huge win of using lists is simplicity and fusion; the latter can be achieved for arrays, too (see #18927 (comment 312644)), while the former is a matter of providing good pattern synonyms for slices of arrays.
  4. There is other useful stuff in primitive, such as abstraction over PrimMonad (which has ST and IO instances)

What are the drawbacks?

  1. We'd have to maintain one more boot package. Given that vector is already quite a prominent Core package and that vector depends on primitive, I would think that the difference isn't too huge. Besides, we already have to provide primitive patches for the head.hackage overlay today, otherwise most of the ecosystem wouldn't even build. Patching primitive as part of an MR that breaks it probably means less maintenance overhead overall.
  2. Longer time to bootstrap. Fair enough.
  3. Intellectual overhead from using PrimMonad abstractions and other stuff we have to get used to. I don't think there's anything quite so complicated in the package that would pose a problem.

To me the cost/benefit ratio is clearly >1, so I propose to add primitive as a boot dependency. I hereby ask for permission to do so.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking