At the moment (6f4dc372), the documentation of the Data.Array.Byte module doesn't say anything except:
-- Derived from @primitive@ package.
This situation is not acceptable; I'd like to remedy to it.
@andrewthad, could you please explain the usecases of this module and its data structures? Does it replace Bytestring in one of the three domains that are covered by it? Does it fare better?
Edited
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
Hécate Kleidukoschanged title from Data.Array.Byte's documentation is unclear about its usefulness to Data.Array.Byte's documentation incredibly lacking.
changed title from Data.Array.Byte's documentation is unclear about its usefulness to Data.Array.Byte's documentation incredibly lacking.
The original motivation for this module was discussed here.
I only proposed we also include the Mutable wrapper. This was discussed here.
The core argument in favour from me during the CLC discussion was:
This would cut down on different names/types being used in different libraries for what amounts to the same thing.
Even if these libraries have to define their own operations for working with such a type, they could still pass such a type between them easier if a common definition in base existed.
Didn't seem useful to me to add this explicitly to the docs. It really is just a boxed wrapper around a unboxed thing
@Bodigrim I haven't been involved in the primitive package, so while the two types are well-described in terms of what they do (boxed wrappers around unboxed thing, as Andreas phrased it), it is very unclear what is it that they are best suited for.
While I understand that "bytes are bytes", there are many ways to use bytes that don't require the same things. If I were to use ByteArray to pass data through the FFI, would kittens die? Should I use it for binary serialisation? Would I commit a sin if I was to implement an IsString instance for it? :)
I really ask this with a lot of naïveté, because I don't feel like this module really answers any of them, and currently the ByteArray# declaration in Haddocks is nothing very concrete:
You ask excellent questions, but in a nutshell they are about ByteArray#, not about ByteArray. GHC.Prim section reveals some answers, but this information is lost after re-export from GHC.Exts. A simple improvement would be to move that explanation into haddock for ByteArray#, so that it is visible from base.
Ah yes indeed, looking at the structure of the Haddocks for GHC.Prim, it appears that this very helpful paragraph is "floating" and not actually attached to a specific declaration. Thank you very much, this quite the easy fix. :)
To address the part of the original question about ByteArray vs ByteString, the differences are:
ByteArray has no support for zero-copy slicing. ByteString does.
ByteStrings can be backed by memory allocated by malloc (or any malloc-like replacement) in C code. They rely on GHC finalizers to call free (or the right free-like replacement) once the object is unreachable. They can also be backed by the pointer returned by mmap, although this use is rare and means that you're waiting on a finalizer to call munmap.
ByteArray, when unpacked into a data constructor, consumes a single machine word. It places the pointer (ByteArray#) to the memory directly into the data constructor. ByteString consumes three machine words: An Int# for the length, an Addr# for the pointer to the memory, and a ForeignPtrContents for finalizer stuff.
ByteString requires pinned memory. ByteArray supports both pinned and unpinned memory.
If you don't need slicing and you know that your memory is going to be the result of a GHC allocation, then ByteArray is better. I've written a library byteslice that augments ByteArray with support for slicing. I use this extensively, but I have no idea if it has any users outside of me. This library makes it possible to switch over to sliced memory without making a copy. It is not generally possible to convert ByteArray to ByteString without making a copy because ByteString requires that its contents are pinned.
@andrewthad From the documentation of ByteArray#, I read that it carries its own length. Do you know why it doesn't consume an extra machine word like ByteString? Or is it extra-magic? :)
The length is stored at the pointer. You have to dereference the pointer to get to it. Lookup the the array section of the Heap Objects wiki page for a visual of this. On a 64-bit machine, the pointer that represents a ByteArray# at runtime points to memory that is 16 bytes before the actual payload: 8 bytes for the header, 8 bytes for the length. With ByteString (assuming that you unpack it into some other constructors), there's no pointer chasing to get at the length. But with ByteArray, you've got to follow that pointer.