Data.Array.Byte's documentation incredibly lacking.

added documentation label

changed title from Data.Array.Byte's documentation is unclear about its usefulness to Data.Array.Byte's documentation incredibly lacking.

@AndreasK Do you have any insight into this that could prove useful?

The original motivation for this module was discussed here.

I only proposed we also include the Mutable wrapper. This was discussed here.

The core argument in favour from me during the CLC discussion was:

This would cut down on different names/types being used in different libraries for what amounts to the same thing. Even if these libraries have to define their own operations for working with such a type, they could still pass such a type between them easier if a common definition in base existed.

Didn't seem useful to me to add this explicitly to the docs. It really is just a boxed wrapper around a unboxed thing

Both exported entites have haddocks, explaining their purpose. I'm not sure what else can be said.

@Bodigrim I haven't been involved in the primitive package, so while the two types are well-described in terms of what they do (boxed wrappers around unboxed thing, as Andreas phrased it), it is very unclear what is it that they are best suited for.

While I understand that "bytes are bytes", there are many ways to use bytes that don't require the same things. If I were to use ByteArray to pass data through the FFI, would kittens die? Should I use it for binary serialisation? Would I commit a sin if I was to implement an IsString instance for it? :)

I really ask this with a lot of naïveté, because I don't feel like this module really answers any of them, and currently the ByteArray# declaration in Haddocks is nothing very concrete:

You ask excellent questions, but in a nutshell they are about ByteArray#, not about ByteArray. GHC.Prim section reveals some answers, but this information is lost after re-export from GHC.Exts. A simple improvement would be to move that explanation into haddock for ByteArray#, so that it is visible from base.

Ah yes indeed, looking at the structure of the Haddocks for GHC.Prim, it appears that this very helpful paragraph is "floating" and not actually attached to a specific declaration. Thank you very much, this quite the easy fix. :)

assigned to @Kleidukos

added Pnormal Tbug labels

mentioned in merge request !8067 (closed)

To address the part of the original question about ByteArray vs ByteString, the differences are:

ByteArray has no support for zero-copy slicing. ByteString does.
ByteStrings can be backed by memory allocated by malloc (or any malloc-like replacement) in C code. They rely on GHC finalizers to call free (or the right free-like replacement) once the object is unreachable. They can also be backed by the pointer returned by mmap, although this use is rare and means that you're waiting on a finalizer to call munmap.
ByteArray, when unpacked into a data constructor, consumes a single machine word. It places the pointer (ByteArray#) to the memory directly into the data constructor. ByteString consumes three machine words: An Int# for the length, an Addr# for the pointer to the memory, and a ForeignPtrContents for finalizer stuff.
ByteString requires pinned memory. ByteArray supports both pinned and unpinned memory.

If you don't need slicing and you know that your memory is going to be the result of a GHC allocation, then ByteArray is better. I've written a library byteslice that augments ByteArray with support for slicing. I use this extensively, but I have no idea if it has any users outside of me. This library makes it possible to switch over to sliced memory without making a copy. It is not generally possible to convert ByteArray to ByteString without making a copy because ByteString requires that its contents are pinned.

Wonderful, this is exactly the kind of information that is needed. :)

@andrewthad From the documentation of ByteArray#, I read that it carries its own length. Do you know why it doesn't consume an extra machine word like ByteString? Or is it extra-magic? :)

The length is stored at the pointer. You have to dereference the pointer to get to it. Lookup the the array section of the Heap Objects wiki page for a visual of this. On a 64-bit machine, the pointer that represents a ByteArray# at runtime points to memory that is 16 bytes before the actual payload: 8 bytes for the header, 8 bytes for the length. With ByteString (assuming that you unpack it into some other constructors), there's no pointer chasing to get at the length. But with ByteArray, you've got to follow that pointer.

mentioned in commit e2b25578

mentioned in commit a0d9861c

mentioned in commit da342964

mentioned in commit 344b1ba3

mentioned in commit 09de5a26

Data.Array.Byte's documentation incredibly lacking.

Child items ...

Activity