Chunkify large records similar to large tuples

Motivation

During a conversation at ZuriHac, the problem of inefficiency when using large records came up (for a good write up of this problem, see this blog post from Well Typed). Towards the end of the blog post some concerns / ideas are raised:

The solution in the blog post involves using a library, it is not a solution baked into GHC
It may be possible to alleviate the problem with more sharing (see #20264)
Record pattern matches are still linear in the size of the record

Proposal

Thinking out loud I asked Simon whether we could do something similar to the treatment of large tuples, and have GHC keep a representation internally that was more nested.¹ Simon enthusiastically raised some points, which I will attempt to faithfully relay here:

We have machinery in the compiler for specifying a more efficient internal representation for data, mkDataConRep etc.
We already go the other way in GHC and expand records (e.g. using {-# UNPACK #-} on a strict tuple field)
Using a chunked representation means selector functions become a case tree
Record updates can share chunks of a record which are not changed

To give some examples (in deliberately invalid Haskell): a large record

{ x0 :: A0, ..., xn :: An }

would be internally represented as something similar to

{ { x0 :: A0, ..., xK-1 :: AK-1 }, { xK :: AK, ... }, { ..., xn :: An } }

where each chunk has K number of fields. The selector functions change accordingly

sel_xn_old r =
  case r of
    C0 _ ... xn -> xn -- Pattern linear to number of fields

sel_xn_new r =
  case r of
    C1 _ c1 -> -- Each item here is just a chunk
      case c1 of
        C2 _ ... xn -> xn -- Match on chunks until we find the element we want

Thinking about this the past week or so I have some more questions about this:

How would {-# UNPACK #-} play nicely with a chunked record representation? We do not have to think about this for tuples because it is not valid to write something like (Int, {-# UNPACK #-} !Int) in Haskell. Intuitively, I would guess that such strict unpacked fields would be floated to the top of the record alongside the chunks.
Would we generate new types for the chunks, or use tuples internally to represent the chunks. I feel like tuples would be simpler, but perhaps there is some reason to not favour this?

@simonpj As promised, I have written this up as an issue. Hopefully this is clear enough starting point for people to discuss.

Due to a limit set by the format of info tables, GHC only allows tuples of up to 64 elements internally. However, when writing code in the surface syntax of Haskell, we can use tuples larger than 64. Internally, GHC splits the tuple into a tuple of smaller tuples which can contain all the elements. This is transparent to the user writing the code. ↩

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information