• Herbert Valerio Riedel's avatar
    Refactor & optimise construction of index cache · db1ef505
    Herbert Valerio Riedel authored
    This commit was motivated by @dcoutts' code-review comment:
    
    > Originally with using the `Sec.directoryEntries` that gave us only the
    > final version of each file, ie not all intermediate revisions. And
    > previously our strategy was to go through the final versions of each
    > file, in file order, and lookup just the ones we're interested in (which
    > in practice is 99% of them).
    >
    > Now for the new cache we want to go through all revisions, which means
    > all entries in file order. So instead of using `Sec.directoryEntries`
    > which reads from the tar index, we go straight for `Sec.directoryFirst`
    > which is block 0 and iterate through, using `lazyUnfold`.
    >
    > But we can now significantly simplify this and do it more
    > efficiently. Note that `indexLookupEntry` and `indexLookupFileEntry` are
    > expensive operations that seek in the tar file and read the tar entry at
    > that point. So lets do it exactly once per entry. The current code does
    > it once in the `lazyUnfold indexLookupEntry` and then again in `mk`. But
    > the old `mk` only did that because it had not previously looked up the
    > entry.
    db1ef505
IndexUtils.hs 30.3 KB