This project is mirrored from https://github.com/haskell/Cabal. Pull mirroring updated .
  1. 25 Sep, 2016 1 commit
    • Herbert Valerio Riedel's avatar
      Use hash-consing to optimise index cache (#3897) · e0dd63cc
      Herbert Valerio Riedel authored
      Without this optimisation, `cabal info somethingnonexisting` results in
      
           960,397,120 bytes allocated in the heap
           739,652,560 bytes copied during GC
            67,757,128 bytes maximum residency (24 sample(s))
             2,234,096 bytes maximum slop
                   147 MB total memory in use (0 MB lost due to fragmentation)
      
      with this optimisation:
      
         1,000,825,744 bytes allocated in the heap
           656,112,432 bytes copied during GC
            44,476,616 bytes maximum residency (24 sample(s))
             2,302,864 bytes maximum slop
                   109 MB total memory in use (0 MB lost due to fragmentation)
      
      So the total memory in use is significantly lower. The total runtime is
      also slightly reduced, from
      
        INIT    time    0.001s  (  0.001s elapsed)
        MUT     time    0.683s  (  1.050s elapsed)
        GC      time    0.946s  (  0.946s elapsed)
        EXIT    time    0.005s  (  0.005s elapsed)
        Total   time    1.637s  (  2.002s elapsed)
      
      to
      
        INIT    time    0.001s  (  0.001s elapsed)
        MUT     time    0.664s  (  0.988s elapsed)
        GC      time    0.797s  (  0.797s elapsed)
        EXIT    time    0.004s  (  0.004s elapsed)
        Total   time    1.467s  (  1.789s elapsed)
      
      
      Note that there's currently ~80k cache entries, but only ~10k unique package names
      and ~6k unique versions. So hash-consing helps reduce the amount of heap objects 
      for both value types by one order of magnitude, which among other benefits also
      reduces GC overhead.
      e0dd63cc
  2. 24 Sep, 2016 3 commits
  3. 21 Sep, 2016 1 commit
    • Herbert Valerio Riedel's avatar
      Refactor & optimise construction of index cache · db1ef505
      Herbert Valerio Riedel authored
      This commit was motivated by @dcoutts' code-review comment:
      
      > Originally with using the `Sec.directoryEntries` that gave us only the
      > final version of each file, ie not all intermediate revisions. And
      > previously our strategy was to go through the final versions of each
      > file, in file order, and lookup just the ones we're interested in (which
      > in practice is 99% of them).
      >
      > Now for the new cache we want to go through all revisions, which means
      > all entries in file order. So instead of using `Sec.directoryEntries`
      > which reads from the tar index, we go straight for `Sec.directoryFirst`
      > which is block 0 and iterate through, using `lazyUnfold`.
      >
      > But we can now significantly simplify this and do it more
      > efficiently. Note that `indexLookupEntry` and `indexLookupFileEntry` are
      > expensive operations that seek in the tar file and read the tar entry at
      > that point. So lets do it exactly once per entry. The current code does
      > it once in the `lazyUnfold indexLookupEntry` and then again in `mk`. But
      > the old `mk` only did that because it had not previously looked up the
      > entry.
      db1ef505
  4. 20 Sep, 2016 2 commits
    • Herbert Valerio Riedel's avatar
      Try to regenerate a corrupted 01-index.cache · 92c51628
      Herbert Valerio Riedel authored
      With this commit, if a corrupted index cache is detected the
      `readIndexCache` function now regenerates the index cache and then
      reattempt to read the index once (and 'die's if it fails again).
      92c51628
    • Herbert Valerio Riedel's avatar
      Extend 01-index.cache & use 'Binary' encoding · f91e65a5
      Herbert Valerio Riedel authored
      This commit extends the index cache entries relevant for 01-index to
      include block numbers and timestamps, and makes them strict so recent
      GHCs unpack the fields:
      
          data IndexCacheEntry
              = CachePackageId PackageId BlockNo
              | CachePreference Dependency
              | CacheBuildTreeRef BuildTreeRefType BlockNo
      
      to
      
         data IndexCacheEntry
             = CachePackageId PackageId !BlockNo !Timestamp
             | CachePreference Dependency !BlockNo !Timestamp
             | CacheBuildTreeRef !BuildTreeRefType !BlockNo
      
      For the legacy `00-index.tar`s, the 'Timestamp' field is set to (-1),
      and the original 00-index.cache format is retained.
      
      For (secure) `01-index.tar`s, all of `IndexCacheEntry`s data is stored
      in the `01-index.cache` file.
      
      Moreover, to avoid having to write out and parse new two integers per
      cache entry, this patch switches to using `Binary` instances for
      encoding the `01-index.cache` file (while `00-index.cache` remains
      plain-text).
      f91e65a5
  5. 19 Sep, 2016 1 commit
    • Herbert Valerio Riedel's avatar
      Store secure repo index data as 01-index.* (#3862) · dc889b17
      Herbert Valerio Riedel authored
      "Secure" cabal repositories use a new extended & incremental
      `01-index.tar`. In order to avoid issues resulting from clobbering
      new/old-style index data, we save them locally to different names.
      
      With this patch, secure repos generate/update the files below on `cabal update`
      
      - `01-index.cache`
      - `01-index.tar`
      - `01-index.tar.gz`
      - `01-index.tar.idx`
      - `mirrors.json`
      - `root.json`
      - `snapshot.json`
      - `timestamp.json`
      
      ...while the legacy codepaths for non-secure repos operate on the files
      
      - `00-index.cache`
      - `00-index.tar`
      - `00-index.tar.gz`
      - `00-index.tar.gz.etag`
      
      This way the old/new codepaths don't interfere with each other anymore.
      
      Note: The format of `01-index.cache` file will be extended by the upcoming `--index-state` feature
      
      This trivially fixes #3854
      dc889b17
  6. 18 Sep, 2016 1 commit
  7. 06 Sep, 2016 2 commits
  8. 26 Apr, 2016 1 commit
  9. 25 Apr, 2016 1 commit
  10. 08 Apr, 2016 1 commit
  11. 07 Mar, 2016 1 commit
  12. 06 Mar, 2016 1 commit
  13. 07 Feb, 2016 1 commit
    • Duncan Coutts's avatar
      Add get{Source,Installed}PackagesMonitorFiles · a15318ce
      Duncan Coutts authored
      Re-export getInstalledPackagesMonitorFiles from Cabal lib and add
      getSourcePackagesMonitorFiles locally to D.C.IndexUtils.
      
      These are for tracking changes to these bits of the environment, so that
      it's possible for us to recompute things that depend on them.
      a15318ce
  14. 16 Jan, 2016 1 commit
    • Edward Z. Yang's avatar
      Distinguish between component ID and unit ID. · ef41f44e
      Edward Z. Yang authored
      
      
      GHC 8.0 is switching the state sponsored way to specify
      linker names from -this-package-key to -this-unit-id, so
      it behooves us to use the right one.  But it didn't make
      much sense to pass ComponentIds to a flag named UnitId,
      so I went ahead and finished a (planned) refactoring
      to distinguish ComponentIds from UnitIds.
      
      At the moment, there is NO difference between a ComponentId
      and a UnitId; they are identical.  But semantically, a
      component ID records what sources/flags we chose (giving us enough
      information to typecheck a package), whereas a unit ID records
      the component ID as well as how holes were instantiated
      (giving us enough information to build it.)  MOST code
      in the Cabal library wants unit IDs, but there are a few
      places (macros and configuration) where we really do
      want a component ID.
      
      Some other refactorings that got caught up in here:
      
          - Changed the type of componentCompatPackageKey to String, reflecting the
            fact that it's not truly a UnitId or ComponentId.
      
          - Changed the behavior of CURRENT_PACKAGE_KEY to unconditionally
            give the compatibility package key, which is actually what you
            want if you're using it for the template Haskell trick.  I also
            added a CURRENT_COMPONENT_ID macro for the actual component ID,
            which is something that the Cabal test-suite will find useful.
      
          - Added the correct feature test for GHC 8.0 ("Uses unit IDs").
      Signed-off-by: default avatarEdward Z. Yang <ezyang@cs.stanford.edu>
      ef41f44e
  15. 12 Jan, 2016 1 commit
  16. 07 Jan, 2016 1 commit
    • Edsko de Vries's avatar
      Introduce RepoContext · ba5c55c4
      Edsko de Vries authored
      The RepoContext encapsulates the list of repositories, as well as some
      associated state. In particular, it also encapsulates the HttpTransport, which
      will be initialized on demand and cached thereafter.  This is important for two
      reasons:
      
      * For the hackage-security integration: in order to be able to use cabal's own
        HttpTransport API for the secure repo, we need to have access to that
        transport when we initialize the repo, but as things stood that was not
        possible (cabal was initializing repos ahead of time but the transport on
        demand).
      
      * For the integration with the nix-local-branch it is important that the Repo
        type remains Serializable. By passing RepoContext rather than a list of
        Repos, we can leave RepoSecure serializable and separately maintain a mapping
        from cabal's Repo type to hackage-security's (stateful) Repository type.
      ba5c55c4
  17. 05 Jan, 2016 2 commits
  18. 04 Jan, 2016 1 commit
    • Duncan Coutts's avatar
      Switch to the tar package, drop builtin code · 0db3b216
      Duncan Coutts authored
      The current incarnation of the tar package originated as code inside
      cabal-install. That external tar package is now quite mature, with more
      features and is much faster. In particular the tar index features will
      be very useful for cabal-install, which currently has to maintain its
      own custom-format index/cache.
      0db3b216
  19. 27 Dec, 2015 1 commit
  20. 21 Dec, 2015 1 commit
  21. 18 Dec, 2015 1 commit
  22. 17 Dec, 2015 4 commits
    • Edsko de Vries's avatar
      Address comments on #2949. · ef6fe247
      Edsko de Vries authored
      This changes the definition of `Index` to
      
      ``` haskell
      data Index =
          -- | The main index for the specified repository
          RepoIndex Repo
      
          -- | A sandbox-local repository
          -- Argument is the location of the index file
        | SandboxIndex FilePath
      ```
      
      with
      
      ```
      cacheFile (SandboxIndex index) = index `replaceExtension` "cache"
      ```
      
      This also renames `repoRemote'` to `maybeRepoRemote`.
      
      I believe this addresses all comments.
      ef6fe247
    • Edsko de Vries's avatar
      Change Repo type · a8056d47
      Edsko de Vries authored
      The old Repo type has a repoKind
      
          repoKind     :: Either RemoteRepo LocalRepo,
      
      where LocalRepo was isomorphic to unit:
      
          data LocalRepo = LocalRepo
      
      This commit changes Repo to
      
          data Repo =
              -- | Local repositories
              RepoLocal {
                  repoLocalDir :: FilePath
                }
      
              -- | Standard (unsecured) remote repositores
            | RepoRemote {
                  repoRemote   :: RemoteRepo
                , repoLocalDir :: FilePath
                }
      
      instead, which is a little more idiomatic and will make adding more repository
      types easier.
      a8056d47
    • Edsko de Vries's avatar
      Introduce datatype for the Cabal index cache · 2cce2cb8
      Edsko de Vries authored
      Right now this just wraps the list of cache entries, but this might make it
      easier in the future to change the structure of the cache.
      2cce2cb8
    • Edsko de Vries's avatar
      Introduce structured type for specifying index · 99454f73
      Edsko de Vries authored
      In particular, distinguish between the repo-global index and a (sandbox-)local
      index.
      99454f73
  23. 09 Dec, 2015 1 commit
  24. 09 Oct, 2015 1 commit
    • Edward Z. Yang's avatar
      Implement ComponentId, replacing PackageKey and InstalledPackageId. · b083151f
      Edward Z. Yang authored
      
      
      Today in Cabal, when you build and install a package, it is
      uniquely identified using an InstalledPackageId which is computed
      using the ABI hash of the library that was installed.  There
      are few problems with doing it this way:
      
          - In a Nix-like world, we should instead uniquely identify
            build products by some sort of hash on the inputs to the
            compilation (source files, dependencies, flags).  The ABI
            hash doesn't capture any of this!
      
          - An InstalledPackageId suggests that we can uniquely identify
            build products by hashing the source and dependencies of
            a package as a whole.  But Cabal packages contain many components:
            a library, test suite, executables, etc.  Currently, when
            we say InstalledPackageId, we are really just talking about
            the dependencies of the library; however, this is unacceptable
            if a Cabal package can install multiple libraries; we need
            different identifiers for each.
      
          - We've also needed to compute another ID, which we've called
            the "package key", which is to be used for linker symbols
            and type equality GHC-side.  It is confusing what the distinction
            between this ID and InstalledPackageIds are; the main reason
            we needed another ID was because the package key was needed
            prior to compilation, whereas the ABI hash was only available
            afterwards.
      
      This patch replaces InstalledPackageId and PackageKey with a
      new identifier called ComponentId, which has the following
      properties:
      
          - It is computed per-component, and consists of a package
            name, package version, hash of the ComponentIds
            of the dependencies it is built against, and the name
            of the component.  For example, "foo-0.1-abcdef" continues
            to identify the library of package foo-0.1, but
            "foo-0.1-123455-foo.exe" would identify the executable,
            and "foo-0.1-abcdef-bar" would identify a private sub-library
            named bar.
      
          - It is passed to GHC to be used for linker symbols and
            type equality.  So as far as GHC is concerned, this is
            the end-all be-all identifier.
      
          - Cabal the library has a simple, default routine for computing
            a ComponentId which DOES NOT hash source code;
            in a later patch Duncan is working on, cabal-install can
            specify a more detailed ComponentId for a package
            to be built with.
      
      Here are some knock-on effects:
      
          - 'id' is a ComponentId
      
          - 'depends' is now a list of ComponentIds
      
          - New 'abi' field to record what the ABI of a unit is (as it is no longer
            computed by looking at the output of ghc --abi-hash).
      
          - The 'HasInstalledPackageId' typeclass is renamed to
            'HasComponentId'.
      
          - GHC 7.10 has explicit compatibility handling with
            a 'compatPackageKey' (an 'ComponentId') which is
            in a compatible format.  The value of this is read out
            from the 'key' field.
      Signed-off-by: default avatarEdward Z. Yang <ezyang@cs.stanford.edu>
      b083151f
  25. 17 Sep, 2015 3 commits
  26. 30 Jul, 2015 1 commit
  27. 17 Jun, 2015 1 commit
  28. 31 May, 2015 1 commit
  29. 20 May, 2015 1 commit
    • Duncan Coutts's avatar
      Handle multiple preferred-versions in the index tarball better · 36265fb1
      Duncan Coutts authored
      The existing code supports reading multiple preferred-versions files in
      the 00-index.tar and merging them. However it doesn't do it quite right
      when the same file is updated, it merged them instead of the later one
      overriding the first.
      
      This should make no difference right now because the 00-index.tar
      typically only contains a single preferred-versions file, with no
      updates.
      36265fb1
  30. 28 Mar, 2015 1 commit