|
|
|
|
|
This page describes how GHC depends on and makes use of Cabal.
|
|
|
|
|
|
# General
|
|
|
|
|
|
|
|
|
GHC uses Cabal in a few ways
|
|
|
|
|
|
- GHC ships with the Cabal library pre-installed. This is as a convenience to users, and as asked for in the original Cabal specification.
|
|
|
- The GHC build system makes use of the Cabal library. See [Building/Architecture/Idiom/Cabal](building/architecture/idiom/cabal)
|
|
|
- The external representation for installed packages as used by ghc-pkg is conceptually defined by the Cabal specification, and in practice defined by the Cabal library (with types, parsers etc).
|
|
|
- The ghc-pkg program depends on the Cabal library for the types, parser etc of the installed packages.
|
|
|
- Historically the GHC library also depended on Cabal for the types of installed packages, but this is no longer the case.
|
|
|
|
|
|
# Removal of the GHC library dependency on the Cabal library
|
|
|
|
|
|
|
|
|
See ticket [\#8244](https://gitlab.haskell.org//ghc/ghc/issues/8244)
|
|
|
|
|
|
|
|
|
The GHC library used to depend on the Cabal library directly, for the representation of installed packages. This was convenient for implementation but had a number of drawbacks:
|
|
|
|
|
|
- Any package making use of the GHC library would be forced to use the same version of Cabal as GHC used. This was annoying because while the parts of Cabal that GHC used were not very fast moving, other parts of the library are, and so other packages did want to use a different version of Cabal.
|
|
|
- Given the existing limitations and inconveniences of installing multiple versions of the same package, the GHC dependency on Cabal made it hard to upgrade Cabal separately. Of course this is really more of a limitation of the packaging side of things.
|
|
|
- The fact that GHC depended directly on Cabal placed limitations on the implementation of Cabal. GHC must be very careful about which packages it needs to be able to build (so called boot packages). Because Cabal was a boot package, it could itself only depend on other boot packages. In particular, Cabal needs a decent parser combinator library, but no such library is available as a boot package (and GHC developers were understandably reluctant to add dependencies on parsec, mtl, text etc as would be required).
|
|
|
|
|
|
## Design of GHC-library's non-dependency on Cabal
|
|
|
|
|
|
|
|
|
Under the new approach, the GHC library does not depend on Cabal, but the ghc-pkg tool still does. Remember that the GHC library depended on Cabal for the representation of installed packages (the `InstalledPackageInfo` type, plus Read/Show/Binary instances for the file formats).
|
|
|
|
|
|
|
|
|
The key idea is that the GHC library now uses its own types and file format for installed packages. These types are shared with ghc-pkg. The ghc-pkg tool still consumes and produces package descriptions using the Cabal types, parser & pretty-printer. The ghc-pkg tool performs the translation from Cabal's types into and GHC's types.
|
|
|
|
|
|
|
|
|
The ghc-pkg tool (as required by the Cabal spec) must consume, and regurgitate package descriptions in an external representation defined by the Cabal spec. So the ghc-pkg database must contain all the information for a Cabal installed package description, so that ghc-pkg can spit it out again. The simplest approach to doing that is to use the Cabal library and to use the types, parser and pretty printer that it defines. In particular this makes it easier to keep Cabal and GHC in sync when changes to the installed package descriptions are made, and makes interoperability with Cabal straightforward.
|
|
|
|
|
|
|
|
|
On the other hand, GHC must also be able to read (but not write) the ghc-pkg databases, to get the set of installed packages. We do not want GHC to have to use Cabal to read those packages. However because GHC only reads and never writes the package database, and because it only needs a subset of the information (what's necessary to compile, without the need for metadata), it is practical to have a separate (simpler) type for packages for GHC to consume.
|
|
|
|
|
|
|
|
|
The format of the ghc-pkg databases is designed to support this approach: it contains all the packages in two different representations, once using Cabal types and once using GHC's types. These are contained in two sections of the package.cache binary file inside each package database directory. One section contains the Cabal representation. This section is read back by ghc-pkg when reading the package database. The other section contains the GHC representation. This section is read by GHC. The format is such that GHC does not need to know the representation of the other section to be able to read its own section. The ghc-pkg tool knows about the representation of both sections and writes both.
|
|
|
|
|
|
## Technical details
|
|
|
|
|
|
|
|
|
The ghc-pkg file format is defined by a library shared between GHC and ghc-pkg. It defines GHC package type and provides functions to read each section, and to write both sections:
|
|
|
|
|
|
```wiki
|
|
|
data InstalledPackageInfo -- same name as used by Cabal, but simpler type
|
|
|
|
|
|
readPackageDbForGhc :: FilePath -> IO [InstalledPackageInfo]
|
|
|
|
|
|
readPackageDbForGhcPkg :: Binary pkgs => FilePath -> IO pkgs
|
|
|
|
|
|
writePackageDb :: FilePath -> [InstalledPackageInfo] -> pkgs -> IO ()
|
|
|
```
|
|
|
|
|
|
|
|
|
Note here that the concrete type of ghc-pkg's representation of packages is not fixed, it simply has to be an instance of `Binary`. This trick means this library does not have to depend on Cabal (which is vital because GHC depends on it), but allows ghc-pkg to instantiate using Cabal's types.
|
|
|
|
|
|
|
|
|
The above types are a slight simplification, the `InstalledPackageInfo` is actually has a number of type parameters, which are used in the fields, e.g.:
|
|
|
|
|
|
```wiki
|
|
|
data InstalledPackageInfo instpkgid srcpkgid srcpkgname pkgkey modulename
|
|
|
= InstalledPackageInfo {
|
|
|
...
|
|
|
depends :: [instpkgid],
|
|
|
...
|
|
|
exposedModules :: [modulename],
|
|
|
...
|
|
|
}
|
|
|
```
|
|
|
|
|
|
|
|
|
The reason for this is that concrete types like `ModuleName` are defined within the GHC library, and we do not want to move their definition into this library. Inside GHC we instantiate this type like so
|
|
|
|
|
|
```wiki
|
|
|
type PackageConfig = InstalledPackageInfo
|
|
|
InstalledPackageId
|
|
|
SourcePackageId
|
|
|
PackageName
|
|
|
Module.PackageKey
|
|
|
Module.ModuleName
|
|
|
```
|
|
|
|
|
|
|
|
|
In ghc-pkg, on the other hand we instantiate it with `String` for all parameters. This works because these fields are all ultimately represented by strings of some sort, and all have external representations which are a string (UTF8 on disk). We manage this using a little type class:
|
|
|
|
|
|
```wiki
|
|
|
class BinaryStringRep a where
|
|
|
fromStringRep :: BS.ByteString -> a
|
|
|
toStringRep :: a -> BS.ByteString
|
|
|
```
|
|
|
|
|
|
|
|
|
The `readPackageDbForGhc` above is then actually:
|
|
|
|
|
|
```wiki
|
|
|
readPackageDbForGhc :: (BinaryStringRep a, BinaryStringRep b, BinaryStringRep c,
|
|
|
BinaryStringRep d, BinaryStringRep e) =>
|
|
|
FilePath -> IO [InstalledPackageInfo a b c d e]
|
|
|
```
|
|
|
|
|
|
|
|
|
So it uses the class to convert to/from the on disk UTF8 representation, and the internal representation (`String` for ghc-pkg, and things like newtype'd `FastString`s in GHC). |