... | ... | @@ -37,41 +37,106 @@ For the purposes of this commentary, we are mostly concerned with GHC and `ghc-p |
|
|
|
|
|
## Identifying Packages
|
|
|
|
|
|
<table><tr><th>`PackageName`</th>
|
|
|
<table><tr><th>`PackageName` ("base")</th>
|
|
|
<td>
|
|
|
A string, e.g. "base". Defined in `Distribution.Package`. Does not uniquely identify a package: the package
|
|
|
A string. Defined in `Distribution.Package`. Does not uniquely identify a package: the package
|
|
|
database can contain several packages with the same name.
|
|
|
</td></tr></table>
|
|
|
|
|
|
<table><tr><th>`PackageIdentifier`</th>
|
|
|
<table><tr><th>`PackageIdentifier` ("base-4.1.0.0")</th>
|
|
|
<td>
|
|
|
A `PackageName` plus a `Version`. Does uniquely identify a package, but only by convention (we may lift
|
|
|
this restriction in the future). `InstalledPackageInfo` contains the field `package :: PackageIdentifier`.
|
|
|
</td></tr></table>
|
|
|
|
|
|
<table><tr><th>`InstalledPackageId`</th>
|
|
|
<table><tr><th>`InstalledPackageId` ("base-4.1.0.0-1mpgjN")</th>
|
|
|
<td>
|
|
|
An opaque string. Each package is uniquely identified by its `InstalledPackageId`. Dependencies
|
|
|
between installed packages are also identified by the `InstalledPackageId`.
|
|
|
(introduced in GHC 6.12 / Cabal 1.7.2) A string that uniquely identifies a package in the database. Dependencies
|
|
|
between installed packages are identified by the `InstalledPackageId`. An `InstalledPackageId` is currently
|
|
|
chosen by adding a random suffix to the string representing the `PackageIdentifier` when a package is registered.
|
|
|
</td></tr></table>
|
|
|
|
|
|
<table><tr><th>`PackageId`</th>
|
|
|
<table><tr><th>`PackageId` (these currently look like "base-4.1.0.0" in GHC 6.12)</th>
|
|
|
<td>
|
|
|
Inside GHC, we use the type `PackageId`, which is a `FastString` representation of `InstalledPackageId`.
|
|
|
The (Z-encoding of) `PackageId` prefixes each external symbol in the generated code, so that the modules of one package do
|
|
|
not clash with those of another package, even when the module names overlap.
|
|
|
Inside GHC, we use the type `PackageId`, which is a `FastString`. The (Z-encoding of) `PackageId` prefixes each
|
|
|
external symbol in the generated code, so that the modules of one package do not clash with those of another package,
|
|
|
even when the module names overlap.
|
|
|
</td></tr></table>
|
|
|
|
|
|
## Design constraints
|
|
|
|
|
|
The tools do not currently support having multiple packages with the same name and version. When re-installing an existing package, the new package should have a different `InstalledPackageId` from the previous version, even if the `PackageIdentifiers` are the same. In this way, we can detect when a package is broken because one of its dependencies has been recompiled and re-installed.
|
|
|
1. We want [Commentary/Compiler/RecompilationAvoidance](commentary/compiler/recompilation-avoidance) to work. This means that symbol names should not contain any information that varies too often, such as the ABI hash of the module or package. The ABI of an entity should depend only on its definition, and the definitons of the things it depends on.
|
|
|
|
|
|
## Design constraints
|
|
|
1. We want to be able to detect ABI incompatibility. If a package is recompiled and installed over the top of the old one, and the new version is ABI-incompatible with the old one, then packages that depended on the old version should be detectably broken using the tools.
|
|
|
|
|
|
1. ABI compatibility:
|
|
|
|
|
|
- We want repeatable compilations. Compiling a package with the same inputs should yield the same outputs.
|
|
|
- Furthermore, we want to be able to make compiled packages that expose an ABI that is compatible (e.g. a superset)
|
|
|
of an existing compiled package.
|
|
|
- Modular upgrades: we want to be able to upgrade an existing package without recompiling everything that depends
|
|
|
on it, by ensuring that the replacement is ABI-compatible.
|
|
|
- Shared library upgrades. We want to be able to substitute a new ABI-compatible shared library for an old one, and all the existing binaries linked against the old version continue to work.
|
|
|
- ABI compatibility is dependent on GHC too; changes to the compiler and RTS can introduce ABI incompatibilities. We
|
|
|
guarantee to only make ABI incompatible changes in a major release of GHC. Between major releases, ABI compatibilty
|
|
|
is ensured; so for example it should be possible to use GHC 6.12.2 with the packages that came with GHC 6.12.1.
|
|
|
|
|
|
|
|
|
Right now, we do not have repeatable compilations, so while we cannot do (3), we keep it in mind.
|
|
|
|
|
|
## The Plan
|
|
|
|
|
|
|
|
|
We need to talk about some more package Ids:
|
|
|
|
|
|
- `PackageSymbolId`: the symbol prefix used in compiled code.
|
|
|
- `PackageLibId`: the package Id in the name of a compiled library file (static and shared).
|
|
|
|
|
|
### Detecting ABI incompatibility
|
|
|
|
|
|
- in the package database, dependencies specify the `InstalledPackageId`.
|
|
|
|
|
|
- The package database will contain at most one instance of a given package/version combination. The tools
|
|
|
are not currently able to cope with multiple instances (e.g. GHC's -package flag selects by name/version).
|
|
|
|
|
|
- If, say, package P-1.0 is recompiled and re-installed, the new instance of the package will almost
|
|
|
certainly have an incompatible ABI from the previous version. We give the new package a distinct
|
|
|
`InstalledPackageId`, so that packages that depend on the old P-1.0 will now be detectably broken.
|
|
|
|
|
|
- `PackageSymbolId`: We do not use the `InstalledPackageId` as the symbol prefix in the compiled code, because
|
|
|
that interacts badly with [Commentary/Compiler/RecompilationAvoidance](commentary/compiler/recompilation-avoidance). Every time we pick a
|
|
|
new unique `InstalledPackageId` (e.g. when reconfiguring the package), we would have to recompile
|
|
|
the entire package. Hence, the `PackageSymbolId` is picked deterministically for the package, e.g.
|
|
|
it can be the `PackageIdentifier`.
|
|
|
|
|
|
- `PackageLibId`: we do want to put the `InstalledPackageId` in the name of a library file, however. This allows
|
|
|
ABI incompatibility to be detected by the linker. This is important for shared libraries too: we
|
|
|
want an ABI-incompatible shared library upgrade to be detected by the dynamic linker. Hence,
|
|
|
`PackageLibId` == `InstalledPackageId`.
|
|
|
|
|
|
### Allowing ABI compatibilty
|
|
|
|
|
|
1. We want RecompilationAvoidance to work. So that means symbol names should not contain any information that varies too often, such as the ABI hash of the module, or the package.
|
|
|
- The simplest scheme is to have an identifier for each distinct ABI, e.g. a pair of the package name and an integer
|
|
|
that is incremented each time an ABI change of any kind is made to the package. The ABI identifier
|
|
|
is declared by the package, and is used as the `PackageSymbolId`. Since packages with the same ABI identifier
|
|
|
are ABI-compatible, the `PackageLibId` can be the same as the `PackageSymbolId`.
|
|
|
|
|
|
1. We want to be able to compile a package that is compatible with another package; i.e. exports the same ABI. Right now it isn't possible to do this, but we hope to be able to do it in the future, and we should design the system with that in mind.
|
|
|
- The previous scheme does not allow ABI-compatible changes (e.g. ABI extension) to be made. Hence, we could
|
|
|
generalise it to a major/minor versioning scheme.
|
|
|
|
|
|
1. When a package is recompiled and installed, packages that depended on the old version should now be detectably broken (unless the newly compiled version is really compatible with the old one).
|
|
|
- the ABI major version is as before, the package name + an integer. This is also the `PackageSymbolId`.
|
|
|
- the ABI minor version is an integer that is incremented each time the ABI is extended in a compatible way.
|
|
|
- package dependencies in the database specify the major+minor ABI version they require, in addition to the
|
|
|
`InstalledPackageId`. They may be satisfied by a greater minor version; when upgrading a package with an
|
|
|
ABI-compatible replacement, ghc-pkg updates dependencies to point to the new `InstalledPackageId`.
|
|
|
- `PackageLibId` is the major version. In the case of shared libraries, we may name the library using the
|
|
|
major + minor versions, with a symbolic link from the major version to major+minor.
|
|
|
- the shared library `SONAME` is the major version.
|
|
|
|
|
|
- The previous scheme only allows ABI-compatible changes to be made in a linear sequence. If we want a tree-shaped
|
|
|
compatibility structure, then something more complex is needed (ToDo).
|
|
|
|
|
|
(3) means that dependencies in the package database should mention something unique about a package installation that changes when the package is installed. However, (1) means that we don't want to put such unique things in symbol names. |
|
|
- The previous schemes only allow compatible ABI changes to be made. If we want to allow incompatible changes to be
|
|
|
made, then we need something like ELF's symbol versioning. This is probably overkill, since we will be making
|
|
|
incompatible ABI changes in the compiler and RTS at regular intervals anyway, so long-term ABI compatibility is
|
|
|
impractical at this stage. |