Package Reorg
In this page we collect proposals and design discussion for reorganising the packages that come with compilers, and the contents of those packages.
None of the ideas herein are claimed to belong to any particular person, many of the ideas have been extracted from mailing list discussions, eg.
http://www.haskell.org/pipermail/libraries/2006-November/006396.html
Some of the points are GHC-specific. Please feel free to insert points specific to other compilers.
Goals
- It would be good to have set of 'core' packages that is installed with every Haskell implementation. More on this at PackageReorg/Rationale page
- Forwards compatibility. Users would like their programs written against the 'core' packages to continue to work, without modification to source text or build system, after upgrading the compiler, or its packages, or switching to a different compiler.
- Backwards compatibility. Users would like to be able to take a program written against some version of the 'core' packages, and build it with an older compiler, accepting that they may have to install newer versions of the 'core' packages in order to do so.
It may not be possible to fully achieve these goals (in particular, backwards compatibility), but that does not mean we should not aim for them.
Proposal
Here's a straw-man proposal
-
There is a set of packages that come with every conforming Haskell implementation. Let's call these the Core Packages to avoid confusion (Bulat called these the "base packages", but that's an over-used term given that there is a package called
base
). The good thing about the Core Packages is that users know that they will be there, and they are consistent with each other. -
Any particular implementation may install more packages by default; for example GHC will install the
template-haskell
andstm
packages. Let's call these the GHC Install Packages, Hugs Install Packages etc; the Install Packages are a superset of the Core Packages.
What is in the Core Packages?
The Core Packages are installed with every conforming Haskell implementation. What should be in the Core? There is a tension:
- As much as possible; which means in practice widely-used and reasonably stable packages. It is convenient for programmers to have as much as possible in a consistent, bundle that is (a) known to work together bundle, and (b) known to work on all implementations.
- As little as possible; which in practice means enough to run Cabal so that you can run the Setup files that come when downloading new packages. As Ian puts it: the less we force the implementations to come with, the quicker compilation will be when developing, the smaller Debian packages (for example) can be, the lower the disk space requirements to build GHC, the lower the time wasted when a Debian package (for example) build fails and the fewer packages we are tangling up with compiler release schedules.
There's a real choice here: Bulat wants (1) and Ian wants (2).
Initial stab at (1):
base
Cabal
haskell98
- Some
regex
packages (precisely which?) -
unix
orWin32
. Questionable, partly because it means the Core interface becomes platform-dependent; and partly becauseWin32
would double the size of the Hugs distribution. parsec
mtl
time
network
-
QuickCheck
(questionable) -
HUnit
(questionable)
Initial stab at (2):
base
haskell98
Cabal
-
filepath
(?)
Bulat: i think that all regex packages should be included and of course libs that helps testing. overall, it should be any general-purpose lib that porters accept (enlarging this set makes users live easier, and porters live harder)
about unix/win32 - these libs provide access to OS internals, not some everywhere-portable API. moreover, other world-interfacing libs (i/o, networking) should use APIs provided by these libs with a conditional compilation (CPPery) tricks in order to provide portable APIs! current situation where such libs use FFI isn't ideal. WinHugs size problem is rather technical - it includes a lot of DLLs which contains almost the same code
i agree to start with minimal stub, and then proceed with discussing inclusion of each library. what we need now is requirements to include library in this set and lifetime support procedure. so:
Requirements to libraries to be included in core set
- BSD-licensed, and even belongs to Haskell community?
- portable (is sense of compiler and OS), may be just Haskell' compatible?
- already widely used
- shouldn't duplicate existing core libs functionality (?)
Exact inclusion, support and exclusion processes?
The base package
The base package is a bit special
-
Package
base
is rather big at the moment. -
From a user's point of view it would be nicer to give it a compiler-independent API. (A module like
GHC.Exts
would move to a new packageghc-base
.)
Thinking of GHC alone for a moment, we could have a package ghc-base
(which is pretty much the current base
) and a thin wrapper package
base
that re-exposes some, but not all, of what ghc-base
exposes.
To support this re-exposing, we need a small fix to both GHC and
Cabal, but one that is independently desirable.
Similarly, Hugs could build hugs-base
from the same souce code, by
using CPP-ery, exactly as now. The thin base
wrapper package
would not change.
To make base
smaller, we could remove stuff, and put it into
separate packages. But be careful: packages cannot be cyclic, so
anything that is moved out can't be used in base
.
Some chunks that would currently be easy to split off are:
- Data.ByteString.* (plus future packed Char strings)
- Control.Applicative (?), Data.Foldable, Data.Monoid (?), Data.Traversable, Data.Graph, Data.IntMap, Data.IntSet, Data.Map, Data.Sequence, Data.Set, Data.Tree
- System.Console.GetOpt
- Text.PrettyPrint.*
- Text.Printf
Some other things, such as arrays and concurrency, have nothing else depending on them, but are so closely coupled with GHC's internals that extracting them would require exposing these internals in the interface of base
.
Bulat: my ArrayRef library contains portable implementation of arrays. there is only thin ghc/hugs-specific layer which should be provided by ghcbase/hugsbase libs. except for MPTC problem (IArray/MArray classes has multiple parameters), this library should be easily portable to any other haskell compiler
See also BaseSplit.
Other packages
Other non-core packages would probably have their own existence. That
is, they don't come with an implementation; instead you use
cabal-get
, or some other mechanism, such as your OS's package
manager. Some of these currently come with GHC, and would no longer do
so
GLUT
ALUT
OpenAL
OpenGL
HGL
HUnit
ObjectIO
X11
arrows
cgi
fgl
html
xhtml
Bulat: i propose to unbundle only graphics/sound libs because these solves particular problems and tends to be large, non-portable (?) and some are just legacy ones - like ObjectIO. we should keep everything small & general purpose, including HUnit, arrows, fgl, html and xhtml, and include even more: ByteString, regex-*, Edison, Filepath, MissingH, NewBinary, QuickCheck, monads
Testing
We should separate out package-specifc tests, which should be part of the repository for each package. Currently they are all squashed together into the testsuite repository.
Implementation-specific notes
Notes about GHC
Currently GHC installs a set of packages by default, the so-called
GHC Boot Packages. They are graphed here, with arrows representing
dependencies between them:
These are exactly the libraries required to build GHC. That shouldn't be the criterion for the core packages.
One reason we do this is because it means that every GHC installation can build GHC. Less configure-script hacking. (NB: even today if you upgrade any of these packages, and then build GHC, the build might fail because the CPP-ery in GHC's sources uses only the version number of GHC, not the version number of the package.)
Still, for convenience we'd probably arrange that the GHC Install Packages included all the GHC Boot Packages.
Every GHC installation must include packages: base
, ghc-prim
,
integer
and
template-haskell
, else GHC itself will not work. (In fact
haskell98
is also required, but only because it is linked by
default.)
So GHC's Install Packages would be the Core Packages plus
template-haskell
editline
integer
ghc-prim
You can upgrade any package, including base
after installing GHC.
However, you need to take care. You must not change a number of things
that GHC "knows about". In particular, these things must not change
- Name
- Defining module
GHC knows even more about some things, where you must not change
- Type signature
- For data types, the names, types, and order of the constructors
The latter group are confined to packages base and template-haskell.
(Note: a few other packages are used by tests in GHC's test suite,
currently: mtl
, QuickCheck
. We should probably eliminate the mtl
dependency; but QuickCheck
is used as part of the test infrastructure
itself, so we'll make it a GHC Boot Package.)
Notes about Hugs
Recent distributions of Hugs come in two sizes, jumbo and minimal.
Minimal distributions include only the packages base
, haskell98
and Cabal
.
(Hugs includes another package hugsbase
containing interfaces to Hugs primitives.)
The requirements for this set are to
- run Haskell 98 programs
- allow packages to be added and upgraded using Cabal
(Currently cpphs
is a Haskell 98 program, so the latter implies the former.)
It should be possible to upgrade even the core packages using Cabal.