The Gitolite Switch
In case of TLDR, go straight to #Developerchanges.
Currently, the developer setup on
ghc.haskell.org is a bit complicated and unfortunate. Most importantly, it hosts all of the Git repositories we use. When a developer for GHC is inducted, we:
Create them a user (i.e. a full shell account) on
Add their SSH public key
Add them access to the darcs group (which owns the canonical, public facing Git repositories.)
Then, that user can clone from the server over SSH, and also push to the repositories directly with their new permissions.
This unfortunately has some downsides:
Every user needs a full shell account. While we probably won't be forkbombed by someone, few people actually need a full shell account, and the principle of least privilege applies here. Really, everybody is just pushing to Git.
Because of the previous point, group and access permissions on the repositories regularly get screwed up, causing situations where people don't have access (and can't push,) or we have to have
post-receivehooks that modify the permissions. Both of these suck (this doesn't affect e.g. the Linux kernel developers, who have pull-based development models, because they can afford to.)
- Due to the current permission scheme, all users in the
darcsgroup are effectively at the level of Trac admins, which can manipulate the
trac.dbdatabase (this is needed by the Git hooks to update the Trac tickets). Again, the principle of least privilege should apply here.
- Moreover, people invariably fix this but it's always slightly patchy, and so the repositories that need 'fixing' for things like permissions are inconsistent, and it's hard to keep track of what needs to be maintained.
- Due to the current permission scheme, all users in the
All users can willy nilly create (and delete!) tags and branches, and perform some risky Git operations. Ideally, only release maintainers should have permission to do things like cut a release tag.
Git admins need to perform risky direct manipulations in the file-system, even for the simplest Git repository administration tasks. Also, Git hook scripts are not centrally managed right now, but placed individually in each git repository that needs them; thus to fix/improve a git hook script, one needs to update all git repositories using the affected script.
The proposed remedy is to use the Git-access wrapper ''Gitolite'' which provides an authorization layer on top of Git, and is executed as a separate system user (thus accessing the git repositories with only one Unix UID). Users accessing git repositories via ssh are discriminated by their ssh public key.
Gitolite also greatly simplifies user management, as a user management is little more than adding/removing a file containing the user's public key, and pushing that the administrative "
gitolite-admin" git repository (Gitolite is administrated via git itself!). Similarly, adding a new git repository comes down to adding a few lines to the central Gitolite repository config file and pushing that file to the "
gitolite-admin" Git repository.
Below are some notes about how we (Austin & Herbert) would like to go about doing this.
Ideally, most of the new setup can occur concurrently with the normal one undisturbed. Presumably 'the big switch' can happen in an hour or so downtime, in which we take the old URIs offline, bring Gitolite online and tell people this is the time to fix your push URLs.
git.haskell.org DNS CNAME alias for
ghc.haskell.org has not been activated yet. If you want to try out the new smart-protocol
http:// urls, you can append "
126.96.36.199 git.haskell.org" to your
/etc/hosts file. For the non-
http:// protocols, you can just substitute
ghc.haskell.org in the following instructions.
Anonymous read access
No immediate action needed
http://darcs.haskell.org/<repo-name>.git Git URLs will continue to work (and still use The Dumb Protocol) for some time.
git clone will use the new
http://git.haskell.org/<repo-name>.git Git URLs and provide the much faster "smart" Git/HTTP protocol. Moreover, also access via
git://git.haskell.org/<repo-name>.git is provided now.
To rewrite a checked out to the new access URLs, you can use the
sync-all script, e.g.:
./sync-all -r git://git.haskell.org remote set-url origin
Authenticated read/write access
For developers (with push permissions) who have already checked out repositories, the only change needed is to go over their repositories and update their git uris from
This can be accomplished by invoking the
sync-all script. But before you do this, you need to make sure you have the latest version of
sync-all as well - a few tiny changes were needed.
Many developers use older trees that may not have been updated before the switch. But never fear, it's quite easy to fix everything, using the GitHub mirror.
# make sure you're on a clean copy of master without any outstanding changes git checkout master # get the latest version of `./sync-all` from GitHub git pull https://github.com/ghc/ghc.git master ./sync-all -r git://git.haskell.org remote set-url origin ./sync-all -r ssh://email@example.com remote set-url --push origin
This will correctly set up both push and pull URLs (use
http:// instead of
git:// if you're behind a firewall blocking port 9418).
Additional user-visible changes
Ssh public keys for git access are separate from shell account, and will be managed by Gitolite admins (the initial public keys will be populated from the currently authorized shell accounts'
Developers no longer need a shell account for being able to push to git repositories; so ideally, as ghc.haskell.org hosts the critical Trac&Git resources, only *required* shell accounts should remain.
'ssh git@…' shows list of currently accessible repositories (+ respective ACLs)
Optionally later-on: users can manage their pubkeys via "sskm" (self-service key management)
Setup (to be done by the Admins)
apt-get install Gitolite & dpkg-reconfigure(Debian7 ships with Gitolite version 2.3 in its main-pool);
dpkg-reconfigurewill ask about:
Set up Gitolite's permissions (-> umask!) in such a way that Gitolite-owned repositories can only be modified by Gitolite's
gituser, and that Gitolite-owned git repositories remain world readable.
old git-repository locations in
/home/darcsare made symlinks to the new Gitolite-owned git repositories (which are to be moved to
/home/git/repositoriesand permission-reset by Gitolite's management scripts)
gitgroup and vice versa, so that Trac and Gitolite can interact with each other.
Git hook scripts need to be made Gitolite-friendly
Populate Gitolite users & Ssh pubkeys from the
~/.ssh/authorized_keysfiles of members of
Register new CNAME
- HTTP vhost
- new HTTP vhost
git.haskell.orgserving /home/git/repositories (optionally enable smart git http transport); Git uris will be
- Cgit could be used as front-page on '/', c.f. http://www.kernel.org/
- HTTP vhost
See #8121 (closed)
Tangential: should we deprecate the darcs.haskell.org URL? Who uses it? The name was known to be a funny misnomer from the Git switchover times, but As Far As Austin Knows, only GHC developers really use it these days. Perhaps we could just retire it.
- Austin: both nhc and yhc use it, so Malcolm and Neil will need to be asked, at least.
Herbert: We can leave
darcs.haskell.orgas-is for legacy reasons, while creating new VHOST on
http://git.haskell.org/which would expose only the Gitolite-owned git repositories (and maybe also a Cgit front-page as e.g. on http://git.kernel.org)
Who's actively committing, and does anybody beyond that actually need a shell account? It's unclear who uses
ghc.haskell.orgfor what at the moment, beyond push access.
Tangential: The current directory setup is a total mess on
darcs.haskell.org, especially since the old darcs repos hang around there (having both
testsuite.gitis confusing.) Maybe we should clean it up if we're going to use it for a browseable directory.
Austin Seipp (thoughtpolice) and Herbert Valerio Riedel (hvr) can be contacted about details or specifics. If you encounter any issues, you can also comment on #8121 (closed).