Commit 5fb764ff authored by Ben Gamari's avatar Ben Gamari 🐢

Infrastructure status update

parent 27b31891
Pipeline #4314 passed with stages
in 14 minutes and 52 seconds
---
title: GHC Infrastructure Update
author: Ben Gamari
date: 2019-04-03T12:51:28
tags: status, infrastructure
---
Around five months ago I proposed that we undertake what became a comprehensive
rebuild of GHC's infrastructure. Since November we have been quietly working behind
the scenes to make this new infrastructure a reality; this has been a massive
project and however I'm happy to say we are now emerging on the other side and
we are very happy with the result.
In this post I want to take this opportunity to describe why this project was
needed, what it has entailed, and where it has brought us. Enjoy!
# Motivation
Most of our users are aware that GHC is an old project: the [earliest
release](https://haskell.org/ghc/download_ghc_029.shtml) I have found is GHC
0.29, released nearly 23 years ago.
In addition to being old, GHC is also a large project and, like most large
projects, has a significant amount of infrastructure to keep things moving
smoothly. Before the migation this infrastructure included:
* [gitolite](https://gitolite.com/gitolite/), our repository hosting service (`git.haskell.org`)
* home-grown infrastructure for maintaining git mirrors to and from GitHub
our homepage (`haskell.org/ghc`)
* [Trac](https://trac.edgewall.org/), our issue tracker and wiki (`ghc.haskell.org/trac/ghc`)
* a number of home-grown linting scripts for ensuring code quality and
preventing mistakes
* [Phabricator](https://phabricator.com/), our code review tool (`phabricator.haskell.org`)
* our continuous integration services (Phabricator and more recently
CircleCI/Appveyor)
While this all worked reasonably well it was not without its share of pain-points.
These issues ranged from minor (e.g. a
constant trickle of effort going towards maintaining consistency between
Phabricator, `git`, and Trac) to serious (e.g. our servers stuck on a
soon-to-be deprecated Debian release; a need for constant fiddling to keep CI
builders running). Nevertheless, none of these issues seemed significant enough
to force any immediate change.
This calculation changed around twelve months ago, when we received word that
Rackspace would be ending its open-source software program, which graciously provided
hosting for our servers for the last six years (thank you, Rackspace!).
Our first inclination was to simply rebuild GHC's existing services on a new
hosting provider. However, as I began this process in the summer of 2018 the scale of
the challenge became apparent. Not only was our primary server an organic mix
of scripts and configuration with varying degrees of documentation, but some of our services ranged
from unsupported (e.g. Phabricator, which no longer supported non-paying
customers), to infrequently maintained (e.g. many of the plugins
used by our Trac instance have not had a release in four years or more),
to obsolete (e.g. we still used gitolite v2, which has been
deprecated for years),
In light of this, it seemed clear that simply rebuilding our existing
infrastructure would prove to be a mistake: despite requiring a
significant investment in resources, the rebuild would solve none of the
friction that our existing infrastructure incurred and would likely once again
devolve into an difficult-to-maintain jungle of configuration. If there was
ever a time to change it was now.
However, to complicate matters we were already deep in a rebuild of our continuous
integration infrastructure, building upon CircleCI and Appveyor, to
support our new six-month release cadence. While this change was in many ways a
great improvement over our previous CI infrastructure, it also introduced a
number of integration challenges (i.e. tying build results back to
Phabricator) that we were still in the process of trying to solve.
# Planning the migration
To replace our infrastructure there were two serious contenders: GitHub and
GitLab. Happily, both options would move us towards a more git-centric
contribution workflow, addressing one of the greatest concerns that potential
contributors expressed in our development priorities survey last fall.
While other projects (namely Rust) have demonstrated that it is possible
to maintain a large-scale open-source project on GitHub, it was far from clear
that GHC could pull this off with our comparatively limited resources and
significantly larger legacy migration needs (e.g. we concluded early on that any
migration must faithfully preserve GHC's ticket history, including ticket
numbers).
In addition, GitLab was been sucessfully adopted by
[GNOME](https://www.gnome.org/), and
[freedesktop.org](https://freedesktop.org/), with [KDE](https://kde.org/)'s
making motions towards a migration as well.
For these and other reasons that are well-covered
[elsewhere](https://mail.haskell.org/pipermail/ghc-devs/2018-October/016425.html)
GitLab seemed like a better fit for GHC's needs.
By early November 2018 there was consensus to move ahead with a migration to
move GHC to GitLab. To keep the migration managable, we carefully limited the
[project's scope](https://gitlab.haskell.org/ghc/ghc/wikis/git-lab-migration)
to migrate code review, repository hosting, ticket tracking, and the wiki,
leaving any continuous integration migration for future work.
# Starting work
In mid-November we started work on the migration as two parallel efforts:
* *phase 1*: migrating repository hosting and code review
* *phase 2*: migrating ticket tracking and the wiki
Phase 1 was intended to be a small project, allowing us to migrate quickly to
our new code review platform and begin the process of decommissioning our old
systems.
Phase 2 was significantly riskier, involving the migration of 16,000 tickets,
carrying over 100,000 comments of human-written markup. Thankfully, we had the
benefit of being able to [build
upon](https://gitlab.haskell.org/bgamari/trac-to-remarkup) Matthew Pickering's
[previous](https://mail.haskell.org/pipermail/ghc-devs/2017-January/013500.html)
prototype infrastructure for migrating Trac tickets to Maniphest, Phabricator's
issue tracker.
# Phase 1 and scope creep
It is sometimes said that no plan survives first contact with reality; GHC's
migration plan was no exception. In early December 2018 we were
[notified](https://mail.haskell.org/pipermail/ghc-devops-group/2018-December/000273.html)
that imminent changes to CircleCI's pricing model would essentially preclude
further use of the platform. While GitLab provided us with a convenient
alternative to CircleCI, this development significantly enlarged the
previously carefully-bounded scope of our migration plan.
While CircleCI generously provided us with a two-week extension to our (free)
CI plan, the weeks that followed were a scramble to rebuild our CI
infrastructure before support vanished.
Regardless, by late December we had finished phase 1 of the migration, including
rudimentary CI support, and designated <https://gitlab.haskell.org/ghc/ghc>
as GHC's official upstream source repository.
Throughout this process we were lucky to have the support of GitLab's director of
community relations, David Planella. David has been an invaluable resource,
helping us plan our migration and quickly draw attention to the occassional bug
report.
Moreover, GitLab was remarkably responsive to the pain-points that we
encountered in our workflow. While many examples can be found via the GHC
migration [tracking
ticket](https://gitlab.com/gitlab-org/gitlab-ce/issues/55039), one in
particular stands out: soon after moving code review to GitLab we quickly found
that our use of the "fast-forward only" merge strategy (necessary to preserve
bisectability), coupled with our long six-hour CI build times, resulted in a
[very poor patch merge
workflow](https://gitlab.com/gitlab-org/gitlab-ce/issues/55039#note_127882246).
While we adopted [marge-bot](https://github.com/smarkets/marge-bot) as a
near-term workaround, David and James of GitLab were happy to hear out and
reflect on our use-case, using our experience to design the [merge
train feature](https://gitlab.com/gitlab-org/gitlab-ee/issues/9186) that will be
available in a coming GitLab release.
# Wiki and ticket migration
Phase 2 of the migration involved migrating GHC's tickets and nearly
wiki pages to GitLab. For the former we used a relatively straightforward
[parser](https://gitlab.haskell.org/bgamari/trac-to-remarkup/blob/master/src/Trac/Parser.hs)
of Trac's markup syntax and a simple
[pretty-printer](https://gitlab.haskell.org/bgamari/trac-to-remarkup/blob/master/src/Trac/Writer.hs).
While this worked well enough for ticket descriptions and comments, its
performance on Wiki pages was unacceptably poor due to syntactic ambiguity and
the significantly richer markup used in wiki pages. To handle this we augmented
our conversion script with an HTML parser to back out Markdown from rendered
Trac wiki pages. While inelegant, this approach was significantly
better in preserving the rich markup found in many wiki pages.
After several dry-run migrations in Janary and February the conversion
quality was deemed acceptable by early March, with the final migration being
carried out on 9 March 2019. This was a bit later than our goal of finishing
the final import by mid-February, but this wasn't surprising. given the scale
of the task.
# Improving CI coverage
Testing the many configurations supported by GHC has been a challenge on each
of the CI platforms we have used. Most recently, while CircleCI offered many
benefits, it complicated testing of non-x86-64/Linux targets due to
the platform's limited operating system support, build time limits, and the
high cost of build time. In this respect the move to GitLab opened up a number
of opportunities.
Throughout February and March 2019 we focused on extending the CI
infrastructure we built in phase 1 to cover these non-standard configurations.
As of today, every GHC merge request is tested on over a dozen operating
system/architecture/build-parameter configurations, with binary distribution
artifacts produced and archived for most of these.
In addition to plain GHC builds, we have also realized the long-sought goal of
regularly testing GHC snapshots against user code. For this we have incorporated
Herbert Valerio Riedel's [head.hackage](http://github.com/hvr/head.hackage)
patch-set with Matthew Pickering's
[ghc-artefact-nix](https://github.com/mpickering/ghc-artefact-nix) nix
expression. In conjunction with some glue logic this combination
gives us the ability to test CI-produced binary distributions against a several
dozen Hackage packages, with more to be added in the future.
In addition to testing for regressions in correctness, our `head.hackage` CI
records compile-time and allocation metrics, allowing us to track compiler
performance on real-world Haskell code. We hope that this will provide better
insight into GHC's compile-time costs and expect to use this insight in future
work on improving compiler performance.
# Further automation
One of the major motivations for the move to GitLab was the promise of
consolidating and automating more project management tasks, removing
human bottlenecks and increasing GHC's bus factor. Towards this end we have
automated many of the more mundane aspects of GHC's infrastructure:
* CI-triggered builds of the [Docker
images](https://gitlab.haskell.org/ghc/ci-images) on which our CI processes
are built
* automated deployment of GHC [documentation snapshots](https://ghc.gitlab.haskell.org/ghc/doc/)
* automated generation and deployment of GHC's [website](http://gitlab.haskell.org/ghc/homepage) (including this blog)
* well-documented, version-controlled, and maintainable configuration for our
servers and CI runners built upon [NixOS](https://nixos.org/)
While each of these investments is small, we hope that making them now will pay
dividends in more development time to work on GHC itself in the years to come.
# Documenting the development process
The move to GitLab has meant that we redesign many of the conventions and
protocols used in the course of GHC's development. In this process we have taken
the opportunity to more coherently document these conventions. From GHC's
[Working Conventions](https://gitlab.haskell.org/ghc/ghc/wikis/contributing#working-conventions) page contributors will now find links the comprehensive
documentation describing GHC's [ticket
triage](https://gitlab.haskell.org/ghc/ghc/wikis/gitlab/issues) and [code
review](https://gitlab.haskell.org/ghc/ghc/wikis/gitlab/merge-requests)
protocols.
We have also rewritten our [newcomer's
documentation](https://gitlab.haskell.org/ghc/ghc/wikis/contributing#newcomers-to-ghc),
making it easier for someone new to GHC development to get from forking the
compiler to submitting a patch.
# What remains to be done
While the dust from the migration has started to settle, there is still
[plenty](https://gitlab.haskell.org/bgamari/gitlab-migration) to be done. While
the wiki import is done, there is still a great deal of cleanup that remains.
If you have a few idle minutes feel free to [browse
around](https://gitlab.haskell.org/ghc/ghc/wikis) looking for import mistakes and spurious Trac references.
There is also much to be done to further improve the state of GHC's
continuous integration jobs. From fixing broken tests on Windows, via contributing
a FreeBSD builder, to making the `head.hackage` job output more legible, there
are plenty of ways in which we appreciate helps.
Additionally for the 8.10 release cycle we would like to greatly increase the size of the
package set tested by `head.hackage` and automate the publication of the
`head.hackage.org` package repository. This will allow all users to easily test
their packages against GHC snapshots and prereleases and will further shrink
GHC's development feedback cycle.
Finally, one of the casualties of the GitLab migration has been the 8.8 release
schedule, which was originally slated to culminate with the 8.8.1 in mid-March.
However, this is a topic I will leave to discuss in another blog post.
# Closing thoughts
Needless to say, the last few months have been a whirlwind. However, we think
that the result is quite exciting. Not only is GHC's test
infrastructure both more reliable and thorough than it has ever been, but the
tools with which the project is developed, released, and maintained are more
sustainable and far more inviting to newcomers than they were only six short
months ago.
If you are interested in contributing to GHC, but so far have been intimidated
by our tools, we encourage you to give it another go. Start from the
[newcomer's
guide](https://gitlab.haskell.org/ghc/ghc/wikis/contributing#newcomers-to-ghc),
browse our list of [newcomer-friendly
tickets](https://gitlab.haskell.org/ghc/ghc/issues?label_name=newcomer), and
pick something that suits your skills. As always, if you ever get stuck or find
some documentation that is unclear, just ping us on `#ghc` on Freenode or on
[ghc-devs](https://mail.haskell.org/mailman/listinfo/ghc-devs).
# Acknowledgements
This migration would not have been possible without people both inside and
outside of the Haskell community:
* Matthew Pickering for his help in configuring and maintaining our GitLab instance, including many thankless hours debugging `marge-bot`
* Takenobu Tani for his attention to detail in spotting and fixing issues with GHC's wiki and documentation both before and after the migration
* Tobias Dammers for his work on the import script and help cleaning up the wiki
* [Packet](https://packet.com/) for generously offering their excellent hosting services
* Google X, Serokell, and Packet for their sponsorship of our CI infrastructure
* The members of the [GHC devops
group](https://gitlab.haskell.org/ghc/ghc/wikis/dev-ops-group-charter) for
their consideration and feedback over the course of the migration.
* David Planella and everyone at GitLab for their help executing the migration
* All of GHC's users and contributors for their patience while we worked our way through this migration
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment