Annual release cycle, its structure and long-term-support.
I think this should be once discussed as a GHC proposal. OTOH, the previous time schedule was changed it wasn't discussed as GHC proposal. Therefore I first open an issue.
In summary: I'd like GHC to have an annual release cycle, i.e. one major release a year. Additionally every second or third release would be supported for longer (LTS, long-term-support). Note: these two topics are independent, even if very related. There's also third topic, how the release cycle is structured Which ties the two first together.
The previous discussion is at https://www.haskell.org/ghc/blog/20170801-2017-release-schedule.html
Let me also overview the release practices of other projects (Ubuntu, Node.JS, Java SE, GCC, Python) and their reasoning. And then discuss why I think GHC should adopt something similar.
I argue, that the fact we can have shorter release cycles, we should not. And saved release manager time should rather be spent into keeping some releases supported for extended periods of time.
And we should concentrate on the predicatability of releases and bugfixes happening.
Haskell
Before we dive into other projects, let us see where GHC was and is now.
One statistic we can easily calculate is the time from previous release, i.e. difference in days between e.g. 8.6.1 and 8.4.1 Another metric is time between first and last release in the series. (The data is in gist if you want to play with it, or correct errors).
Release | date | days from prev | in 30 days | span | notes |
---|---|---|---|---|---|
7.0 | 2010-11-16 | 211 | |||
7.2 | 2011-08-09 | 266 | 8.8 | 67 | |
7.4 | 2011-02-02 | 177 | 5.0 | 129 | 443 (14), if you don't consider 7.2 |
7.6 | 2012-09-06 | 217 | 7.2 | 227 | |
7.8 | 2014-04-09 | 580 | 19.3 | 258 | |
7.10 | 2015-03-27 | 352 | 11.7 | 256 | |
8.0 | 2016-05-21 | 421 | 14.0 | 235 | |
8.2 | 2017-07-22 | 427 | 14.2 | 123 | |
8.4 | 2018-03-08 | 229 | 7.6 | 220 | 6 month cadence starts |
8.6 | 2018-09-21 | 197 | 6.6 | 214 | |
8.8 | 2019-08-25 | 338 | 11.3 | 183 | |
8.10 | 2020-03-24 | 212 | 7.0 | ? | |
8.12 | 2020-09-25 | 185 | 6.1 | ? | based on https://mail.haskell.org/pipermail/ghc-devs/2020-May/018851.html schedule |
We can visualise individual releases on the time axis
In 2017 the observations were:
- release cadence has swung rather wildly
- the release cycle has stretched in the last several releases
- time-between-releases generally tends to be on the order of a year
And Ben argues
I personally think that a more stable, shorter release cycle would be better for developers and users alike,
developers have a tighter feedback loop, inducing less pressure to get new features and non-critical bugfixes into minor releases
release managers have fewer patches to cherry-pick
users see new features and bugfixes more quickly
It's since been have three years. So it's time to re-evaluate.
- Is there really less pressure to get new features in, or is it rather opposite so the difficult stuff is postponed to next releases? (I'm thinking about Linear Haskell for example).
-
head.hackage
provides feedback, but it is independent of release cadence.
The last point, users new features and bugfixes more quickly deserves special attention.
One easy metric is how fast Stackage LTS snapshot catches up with the new major GHC release, see https://github.com/commercialhaskell/stackage/wiki/Stackage-LTS-Releases. At best it was 3 months, 4 months seems to be fair average. For GHC-8.8 it was 5.8 months, and for 8.10 it won't be less than two months (as there aren't GHC-8.10 compatible Stackage LTS release).
My view on this is that indeed users do see new features, and bugfixes quickly, but the upgrade cost for getting bugfixes might mean upgrading to the next major GHC. For example, it seems that GHC-8.8 series (not even a year old), is not usable on latest Windows (#17926 (closed)).
Overview of policies other projects use
Ubuntu
The first example is Ubuntu. It's not a programming language implementation, but a lot of the same reasoning applies. Ubuntu has a page about their release cycle, which says
LTS or ‘Long Term Support’ releases are published every two years in April. LTS releases are the ‘enterprise grade’ releases of Ubuntu and are utilised the most. An estimated 95% of all Ubuntu installations are LTS releases.
Every six months between LTS versions, Canonical publishes an interim release of Ubuntu, with 19.10 being the latest example. These are production-quality releases and are supported for 9 months, with sufficient time provided for users to update, but these releases do not receive the long-term commitment of LTS releases.
Interim releases will introduce new capabilities from Canonical and upstream open source projects, they serve as a proving ground for these new capabilities. Many developers run interim releases because they provide newer compilers or access to newer kernels and newer libraries, and they are often used inside rapid devops processes like CI/CD pipelines where the lifespan of an artefact is likely to be less than the support period of the interim release. Interim releases receive full security maintenance for ‘main’ during their lifespan.
Note that Ubuntu's "parent" -- Debian -- has two-year release cycle as well. See Release statistics on the DebianReleases wiki page, the time from previous release is around 600-800 days. I have no further insight if this is codified, the wording imply that thing just happen to be this way.
What can be deduced from those data is that the "most-typical" Debian release:
endures a freeze cycle of 7 +/- 1 months before getting released. is released about 2 years after the previous one (the often cited example of Debian Sarge being quite an exceptional event in Debian history). leaves users about 1 year to upgrade to the next one once this latter itself gets released. has (from release to the end of security updates) a total lifetime of about 3 years.
Node.JS
The Node.JS project has a releases page as well, which says
Major Node.js versions enter Current release status for six months, which gives library authors time to add support for them. After six months, odd-numbered releases (9, 11, etc.) become unsupported, and even-numbered releases (10, 12, etc.) move to Active LTS status and are ready for general use. LTS release status is "long-term support", which typically guarantees that critical bugs will be fixed for a total of 30 months. Production applications should only use Active LTS or Maintenance LTS releases.
If you have problem parsing that, the even versions are current and actively supported for 1.5 years in total. And then are maintained supported for an additional year.
This is quite similar to Ubuntu policy.
- The current phase of to become LTS release is kind of a release Ubuntu have prior LTS (e.g. 19.10).
- The active phase of an active release is similar to the the LTS release (single version gets all maintenance it needs).
- And then it enters into maintenance mode.
The Node.JS time scales are different. LTS releases happen once a year, not once two years.
Java SE
This topic cannot be covered without mentioning Java. Java recently changed it's release policy, and it is documented on Oracle's site
The time-scales are of the size of the Starship Enterprise. They happen less often:
Oracle will designate a release, every three years, as a Long-Term-Support (LTS) release.
but on the other hand, the support window is enormous. For example current Java 11 was generally available (GA) in September 2018, will be supported until September 2013 (5 years), for an additional fee the support for your setup will extended. And they even offer you maintenance for as long as you use your Oracle software.
GCC
I didn't find a GCC release policy described concisely in one place, you can get an idea from releases page.
There are various major branches, which get interleaved releases.
The GCC Development page has a timeline showing that in a more graphical (ASCII form):
+-- GCC 8 branch created --------+
| \
| v
GCC 9 Stage 1 (starts 2018-04-25) GCC 8.1 release (2018-05-02)
| \
| v
| GCC 8.2 release (2018-07-26)
GCC 9 Stage 3 (starts 2018-11-12) \
| v
GCC 9 Stage 4 (starts 2019-01-07) GCC 8.3 release (2019-02-22)
| \
| v
| GCC 8.4 release (2020-03-04)
|
+-- GCC 9 branch created --------+
| \
| v
GCC 10 Stage 1 (starts 2019-04-25) GCC 9.1 release (2019-05-03)
| \
| v
| GCC 9.2 release (2019-08-12)
GCC 10 Stage 3 (starts 2019-11-17) \
| v
GCC 10 Stage 4 (starts 2020-01-13) GCC 9.3 release (2020-03-12)
|
|
+-- GCC 10 branch created -------+
| \
| v
GCC 11 Stage 1 (starts 2020-04-30) GCC 10.1 release (2020-05-07)
|
and also explains the process in some detail. This is the only one describing their process. There is not only a description of it, but also a rationale.
I include here what they say about the schedule.
Development on our main branch will proceed in three stages.
- Stage 1
- During this period, changes of any nature may be made to the compiler. In particular, major changes may be merged from branches. Stage 1 is feature driven and will last at least four months. In order to avoid chaos, the Release Managers will ask for a list of major projects proposed for the coming release cycle before the start of this stage. They will attempt to sequence the projects in such a way as to cause minimal disruption. The Release Managers will not reject projects that will be ready for inclusion before the end of Stage 1. Similarly, the Release Managers have no special power to accept a particular patch or branch beyond what their status as maintainers affords. The role of the Release Managers is merely to attempt to order the inclusion of major features in an organized manner.
- Stage 3
- During this two-month period, the only (non-documentation) changes that may be made are changes that fix bugs or new ports which do not require changes to other parts of the compiler. New functionality may not be introduced during this period.
- Stage 4
- During this period, the only (non-documentation) changes that may be made are changes that fix regressions. Other important bugs like wrong-code, rejects-valid or build issues may be fixed as well. All changes during this period should be done with extra care on not introducing new regressions - fixing bugs at all cost is not wanted. Note that the same constraints apply to release branches. This period lasts until stage 1 opens for the next release.
- Rationale
In order to produce releases on a regular schedule, we must ensure that the mainline is reasonably stable some time before we make the release. Therefore, more radical changes must be made earlier in the cycle, so that we have time to fix any problems that result.
In order to reach higher standards of quality, we must focus on fixing bugs; by working exclusively on bug-fixing through this stage and before branching for the release, we will have a higher quality source base as we prepare for a release.
Although maintaining a development branch, including merging new changes from the mainline, is somewhat burdensome, the absolute worst case is that such a branch will have to be maintained for a few months. During this period, the only mainline changes will be bug-fixes, so it is unlikely that many conflicts will occur.
Worthwile mentions
- Major branches are created once a year.
- Each release is a "LTS release"
- There is a section about patch reversion and bugfix releases.
The bugfix releases says
By waiting for two months to make a bug-fix release, we will be able to accumulate fixes for the most important problems, and avoid spending undue amounts of time on release testing and packaging.
I think this individual point is something GHC could adopt right away. Given e.g. GHC-8.12.1 initial release, users could expect the GHC-8.12.2 to happen two month later, and GHC-8.12.3 four month later. No sooner nor later. If they don't happen, there weren't reason for. See the Release Methodology for a rationale.
There are other than the mentioned bugfix releases for each major GCC release, but the page doesn't explain the reasoning for them. But looks like they consist of only regression fixes (backports?) See e.g. a list for GCC-7.5 (which happens to be default GHC on current Ubuntu-18.04 (LTS)).
Python
The last project I'll cover is Python. There is a recent accepted PEP 602 - Annual Release Cycle for Python.
The short summary is
- Seventeen months to develop a major version
- 1½ year of full support, 3½ more years of security fixes
- Annual release cadence
With an example illustrated by
The Seventeen months to develop a major version has four bullet points, which are very similar to GCC release policy
This PEP proposes that Python 3.X.0 will be developed for around 17 months:1. The first five months overlap with Python 3.(X-1).0's beta and release candidate stages and are thus unversioned.
2. The next seven months are spent on versioned alpha releases where both new features are incrementally added and bug fixes are included.
Is like the stage 1 of GCC process. Everything could go in. Note, the GCC stage1 is versioned.
3. The following three months are spent on four versioned beta releases where no new features can be added but bug fixes are still included.
Is like stage 2 of GCC process (which is two months there).
4. The final two months are spent on two release candidates (or more, if necessary) and conclude with the release of the final release of Python 3.X.0.
Which sounds exactly like the stage of GCC process.
The details differ, but in overall the process is divided quite similarly.
The rationale section of PEP602 is a good read.
This change provides the following advantages:
- Creates a predictable calendar for releases where the final release is always in October (so after the annual core sprint), and the beta phase starts in late May (so after PyCon US sprints), which is especially important for core developers who need to plan to include Python involvement in their calendar;
This is the same reasoning as currently (not-largely discussed) 6-month release cadence GHC uses. Yet, Python set up for just year. (down from their previous 18 months). Especially notice the rejected idea of Double the release cadence to achieve 9 months between major versions with the reasoning This was originally proposed in PEP 596 and rejected as both too irregular and too short.
The PEP mentions other things to consider. For example it discusses Dependent Policies. For example in Haskell world we have 3-version policy for changes to core libraries, which was accellerated due 6 month cadence.
Also worth pointing out is the last section
Long-Term Support ReleasesEach version of Python is effectively long-term support: it's supported for five years, with the first eighteen months allowing regular bug fixes and security updates. For the remaining time security updates are accepted and promptly released.
GHC Haskell in the future
The most projects I covered has a common one release a year cadence with two exceptions: Java and Ubuntu. Both exceptions are understandable. Java is what it is. And Ubuntu is an operating system, which you would expect to be stable for longer periods of time.
All projects seem to have some kind of real (i.e. with decicated version) yet preview releases. GHC doesn't have these. We have alphas and release candidates, but those don't have own version numbers. Yet, for the few latest GHC releases I have noticed that early adopters treat them as such.
GHC Haskell setup is tricky, as base
and template-haskell
change
sometimes significantly during the development phases, so assigning
them proper PVP compliant versions would make specifying version bounds
tricky (one would need to exclude "preview" version ranges).
This is technical problem which I think can be solved.
Long term support
We naturally come to a question for how long each release is supported. Currently GHC doesn't have any kind of long-term-support. The support for consequtive releases barely overlap. This is understandable given that we as community simply don't have resources. Yet Python is way ahead as Each version of Python is effectively long-term support. Debian explicitly says Users have a year to migrate.
Therefore my proposal is to treat just some (one in two or three years) release as "long term release". The expectations would be that LTS releases have
- long maintenance perioids
- and an an overlap in maintenance periods between consequtive versions
The users who want new features right now, and enjoy upgrading everything, could do that, and use non-LTS releases. More conservative users could restict them to LTS releases only.
The LTS releases would be beneficial for a lot of users
- beginners who don't know which GHC version to pick
- Currently they probably default to the latest, and run into surprises of them not well supported
- Executable developers using Haskell (e.g. Agda maintainers)
- Linux distribution packagers
- The LTS versions could be ones we recommend for operating system packagers to include in their stable release repositories (Debian, Ubuntu). The Red Hat Enterprise Linux is an exceptional outlier, but if their releases contained GHC LTS versions, that would be great.
- Possibly tool writes (e.g. Liquid Haskell) could limit themselves to support only GHC LTS releases
- Book and teaching material writers could use GHC LTS releases, as one would expect them supported.
In short, if we select few GHC releases to have long-term-support, the ecosystem at large would synchronize around them. A student could install GHC LTS on their new shiny macbook and the same version would be on IT department provided machines. (Updated) book examples would work the same and so on.
The ecosystem is constantly rolling update mode.
The span in table indicate that there is some kind of support for each release, but it's around 8 months. I'd like to have some GHC releases actively maintained for longer, two or three years.
LTS release would also offer a solution to how many previous GHC versions a library maintainer should support, the answer could be as simple as The currently active GHC LTS releases.
Release cycle structure
Let me try to illustrate the steps with current 8.12 plan and what I can extrapolate for 8.14.
GHC tree is at some state (master)
|
GHC 8.12 feature merge window (May-June 2020)
|
+-- GHC-8.12 branch created ---+ (2020-06-30 - late June)
| \
| v
| GHC 8.12.0-alpha1 (2020-06-30)
| \
| v
| GHC 8.12.0-alpha2 (2020-07-15)
| \
| v
| GHC 8.12.0-alpha3 (2020-08-01)
| \
| v
| GHC 8.12.0-rc1 (2020-09-01)
| \
| v
| GHC 8.12.1 (2020-09-25)
| \
| v
| GHC 8.12.2 (2020-xx-yy)
|
|
GHC-8.14 feature merge window (November-December 2020)
|
+-- GHC-8.14 branch created ---+ (2020-12-31 - late December)
| \
| v
| GHC 8.14.0-alpha1 (2020-12-30)
| \
| v
| GHC 8.14.0-alpha2 (2021-01-15)
| \
| v
| GHC 8.14.0-alpha3 (2021-02-01)
| \
| v
| GHC 8.14.0-rc1 (2021-03-01)
| \
| v
| GHC 8.14.1 (2021-03-25)
| \
| v
| GHC 8.14.2 (2021-xx-yy)
|
My question is what happens in master
after release branch
is cut but next release merge window is not yet opened?
Between 8.12 branch creation and 8.14 feature merge window
there are 4 months.
I think that period is kind of GCC Stage 1, where
release managers will not reject projects that will be ready for inclusion before the end of Stage 1,
and the GHC merge window is not any kind like GCC Stage 2 which
is meant for stabilization of the trunk / master
.
GCC has New functionality may not be introduced during this period. requirement for Stage 2,
the GHC merge call explicitly encourages for it.
The stabilisation efforts are then proceeded in GHC release branch.
Yet the commits are backported from the master
.
Therefore one naturally avoids making bigger changes in
master
right after the release branch is made,
so cherry-picking stays simpler.
I don't think this is optimal.
If we require all major changes to be developed in branches,
(And I think this is true today).
then we can have alpha releases cut from master
.
We would avoid extensive backporting of bugfixes,
and importantly avoid forgetting to cherry-pick some issue all-together.
Cycle length
The release cadence could still stay at 6 months. Yet, I think that we can get the same level of benefit of users see new features and bugfixes more quickly with just annual releases. The critical bugfixes would appear anyway in LTS releases, and new features will be seen annually. But predicatably so.
I don't know how GHC developers feel about shorter development and merge
windows. As a Cabal maintainer I find it largely stressful.
I have to reject merges because
I don't believe they will be ready before we need to cut the release branch for GHC.
(cf. GCC Stage 1 requirement).
This results into preparing larger chunks of code to be merged at once,
which is another recipy for disasters.
I was thinking of various ways to decouple the GHC and Cabal releases, but I haven't yet managed to find better than status quo solution.
The fact that Cabal
is a library, and is required to support newer GHCs cuts our option space.
As a library maintainer, even I value PVP which I find excellent
policy to have in the ecosystem,
I'm considering dropping upper bounds on base
and template-haskell
,
and outsourcing version compatibility problems to downstream (and my fellow Hackage Trustees).
I already mentioned Stackage not keeping up.
As far as I follow other language communities, they seems to be happy with their release schedules. I mention Python PEP 602 again. There are points of smaller releases, bug fixes sooner, but also mention:
creates a predictable calendar for releases where the final release is always in October (so after the annual core sprint), and the beta phase starts in late May (so after PyCon US sprints), which is especially important for core developers who need to plan to include Python involvement in their calendar;
In my opinion, the annual calendar issue is important. I'd avoid having stabilisation periods during common western holiday periods (End of summmer, December-beginning of the next year). Also including community events like ZuriHac (which will hopefully happen again in 2021), or taking into account paper submission deadlines for ICFP / PLDI / POPL should be easier if the release cycle were annual.