CI strategy for Gitlab

I have been working on the configuration of the staging gitlab instance recently. We managed to hook it up with both Circle CI and Appveyor, for the ghc/ghc (username/repo) project on our instance. It all works.

Things get tricky when it comes to running CI jobs for forks, though. Indeed, our Circle CI script relies on some CI env vars being defined in the CI settings of the project. Those are "secrets" (e.g circle ci API token) that we do not want to make public and encourage everyone to use, which means that CI simply won't work for forks, out of the box, with things as they stand. Ben and I have discussed this a bit and we see 3 different ways to go about this.

Forget about the fork-based workflow. We might consider having a reference repo where things get merged and a "sandbox" repo where people could push their branches and get some CI running. They could then create MRs against the "real" repo to get patches merged. This one way or another requires adding new contributors manually to the "sandbox repo". It is quite likely that they could also get "our secrets" by simply tweaking .gitlab-ci.yml to print the right env vars, in their branch.
We go for a "full gitlab CI" solution, which might require less work than it initially seems as it could use the same docker images and run roughly the same commands as Circle CI. That requires computational resources though, and makes the transition less smooth. We can certainly consider doing some additional CI with this, if we make the move definitive and end up refining our CI needs, but we definitely don't want to get rid of Circle CI/Appveyor now anyway.
We implement a "mediator" service that would have all our secrets and would get hit by gitlab CI jobs to launch builds and report the results & artifacts. It could therefore behave the same regardless of the user/repo/branch that triggers it without exposing anything. This could be put together somewhat quickly but requires enough work that we may not be able to set that up in time for the initial move. It is however the best solution from the user experience (ghc contributor) point of view. People with developer access to the main repo could push there, or in their forks. Others would only push to their forks and create MRs. CI would work the same in all those cases.

In my opinion, going with 1) initially and then 3) as soon as possible is the best plan in the long term. It would make for a great contributor experience while requiring very little maintenance. This could also very easily live side by side with some "real" gitlab CI, that doesn't offload the work to an external service.

Trac metadata

Trac field	Value
Version
Type	Task
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Continuous Integration
Test case
Differential revisions
BlockedBy
Related
Blocking
CC	bgamari
Operating system
Architecture

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information