I had this discussion with multiple people in private but I'd like to have it publicly as well.
For people outside of the GHC project, having Haddock on GitHub has the advantage of having it hosted on a well-known platform (ease of login), but we also get few "absolute newcomer" contributions, which makes me question the benefits of having Haddock's tree outside of the GHC tree. The jumps through hoops that we have to do with the gitlab mirror and the submodule seem less and less worth it, especially when GHC distributes its own version that doesn't have an equivalent in the GitHub tree.
Most languages that have a documentation tool also have its code living with the compiler (from the top of my head I can think of Elixir, Rust, Python), and we can certainly simplify things a great deal if we remove indirections.
Moving Haddock into the GHC tree and also having to maintain and review Haddock MRs is undesirable, there is not really sufficient manpower to maintain anything else.
It makes updating haddock a bit easier because you don't have to update a submodule, but my personal opinion is that isn't much effort (compared to maintaining the whole of haddock).
The version of haddock bundled with GHC should be the same as one which is released from the github tree.
@mpickering Thank you very much, I'd like to address what you said:
Moving Haddock into the GHC tree and also having to maintain and review Haddock MRs is undesirable, there is not really sufficient manpower to maintain anything else.
I don't believe this has to fall on the GHC core team, and we have code owner mechanisms today that help with delimiting everyone's responsibilities. I'm not standing down as PM for Haddock, and I'm sorry if this ticket implied it.
It makes updating haddock a bit easier because you don't have to update a submodule, but my personal opinion is that isn't much effort (compared to maintaining the whole of haddock).
Again, I'm not proposing to hand you the maintenance of Haddock. And in any case it would be also easier for MRs to bundle GHC & Haddock changes together instead of one waiting for the other.
The version of haddock bundled with GHC should be the same as one which is released from the github tree.
Today this is not the case, I have no idea what Haddock 2.27.0 contains, since:
There is no tag on the GitHub rep
GHC has its own custom patches that are not testable in the GitHub CI without building a GHC from latest HEAD
And thus it's fairly hard to provide any kind of support to users when the binary that is shipped to them with the GHC bindists diverges from what I can observe on the GitHub repo.
And just once more: I am not proposing that your team takes the responsibility of Haddock. This is not a way to trick you into doing this.
I'd like to add that I'm aware that it's not a perfect solution and that there are tradeoffs for every choice, but having Haddock live in 2 parallel repos (none of which being GHC's) is starting to be a bit unbearable, and unifying these tools would certainly help with my workload since I would only have to work in one place and would be more reactive to new tickets.
I agree with this proposal, I don't think merging the 2 trees would create too many management conflicts but would make it much easier to keep the 2 projects synchronous (they already need a lot of synchronization so I feel like there is indeed a need for that)
@Kleidukos and I discussed this at Zurihac and I am increasingly in agreement with their sense that the merging haddock into the tree would be a better trade-off that the current two-repository solution. In particular, the most recent experience trying to fork for 9.8 has been an excellent example of how the status quo can quickly devolve without care and coordination that is very hard to ensure in a distributed project.
@Kleidukos, @mpickering, Laurent, and I discussed this in a call. In short, none of us had any particular objections to merging haddock back into the tree. Moreover, while the on-going HSoC project supervised by Laurent may some day further decouple Haddock's AST from GHC's, this it doesn't change the fact that GHC will still need to compile and ship Haddock and generated documentation. There was mutual understanding that GHC's developers were rather weary of being left responsible for the maintenance of Haddock in the future if the current maintainer pool dries up; @Kleidukos will engage with the HF to put in place a plan to mitigate this risk.
The plan is:
to merge haddock (and its git history) into the GHC tree for GHC 9.10.
add a Hadrian target to allow testing of Haddock using the stage1 compiler to avoid making Haddock development too burdensome
create a haddock group to which Haddock's maintainers will belong
add a CODEOWNERS entry covering the haddock tree
@Kleidukos will do some triage of the current GitHub tickets and migrate those that are still relevant to ghc/ghc>
Haddock releases will be made by GHC's maintainers in lockstep with GHC releases
Code review and other Haddock maintenance tasks will be performed by Haddock's maintainers
@Kleidukos, it would be great to perform the merge early in the GHC 9.10 cycle. I am happy to handle the merge but I think there were a few steps that needed to be done before we could do so. Specifically:
Come up with a for a backup maintainer plan
Triage the current set of GitHub tickets and move tickets to GitLab as needed
@bgamari I have been reforming the Haddock team. 4 new volunteers have expressed interest and we met today to discuss the aim and breadth of their involvement. The primary aim is to maintain proper reactivity to the GHC contributions and tickets related to Haddock. Mango, Yvan, Elland and I are constituting the first batch of stewards for Haddock.
I'd be more than happy to perform the migration of tickets once we're done with the haddock migration.
Unfortunately there are a few malformed git objects in Haddock's history which are currently reject by the default git configuration:
$ git fsckChecking object directories: 100% (256/256), done.error in commit 2b07607c4562034359f52b42055f8d2af4721ca4: missingNameBeforeEmail: invalid author/committer line - missing space before emailerror in commit 87777da468f9bdc8c01b40c5e097f0c5cac3356e: missingNameBeforeEmail: invalid author/committer line - missing space before emailerror in commit 12f3a912d1bf310a1af498fb165f68aa8bbc74da: missingNameBeforeEmail: invalid author/committer line - missing space before email
I will rewrite to eliminate these issues and then add the result as a subtree to ghc/ghc>.