Let's say we have types that would normally go into two separate TyClGroups:
data T1 = MkT1data T2 = MkT2
Now we give a single TLKS to both of them:
type T1, T2 :: Typedata T1 = MkT1data T2 = MkT2
There are three designs I can think of:
Both T1 and T2 go into a single TyClGroup now.
The TLKS is duplicated and will appear in both TyClGroups.
The TLKS is split into two, as if the user has written type T1 :: Type, type T2 :: Type.
None of these seem satisfactory:
(1) has the downside that now unrelated types go into the same TyClGroup, so it no longer represents a SCC, and I'm not sure what the consequences of this would be.
Both (2) and (3) have the downside that the TLKS will be processed twice during type-checking.
(2) also has the downside one of the TyClGroups will contain a TLKS that references a type which will only be defined in a later TyClGroup.
(3) also has the downside that we cannot recover the original definition to produce relevant error messages.
I agree that none of options (1) through (3) seem satisfactory. Let's see if there is an option (4) that doesn't have any of the previous options' drawbacks.
As is usually the case whenever an issue with TLKS is encountered, my first thought is "what do we do in terms"? Although there are no TyClGroups directly in the term level, term-level bindings are divided up into strongly-connected components during renaming, so these can be thought of as the spiritual counterparts to TyClGroups in terms. What's interesting is that unlike TyClGroups, these SCCs are just [(RecFlag, LHsBinds GhcRn)], and they do not contain the top-level type signatures for the bindings at all. Instead, all of the type signatures are checked before the SCCs are typechecked, in tcValBinds:
tcValBinds::TopLevelFlag->[(RecFlag,LHsBindsGhcRn)]->[LSigGhcRn]->TcMthing->TcM([(RecFlag,LHsBindsGhcTcId)],thing)tcValBindstop_lvlbindssigsthing_inside=do{-- Typecheck the signatures-- It's easier to do so now, once for all the SCCs together-- because a single signature f,g :: <type>-- might relate to more than one SCC;(poly_ids,sig_fn)<-tcAddPatSynPlaceholderspatsyns$tcTySigssigs-- Extend the envt right away with all the Ids-- declared with complete type signatures-- ...;tcExtendSigIdstop_lvlpoly_ids$do{(binds',(extra_binds',thing))<-tcBindGroupstop_lvlsig_fnprag_fnbinds$do{...};return(binds'++extra_binds',thing)}}
(I've omitted some details for brevity.) The highlights of tcValBinds are that it first invokes tcTySigs to process all top-level type signatures and returns a sig_fn which maps the Names of bindings to the relevant information needed to typecheck it. After this, tcValBinds typechecks the bindings (via tcBindGroups), passing along the sig_fn.
I wonder if we could do something similar for TLKSs. Instead of putting LTopKindSigs into TyClGroups, instead kind-check them all at once before kind-checking any declarations (perhaps in tcTyAndClassDecls). Have the function which kind-checks TLKSs return a NameEnv (much like you currently do in !1438 (merged) in tcTyClGroup) and pass that NameEnv to tcTyClGroup, using it as appropriate.
With this approach, LTopKindSigs would no longer need to go into TyClGroups at all, which avoids the downside of option (1). It only requires processing each TLKS once and does not require "splitting" TLKSs with multiple names, which avoids the downsides of options (2) and (3). A potential downside of my proposed option (i.e., option (4)) is that I haven't implemented it, so I don't know yet if it will actually work :)
Bah, I had a feeling that option (4) sounded too good to be true. Thank you for pointing out that that plan is dead in the water.
Perhaps there's an option (5) where we could change the way SCC analysis works for type-level declarations so that this idea becomes possible. It's entirely unclear to me how that would work, so I think I'll stop throwing out half-baked ideas now. Perhaps others more knowledgeable in this part of the codebase can produce a fully baked idea.
For now, restricting to one tycon per TLKS is fine. Let's get that working first.
But it should not be hard to support multiple tycons per TLKS:
Do SCC analysis of the dedls
For each SCC:
process the TLKSs (which must not refer to any tycons in the SCC)
process the declarations themselves
We can refine step 1 a bit, like this:
1a. Do SCC analysis on the TLKSs (ignoring the data/class declarations they are associated with)
1b. process each TLKS in dependency order.
Ryan, Vlad and I have discussed this elsewhere. I forget the exact conclusion, but there seemed to be a way forward, along the lines of Simon's suggestions.