Feasibility of Splitting CLabel?
This isn't an issue per se, but more of an idea that I had while working on #15560 CLabel
is a bit of a beast, here is its description:
{- |
'CLabel' is an abstract type that supports the following operations:
- Pretty printing
- In a C file, does it need to be declared before use? (i.e. is it
guaranteed to be already in scope in the places we need to refer to it?)
- If it needs to be declared, what type (code or data) should it be
declared to have?
- Is it visible outside this object file or not?
- Is it "dynamic" (see details below)
- Eq and Ord, so that we can make sets of CLabels (currently only
used in outputting C as far as I can tell, to avoid generating
more than one declaration for any given label).
- Converting an info table label into an entry label.
CLabel usage is a bit messy in GHC as they are used in a number of different
contexts:
- By the C-- AST to identify labels
- By the unregisterised C code generator (\"PprC\") for naming functions (hence
the name 'CLabel')
- By the native and LLVM code generators to identify labels
For extra fun, each of these uses a slightly different subset of constructors
(e.g. 'AsmTempLabel' and 'AsmTempDerivedLabel' are used only in the NCG and
LLVM backends).
In general, we use 'IdLabel' to represent Haskell things early in the
pipeline. However, later optimization passes will often represent blocks they
create with 'LocalBlockLabel' where there is no obvious 'Name' to hang off the
label.
-}
The key part is the is a bit messy in GHC
. My proposal is to split CLabel
into parts that satisfy each of the listed needs in the note:
- For c-- AST labels
- For the C code generator
- native and LLVM labels
and remove CostCentre
CostCenterStack
IPE_Label
and PicBaseLabel
.
What's the purpose
The code generator does a lot of set manipulation and handling, but because of the above constructors these passes must use Data.Set
which is a much slower data structure than the intmaps we use all over the compiler. So by re-designing CLabel
we pave the way to use more efficient data structures and (possibly) speed up the StgToCmm
pass and its optimizations.