Speed optimizations for elimCommonBlocks
Use toBlockList instead of revPostorder.
Block elimination works on a given Cmm graph by:
- Getting a list of blocks.
- Looking for duplicates in these blocks.
- Removing all but one instance of duplicates.
There are two (reasonable) ways to get the list of blocks.
- The fast way:
This just flattens the underlying map into a list.
- The convenient way:
Start at the entry label, scan for reachable blocks and return only these. This has the advantage of removing all dead code.
If there is dead code the later is better. Work done on unreachable blocks is clearly wasted work. However by the point we run the common block elimination pass the input graph already had all dead code removed. This is done during control flow optimization in CmmContFlowOpt which is our first Cmm pass.
This means common block elimination is free to use toBlockList because revPostorder would return the same blocks. (Although in a different order).
Change the triemap used for grouping by a label list
ListMap (GenMap IntMap).
- Using GenMap offers leaf compression. Which is a trie optimization described
by the Note [Compressed TrieMap] in CoreSyn/TrieMap.hs
- Using a IntMap removes the overhead associated with UniqDFM.
The reasoning why this is deterministic after the change:
- IntMap is deterministic given the same keys.
- Labels have a Int representation, so for the same Labels we get the
same keys, hence the same result for each run.