Commit 26a429c0 authored by apankiv's avatar apankiv Committed by Jan Stolarek

Repo cleanup. Fixes #6

  * remove unused modules
  * delete unnecessary tex comments
  * delete paper
  * update .gitignore: we no longer need to ignore LaTeX build artifacts,
    as the paper has been removed from the repository.  We also ignore
    cabal sandboxes
  * update readme to link to the paper.
parent b4477e82
# Cabal sandbox
# Coverage reports
Here's how to work on Hoopl without breaking GHC.
- Hoopl maintainers should work on the branch "develop", where they
need not validate changes against GHC.
- GHC uses the 'master' branch, and makes changes there from time to
time as necessary to keep GHC working
- GHC central merges master with develop occasionally (e.g. before a
new GHC stable branch is created).
- Intermediate releases can be made from develop, although we have
to ensure that each GHC release ships with a released version of
the library. (As of May 2012, Ian Lynagh coordinates with the
package maintainers to make this happen.)
Problems with polymorphic transfer functions on the left of an arrow
using RankNTypes:
- Can't easily write generic debugging code to show facts
propagating through a graph, because
. can't tell the shape of the node
. therefore can't tell the type of the facts
- Can't write a generic dominator analysis that assumes only (Edges n)
- Can't use default cases (or at least not easily) in the transfer
- Harder to reuse common predicate transformers like
- id
- distributeFacts :: Edges n => n -> f -> FactBase f
distributeFacts n f = mkFactBase [(l, f) | l <- successors n]
Instructions given to NR's class:
If you consult the type definition of FwdTransfer,
you'll see that it requires a polymorphic function and uses a type
family which alters the types of arguments and results depending on
the shape of a node. If the type of a fact is 'f', then
- The predicate transformer for a closed/open node has type f -> FactBase f
- The predicate transformer for an open/open node has type f -> f
- The predicate transformer for an open/closed node has type FactBase f -> f
Simon was very enamored of this interface, but it's clear that it
imposes a heavy burden on clients:
1. For a typical first node such as
LabelNode l
You'll have to capture the fact using
fromJust $ factLookup factbase l
2. For a last node you may want something like
\f -> mkFactBase [(l, f) | l <- successors n]
Some last nodes may require more elaborate code.
3. Because the function is both GADT and polymorphic, you can't
default any cases---every constructor has to be written
explicitly. When you are doing this, but you don't care about the
constructor's arguments, it can be useful to use the record
wildcard syntax:
xfer (ArraySet {}) = id
This syntax matches any fully saturated application of ArraySet,
no matter how many arguments ArraySet expects.
......@@ -3,29 +3,31 @@ The `hoopl` Package [![Hackage](](ht
## Hoopl: A Higher-Order OPtimization Library
API documentation can be found on [Hackage](
API documentation can be found on
[Hackage]( For detailed explanation
of the library design see paper ["Hoopl: A Modular, Reusable Library for
Dataflow Analysis and
| Directory | Contents
| -------------- | ---------
| `paper/` | A paper about Hoopl
| `prototypes/` | A sampling of prototypes and early designs
| `src/` | The current official sources to the Cabal package
| `testing/` | Tests, including a sample client. See [`testing/README`](testing/README)
### Development Notes
To build the library, change to the src directory and run
To build the library run:
cabal configure --prefix=$HOME --user # we have no idea what this means
cabal configure
cabal build
cabal install --enable-documentation
To run the tests in the folder testing/, change to the src directory and run
To run the tests in the `testing/` folder run:
cabal configure --enable-tests
cabal test
To run the tests with the test coverage report, change to the src directory and run
To run the tests with the test coverage report run:
cabal configure --enable-tests -f testcoverage
cabal test
......@@ -51,8 +51,6 @@ Library
-- Compiler.Hoopl.DataflowFold,
-- Compiler.Hoopl.OldDataflow,
-- The remaining modules are hidden *provisionally*
Other-modules: Compiler.Hoopl.Checkpoint,
File deleted
# General makefile for Latex stuff
.SUFFIXES: .tib .verb .tex .fig .dvi .ps
MAIN = dfopt
# Styles are in papers/styles
TEXINPUTS := .:../styles//:$(TEXINPUTS)
# Bibliographies are in papers/bibs
BIBINPUTS := .:../bibs//:$(BIBINPUTS)
default: comb1.tex iterf.tex pairf.tex dg.tex cprop.tex
[ -r "$(MAIN)du.tex" ] && chmod +w $(MAIN)du.tex
./defuse < $(MAIN).tex > $(MAIN)du.tex
chmod -w $(MAIN)du.tex
latex $(MAIN).tex
# bibtex $(MAIN)
latex $(MAIN).tex
latex $(MAIN).tex
dvips -f -P pdf < $(MAIN).dvi > $(MAIN).ps
ps2pdf $(MAIN).ps
latex escMeets.tex
bibtex escMeets
dvips -f < escMeets.dvi >
bibtex $(MAIN)
latex $(MAIN).tex
bibtex $(MAIN)
latex $(MAIN).tex
pdflatex $(MAIN).tex
latex $(MAIN).tex
bibtex $(MAIN)
latex $(MAIN).tex
latex $(MAIN).tex
dvips -t a4 $(MAIN).dvi -o $(MAIN).ps
clean-ps imp*.ps
comb1.tex iterf.tex pairf.tex: ./xsource $(HOOPL)/Combinators.hs
lua ./xsource $(HOOPL)/Combinators.hs
dg.tex: ./xsource $(HOOPL)/Dataflow.hs
lua ./xsource $(HOOPL)/Dataflow.hs
CPROPS=$(CLIENT)/ConstProp.hs $(CLIENT)/Simplify.hs $(CLIENT)/Test.hs
cprop.tex: ./xsource $(CPROPS)
lua ./xsource $(CPROPS)
Notes 18 March 2010
* John and Norman are not convinced that the ChangeFlag 'tfb_cha' is
being updated correctly when facts change. Because block IDs are
added to 'tfb_bids' incrementally, we are worried that a fact at
label L could be made to climb by a predecessor, but that this
change wouldn't be noted by updateFact unless the block labelled L
had already been visited. We spent a good 20 minutes on this
question, which right there is indicative of a problem. The
computation of the fixed point needs to be manifestly correct.
Perhaps tfb_bids should be initialized all at one go, and perhaps
it should be given type 'BlockId -> Bool'?
* The 'nodeGraph' function, together with the supporting case of
OCFlag and IsOC, are a bit eye-popping. A good explanation
should be a high priority. Also, it would be good to be able to
say in the paper whether GHC -O is able to eliminate the IsOC
type-class dictionaries. Finally, it is probably also worth
noting that if Haskell had Sorts as in the Omega language, this
issue might not arise.
* Similarly, the new type-class constraints in GFTR are worth
noting, and the same questions arise about dictionaries.
* Norman has a few ideas to tidy the implementation of gftGraph.
He will try them out, and if he likes them, will send them on.
* The >>> arrow has but one use. We suggest a local binding:
blocks_once :: Blocks n -> Trans (TxFactBase n f) (TxFactBase n f)
blocks_once = foldr ((>>>) . block_once) idTrans
where (t1 >>> t2) f = t1 f >>= t2
* Does the "it's a bit disgusting..." comment still apply?
Notes March 2010
Normans comment on draft so far:
- Revisit introduction
- Still skeptical about blockId function
- GMany: does the list have to be non-empty?
Could we have GMany b1 [] b2?
- Distinction between invariants that hold on "finished graphs"
and ones that hold on graphs under construction.
- Consider (GUnit b) :: Graph C C, can successors(b) include b's
own BlockId? No.
- If you ask for successors(x) can you get any pointer into x?
Answer no.
Simon says: you can can only get a loop in g_blocks. A singleton
block can't be a loop.
- Client knows about nodes. Our job is to lift to blocks and graphs.
* I would love to simplify the Graph type, further, but I don't know
how. In particular, we seem to *need* a function of type
lift :: n e x -> Graph n e x
to use when the client's rewriting function says Nothing.
But that forces the slightly
* Talking to John, we agreed that a common use of Hoopl will be to
analyse full graphs, to get a full mapping (BlockId -> fact), for
the *internal* nodes of the graph, not just its out-edges. Inded
for a full graph (eg a procedure body) there wil *be* no out-edges.
So maybe we want
data Graph n e x where
GMany {
g_entry :: Block n e C,
g_blocks :: FullGraph n,
newtype FullGraph n = FG [Block n C C]
And the client might define a procedure thus:
data Procedure = Proc BlockId -- Entry point
(FullGraph CmmNode)
Now we may imagine
GFT_FullGraph n f = GFT (FullGraph n)
and the client interface might be exposed for FullGraph.
Meanwhile, the recursive invocations of the analysis still
work on Graphs.
So a FullGraph would be a fourth kind of entity (as well as
nodes, blocks, graphs), albeit one that is not an instance of
That would make the signature of analyseAndRewrite look like this:
:: forall n f. Edges n
=> RewritingDepth
-> DataflowLattice f
-> ForwardTransfers n f
-> ForwardRewrites n f
-> GFT_FullGraph n f
GFT_FullGraph n f = FullGraph n -> InFactC f ->
* Incidentally, eleveating FullGraph in this way would let
us switch to BlockEnv or whatever else we wanted if that
proved convenient.
* Maybe FullGraph should be Graph, and Graph should be PGraph (for
partial graph), or SubGraph.
* I was thinking how to do dead-block elimination. Given a fact
(B17 -> Dead), how can I rewrite the block with label B17 to
an empty graph? I'd like to write
rewrite fact_fun (Label b)
| fact_fun b == Dead = Just (GUnit (BUnit b `BCat` unreachable))
| otherwise = Nothing
So what is "unreachable :: Block". I suppose it's a new constructor
of the Block type, that eats all its downstream fellows:
data Block n e x where
BUnr :: Block n O x before...
It's a bit like the GNil constructor, which is there primarily
to allow us to rewrite a node to a no-op.
Its a bit tiresome that it has be be in Block not Graph, but
we still need that Label.
"Optimization" encompasses:
- substitution informed by equational reasoning (about states)
- removal of redundant code, as justified by reasoning about
"Code motion" is achieved by inserting redundant code,
thereby making original code redundant, which in turn
can be removed.
No single technique; the secret sauce is how we combine things:
- Zipper CFG
- Disctinct representations for construction and analyis of CFGs
- Maximum use of polymorphism
- Type classes to make notation resemble prior art
- Transfer equations coded in dragon-book style
- Fixed points courtesy Lerner, Grove, and Chambers (2002)
We make dataflow optimization easy to think about and easy to build:
* Ideas that reconcile the ad-hoc 'optimization zoo' found in the
dragon book with methods of reasoning long understood by functional
* Design and implementation that make it not just possible but *easy*
to use dataflow techniques in your compiler.
Working title: Dataflow Optimization Made Simple
Note: By decomposing 'optimizations' into smaller units, we simplify.
'induction-variable elimination' is *not* an atomic thing!
Vague Outline
1. Intro
2. Short example
3. Logical view of optimization
4. Clients (examples, including type class declarations)
5. Graphs
6. Fixed-point computation; the dataflow monad
7. Discussion
module Rew
data Rew a = No
| Mk a
| Then (Rew a) (Rew a)
| Iter (Rew a)
rewrite :: Rew (node -> fact -> Maybe graph)
-> node -> fact -> Maybe (graph, Rew (node -> fact -> Maybe graph))
rewrite_direct rs n f = rew rs
where rew No = Nothing
rew (Mk r) =
case r n f of Nothing -> Nothing
Just g -> Just (g, No)
rew (r1 `Then` r2) =
case rew r1 of
Nothing -> rew r2
Just (g, r1') -> Just (g, r1' `Then` r2)
rew (Iter r) =
case rew r of
Nothing -> Nothing
Just (g, r') -> Just (g, r' `Then` Iter r)
rewrite rs node f = rew rs Just Nothing
rew No j n = n
rew (Mk r) j n =
case r node f of Nothing -> n
Just g -> j (g, No)
rew (r1 `Then` r2) j n = rew r1 (j . add r2) (rew r2 j n)
rew (Iter r) j n = rew r (j . add (Iter r)) n
add tail (g, r) = (g, r `Then` tail)
rewritem :: Monad m => Rew (node -> fact -> m (Maybe graph))
-> node -> fact -> m (Maybe (graph, Rew (node -> fact -> m (Maybe graph))))
rewritem rs node f = rew rs (return . Just) (return Nothing)
rew No j n = n
rew (Mk r) j n = do mg <- r node f
case mg of Nothing -> n
Just g -> j (g, No)
rew (r1 `Then` r2) j n = rew r1 (j . add r2) (rew r2 j n)
rew (Iter r) j n = rew r (j . add (Iter r)) n
add tail (g, r) = (g, r `Then` tail)
- get Simon to review Lerner, Ullman & Kam, Kildall
- implementation section: why?
. really exists
. not insane
. dramatically simpler than original & other implementations
. can see some of the idea
- massive failure of parallel structure between sections 3 and 6
(three bullets)
- Hypothesis:
if {P} s --> s' by rewrite, and {P} s {Q} and {P} s' {Q'}, then
Q \sqsubseteq Q'
if s {Q} --> s' by rewrite, and {P} s {Q} and {P'} s' {Q}, then
P' \sqsubseteq P
Why do we care? Is this a theorem? A requirement? Do we want
implication (in the logic) and not a lattic op? How does
implication translate to lattice ops, anyway? (disjunction ~ join)
DFOPT: if any extra room, restore Davidson reference!
then: 'to reduce register pressure'
then: 2nd dataflow paper
%.bbl:Q: %.aux
echo "=== Pretend I ran nbibtex here (use the .bib from the repo to make a .bbl) ==="
%.bib: %.aux
echo "=== Pretend I ran nbibtex here (use the .bib from the repo to make a .bbl) ==="
%.bbl: %.aux
nbibtex $stem
%.bib: %.aux
nbibtex -o $target -bib $stem
%.bbl:Q: %.aux
echo "=== Pretend I ran nbibtex here (use the .bbl from the repo) ==="
# do nothing
\ No newline at end of file
%.bitly:V: $HOME/www/pubs/%.pdf
rsync -avP $prereq$stem.pdf
# do nothing
# do nothing
\ No newline at end of file
% I have enclosed code.sty, which achieves 99% of what you want without
% the need for a separate preprocessor. At the start of your document
% you write "\makeatactive". From then on, inline code is written as @\x
% -> x_1 & y@. The only difference with what you are used to, is that
% instead of
% @
% foo :: Int -> Int
% foo = \n -> n+1
% @
% you have to write
% \begin{code}
% foo :: Int -> Int
% foo = \n -> n+1
% \end{code}
% and that you cannot use @ in \section{} and \caption{}. For the paper that occured twice, in which case I had to replace @...@ b y \texttt{...}.
% code.sty --- nice verbatim mode for code
% To get '@' use \verb+@+
\let\do\@makeother \dospecials
%% next two lines provide more sensible spacing
\@tempdima = \fontdimen2\font\relax
\spaceskip = 0.65\@tempdima
% The \makeatactive command:
% makes @ active, in such a way that @...@ behaves as \icode@...@:
\catcode`@=\active \def@{\icode@}
% Since @ becomes active, it has to be taken care of in verbatim-modes:
\let\olddospecials\dospecials \def\dospecials{\do\@\olddospecials}}
% \gdef\makeatother{\g@remfrom@specials{\@}\@makeother\@}
\begingroup \catcode `|=0 \catcode `[= 1
\catcode`]=2 \catcode `\{=12 \catcode `\}=12
\catcode`\\=12 |gdef|@xcode#1\end{code}[#1|end[code]]