Commits (38)
......@@ -29,4 +29,4 @@ dist
/dist-install
/dist-boot
/ghc.mk
.hpc
language: c
sudo: required
env:
- GHCVER=7.0.1
- GHCVER=7.0.2
- GHCVER=7.0.3
- GHCVER=7.0.4
- GHCVER=7.2.2
- GHCVER=7.4.1
- GHCVER=7.4.2
- GHCVER=7.6.1
- GHCVER=7.6.2
- GHCVER=7.6.3
- GHCVER=head
- CABALVER=1.16 GHCVER=7.0.1
- CABALVER=1.16 GHCVER=7.0.4
- CABALVER=1.16 GHCVER=7.2.2
- CABALVER=1.16 GHCVER=7.4.2
- CABALVER=1.16 GHCVER=7.6.3
- CABALVER=1.18 GHCVER=7.8.4
- CABALVER=1.22 GHCVER=7.10.1
- CABALVER=head GHCVER=head
matrix:
allow_failures:
- env: GHCVER=head
- env: CABALVER=head GHCVER=head
before_install:
- sudo add-apt-repository -y ppa:hvr/ghc
- sudo apt-get update
- sudo apt-get install cabal-install-1.18 ghc-$GHCVER
- export PATH=/opt/ghc/$GHCVER/bin:$PATH
- travis_retry sudo add-apt-repository -y ppa:hvr/ghc
- travis_retry sudo apt-get update
- travis_retry sudo apt-get install cabal-install-$CABALVER ghc-$GHCVER
- export PATH=/opt/ghc/$GHCVER/bin:/opt/cabal/$CABALVER/bin:$PATH
install:
- cabal-1.18 update
- ghc --version
- cabal --version
- echo "$(ghc --version) [$(ghc --print-project-git-commit-id 2> /dev/null || echo '?')]"
- travis_retry cabal update
script:
- cabal-1.18 configure -v2
- cabal-1.18 build
- cabal-1.18 check
- cabal-1.18 sdist
- export SRC_TGZ=$(cabal-1.18 info . | awk '{print $2 ".tar.gz";exit}') ;
cd dist/;
if [ -f "$SRC_TGZ" ]; then
cabal-1.18 install "$SRC_TGZ";
else
echo "expected '$SRC_TGZ' not found";
exit 1;
fi
- cabal install --only-dependencies --enable-tests
- cabal configure -v2 --enable-tests
- cabal build
- cabal test
- cabal check
- cabal sdist
- SRC_TGZ=$(cabal info . | awk '{print $2;exit}').tar.gz &&
cd dist && cabal install --force-reinstalls "$SRC_TGZ"
This repository contains things related to
Hoopl: A Higher-Order OPtimization Library
** The closest thing we have to a SAMPLE CLIENT is in ./testing **
Directory Contents
paper/ A paper about Hoopl
prototypes/ A sampling of prototypes and early designs
src/ The current official sources to the Cabal package
testing/ Tests, including a sample client. See ./testing/README.
To build the library, change to the src directory and run
cabal configure --prefix=$HOME --user # we have no idea what this means
cabal build
cabal install --enable-documentation
You'll need a Haskell Platform, which should include appropriate
versions of Cabal and GHC.
To upload to Hackage,
cabal sdist
cabal upload dist/something.tar.gz
Developers, please see HOWTO-BRANCHES
The `hoopl` Package [![Hackage](https://img.shields.io/hackage/v/hoopl.svg)](https://hackage.haskell.org/package/hoopl) [![Build Status](https://travis-ci.org/haskell/hoopl.svg)](https://travis-ci.org/haskell/hoopl)
===================
## Hoopl: A Higher-Order OPtimization Library
API documentation can be found on [Hackage](https://hackage.haskell.org/package/hoopl).
| Directory | Contents
| -------------- | ---------
| `paper/` | A paper about Hoopl
| `prototypes/` | A sampling of prototypes and early designs
| `src/` | The current official sources to the Cabal package
| `testing/` | Tests, including a sample client. See [`testing/README`](testing/README)
### Development Notes
To build the library, change to the src directory and run
cabal configure --prefix=$HOME --user # we have no idea what this means
cabal build
cabal install --enable-documentation
To run the tests in the folder testing/, change to the src directory and run
cabal configure --enable-tests
cabal test
To run the tests with the test coverage report, change to the src directory and run
cabal configure --enable-tests -f testcoverage
cabal test
You'll need a Haskell Platform, which should include appropriate
versions of Cabal and GHC.
### Checklist for Making Releases
In order to facilitate GHC development's workflow, the version in [`hoopl.cabal`](hoopl.cabal) is to be bumped as soon as a change requires a respective version bump (according to the PVP) relative to the last released `hoopl` version.
1. Make sure `hoopl` passes Travis for all GHC versions in the build-matrix
2. Update Changelog (& `git commit`)
3. Generate source tarball via `cabal sdist` and upload a candidate to Hackage (see note below), and inspect the result.
4. If everything checks out, make an annotated and GPG-signed Git release tag: `git tag -a -s v${VER} -m "hoopl ${VER}"`
5. Publish (there's a button for that on Hackage) the package candidate
6. Work on next release
Note: To upload to Hackage,
cabal sdist
cabal upload dist/hoopl-*.tar.gz
However, it's recommended use the Hackage feature for
[uploading a candidate](http://hackage.haskell.org/packages/candidates/upload).
# Changelog for [`hoopl` package](http://hackage.haskell.org/package/hoopl)
## ...
- replace `#if CABAL` macro by no CPP at all
## 3.10.1.1 *Aug 2015*
- Add #if CABAL macro to several hoopl source files such that the Cabal generated macro is not included when building in ghci
- Change the test code (testing/*) to compare the converted graphs against the expected graphs in AST form
- Update the cabal file to run tests and generate a test coverage report
- Unhide gSplice of Hoopl.Graph
- Expose Showing of Hoopl.Show
- Some fixes of testing
## 3.10.1.0 *Apr 2015*
- Re-export runWithFuel from Compiler.Hoopl.
- Remove redundant constraints
## 3.10.0.2 *Dec 2014*
- Add support for `base-4.8.0.0` package version
......
Name: hoopl
Version: 3.10.0.2
Version: 3.10.2.0
-- NOTE: Don't forget to update ./changelog.md
Description:
Higher-order optimization library
......@@ -10,20 +10,25 @@ Description:
License: BSD3
License-File: LICENSE
Author: Norman Ramsey, Joao Dias, Simon Marlow and Simon Peyton Jones
Maintainer: nr@cs.tufts.edu
Maintainer: nr@cs.tufts.edu, andreas.voellmy@gmail.com, email@ningwang.org
Homepage: http://ghc.cs.tufts.edu/hoopl/
Bug-Reports: http://ghc.haskell.org/trac/ghc/newticket?component=libraries/hoopl
Bug-Reports: https://github.com/haskell/hoopl/issues/
Build-Type: Simple
Cabal-Version: >=1.10
Synopsis: A library to support dataflow analysis and optimization
Category: Compilers/Interpreters
Tested-With: GHC>=7.0.1
Extra-Source-Files: README, hoopl.pdf, changelog.md
Extra-Source-Files: README.md, hoopl.pdf, changelog.md
Source-repository head
Type: git
Location: http://git.haskell.org/packages/hoopl.git
flag testcoverage
description: Enable test coverage report
default: False
Library
Default-Language: Haskell2010
Other-Extensions: CPP
......@@ -40,7 +45,7 @@ Library
Other-Extensions: Safe Trustworthy
Hs-Source-Dirs: src
Build-Depends: base >= 4.3 && < 4.9, containers >= 0.4 && < 0.6
Build-Depends: base >= 4.3 && < 4.10, containers >= 0.4 && < 0.6
Exposed-Modules: Compiler.Hoopl,
Compiler.Hoopl.Internals,
Compiler.Hoopl.Wrappers,
......@@ -68,3 +73,19 @@ Library
Ghc-Options: -Wall -fno-warn-name-shadowing
Test-Suite hoopl-test
Default-Language: Haskell2010
Type: exitcode-stdio-1.0
Main-Is: Main.hs
Hs-Source-Dirs: testing src
Build-Depends: base >= 4.3 && < 4.10,
containers >= 0.4 && < 0.6,
filepath,
mtl >= 2.1.3.1,
parsec >= 3.1.7,
test-framework < 0.9,
test-framework-hunit < 0.4,
mtl >= 2.1.3.1
if flag(testcoverage)
Ghc-Options: -fhpc
We agree with the reviewers on almost all points. However, we draw
a more positive conclusion from these points than the reviewers imply:
* We agree with reviewer #1 that our paper is about solid
engineering, a subject that is often valued at ICFP.
As engineers, we take reviewer #2's comment 'appears
straightforward' as a high compliment. A simple design can be
surprisingly difficult to achieve; it is often obvious only in
retrospect. We believe that the interfaces described in the paper
are substantially simpler than the interfaces we described in
2005, and unlike the interfaces in the 2005 paper, they are
reusable.
* The impression that our library is conceived just to serve GHC
represents a misleading failure of our presentation. We
specifically made our library *polymorphic* so that it can be used
with *any* intermediate language, provided only that the language
distinguishes control transfers and provides a means to follow
control-flow edges. We hope soon to use the library with
representations of machine instructions.
* We hope Reviewer #3 may reconsider whether our library is fit for
general purposes. We make no claim to theoretical novelty and no
claim to enable analyses that cannot be implemented using other
methods. Our claim is that using our library, it is much *easier*
to implement dataflow analyses and transformations than it is
using other compiler infrastructures (e.g., SUIF or vpo, to name
two with which we are familiar). In substantiating this claim, we
chose examples of analyses and optimizations that are already
known, so that readers can compare our code with code written for
their favorite compilers.
To be sure the examples are right, we chose *only* examples that
have been implemented and tested in GHC. This choice may have
influenced the reviewer's impression that the library is not fit
for general purposes.
There are many faults in the paper. If the paper is accepted,
1. We will make sure all missing definitions and explanations are
accounted for, so that readers can understand the code easily.
2. We will rewrite Section 7 completely. (We have already done so,
but since it would be unfair to ask the reviewers to base a
decision on work done since the deadline, we say no more here.)
3. We will provide a suitable scholarly account of the relationship
with Schmidt's work and with abstract interpretation more generally.
To reviewer #1: if register pressure could be approximated by,
e.g., the number of live variables, then it would be a
straightforward matter to write a dataflow pass that removes
redundant reloads when the number of live variables is small.
It would require a forward pass to mark the redundant reloads and a
backward pass to remove them.
Dear Prof. Norman Ramsey:
Final versions of the reviews for your submission
Dataflow Optimization Made Simple
are at the bottom of this mail. I hope you will find them
useful.
Please note that in some cases these reviews have been
updated since the author response period. Also, additional
reviews may have been received after the response period
ended; if this was the case, the committee took note of the
fact that you did not have the opportunity to respond to
them.
Again, if you have any additional questions, please feel free
to contact me.
Sincerely,
Andrew Tolmach
PC Chair
============================================================================
ICFP 2009 Reviews for Submission #8
============================================================================
Title: Dataflow Optimization Made Simple
Authors: Norman Ramsey, Joao Dias and Simon Peyton Jones
============================================================================
REVIEWER #1
============================================================================
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
The authors present "a Haskell library that makes it easy for compiler writers
to implement program transformations based on dataflow analyses".
Although the main goal of the paper is interesting, I think that the authors'
proposal is not so general as claimed in the abstract (and
expected for a general Haskell library).
>From a theoretical point of view, it does not introduce anything really new
(the theoretical basis comes from a POPL'02 paper by Lerner, Grove, and
Chambers).
>From a practical point of view, I find the following drawbacks:
1.- Although a general purpose library is claimed to be given,
the development focuses on some few (already known) program analyses
and optimizations which are defined and described for C--,
a particular compiler-target intermediate language
which (as far as I know) is not widely used.
Actually, at the end of the abstract it is said that
this library is the "workhorse of a new backend for GHC".
2.- Many types, functions and data structures are not defined in the paper.
For instance, the types VarSet or Middle (together with some data constructors
like MidAssign, CmmLocal, CmmLoad, etc., and defined functions like aTx, noTx,
etc.) in Figures 3 to 7 seem to be part of the types which are used
in the current implementation of GHC (this is explicit for Middle as one can
read in the but last paragraph of the second column in page 5) or in a full
description of the library which is not available anywhere (at least I found
no link to any additional information about them). This makes the code in
Figures 3 to 7 more difficult to read and understand and, again, more difficult
to generalize to compilers of other programming languages different from C
3.- It is unclear to me how the optimization of the target code is achieved
by using the proposed functions. The focus is on the dataflow graph but
nothing is said about how this graph is obtained from the unoptimized code
and how an optimized target code is obtained from the transformed/rewritten
graph.
Overall, the paper is very well-written. I also find interesting the
ideas dropped in the Conclusions, and have no doubt that FP can export
many ideas and techniques for optimizing compilers of more 'traditional'
programming languages. However, I failed to see how the proposed
library helps to achieve this since, in my opinion, it
seems to be conceived just to serve the GHC compiler.
============================================================================
REVIEWER #2
============================================================================
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
This paper presents a Haskell library for implementing a general
dataflow analysis and optimization through examples and describes
its implementation.
The introductory part of sections 1 - 6 contains a nice introduction
to dataflow optimization, and convincing argument on ease of
developing variety of dataflow optimizers using the presented
library. However, the presented library appears to be a
straightforward porting of Ramsey and Dias's dataflow infrastructure
based on applicative representation of control flow graph based on
Huet's Zipper. Data representation and the structures of the
optimization engine appears to be quite similar, and it is unclear
what the new technical contributions are. In addition, the description
of the dataflow engine in Section 7 of is not very clear due to lack
of proper explanation. For example, readers would have some difficulty
in understanding the importance and relevance of the data types such
as "Block", "ZTail" and "Zhead" unless they were already familiar with
the underlying Zipper style representation of control flow graph with
the corresponding definitions (ie, "block", "head" and "tail") given
in Ramsey and Dias's original article.
I understand that sometimes republishing a workshop paper in ICFP
would be appropriate especially when the workshop is limit visibility,
but this may not be the case. The audience of ML workshop
significantly overlap with those of ICFP and that the proceeding of ML
2005 has been published in Elsevier's Electronic Notes in Theoretical
Computer Science, which is widely available and archival publication.
As such, I doubt that this paper contains enough original contribution
appropriate for publication in ICFP 2009.
============================================================================
REVIEWER #3
============================================================================
---------------------------------------------------------------------------
Comments
---------------------------------------------------------------------------
This paper describes the design and Haskell implementation of a generic library
for dataflow analysis and code transformations. Examples of classic dataflow
analyses expressed using the library are shown and appear remarkably compact.
A distinguishing feature of this library is that it supports combined analysis
and transformation, whereas the result of the analysis anticipates the effect
of the code transformation. This follows the approach proposed by Lerner et al
in 2002. It is pointed out that this enables combining several
analysis/transformation passes in one, but no example is given. The only
example given where analyze&transform gives better results than
analyze-then-transform is dead code elimination by liveness analysis, but as
noted in footnote 1 it was already known how to achieve the same result using a
standard analyze-only dataflow solver.
80% of the paper describe the library and its use. The remaining 20% read more
like a lecture on dataflow analysis, presented as inference of (certain classes
of) logical assertions. I wasn't completely convinced by this presentation,
which I found less penetrating than D. Schmidt's famous "Data flow analysis is
model checking of abstract interpretations". More generally, it is surprising
that abstract interpretation is mentioned nowhere in this paper.
All in all, this is a well-written paper describing a very solid piece of
compiler engineering, but I didn't see a breakthrough.
Minor comments:
The example in section 4 is a bit mysterious. Once the set of variables live
across a function call has been computed (by a standard liveness analysis), it
is a simple syntactic transformation to insert a spill after each definition
(of one of those variables) and a reload before each use; no further dataflow
analyses are needed. Yes, this can give less precise results than the proposed
approach if there are multiple definitions of a variable, some live across a
call and some not (can this happen in GHC's intermediate code?). But, still,
the hard part about spilling and reloading is to eliminate multiple reloads in
areas of code where register pressure is low, e.g.
x = ...
spill(x)
...
reload(x)
y = ... x ...
// do not reload(x) again if register pressure is low
z = ... x ...
I didn't see whether / how the proposed approach could do this.
The referees' reports are clear, and we haven't identified any
significant misunderstandings. Several referees suggest that the
paper reads more like a pearl than a research contribution, and we are
happy to have it evaluated as such. Below we answer referees'
questions. (Having received such nice detailed reviews, we don't want
to leave referees' questions hanging unanswered, but it is probably
not necessary to read the answers below in order to make a decision
about the paper.)
Referee A asks if we have experimental results which show that the
quality of generated code can compete with state-of-the-art compilers.
Yes, we have experimental results with the Glasgow Haskell Compiler
which show that the new back end produces code at least as good as the
old back end. But although GHC's front end contains some very
sophisticated optimizations, by the time the code gets to the level
shown in the paper, the back-end optimizations are limited, and so
GHC's bar is actually set low.
Referee B, citing Knoop, Koschützki, and Steffen, points out that the
API might be simpler if we eliminated the static type distinction
between 'first', 'last', and 'middle' nodes. Ironically, we were very
inspired by the 'living dinosaur' paper and used it as the starting
point for our representation of control-flow graphs. But giving all
nodes the same type led to a great deal of run-time checking, and to
preserve our sanity we were forced to distinguish at compile time
between first, middle, and last nodes, which of course means that we
reinvented basic blocks. Perhaps one way to think about the design
issues here is that although the split into three static types makes
the API wider, client code is simpler because each of the three static
types of node obeys a stronger invariant (constraining the numbers of
predecessors or successors). In any case, we have experience with
both representations, and our experience is that the wider API leads
to a simpler compiler overall---although we don't know how to make
that case compellingly in a conference submission.
Referee C asks why we rewrite in two steps. The referee has the
answer exactly: during the first step of the analysis, speculative
rewriting produces intermediate results which are not guaranteed to be
sound until a fixed point is reached.
Referee C asks for more detail on how the optimization fuel is used
for debugging. Suppose we are regression-testing the compiler and a
test fails. We re-run the same test with no fuel. If the test then
succeeds, the optimizer is at fault. We ask the compiler how much
fuel was used on the original run, and we use that as the starting
point for a binary search on the fuel supply. This binary search
identifies a single graph-node rewrite which transforms a working test
case into a failed test case. At this point there's no more
automation; the compiler writer takes over and diagnoses whether the
transformation is unjustified or the underlying analysis is faulty.
To summarize, optimization fuel is used to find, in logarithmically
many runs of the compiler, the transformation, analysis, node, and
rewrite that cause a fault. We should add that although this process
is completely automated in the 'Quick C--' compiler written by the
first two authors, it is not yet automated in the Glasgow Haskell
Compiler.
Referee E observes that CIL uses OCaml so it may have some nice static
type-checking guarantees. We wrote a predecessor of Hoopl in OCaml
and the static typing was not bad, but having the 'open' and 'closed'
graph properties checked statically is a significant upgrade---we
eliminated a number of dynamic checks, some of which had been sources
of bugs. It is possible that a creative encoding into OCaml would
make it possible to check the same properties statically using only
OCaml's Hindley-Milner system, but GHC's extension of generalized
algebraic data types makes it very easy to provide the extra static
checking.
Referee E also suggests we should compare Hoopl with other engines for
dataflow analysis. We are all wearing our stupid hats and whacking
ourselves in the head for not thinking of this. If it should happen
that the paper is accepted, we'll do a proper job.
===========================================================================
POPL 2010 Review #28A
Updated Saturday 25 Jul 2009 8:03:25am PDT
---------------------------------------------------------------------------
Paper #28: Hoopl: Dataflow Optimization Made Simple
---------------------------------------------------------------------------
Overall merit: B. OK paper, but I will not champion
it.
Reviewer expertise: Z. I am an informed outsider of the
area.
===== Paper summary =====
The paper presents an approach to specifying and combining data flow analyses. The authors do program analysis by solving equations, they relate assertions via weakest liberal preconditions and strongest postconditions, and they combine analyses and transformations using the POPL 2002 paper by Lerner, Grove, and Chambers. The entire framework is written in a functional, nonimperative style that uses zippers and a dataflow monad, and is polymorphic in the underlying representations.
===== Comments for author =====
The paper reads more like a pearl than a research paper. The paper combines "everything we know" into an elegant system for program analysis and optimization.
The examples of analyses and transformations are admirably short, and the paper gives several examples of how they apply to programs.
The main difficulty that has been overcome by the authors is doing the design of the system in a way puts together many ideas in a neat and seamless way.
Question: do you have experimental results that show that the quality of the produced code can compete with that of state-of-the-art compilers?
===========================================================================
POPL 2010 Review #28B
Updated Saturday 1 Aug 2009 9:46:50am PDT
---------------------------------------------------------------------------
Paper #28: Hoopl: Dataflow Optimization Made Simple
---------------------------------------------------------------------------
Overall merit: C. Weak paper, though I will not fight
strongly against it.
Reviewer expertise: Y. I am knowledgeable in the area,
though not an expert.
===== Paper summary =====
The paper presents the interface of an Haskell generic library for dataflow analysis combined with code transformations, in the style of Lerner, Grove and Chambers (POPL 2002).
The approach is illustrated by very compact implementations of two classic analyses (available variables and liveness) and a less common (and harder to follow) analysis+transformation for the insertion of spill and reload instructions.
===== Comments for author =====
All in all, this is a very solid piece of compiler engineering, and the paper is well written . But there are essentially no new principles in this paper. The only really novel aspect of this work ("analyze and transform" instead of "analyze then transform") is taken from Lerner et al. The use of an applicative "zipper" to represent the CFG scores some points for originality but was already published by the first two authors in a workshop paper (ENTCS 148(2)).
The authors also claim as an achievement the simplicity of their API, but I'm not convinced: for dataflow analysis at least, simpler interfaces could be obtained by throwing away the distinction between "first", "middle" and "last" nodes and working on a CFG of individual instructions [1]. See for example the presentations of Kildall's dataflow equation solver by Klein and Nipkow [2] and by Coupet-Grimal and Delobel [3], both of which were also mechanically proved correct.
I was excited, at first, by the extended example on insertion of reload and spill instructions, because this is an isse that is not well treated in compiler textbooks. In the end, I was a bit disappointed: I had the feeling that the proposed approach doesn't work significantly better than the trivial approach of inserting a spill after every definition and a reload before every use for each variable that couldn't be allocated to a register. Isn't the proposed approach overengineered?
Minor remarks:
Page 3, col 2, "the analysis shows nothing about x, which we notate x = bottom". This explanation of bottom sounds wrong. Thinking in terms of abstract interpretation, k denotes the singleton set of values {k}, top the set of all values, and bot the empty set of values. Knowing x = bottom at the end of the analysis really means something very strong about x, namely that all its definitions are unreachable.
References:
[1] Jens Knoop, Dirk Koschützki and Bernhard Steffen.
"Basic-Block Graphs: Living Dinosaurs?".
Proc. Compiler Construction '98, LNCS 1383, 1998.
[2] Gerwin Klein and Tobias Nipkow.
"Verified bytecode verifiers".
Theor. Comp. Sci 298, 2003.
[3] Solange Coupet-Grimal and William Delobel.
"A Uniform and Certified Approach for Two Static Analyses".
Types for Proofs and Programs, International Workshop, TYPES 2004.
LNCS 3839, 2006.
===========================================================================
POPL 2010 Review #28C
Updated Friday 11 Sep 2009 6:15:44pm PDT
---------------------------------------------------------------------------
Paper #28: Hoopl: Dataflow Optimization Made Simple
---------------------------------------------------------------------------
Overall merit: C. Weak paper, though I will not fight
strongly against it.
Reviewer expertise: Y. I am knowledgeable in the area,
though not an expert.
===== Paper summary =====
The paper presents a data flow analysis and program transformation framework. The framework, Hoopl, is implemented as a Haskell library that compiler writers can use to implement optimizations. The paper presents examples of actual analyses and transformations in the context of the Glasgow Haskell compiler, and outlines the implementation of the dataflow engine, which is the main part of Hoopl.
===== Comments for author =====
It is hard to pinpoint exactly the technical contribution of this paper. On one hand, it appears to be a beautifully engineered implementation of a data flow analysis framework but there is little comparison with other similar frameworks and at this point little evidence that this is the "right" design with the right compromise between expressiveness and generality. The paper is also an improvement over Ramsey and Dias's work but the improvements are scattered here and there. Finally the paper, I feel, is hard to appreciate without some familiarity with the GHC backend. Some of the code has to be elided and some of the code presented uses the actual GHC datatypes (which is good in some sense but also adds some unneeded complexity to the examples).
A couple of specific comments/questions:
- can you explain in more detail the reasons for implementing the rewriting in two steps: first a speculative step and then a step that commits it. Is that because the intermediate results of the analysis are unsound and that soundness is only achieved when the analysis reaches a fixed point?
- can you provide more detail on how the optimization fuel is used for debugging
===========================================================================
POPL 2010 Review #28D
Updated Tuesday 15 Sep 2009 5:36:29am PDT
---------------------------------------------------------------------------
Paper #28: Hoopl: Dataflow Optimization Made Simple
---------------------------------------------------------------------------
Overall merit: B. OK paper, but I will not champion
it.
Reviewer expertise: Y. I am knowledgeable in the area,
though not an expert.
===== Paper summary =====
This paper describes Hoopl, a dataflow optimization tool. The paper
first analyzes general properties and principles underlying various
dataflow analysis for low-level code languages through examples, and
identifies major components of a general dataflow optimizer: (1) a
dataflow fact lattice, (2) a transfer function that computes a dataflow
fact of a program point from the preceding (depending of the direction
of the analysis) dataflow facts, (3) a rewrite function that replaces a
node in a control flow graph based on dataflow facts. Based on this
analysis, the paper introduces Hoopl as a generic dataflow optimizer
through type signatures of Hoopl functions, and describes their
functionality. Hoopl takes a dataflow fact lattice (i.e. types and
associated operations), a transfer function, a rewriter, and performs
the iterative process of analyzing the graph using the transfer
function and transforming the graph using the rewrite function until
it obtains the least fixed point. The paper then describes some
aspects of its implementation, including its two phase architecture
consisting of a speculative iterator and an actualizer, and describes
the implemented forward iterator and forward actualizer in some details.
===== Comments for author =====
From the presentation, it seems that Hoopl is an easy to use and
generic tool that automates dataflow optimization for low-level code
languages. It is well engineered so that compiler writers can readily
use it for implementing various optimizations in their optimizing
compilers. The paper is also very well written. Hoopl's description
through examples can serve as a nice tutorial on unified view of
dataflow optimization.
However, I am not completely sure that this paper makes significantly
new contribution to POPL 2010. Although being a well engineered tool,
Hoopl appears to be based on combinations of known results.
The overall structure of representation and implementation is based
on some of the authors earlier work on zipper-style control-flow graph
representation and optimization. There are some improvements on
representations and interfaces: graphs are classified into "open" and
"closed" ones, and interfaces of graph splicing functions are
improved. The overall structure of interleaved analysis and
transformation is due to other existing work. Hoopl also combines
debugging facility, which is based on excising work. The description
of its implementation is too sketched to be useful in implementing new
optimization engines.
===========================================================================
POPL 2010 Review #28E
Updated Wednesday 16 Sep 2009 9:22:46am PDT
---------------------------------------------------------------------------
Paper #28: Hoopl: Dataflow Optimization Made Simple
---------------------------------------------------------------------------
Overall merit: C. Weak paper, though I will not fight
strongly against it.
Reviewer expertise: X. I am an expert in the subject area
of this paper.
===== Paper summary =====
This paper presents an analysis and transformation engine implemented
in Haskell. To use the engine, the programmer provides a description
of the lattice, transfer functions, and rewrite functions. The engine
then takes care of computing the dataflow fixed point and applying the
rewrites. The paper describes the interface to the engine, shows
examples of several client analyses and optimizations and describes an
implementation of the engine.
===== Comments for author =====
The interesting part of this paper is that it shows how to effectively
combine several previously known techniques/ideas into a single
engine. These techniques/ideas are: the fixed-point formulation of
dataflow analyses; the rewrite-rule formulation of transformation
rules; the composition technique of Lerner Grove and Chambers; and the
fuel-based abstraction of Whalley for quickly narrowing down the
source of compiler bugs.
However, it's hard to tease out what exactly the contribution
is. Datafow analysis engines based on lattices, transfer functions,
and rewrite functions are very common (Weimer and Necula's CIL has
one, Hendren et al's Soot has one, Lattner and Adve's LLVM has one,
IBM's WALA engine has one). It would be interesting to better
understand how the proposed framework distinguishes itself from these
existing frameworks.
Presumably, one difference is that the proposed framework incorporates
additional techniques (eg: the composition framework and the
fuel-based abstraction). However, these two techniques were previously
published, and they also seem quite orthogonal to each other (which
means the integration of the two techniques would not pose too many
additional challenges -- if it does, the paper should focus on this).
The paper does point out how Haskell helps with many of the
implementation tasks, and the use of Haskell is indeed a difference
from other frameworks. However, the paper doesn't really develop this
point, and it's also not clear how much of this type checking also
exists in other frameworks (eg: CIL uses OCaml so it may have some
nice static type-checking guarantees) The paper would be stronger if
it had a direct comparison (maybe a table?) of what kinds of
properties are statically checked using types in the proposed
framework, vs CIL, Soot, LLVM, and others frameworks too.
The paper could also be improved by reporting on experience in using
the framework. For example: what was it used for? what are some
statistics about the framework (number of analyses implemented, how
many lines of code, bugs found using types, etc.) how does experience
with the proposed framework compare with other frameworks such as LLVM
(eg: for conciseness, ease of use, etc)
Finally, the paper doesn't seem to address interprocedural analyses
and optimizations (although that's understandable to some extent --
one has to nail down the intra-procedural case first, but it would be
nice to get an idea of how the authors see this framework panning out
in the interprocedural case)
===========================================================================
Author's Response by Norman Ramsey <nr@cs.tufts.edu>
Paper #28: Hoopl: Dataflow Optimization Made Simple
---------------------------------------------------------------------------
The referees' reports are clear, and we haven't identified any
significant misunderstandings. Several referees suggest that the
paper reads more like a pearl than a research contribution, and we are
happy to have it evaluated as such. Below we answer referees'
questions. (Having received such nice detailed reviews, we don't want
to leave referees' questions hanging unanswered, but it is probably
not necessary to read the answers below in order to make a decision
about the paper.)
Referee A asks if we have experimental results which show that the
quality of generated code can compete with state-of-the-art compilers.
Yes, we have experimental results with the Glasgow Haskell Compiler
which show that the new back end produces code at least as good as the
old back end. But although GHC's front end contains some very
sophisticated optimizations, by the time the code gets to the level
shown in the paper, the back-end optimizations are limited, and so
GHC's bar is actually set low.
Referee B, citing Knoop, Koschützki, and Steffen, points out that the
API might be simpler if we eliminated the static type distinction
between 'first', 'last', and 'middle' nodes. Ironically, we were very
inspired by the 'living dinosaur' paper and used it as the starting
point for our representation of control-flow graphs. But giving all
nodes the same type led to a great deal of run-time checking, and to
preserve our sanity we were forced to distinguish at compile time
between first, middle, and last nodes, which of course means that we
reinvented basic blocks. Perhaps one way to think about the design
issues here is that although the split into three static types makes
the API wider, client code is simpler because each of the three static
types of node obeys a stronger invariant (constraining the numbers of
predecessors or successors). In any case, we have experience with
both representations, and our experience is that the wider API leads
to a simpler compiler overall---although we don't know how to make
that case compellingly in a conference submission.
Referee C asks why we rewrite in two steps. The referee has the
answer exactly: during the first step of the analysis, speculative
rewriting produces intermediate results which are not guaranteed to be
sound until a fixed point is reached.
Referee C asks for more detail on how the optimization fuel is used
for debugging. Suppose we are regression-testing the compiler and a
test fails. We re-run the same test with no fuel. If the test then
succeeds, the optimizer is at fault. We ask the compiler how much
fuel was used on the original run, and we use that as the starting
point for a binary search on the fuel supply. This binary search
identifies a single graph-node rewrite which transforms a working test
case into a failed test case. At this point there's no more
automation; the compiler writer takes over and diagnoses whether the
transformation is unjustified or the underlying analysis is faulty.
To summarize, optimization fuel is used to find, in logarithmically
many runs of the compiler, the transformation, analysis, node, and
rewrite that cause a fault. We should add that although this process
is completely automated in the 'Quick C--' compiler written by the
first two authors, it is not yet automated in the Glasgow Haskell
Compiler.
Referee E observes that CIL uses OCaml so it may have some nice static
type-checking guarantees. We wrote a predecessor of Hoopl in OCaml
and the static typing was not bad, but having the 'open' and 'closed'
graph properties checked statically is a significant upgrade---we
eliminated a number of dynamic checks, some of which had been sources
of bugs. It is possible that a creative encoding into OCaml would
make it possible to check the same properties statically using only
OCaml's Hindley-Milner system, but GHC's extension of generalized
algebraic data types makes it very easy to provide the extra static
checking.
Referee E also suggests we should compare Hoopl with other engines for
dataflow analysis. We are all wearing our stupid hats and whacking
ourselves in the head for not thinking of this. If it should happen
that the paper is accepted, we'll do a proper job.
POPL 2010
Paper #28
nr@cs.tufts.edu | Help | Sign out | Thursday 17 Sep 2009 6:38:16pm PDT
Your submissions #29->
[Main] Main
[Edit] Edit
#28
Comment notification
If selected, the system will email you when updated comments are available.
PC conflicts
None
Hoopl: Dataflow Optimization Made Simple
Submitted [PDF] 201kB Updated Wednesday 15 Jul 2009 3:35:28pm PDT | SHA-1 58af8027d3a1ff45917d786fde3ed90b2885d754
You are an author of this paper.
+ Abstract\u2212 Abstract
We present Hoopl, a Haskell library that makes it easy for compiler writers to implement program transformations based on dataflow analyses. The compiler writer must identify [more]We present Hoopl, a Haskell library that makes it easy for compiler writers to implement program transformations based on dataflow analyses. The compiler writer must identify (a) logical assertions on which the transformation will be based; (b) a representation of such assertions, which should form a lattice of finite height; (c) transfer functions that approximate weakest preconditions or strongest postconditions over the assertions; and (d) rewrite functions whose soundness is justified by the assertions. Hoopl uses the algorithm of Lerner, Grove, and Chambers (2002), which can compose very simple analyses and transformations in a way that achieves the same precision as complex, handwritten "superanalyses." Hoopl will be the workhorse of a new back end for the Glasgow Haskell Compiler (version 6.12, forthcoming). Because the major claim in the paper is that Hoopl makes it easy to implement program transformations, the paper is filled with examples, which are written in Haskell. The paper also sketches the implementation of Hoopl, including some excerpts from the implementation.
+ Authors\u2212 Authors
N. Ramsey, J. Dias, S. Jones [details]Norman Ramsey (Tufts University) <nr@cs.tufts.edu>
João Dias (Tufts University) <dias@cs.tufts.edu>
Simon Peyton Jones (Microsoft Research) <simonpj@microsoft.com>
+ Topics\u2212 Topics
Compilers Static analysis
OveMer RevExp
Review #28A B Z
Review #28B C Y
Review #28C C Y
Review #28D B Y
Review #28E C X
[Edit paper] Edit paper | [Add response] Add response
[Text] Reviews and comments in plain text
[Text] Plain text
Review #28A
Modified Saturday 25 Jul 2009 8:03:25am PDT
Overall merit (?)
B. OK paper, but I will not champion it.
Reviewer expertise (?)
Z. I am an informed outsider of the area.
Paper summary
The paper presents an approach to specifying and combining data flow analyses. The authors do program analysis by solving equations, they relate assertions via weakest liberal preconditions and strongest postconditions, and they combine analyses and transformations using the POPL 2002 paper by Lerner, Grove, and Chambers. The entire framework is written in a functional, nonimperative style that uses zippers and a dataflow monad, and is polymorphic in the underlying representations.
Comments for author
The paper reads more like a pearl than a research paper. The paper combines "everything we know" into an elegant system for program analysis and optimization.
The examples of analyses and transformations are admirably short, and the paper gives several examples of how they apply to programs.
The main difficulty that has been overcome by the authors is doing the design of the system in a way puts together many ideas in a neat and seamless way.
Question: do you have experimental results that show that the quality of the produced code can compete with that of state-of-the-art compilers?
[Text] Plain text
Review #28B
Modified Saturday 1 Aug 2009 9:46:50am PDT
Overall merit (?)
C. Weak paper, though I will not fight strongly against it.
Reviewer expertise (?)
Y. I am knowledgeable in the area, though not an expert.
Paper summary
The paper presents the interface of an Haskell generic library for dataflow analysis combined with code transformations, in the style of Lerner, Grove and Chambers (POPL 2002).
The approach is illustrated by very compact implementations of two classic analyses (available variables and liveness) and a less common (and harder to follow) analysis+transformation for the insertion of spill and reload instructions.
Comments for author
All in all, this is a very solid piece of compiler engineering, and the paper is well written . But there are essentially no new principles in this paper. The only really novel aspect of this work ("analyze and transform" instead of "analyze then transform") is taken from Lerner et al. The use of an applicative "zipper" to represent the CFG scores some points for originality but was already published by the first two authors in a workshop paper (ENTCS 148(2)).
The authors also claim as an achievement the simplicity of their API, but I'm not convinced: for dataflow analysis at least, simpler interfaces could be obtained by throwing away the distinction between "first", "middle" and "last" nodes and working on a CFG of individual instructions [1]. See for example the presentations of Kildall's dataflow equation solver by Klein and Nipkow [2] and by Coupet-Grimal and Delobel [3], both of which were also mechanically proved correct.
I was excited, at first, by the extended example on insertion of reload and spill instructions, because this is an isse that is not well treated in compiler textbooks. In the end, I was a bit disappointed: I had the feeling that the proposed approach doesn't work significantly better than the trivial approach of inserting a spill after every definition and a reload before every use for each variable that couldn't be allocated to a register. Isn't the proposed approach overengineered?
Minor remarks:
Page 3, col 2, "the analysis shows nothing about x, which we notate x = bottom". This explanation of bottom sounds wrong. Thinking in terms of abstract interpretation, k denotes the singleton set of values {k}, top the set of all values, and bot the empty set of values. Knowing x = bottom at the end of the analysis really means something very strong about x, namely that all its definitions are unreachable.
References:
[1] Jens Knoop, Dirk Koschützki and Bernhard Steffen.
"Basic-Block Graphs: Living Dinosaurs?".
Proc. Compiler Construction '98, LNCS 1383, 1998.
[2] Gerwin Klein and Tobias Nipkow.
"Verified bytecode verifiers".
Theor. Comp. Sci 298, 2003.
[3] Solange Coupet-Grimal and William Delobel.
"A Uniform and Certified Approach for Two Static Analyses".
Types for Proofs and Programs, International Workshop, TYPES 2004.
LNCS 3839, 2006.
[Text] Plain text
Review #28C
Modified Friday 11 Sep 2009 6:15:44pm PDT
Overall merit (?)
C. Weak paper, though I will not fight strongly against it.
Reviewer expertise (?)
Y. I am knowledgeable in the area, though not an expert.
Paper summary
The paper presents a data flow analysis and program transformation framework. The framework, Hoopl, is implemented as a Haskell library that compiler writers can use to implement optimizations. The paper presents examples of actual analyses and transformations in the context of the Glasgow Haskell compiler, and outlines the implementation of the dataflow engine, which is the main part of Hoopl.
Comments for author
It is hard to pinpoint exactly the technical contribution of this paper. On one hand, it appears to be a beautifully engineered implementation of a data flow analysis framework but there is little comparison with other similar frameworks and at this point little evidence that this is the "right" design with the right compromise between expressiveness and generality. The paper is also an improvement over Ramsey and Dias's work but the improvements are scattered here and there. Finally the paper, I feel, is hard to appreciate without some familiarity with the GHC backend. Some of the code has to be elided and some of the code presented uses the actual GHC datatypes (which is good in some sense but also adds some unneeded complexity to the examples).
A couple of specific comments/questions:
- can you explain in more detail the reasons for implementing the rewriting in two steps: first a speculative step and then a step that commits it. Is that because the intermediate results of the analysis are unsound and that soundness is only achieved when the analysis reaches a fixed point?
- can you provide more detail on how the optimization fuel is used for debugging
[Text] Plain text
Review #28D
Modified Tuesday 15 Sep 2009 5:36:29am PDT
Overall merit (?)
B. OK paper, but I will not champion it.
Reviewer expertise (?)
Y. I am knowledgeable in the area, though not an expert.
Paper summary
This paper describes Hoopl, a dataflow optimization tool. The paper
first analyzes general properties and principles underlying various
dataflow analysis for low-level code languages through examples, and
identifies major components of a general dataflow optimizer: (1) a
dataflow fact lattice, (2) a transfer function that computes a dataflow
fact of a program point from the preceding (depending of the direction
of the analysis) dataflow facts, (3) a rewrite function that replaces a
node in a control flow graph based on dataflow facts. Based on this
analysis, the paper introduces Hoopl as a generic dataflow optimizer
through type signatures of Hoopl functions, and describes their
functionality. Hoopl takes a dataflow fact lattice (i.e. types and
associated operations), a transfer function, a rewriter, and performs
the iterative process of analyzing the graph using the transfer
function and transforming the graph using the rewrite function until
it obtains the least fixed point. The paper then describes some
aspects of its implementation, including its two phase architecture
consisting of a speculative iterator and an actualizer, and describes
the implemented forward iterator and forward actualizer in some details.
Comments for author
From the presentation, it seems that Hoopl is an easy to use and
generic tool that automates dataflow optimization for low-level code
languages. It is well engineered so that compiler writers can readily
use it for implementing various optimizations in their optimizing
compilers. The paper is also very well written. Hoopl's description
through examples can serve as a nice tutorial on unified view of
dataflow optimization.
However, I am not completely sure that this paper makes significantly
new contribution to POPL 2010. Although being a well engineered tool,
Hoopl appears to be based on combinations of known results.
The overall structure of representation and implementation is based
on some of the authors earlier work on zipper-style control-flow graph
representation and optimization. There are some improvements on
representations and interfaces: graphs are classified into "open" and
"closed" ones, and interfaces of graph splicing functions are
improved. The overall structure of interleaved analysis and
transformation is due to other existing work. Hoopl also combines
debugging facility, which is based on excising work. The description
of its implementation is too sketched to be useful in implementing new
optimization engines.
[Text] Plain text
Review #28E
Modified Wednesday 16 Sep 2009 9:22:46am PDT
Overall merit (?)
C. Weak paper, though I will not fight strongly against it.
Reviewer expertise (?)
X. I am an expert in the subject area of this paper.
Paper summary
This paper presents an analysis and transformation engine implemented
in Haskell. To use the engine, the programmer provides a description
of the lattice, transfer functions, and rewrite functions. The engine
then takes care of computing the dataflow fixed point and applying the
rewrites. The paper describes the interface to the engine, shows
examples of several client analyses and optimizations and describes an
implementation of the engine.
Comments for author
The interesting part of this paper is that it shows how to effectively
combine several previously known techniques/ideas into a single
engine. These techniques/ideas are: the fixed-point formulation of
dataflow analyses; the rewrite-rule formulation of transformation
rules; the composition technique of Lerner Grove and Chambers; and the
fuel-based abstraction of Whalley for quickly narrowing down the
source of compiler bugs.
However, it's hard to tease out what exactly the contribution
is. Datafow analysis engines based on lattices, transfer functions,
and rewrite functions are very common (Weimer and Necula's CIL has
one, Hendren et al's Soot has one, Lattner and Adve's LLVM has one,
IBM's WALA engine has one). It would be interesting to better
understand how the proposed framework distinguishes itself from these
existing frameworks.
Presumably, one difference is that the proposed framework incorporates
additional techniques (eg: the composition framework and the
fuel-based abstraction). However, these two techniques were previously
published, and they also seem quite orthogonal to each other (which
means the integration of the two techniques would not pose too many
additional challenges -- if it does, the paper should focus on this).
The paper does point out how Haskell helps with many of the
implementation tasks, and the use of Haskell is indeed a difference
from other frameworks. However, the paper doesn't really develop this
point, and it's also not clear how much of this type checking also
exists in other frameworks (eg: CIL uses OCaml so it may have some
nice static type-checking guarantees) The paper would be stronger if
it had a direct comparison (maybe a table?) of what kinds of
properties are statically checked using types in the proposed
framework, vs CIL, Soot, LLVM, and others frameworks too.
The paper could also be improved by reporting on experience in using
the framework. For example: what was it used for? what are some
statistics about the framework (number of analyses implemented, how
many lines of code, bugs found using types, etc.) how does experience
with the proposed framework compare with other frameworks such as LLVM
(eg: for conciseness, ease of use, etc)
Finally, the paper doesn't seem to address interprocedural analyses
and optimizations (although that's understandable to some extent --
one has to nail down the intra-procedural case first, but it would be
nice to get an idea of how the authors see this framework panning out
in the interprocedural case)
Response
The authors' response is intended to address reviewer concerns and correct misunderstandings. The response should be addressed to the program committee, who will consider it when making their decision. Don't try to augment the paper's content or form\u2014the conference deadline has passed. Please keep the response short and to the point.
This response should be sent to the reviewers.
HotCRP Conference Management Software
Overall merit choices are:
A. Good paper. I will champion it at the PC meeting.
B. OK paper, but I will not champion it.
C. Weak paper, though I will not fight strongly against it.
D. Serious problems. I will argue to reject this paper.
Reviewer expertise choices are:
X. I am an expert in the subject area of this paper.
Y. I am knowledgeable in the area, though not an expert.
Z. I am an informed outsider of the area.
{-# LANGUAGE RankNTypes, ScopedTypeVariables, GADTs, EmptyDataDecls, PatternGuards, TypeFamilies #-}
-- This version uses type families to express the functional dependency
-- between the open/closed-ness of the input graph and the type of the
-- input fact expected for a graph of that shape
module CunningTransfers( pureAnalysis, analyseAndRewrite ) where
import qualified Data.IntMap as M
import qualified Data.IntSet as S
-----------------------------------------------------------------------------
-- Graphs
-----------------------------------------------------------------------------
data ZOpen
data ZClosed
type O = ZOpen
type C = ZClosed
-- This data type is NOT MENTIONED in the rest of the code.
-- It's just an example to how how we can embed our existing
-- middle/last idea into the new story
data ZNode m l e x where
ZFirst :: BlockId -> ZNode m l C O
ZMid :: m -> ZNode m l O O
ZLast :: l -> ZNode m l O C
data ZBlock node e x where
ZBOne :: node e x -> ZBlock node e x
ZCat :: ZBlock node e O -> ZBlock node O x -> ZBlock node e x
type Block node = ZBlock node C C
data ZGraph node e x where
ZGMany { zg_entry :: ZBlock node e C
, zg_blocks :: BlockEnv (Block node)
, zg_exit :: ZBlock node C x } :: ZGraph node e x
ZGOne { zg_mids :: ZBlock node O O } :: ZGraph node O O
ZGNil :: ZGraph node O O
type Graph node = ZGraph node C C
forwardBlockList :: BlockEnv (Block node) -> [(BlockId, Block node)]
-- This produces a list of blocks in order suitable for forward analysis.
-- ToDo: Do a topological sort to improve convergence rate of fixpoint
-- This will require a (HavingSuccessors l) class constraint
forwardBlockList env = M.toList env
-----------------------------------------------------------------------------
-- DataflowLattice
-----------------------------------------------------------------------------
data DataflowLattice a = DataflowLattice {
fact_name :: String, -- Documentation
fact_bot :: a, -- Lattice bottom element
fact_add_to :: a -> a -> TxRes a, -- Lattice join plus change flag
fact_do_logging :: Bool -- log changes
}
-----------------------------------------------------------------------------
-- The main Hoopl API
-----------------------------------------------------------------------------
data ForwardTransfers node f
= ForwardTransfers
{ ft_trans :: forall e x. node e x -> InT e f -> OutT x f }
data ForwardRewrites node f
= ForwardRewrites
{ fr_rw :: forall e x. node e x -> InT e f -> Maybe (AGraph node e x) }
type family InT e f :: *
type instance InT C f = FactBase f
type instance InT O f = f
type family OutT x f :: *
type instance OutT C f = OutFacts f
type instance OutT O f = f
newtype OutFacts fact = OutFacts [(BlockId, fact)]
newtype FactBase fact = FactBase (BlockEnv fact)
data AGraph node e x = AGraph -- Stub for now
-----------------------------------------------------------------------------
-- TxFactBase: a FactBase with ChangeFlag information
-----------------------------------------------------------------------------
-- A TxFactBase carries a ChangeFlag with it, and a set of BlockIds
-- to monitor. Updates to other BlockIds don't affect the ChangeFlag
data TxFactBase fact
= TxFB { tfb_fbase :: FactBase fact
, tfb_cha :: ChangeFlag
, tfb_bids :: BlockSet -- Update change flag iff these blocks change
}
updateFact :: DataflowLattice f -> (BlockId, f)
-> TxFactBase f -> TxFactBase f
-- Update a TxFactBase, setting the change flag iff
-- a) the new fact adds information...
-- b) for a block in the BlockSet in the TxFactBase
updateFact lat (bid, new_fact) tx_fb@(TxFB { tfb_fbase = FactBase fbase, tfb_bids = bids})
| NoChange <- cha2 = tx_fb
| bid `elemBlockSet` bids = tx_fb { tfb_fbase = new_fbase, tfb_cha = SomeChange }
| otherwise = tx_fb { tfb_fbase = new_fbase }
where
old_fact = lookupBEnv fbase bid `orElse` fact_bot lat
TxRes cha2 res_fact = fact_add_to lat old_fact new_fact
new_fbase = FactBase (extendBEnv fbase bid res_fact)
updateFacts :: DataflowLattice f -> BlockId
-> Trans (FactBase f) (OutFacts f)
-> Trans (TxFactBase f) (TxFactBase f)
updateFacts lat bid thing_inside tx_fb@(TxFB { tfb_fbase = fbase, tfb_bids = bids })
= do { OutFacts out_facts <- thing_inside fbase
; let tx_fb' = tx_fb { tfb_bids = extendBlockSet bids bid }
; return (foldr (updateFact lat) tx_fb out_facts) }
-----------------------------------------------------------------------------
-- The Trans arrow
-----------------------------------------------------------------------------
type Trans a b = a -> FuelMonad b
-- Transform a into b, with facts of type f
-- Deals with optimsation fuel and unique supply too
(>>>) :: Trans a b -> Trans b c -> Trans a c
-- Compose two dataflow transfers in sequence
(dft1 >>> dft2) f1 = do { f2 <- dft1 f1; dft2 f2 }
liftTrans :: (a->b) -> Trans a b
liftTrans f x = return (f x)
idTrans :: Trans a a
idTrans x = return x
fixpointTrans :: forall f. Trans (TxFactBase f) (TxFactBase f)
-> Trans (OutFacts f) (FactBase f)
fixpointTrans thing_inside (OutFacts out_facts)
= loop (FactBase (mkBlockEnv out_facts))
where
loop :: Trans (FactBase f) (FactBase f)
loop fbase = do { tx_fb <- thing_inside (TxFB { tfb_fbase = fbase
, tfb_cha = NoChange
, tfb_bids = emptyBlockSet })
; case tfb_cha tx_fb of
NoChange -> return fbase
SomeChange -> loop (tfb_fbase tx_fb) }
-----------------------------------------------------------------------------
-- Transfer functions
-----------------------------------------------------------------------------
-- Keys to the castle: a generic transfer function for each shape
-- Here's the idea: we start with single-node transfer functions,
-- move to basic-block transfer functions (we have exactly four shapes),
-- then finally to graph transfer functions (which requires iteration).
data GFT thing fact
= GFT { gft_trans :: forall e x. thing e x -> Trans (InT e fact) (OutT x fact) }
type GFT_Node node = GFT node
type GFT_Block node = GFT (ZBlock node)
type GFT_Graph node = GFT (ZGraph node)
----------------------------------------------------------------------------------------------
gftNode :: forall node f . ForwardTransfers node f -> GFT_Node node f
-- Injection from the external interface into the internal representation
gftNode (ForwardTransfers { ft_trans = base_trans })
= GFT { gft_trans = node_trans }
where
node_trans :: node e x -> Trans (InT e f) (OutT x f)
node_trans node f = return (base_trans node f)
gftBlock :: forall node f. GFT_Node node f -> GFT_Block node f
-- Lift from nodes to blocks
gftBlock (GFT { gft_trans = node_trans })
= GFT { gft_trans = block_trans }
where
block_trans :: ZBlock node e x -> Trans (InT e f) (OutT x f)
block_trans (ZBOne node) = node_trans node
block_trans (ZCat head mids) = block_trans head >>> block_trans mids
gftGraph :: forall node f. DataflowLattice f -> GFT_Block node f -> GFT_Graph node f
-- Lift from blocks to graphs
gftGraph lattice (GFT { gft_trans = block_trans })
= GFT { gft_trans = graph_trans }
where
-- These functions are orgasmically beautiful
graph_trans :: ZGraph node e x -> Trans (InT e f) (OutT x f)
graph_trans ZGNil = idTrans
graph_trans (ZGOne mids) = block_trans mids
graph_trans (ZGMany { zg_entry = entry, zg_blocks = blocks, zg_exit = exit })
= block_trans entry >>> ft_blocks blocks >>> block_trans exit
ft_blocks :: BlockEnv (Block node) -> Trans (OutFacts f) (FactBase f)
ft_blocks blocks = fixpointTrans (ft_blocks_once (forwardBlockList blocks))
ft_blocks_once :: [(BlockId, Block node)] -> Trans (TxFactBase f) (TxFactBase f)
ft_blocks_once blks = foldr ((>>>) . ft_block_once) idTrans blks
ft_block_once :: (BlockId, Block node)
-> Trans (TxFactBase f) (TxFactBase f)
ft_block_once (blk_id, blk) = updateFacts lattice blk_id (block_trans blk)
----------------------------------------------------------------
-- The pice de resistance: cunning transfer functions
----------------------------------------------------------------
pureAnalysis :: DataflowLattice f -> ForwardTransfers node f -> GFT_Graph node f
pureAnalysis lattice = gftGraph lattice . gftBlock . gftNode
analyseAndRewrite
:: forall node f .
RewritingDepth
-> DataflowLattice f
-> ForwardTransfers node f
-> ForwardRewrites node f
-> GFT_Graph node f
data RewritingDepth = RewriteShallow | RewriteDeep
-- When a transformation proposes to rewrite a node,
-- you can either ask the system to
-- * "shallow": accept the new graph, analyse it without further rewriting
-- * "deep": recursively analyse-and-rewrite the new graph
analyseAndRewrite depth lattice transfers rewrites
= gft_graph_cunning
where
gft_graph_base, gft_graph_cunning, gft_graph_recurse :: GFT_Graph node f
gft_graph_base = gftGraph lattice (gftBlock gft_node_base)
gft_graph_cunning = gftGraph lattice (gftBlock gft_node_cunning)
gft_graph_recurse = case depth of
RewriteShallow -> gft_graph_base
RewriteDeep -> gft_graph_cunning
gft_node_base, gft_node_cunning :: GFT_Node node f
gft_node_base = gftNode transfers
gft_node_cunning = GFT { gft_trans = cunning_trans }
cunning_trans :: node e x -> Trans (InT e f) (OutT x f)
cunning_trans node = tryRewrite (fr_rw rewrites node)
(gft_trans gft_graph_recurse)
(gft_trans gft_node_base node)
-----------------------------------------------------------------------------
-- Rewriting
-----------------------------------------------------------------------------
{-
data GRT co oo oc cc fact
= GRT { grt_lat :: DataflowLattice fact
, grt_co :: co -> Trans (FactBase fact) (fact, Graph C O m l)
, grt_oo :: oo -> Trans fact (fact, Graph O O m l)
, grt_oc :: oc -> Trans fact (OutFacts fact)
, gRt_cc :: cc -> Trans (FactBase fact) (OutFacts fact) }
-}
-----------------------------------------------------------------------------
-- BlockId, BlockEnv, BlockSet
-----------------------------------------------------------------------------
type BlockId = Int
mkBlockId :: Int -> BlockId
mkBlockId uniq = uniq
type BlockEnv a = M.IntMap a
mkBlockEnv :: [(BlockId, a)] -> BlockEnv a
mkBlockEnv prs = M.fromList prs
lookupBEnv :: BlockEnv f -> BlockId -> Maybe f