Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in / Register
Toggle navigation
N
nofib
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Locked Files
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Iterations
Merge Requests
0
Merge Requests
0
Requirements
Requirements
List
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Security & Compliance
Security & Compliance
Dependency List
License Compliance
Operations
Operations
Incidents
Environments
Packages & Registries
Packages & Registries
Package Registry
Container Registry
Analytics
Analytics
CI / CD
Code Review
Insights
Issue
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Sebastian Graf
nofib
Commits
36fb02b0
Commit
36fb02b0
authored
May 04, 2018
by
Simon Peyton Jones
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update Simon-nofib-notes
parent
1364fe62
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
31 additions
and
19 deletions
+31
-19
Simon-nofib-notes
Simon-nofib-notes
+31
-19
No files found.
Simon-nofib-notes
View file @
36fb02b0
...
...
@@ -13,6 +13,15 @@ whereas it didn't before. So allocations go up a bit.
Imaginary suite
---------------------------------------
queens
~~~~~~
The comprehension
gen n = [ (q:b) | b <- gen (n-1), q <- [1..nq], safe q 1 b]
has, for each iteration of 'b', a new list [1..nq]. This can floated
and hence and shared, or fused. It's quite delicate which of the two
happens.
integrate
~~~~~~~~~
integrate1D is strict in its second argument 'u', but it also passes 'u' to
...
...
@@ -21,7 +30,7 @@ slightly.
gen_regexps
~~~~~~~~~~~
I found that there were some very bad loss-of-arity cases in PrelShow.
I found that there were some very bad loss-of-arity cases in PrelShow.
In particular, we had:
showl "" = showChar '"' s
...
...
@@ -46,7 +55,7 @@ I found that there were some very bad loss-of-arity cases in PrelShow.
So I've changed PrelShow.showLitChar to use explicit \s. Even then, showl
doesn't work, because GHC can't see that showl xs can be pushed inside the \s.
So I've put an explict \s there too.
So I've put an explict \s there too.
showl "" s = showChar '"' s
showl ('"':xs) s = showString "\\\"" (showl xs s)
...
...
@@ -59,7 +68,7 @@ queens
If we do
a) some inlining before float-out
b) fold/build fusion before float-out
then queens get 40% more allocation. Presumably the fusion
then queens get 40% more allocation. Presumably the fusion
prevents sharing.
...
...
@@ -81,7 +90,7 @@ It's important to inline p_ident.
There's a very delicate CSE in p_expr
p_expr = seQ q_op [p_term1, p_op, p_term2] ## p_term3
(where all the pterm1,2,3 are really just p_term).
(where all the pterm1,2,3 are really just p_term).
This expands into
p_expr s = case p_term1 s of
...
...
@@ -111,7 +120,7 @@ like this:
xs7_s1i8 :: GHC.Prim.Int# -> [GHC.Base.Char]
[Str: DmdType]
xs7_s1i8 = go_r1og ys_aGO
} in
} in
\ (m_XWf :: GHC.Prim.Int#) ->
case GHC.Prim.<=# m_XWf 1 of wild1_aSI {
GHC.Base.False ->
...
...
@@ -144,7 +153,7 @@ up allocation.
expert
~~~~~~
In spectral/expert/Search.ask there's a statically visible CSE. Catching this
In spectral/expert/Search.ask there's a statically visible CSE. Catching this
depends almost entirely on chance, which is a pity.
reptile
...
...
@@ -229,9 +238,9 @@ it was inlined regardless by the instance-decl stuff. So perf drops slightly.
integer
~~~~~~~
A good benchmark for beating on big-integer arithmetic.
In this function:
A good benchmark for beating on big-integer arithmetic
There is a delicate interaction of fusion and full laziness in the comprehension
integerbench :: (Integer -> Integer -> a)
-> Integer -> Integer -> Integer
-> Integer -> Integer -> Integer
...
...
@@ -242,12 +251,15 @@ In this function:
, b <- [ bstart,astart+bstep..blim ]])
return ()
if you do a bit of inlining and rule firing before floating, we'll fuse
the comprehension with the [bstart, astart+bstep..blim], whereas if you
float first you'll share the [bstart...] list. The latter does 11% less
allocation, but more case analysis etc.
and the analogous one for Int.
Since the inner loop (for b) doesn't depend on a, we could float the
b-list out; but it may fuse first. In GHC 8 (and most previous
version) this fusion did happen at type Integer, but (accidentally) not for
Int because an interving eval got in the way. So the b-enumeration was floated
out, which led to less allocation of Int values.
k
nights
K
nights
~~~~~~~
* In knights/KnightHeuristic, we don't find that possibleMoves is strict
(with important knock-on effects) unless we apply rules before floating
...
...
@@ -261,7 +273,7 @@ knights
lambda
~~~~~~
This program shows the cost of the non-eta-expanded lambdas that arise from
a state monad.
a state monad.
mandel2
~~~~~~~
...
...
@@ -281,7 +293,7 @@ in particular, it did not inline windowToViewport
multiplier
~~~~~~~~~~
In spectral/multiplier, we have
In spectral/multiplier, we have
xor = lift21 forceBit f
where f :: Bit -> Bit -> Bit
f 0 0 = 0
...
...
@@ -310,11 +322,11 @@ in runtime after 4.08
puzzle
~~~~~~
The main function is 'transfer'. It has some complicated join points, and
a big issue is the full laziness can float out many small MFEs that then
a big issue is the full laziness can float out many small MFEs that then
make much bigger closures. It's quite delicate: small changes can make
big differences, and I spent far too long gazing at it.
I found that in my experimental proto 4.09 compiler I had
I found that in my experimental proto 4.09 compiler I had
let ds = go xs in
let $j = .... ds ... in
...
...
@@ -332,7 +344,7 @@ Also, making concat into a good producer made a large gain.
My proto 4.09 still allocates more, partly because of more full laziness relative
to 4.08; I don't know why that happens
Extra allocation is happening in 5.02 as well; perhaps for the same reasons. There is
Extra allocation is happening in 5.02 as well; perhaps for the same reasons. There is
at least one instance of floating that prevents fusion; namely the enumerated lists
in 'transfer'.
...
...
@@ -357,7 +369,7 @@ $wvecsub
case ww5 of wild1 { D# y ->
let { a3 = -## x y
} in $wD# a3
} }
} }
} in (# a, a1, a2 #)
Currently it gets guidance: IF_ARGS 6 [2 2 2 2 2 2] 25 4
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment