Commit 9454511b authored by Simon Peyton Jones's avatar Simon Peyton Jones Committed by Marge Bot
Browse files

Optimisation in Unique.Supply

This patch switches on -fno-state-hack in GHC.Types.Unique.Supply.

It turned out that my fixes for #18078 (coercion floating) changed the
optimisation pathway for mkSplitUniqSupply in such a way that we had
an extra allocation inside the inner loop.  Adding -fno-state-hack
fixed that -- and indeed the loop in mkSplitUniqSupply is a classic
example of the way in which -fno-state-hack can be bad; see #18238.

Moreover, the new code is better than the old.  They allocate
the same, but the old code ends up with a partial application.
The net effect is that the test
    perf/should_run/UniqLoop
runs 20% faster!   From 2.5s down to 2.0s.  The allocation numbers
are the same -- but elapsed time falls. Good!

The bad thing about this is that it's terribly delicate.  But
at least it's a good example of such delicacy in action.

There is a long Note [Optimising the unique supply] which now
explains all this.
parent 3b22b14a
......@@ -3,6 +3,10 @@
(c) The GRASP/AQUA Project, Glasgow University, 1992-1998
-}
{-# OPTIONS_GHC -fno-state-hack #-}
-- This -fno-state-hack is important
-- See Note [Optimising the unique supply]
{-# LANGUAGE CPP #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE PatternSynonyms #-}
......@@ -55,6 +59,96 @@ import Data.Char
************************************************************************
-}
{- Note [How the unique supply works]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The basic idea (due to Lennart Augustsson) is that a UniqSupply is
lazily-evaluated infinite tree.
* At each MkSplitUniqSupply node is a unique Int, and two
sub-trees (see data UniqSupply)
* takeUniqFromSupply :: UniqSupply -> (Unique, UniqSupply)
returns the unique Int and one of the sub-trees
* splitUniqSupply :: UniqSupply -> (UniqSupply, UniqSupply)
returns the two sub-trees
* When you poke on one of the thunks, it does a foreign call
to get a fresh Int from a thread-safe counter, and returns
a fresh MkSplitUniqSupply node. This has to be as efficient
as possible: it should allocate only
* The fresh node
* A thunk for each sub-tree
Note [Optimising the unique supply]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The inner loop of mkSplitUniqSupply is a function closure
mk_supply :: IO UniqSupply
mk_supply = unsafeInterleaveIO $
genSym >>= \ u ->
mk_supply >>= \ s1 ->
mk_supply >>= \ s2 ->
return (MkSplitUniqSupply (mask .|. u) s1 s2)
It's a classic example of an IO action that is captured
and the called repeatedly (see #18238 for some discussion).
It turns out that we can get something like
$wmkSplitUniqSupply c# s
= letrec
mk_supply
= \s -> unsafeDupableInterleaveIO1
(\s2 -> case noDuplicate# s2 of s3 ->
...
case mk_supply s4 of (# s5, t1 #) ->
...
(# s6, MkSplitUniqSupply ... #)
in mk_supply s
This is bad becuase we allocate that inner (\s2...) every time.
Why doesn't full laziness float out the (\s2...)? Because of
the state hack (#18238).
So for this module we switch the state hack off -- it's an example
of when it makes things worse rather than better. Now full laziness
can float that lambda out, and we get
$wmkSplitUniqSupply c# s
= letrec
lvl = \s2 -> case noDuplicate# s2 of s3 ->
...
case unsafeDupableInterleaveIO
lvl s4 of (# s5, t1 #) ->
...
(# s6, MkSplitUniqSupply ... #)
in unsafeDupableInterleaveIO1 lvl s
This is all terribly delicate. It just so happened that before I
fixed #18078, and even with the state-hack still enabled, we were
getting this:
$wmkSplitUniqSupply c# s
= letrec
mk_supply = \s2 -> case noDuplicate# s2 of s3 ->
...
case mks_help s3 of (# s5,t1 #) ->
...
(# s6, MkSplitUniqSupply ... #)
mks_help = unsafeDupableInterleaveIO mk_supply
-- mks_help marked as loop breaker
in mks_help s
The fact that we didn't need full laziness was somewhat fortuitious.
We got the right number of allocations. But the partial application of
the arity-2 unsafeDupableInterleaveIO in mks_help makes it quite a
bit slower. (Test perf/should_run/UniqLoop had a 20% perf change.)
Sigh. The test perf/should_run/UniqLoop keeps track of this loop.
Watch it carefully.
-}
-- | Unique Supply
--
-- A value of type 'UniqSupply' is unique, and it can
......@@ -71,6 +165,26 @@ mkSplitUniqSupply :: Char -> IO UniqSupply
-- be distinct from those of all calls to this function in the compiler
-- for the values generated to be truly unique.
-- See Note [How the unique supply works]
-- See Note [Optimising the unique supply]
mkSplitUniqSupply c
= mk_supply
where
!mask = ord c `shiftL` uNIQUE_BITS
-- Here comes THE MAGIC: see Note [How the unique supply works]
-- This is one of the most hammered bits in the whole compiler
-- See Note [Optimising the unique supply]
-- NB: Use unsafeInterleaveIO for thread-safety.
mk_supply = unsafeInterleaveIO $
genSym >>= \ u ->
mk_supply >>= \ s1 ->
mk_supply >>= \ s2 ->
return (MkSplitUniqSupply (mask .|. u) s1 s2)
foreign import ccall unsafe "genSym" genSym :: IO Int
foreign import ccall unsafe "initGenSym" initUniqSupply :: Int -> Int -> IO ()
splitUniqSupply :: UniqSupply -> (UniqSupply, UniqSupply)
-- ^ Build two 'UniqSupply' from a single one, each of which
-- can supply its own 'Unique'.
......@@ -83,31 +197,6 @@ uniqsFromSupply :: UniqSupply -> [Unique] -- Infinite
takeUniqFromSupply :: UniqSupply -> (Unique, UniqSupply)
-- ^ Obtain the 'Unique' from this particular 'UniqSupply', and a new supply
uniqFromMask :: Char -> IO Unique
uniqFromMask mask
= do { uqNum <- genSym
; return $! mkUnique mask uqNum }
mkSplitUniqSupply c
= case ord c `shiftL` uNIQUE_BITS of
!mask -> let
-- here comes THE MAGIC:
-- This is one of the most hammered bits in the whole compiler
mk_supply
-- NB: Use unsafeInterleaveIO for thread-safety.
= unsafeInterleaveIO (
genSym >>= \ u ->
mk_supply >>= \ s1 ->
mk_supply >>= \ s2 ->
return (MkSplitUniqSupply (mask .|. u) s1 s2)
)
in
mk_supply
foreign import ccall unsafe "genSym" genSym :: IO Int
foreign import ccall unsafe "initGenSym" initUniqSupply :: Int -> Int -> IO ()
splitUniqSupply (MkSplitUniqSupply _ s1 s2) = (s1, s2)
listSplitUniqSupply (MkSplitUniqSupply _ s1 s2) = s1 : listSplitUniqSupply s2
......@@ -115,6 +204,12 @@ uniqFromSupply (MkSplitUniqSupply n _ _) = mkUniqueGrimily n
uniqsFromSupply (MkSplitUniqSupply n _ s2) = mkUniqueGrimily n : uniqsFromSupply s2
takeUniqFromSupply (MkSplitUniqSupply n s1 _) = (mkUniqueGrimily n, s1)
uniqFromMask :: Char -> IO Unique
uniqFromMask mask
= do { uqNum <- genSym
; return $! mkUnique mask uqNum }
{-
************************************************************************
* *
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment