Commit b29633f5 authored by Ben Gamari's avatar Ben Gamari Committed by Ben Gamari

Bitmap: Fix thunk explosion

Previously we would build up another `map (-N)` thunk
for every word in the bitmap. Now we strictly accumulate the position
and carry out a single ``map (`subtract` accum)``.

`Bitmap.intsToBitmap` showed up in the profile while compiling a 
testcase of #7450 (namely a program containing a record type with large 
number of fields which derived `Read`). The culprit was 
`CmmBuildInfoTables.procpointSRT.bitmap`. On the testcase (with 4096 
fields), the profile previously looked like,

```
	total time  =      307.94 secs   (307943 ticks @ 1000 us, 1 
processor)
	total alloc = 336,797,868,056 bytes  (excludes profiling 
overheads)

COST CENTRE              MODULE              %time %alloc

lintAnnots               CoreLint             17.2   25.8
procpointSRT.bitmap      CmmBuildInfoTables   11.3   25.2
FloatOutwards            SimplCore             7.5    1.6
flatten.lookup           CmmBuildInfoTables    4.0    3.9
...
```

After this fix it looks like,
```
	total time  =      256.88 secs   (256876 ticks @ 1000 us, 1 
processor)
	total alloc = 255,033,667,448 bytes  (excludes profiling 
overheads)

COST CENTRE              MODULE              %time %alloc

lintAnnots               CoreLint             20.3   34.1
FloatOutwards            SimplCore             9.1    2.1
flatten.lookup           CmmBuildInfoTables    4.8    5.2
pprNativeCode            AsmCodeGen            3.7    4.3
simplLetUnfolding        Simplify              3.6    2.2
StgCmm                   HscMain               3.6    2.1
```
Signed-off-by: Ben Gamari's avatarBen Gamari <ben@smart-cactus.org>

Test Plan: Validate

Reviewers: austin, simonpj

Reviewed By: simonpj

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1041

GHC Trac Issues: #7450
parent 4f9d6008
{-# LANGUAGE CPP #-}
{-# LANGUAGE CPP, BangPatterns #-}
--
-- (c) The University of Glasgow 2003-2006
......@@ -45,31 +45,75 @@ chunkToBitmap dflags chunk =
-- eg. @[0,1,3], size 4 ==> 0xb@.
--
-- The list of @Int@s /must/ be already sorted.
intsToBitmap :: DynFlags -> Int -> [Int] -> Bitmap
intsToBitmap dflags size slots{- must be sorted -}
| size <= 0 = []
| otherwise =
(foldr (.|.) (toStgWord dflags 0) (map (toStgWord dflags 1 `shiftL`) these)) :
intsToBitmap dflags (size - wORD_SIZE_IN_BITS dflags)
(map (\x -> x - wORD_SIZE_IN_BITS dflags) rest)
where (these,rest) = span (< wORD_SIZE_IN_BITS dflags) slots
intsToBitmap :: DynFlags
-> Int -- ^ size in bits
-> [Int] -- ^ sorted indices of ones
-> Bitmap
intsToBitmap dflags size = go 0
where
word_sz = wORD_SIZE_IN_BITS dflags
oneAt :: Int -> StgWord
oneAt i = toStgWord dflags 1 `shiftL` i
-- It is important that we maintain strictness here.
-- See Note [Strictness when building Bitmaps].
go :: Int -> [Int] -> Bitmap
go !pos slots
| size <= pos = []
| otherwise =
(foldr (.|.) (toStgWord dflags 0) (map (\i->oneAt (i - pos)) these)) :
go (pos + word_sz) rest
where
(these,rest) = span (< (pos + word_sz)) slots
-- | Make a bitmap where the slots specified are the /zeros/ in the bitmap.
-- eg. @[0,1,3], size 4 ==> 0x4@ (we leave any bits outside the size as zero,
-- just to make the bitmap easier to read).
--
-- The list of @Int@s /must/ be already sorted and duplicate-free.
intsToReverseBitmap :: DynFlags -> Int -> [Int] -> Bitmap
intsToReverseBitmap dflags size slots{- must be sorted -}
| size <= 0 = []
| otherwise =
(foldr xor (toStgWord dflags init) (map (toStgWord dflags 1 `shiftL`) these)) :
intsToReverseBitmap dflags (size - wORD_SIZE_IN_BITS dflags)
(map (\x -> x - wORD_SIZE_IN_BITS dflags) rest)
where (these,rest) = span (< wORD_SIZE_IN_BITS dflags) slots
init
| size >= wORD_SIZE_IN_BITS dflags = -1
| otherwise = (1 `shiftL` size) - 1
intsToReverseBitmap :: DynFlags
-> Int -- ^ size in bits
-> [Int] -- ^ sorted indices of zeros free of duplicates
-> Bitmap
intsToReverseBitmap dflags size = go 0
where
word_sz = wORD_SIZE_IN_BITS dflags
oneAt :: Int -> StgWord
oneAt i = toStgWord dflags 1 `shiftL` i
-- It is important that we maintain strictness here.
-- See Note [Strictness when building Bitmaps].
go :: Int -> [Int] -> Bitmap
go !pos slots
| size <= pos = []
| otherwise =
(foldr xor (toStgWord dflags init) (map (\i->oneAt (i - pos)) these)) :
go (pos + word_sz) rest
where
(these,rest) = span (< (pos + word_sz)) slots
remain = size - pos
init
| remain >= word_sz = -1
| otherwise = (1 `shiftL` remain) - 1
{-
Note [Strictness when building Bitmaps]
========================================
One of the places where @Bitmap@ is used is in in building Static Reference
Tables (SRTs) (in @CmmBuildInfoTables.procpointSRT@). In #7450 it was noticed
that some test cases (particularly those whose C-- have large numbers of CAFs)
produced large quantities of allocations from this function.
The source traced back to 'intsToBitmap', which was lazily subtracting the word
size from the elements of the tail of the @slots@ list and recursively invoking
itself with the result. This resulted in large numbers of subtraction thunks
being built up. Here we take care to avoid passing new thunks to the recursive
call. Instead we pass the unmodified tail along with an explicit position
accumulator, which get subtracted in the fold when we compute the Word.
-}
{- |
Magic number, must agree with @BITMAP_BITS_SHIFT@ in InfoTables.h.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment