Importing a module prevents fold/build rule from firing
Summary
I expected fold/build fusion to eliminate most allocations, but that does not happen.
Steps to reproduce
-- T23913.hs
{-# OPTIONS_GHC -O #-}
import Work -- intentionally unused
import Data.List (foldl')
data Set a = None | One !a | Many deriving Show
insert :: Eq a => a -> Set a -> Set a
insert x (One y) | x == y = One x
insert x None = One x
insert x _ = Many
empty :: Set a
empty = None
fromList :: Eq a => [a] -> Set a
fromList xs = foldl' (\set x -> insert x set) empty xs
main = print (fromList (replicate 10_000_000 'a'))
-- Work.hs
module Work where
Compile with ghc -o run-it T23913
When running it still allocates over a billion bytes:
$ ./run-it +RTS -s
One 'a'
1,760,051,704 bytes allocated in the heap
630,208 bytes copied during GC
44,328 bytes maximum residency (2 sample(s))
29,400 bytes maximum slop
5 MiB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 419 colls, 0 par 0.002s 0.004s 0.0000s 0.0006s
Gen 1 2 colls, 0 par 0.000s 0.000s 0.0001s 0.0002s
INIT time 0.000s ( 0.004s elapsed)
MUT time 0.371s ( 0.373s elapsed)
GC time 0.002s ( 0.004s elapsed)
EXIT time 0.000s ( 0.009s elapsed)
Total time 0.373s ( 0.389s elapsed)
%GC time 0.0% (0.0% elapsed)
Alloc rate 4,749,733,385 bytes per MUT second
Productivity 99.3% of total user, 95.7% of total elapsed
In the verbose core2core
I do spot this intermediate snapshot of the Core:
-- RHS size: {terms: 49, types: 40, coercions: 0, joins: 0/0}
main :: IO ()
[LclIdX,
Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 440 0}]
main
= work
@(Set Char)
(Main.$fShowSet @Char GHC.Show.$fShowChar)
(GHC.Base.foldr
@Char
@(Set Char -> Set Char)
(\ (ds_a1rU :: Char)
(ds1_a1rV [OS=OneShot] :: Set Char -> Set Char)
(v_a1rW [OS=OneShot] :: Set Char) ->
case v_a1rW of z_a1rX { __DEFAULT ->
ds1_a1rV
(case z_a1rX of {
None -> Main.$WOne @Char ds_a1rU;
One y_a13F ->
case GHC.Classes.eqChar ds_a1rU y_a13F of {
False -> Main.Many @Char;
True -> Main.$WOne @Char ds_a1rU
};
Many -> Main.Many @Char
})
})
(id @(Set Char))
(GHC.Base.build
@Char
(\ (@b_a1tx)
(c_a1ty [OS=OneShot] :: Char -> b_a1tx -> b_a1tx)
(nil_a1tz [OS=OneShot] :: b_a1tx) ->
case GHC.Classes.ltInt (GHC.Types.I# 0#) n_s2aa of {
False -> nil_a1tz;
True ->
GHC.List.repeatFB
@Char
@(Int -> b_a1tx)
(GHC.List.takeFB @Char @b_a1tx c_a1ty nil_a1tz)
(GHC.Types.C# 'a'#)
n_s2aa
}))
(Main.None @Char))
But -drule-check fold/build
does not report any potential rule application sites.
Expected behavior
The example should run using a constant amount of memory w.r.t. the first argument of replicate.
I notice I do get the expected behaviour if I remove the Work
import:
{-# OPTIONS_GHC -O #-}
import Data.List (foldl')
data Set a = None | One !a | Many deriving Show
insert :: Eq a => a -> Set a -> Set a
insert x (One y) | x == y = One x
insert x None = One x
insert x _ = Many
empty :: Set a
empty = None
fromList :: Eq a => [a] -> Set a
fromList xs = foldl' (\set x -> insert x set) empty xs
main = print (fromList (replicate 10_000_000 'a'))
And compiling with ghc -O -o run-it T23913
also fixes the issue.
Environment
- GHC version used: 9.4.5 (seems to affect every version at least since 9.2.8)