INLINE Pragma prevents join point from being floated out
Summary
The Data.Text.Internal.Search.indices function can become significantly slower in some cases when compiled with -O2. Reproducing this depends on the calling context, the INLINE pragma, and strictness, making the underlying issue fairly hard to pin down. I have tried to create a minimal test but cannot claim to understand why this happens.
This comment mentions some (probably) relevant issues.
Steps to reproduce
Compile the following code with -O1 or -O2:
module Foo (wrapper) where
-- INLINE or variants like INLINE[1], but not INLINABLE
{-# INLINE loop #-}
-- inlined function contains a nested joinrec loop
loop :: Int -> Int
loop x0 = go x0 0
where
go x acc
| x >= 1000000 = acc
| otherwise = go (x+1) (acc+step)
where
-- joinrec loop has a nested joinrec loop that could be floated out
step = buildTable 0 x0
-- the buildTable type signature must be missing
-- buildTable :: Int -> Int -> Int
buildTable a i
| a == i = i
| otherwise = buildTable (a+1) i
-- Nontrivial call site
wrapper :: Int -> Int
wrapper i
| i <= 10 = i
| otherwise = loop i
Expected behavior
buildTable should be floated out of the go loop. This happens if buildTable has an explicit type signature, wrapper is simpler, or the INLINE pragma is removed:
Rec {
-- RHS size: {terms: 14, types: 3, coercions: 0, joins: 0/0}
$wbuildTable
= \ ww ww1 ->
case ==# ww ww1 of {
__DEFAULT -> $wbuildTable (+# ww 1#) ww1;
1# -> ww1
}
end Rec }
-- RHS size: {terms: 44, types: 11, coercions: 0, joins: 1/1}
wrapper
= \ x0 ->
case x0 of ww { I# ww ->
case >=# ww 1000000# of {
__DEFAULT ->
case $wbuildTable 0# ww of ww { __DEFAULT ->
joinrec {
$wgo ww ww
= case >=# ww 1000000# of {
__DEFAULT -> jump $wgo (+# ww 1#) (*# (+# ww ww) 3#);
1# -> I# ww
}; } in
jump $wgo (+# ww 1#) (*# ww 3#)
};
1# -> I# 0#
}
}
Actual behavior
The loop remains nested
$wwrapper
= \ ww ->
case <=# ww 10# of {
__DEFAULT ->
joinrec {
$wgo ww ww
= case >=# ww 1000000# of {
__DEFAULT ->
joinrec {
$wbuildTable ww ww
= case ==# ww ww of {
__DEFAULT -> jump $wbuildTable (+# ww 1#) ww;
1# -> jump $wgo (+# ww 1#) (*# (+# ww ww) 3#)
}; } in
jump $wbuildTable 0# ww;
1# -> ww
}; } in
jump $wgo ww 0#;
1# -> ww
}
Environment
- GHC version used: 8.10.4
Optional:
- Operating System: Windows
- System Architecture: x86-64