Performance issues with replicateM_
When working on optimizing a program by minimizing allocations, I can into an issue with replicateM_
. Consider the following code
import Control.Monad (replicateM_)
import Foreign.C.String (withCString)
import Foreign.Storable (peek)
main :: IO ()
main = withCString "foo" $ replicateM_ 10000000 . peek
When I run this program, I get:
160,042,656 bytes allocated in the heap
The result is the same whether I compile with -O0
, -O
, or -O2
. And as expected, the total allocation increases or decreases based on the numbers of times I replicate the action. On the other hand, replacing replicateM_
with a hand-written version makes the total allocations for the program only 42KB, and does not increase with the numbers of replications.
replicateM_ :: Monad m => Int -> m a -> m ()
replicateM_ cnt0 f =
loop cnt0
where
loop cnt
| cnt <= 0 = return ()
| otherwise = f >> loop (cnt - 1)
By contrast, Control.Monad.replicateM_
looks like:
replicateM_ :: (Monad m) => Int -> m a -> m ()
{-# INLINEABLE replicateM_ #-}
{-# SPECIALISE replicateM_ :: Int -> IO a -> IO () #-}
{-# SPECIALISE replicateM_ :: Int -> Maybe a -> Maybe () #-}
replicateM_ n x = sequence_ (replicate n x)
I can't see an advantage to this implementation over the more direct implementation I've provided above. Unless there are objections, I'll send a patch to switch the implementation. (Since master already uses Applicative
, I'll make the relevant updates to generalize the function signature too.)
Trac metadata
Trac field | Value |
---|---|
Version | 7.10.3 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | libraries/base |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |