Skip to content

Overwhelming the TimerManager

I was talking on IRC with davean about an issue that potentially could have been related to STM (I don't think it is at all after investigating) and I reduced the issue to the following program:

-- Main.hs
module Main where

import Control.Monad
import Control.Concurrent
import Control.Concurrent.STM
import System.Environment

main :: IO ()
main = do
    as <- getArgs
    case as of
        ["-f"]  -> replicateM_ 100000 . void . forkIO . void $ registerDelay 10
        _       -> replicateM_ 100000 .                 void $ registerDelay 10

    threadDelay 1000

This ends up registering a lot of events with the TimerManager. With the "-f" flag it does so from many threads and when run that way it appears to eventually overwhelm the TimerManager and causing over 350 MB total memory in use.

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.9.20141115
$ ghc -O2 -threaded -debug Main.hs -o Main-head
...
$ ./Main-head -f +RTS -s
   3,566,966,936 bytes allocated in the heap
   4,200,021,784 bytes copied during GC
     118,273,720 bytes maximum residency (96 sample(s))
      12,649,480 bytes maximum slop
             354 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      6350 colls,     0 par    2.430s   2.434s     0.0004s    0.0073s
  Gen  1        96 colls,     0 par    7.438s   7.441s     0.0775s    0.2526s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.002s  (  0.002s elapsed)
  MUT     time    0.461s  (  0.460s elapsed)
  GC      time    9.869s  (  9.875s elapsed)
  EXIT    time    0.003s  (  0.003s elapsed)
  Total   time   10.336s  ( 10.340s elapsed)

  Alloc rate    7,741,472,461 bytes per MUT second

  Productivity   4.5% of total user, 4.5% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

Running without forking many threads and the total memory in use stays low (3 MB).

     154,305,648 bytes allocated in the heap
      16,922,272 bytes copied during GC
       1,378,608 bytes maximum residency (3 sample(s))
          28,520 bytes maximum slop
               3 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0       298 colls,     0 par    0.056s   0.056s     0.0002s    0.0015s
  Gen  1         3 colls,     0 par    0.005s   0.005s     0.0017s    0.0047s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.001s  (  0.001s elapsed)
  MUT     time    0.148s  (  0.148s elapsed)
  GC      time    0.061s  (  0.061s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.211s  (  0.210s elapsed)

  Alloc rate    1,042,557,595 bytes per MUT second

  Productivity  70.8% of total user, 71.2% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

Using 7.6.3, things don't get out of hand, also with 3 MB total memory use.

$ ./Main-7.6.3 -f +RTS -s
     213,519,392 bytes allocated in the heap
     116,111,712 bytes copied during GC
         505,080 bytes maximum residency (11 sample(s))
         113,032 bytes maximum slop
               3 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0       403 colls,     0 par    0.33s    0.33s     0.0008s    0.0028s
  Gen  1        11 colls,     0 par    0.01s    0.01s     0.0011s    0.0019s

  TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    1.99s  (  1.50s elapsed)
  GC      time    0.34s  (  0.34s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    2.33s  (  1.84s elapsed)

  Alloc rate    107,426,859 bytes per MUT second

  Productivity  85.5% of total user, 108.1% of total elapsed

gc_alloc_block_sync: 0
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

I looked for causes and eliminated any STM interactions causing problems (STM shows up in the Unique values and in creating a TVar for the registered delay) but did discover that the emTimeouts queue gets very large at some point when executing with "-f". If I print the size of expired on this line:

https://github.com/ghc/ghc/blob/021b7978d14799bae779907faf7490cfd21b3f46/libraries/base/GHC/Event/TimerManager.hs#L226

I end up seeing somewhere from 4 to 20 events for a while then eventually it jumps up to 80000 or so. Perhaps davean can chime in about the particular use case that he has that I reduced to this simple program for a more real world use.

Trac metadata
Trac field Value
Version 7.9
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Runtime System
Test case
Differential revisions
BlockedBy
Related
Blocking
CC simonmar
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information