Skip to content

idle time full GCs (idle cpu usage)

In an interactive program I noticed idle CPU usage of ~12% due to the full GCs done by the RTS on default settings. For this specific usecase this is a lot of wasted CPU cycles for no gain. Passing -I0 to the RTS "fixes" this; nonetheless I think that the default behaviour could/should be improved, even when considering other usecases where full GCs are a good choice.

The program in question keeps ~50MB allocated on the heap but does close to no short-lived allocations in between the idle GCs. Some relevant data from the summary:

     930,569,008 bytes allocated in the heap
   4,847,446,928 bytes copied during GC
      46,255,056 bytes maximum residency (111 sample(s))
         378,200 bytes maximum slop
              94 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1784 colls,     0 par    0.123s   0.100s     0.0001s    0.0006s
  Gen  1       111 colls,     0 par    9.377s   9.418s     0.0848s    0.1099s

  INIT    time    0.000s  (  0.003s elapsed)
  MUT     time    2.250s  ( 95.029s elapsed)
  GC      time    9.500s  (  9.518s elapsed)
  RP      time    0.000s  (  0.000s elapsed)
  PROF    time    0.000s  (  0.000s elapsed)
  EXIT    time    0.007s  (  0.008s elapsed)
  Total   time   11.760s  (104.557s elapsed)

The detailed GC statistics of the idle GCs look like

    Alloc    Copied     Live     GC     GC      TOT      TOT  Page Flts
    bytes     bytes     bytes   user   elap     user     elap
    13304  45958728  46210536  0.070  0.068    0.920    1.550    0    0  (Gen:  1)
    11568  45958720  46210536  0.100  0.102    1.043    2.584    0    0  (Gen:  1)
    15432  45958720  46210536  0.103  0.105    1.167    3.588    0    0  (Gen:  1)
    11568  45958720  46210536  0.100  0.102    1.287    4.586    0    0  (Gen:  1)
    11568  45958720  46210536  0.107  0.107    1.413    5.592    0    0  (Gen:  1)
    11568  45958720  46210536  0.073  0.080    1.503    6.566    0    0  (Gen:  1)
    11568  45958720  46210536  0.073  0.073    1.593    7.560    0    0  (Gen:  1)
    11568  45958720  46210536  0.067  0.068    1.677    8.557    0    0  (Gen:  1)

I don't really know what exactly "alloc bytes" means, but my guess is that those numbers are indeed fairly low and those GCs are mostly useless.

To clarify the intention of this request: Tweak the idle GC (default) configuration in some way so that interactive programs with moderate heap use reasonable amount of CPU time while idle.

And "reasonable" is intentionally vague as I don't know how to weight other usecases.

Minimal example:

import Control.Concurrent

main :: IO ()
main = do
  let large = [1..1000000]
  print $ length large
  [1..20] `forM_` \_ -> do
    threadDelay 400000
  print $ sum large
Trac metadata
Trac field Value
Version 8.0.1
Type FeatureRequest
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Runtime System
Test case
Differential revisions
BlockedBy
Related
Blocking
CC simonmar
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information