Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,274
    • Issues 4,274
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 412
    • Merge Requests 412
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #11783

Closed
Open
Opened Apr 02, 2016 by luispedro@trac-luispedro

Very large slowdown when using parallel garbage collector

As part of debugging some performance issues on an application I am writing, I concluded that the issue is in the parallel GC implemented in the GHC RTS. I extracted the code attached to make a self-contained use-case, but in my system the code runs in 16s when using a single thread, in 18s when using 6 threads but no parallel GC and in over a minute when using 6 threads with parallel GC!

The true slowdown in the full code is actually worse and relevant for the application (some steps take >1 hour instead of <1 minute!). Parts of the code do take full advantage of parallel processing, this is just one simple test case.

On some machines it seems worse than others and it seems that the input file (data.txt) needs to be quite large for the problem to really show up (the attached script generates a 16 million input file, this is still smaller than some of my real use cases, but I couldn't trigger it with only 1 million). Similarly, with 4 threads, the slowdown is detectable, but not as large.

While running, CPU usage is very high (I tested with 16 threads and it uses 16 CPUs continuously, top reports 1600% CPU).

Using '+RTS -A64m' is another way around the issue, but for the full application it is still not as effective as '+RTS -qg', so there still seems to be a performance issue here.

Edited Mar 10, 2019 by luispedro
Assignee
Assign to
8.0.1
Milestone
8.0.1 (Past due)
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#11783