Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,261
    • Issues 4,261
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 405
    • Merge Requests 405
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
  • Security & Compliance
    • Security & Compliance
    • Dependency List
    • License Compliance
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #18919

Closed
Open
Opened Nov 02, 2020 by vdukhovni@trac-vdukhovniDeveloper

GHC 7.8—8.10, putMVar segfault

Summary

Intermittent segfaults observed in multi-threaded program (likely via BoundedChan) with GHC 8.10.2 [ TL;DR. Missing dirty_MVAR call in putMvar resulted in MVar incorrectly remaining clean even when holding the last reference to a TSO queue head in a younger generation, the queue head was subsequently moved by the GC, but the MVar pointer was not updated. As a result the MVar queue was corrupted with a dangling pointer to an unexpected object or just random content in memory. ]

Steps to reproduce

Crash many tens of minutes into the run, which was processing tens of GB of data, so the crash is by no means immediate, or easy to reproduce. It has happened a few times now. Rather difficult to reproduce, very sensitive to scheduler timing and workload. Things that made it more likely were:

  • Limiting the depth of the BoundedChan to 1, thus increasing inter-thread contention
  • Limiting the heap size with -A128k, making GC more frequent.
  • Running on a bare-metal 16-core/32-thread machine, to get more effective concurrency.
  • A multi-layer pipeline of BoundedChan's between source and sink:
    • HTTPS or stdin
    • gunzip
    • group into chunks of 1k lines
    • parallel JSON parser/filter
    • output
  • Large dataset from internet-wide IP survey, compressed to multiple GB.

Expected behavior

No segfault. (Appears to be resolved via !4457 (closed), !4458 (closed), !4459 and !4460 (closed). Issue introduced in 5d9e686c)

Environment

  • GHC version used: GHC 8.10.2 (but applies to all releases from 7.8 onward, MRs filed for 8.8, 8.10, 9.0 and master).

Optional:

  • Operating System: Fedora 31
  • System Architecture: x86_64
Edited Nov 30, 2020 by vdukhovni
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#18919