Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 5,244
    • Issues 5,244
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 570
    • Merge requests 570
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell CompilerGlasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #22264
Closed
Open
Issue created Oct 06, 2022 by jberryman@trac-jberryman

Segfaults and "PAP object entered!" with --nonmoving-gc

Summary

On 8.10.7 and on 9.2.4 (but only with optimizations on afaict) I get segfaults and an occasional internal error running our app (https://github.com/hasura/graphql-engine/commits/master) with --nonmoving-gc. I've only started debugging with 9.2.4. There I see: sometimes the server throws an error on startup, sometimes only after sending a query, and sometimes it seems to function normally.

A couple gdb sessions are attached; I don't really know how to get useful information from it. I’ve built with -g2 and I'm using the dwarf bindist.

gdb2 gdb1

The one "internal error" I managed to get was:

graphql-engine: internal error: PAP object (0x4205af6858) entered!
Stack trace:
            0x7fffdffa048f    set_initial_registers (rts/Libdw.c:294.5)
            0x7fffde9435a8    dwfl_thread_getframes (/usr/lib/x86_64-linux-gnu/libdw-0.187.so)
            0x7fffde94395c    dwfl_getthread_frames (/usr/lib/x86_64-linux-gnu/libdw-0.187.so)
            0x7fffdffa0ab7    libdwGetBacktrace (rts/Libdw.c:265.8)
            0x7fffdff9610d    rtsFatalInternalErrorFn (rts/RtsMessages.c:176.6)
            0x7fffdff962d0    barf (rts/RtsMessages.c:49.3)
            0x7fffdffc9093    stg_PAP_info (rts/Apply.cmm:215.1)
            0x7ffff6e730e8    graphqlzmenginezm1zi0zi0zminplace_HasuraziRQLziDDLziSchemaziCache_buildRebuildableSchemaCacheWithReason1847_info (src-lib/Hasura/RQL/DDL/Schema/Cache.hs:337.23)
            0x7fffdffc7230    stg_upd_frame_info (rts/Updates.cmm:31.1)
            0x7ffff6df5ed0    (null) (libraries/base/GHC/Base.hs:1491.22)
            0x7ffff6ea1b88    (null) (src-lib/Hasura/RQL/DDL/Schema/Cache.hs:156.10)
                  0x886ca8    (null) (src-lib/Hasura/App.hs:409.21)
            0x7fffdffc75f8    stg_maskAsyncExceptionszh_ret_info (rts/Exception.cmm:115.1)
            0x7fffdffc7b08    stg_catch_frame_info (rts/Exception.cmm:376.1)
                  0x87e2f0    (null) (libraries/base/GHC/Base.hs:1514.12)
            0x7fffdffc74b0    stg_unmaskAsyncExceptionszh_ret_info (rts/Exception.cmm:63.1)
                  0x87e750    Main_zdsforkManagedT_info (libraries/base/GHC/IO.hs:290.36)
            0x7fffdffc75f8    stg_maskAsyncExceptionszh_ret_info (rts/Exception.cmm:115.1)
            0x7fffdffc7b08    stg_catch_frame_info (rts/Exception.cmm:376.1)
                  0x3d9ab0    (null) (libraries/base/Control/Exception/Base.hs:239.10)
            0x7fffdffc74b0    stg_unmaskAsyncExceptionszh_ret_info (rts/Exception.cmm:63.1)
                  0x3d9f60    Main_zdszdwmkLoggerCtx_info (libraries/base/GHC/IO.hs:290.36)
            0x7fffdffc7b08    stg_catch_frame_info (rts/Exception.cmm:376.1)
            0x7fffdffc7b08    stg_catch_frame_info (rts/Exception.cmm:376.1)
            0x7fffdffc6fc8    stg_stop_thread_info (rts/StgStartup.cmm:42.1)
            0x7fffdffa0c5b    StgRunIsImplementedInAssembler (rts/StgCRun.c:379.5)
            0x7fffdffa4bc3    schedule (rts/Capability.h:226.12)
            0x7fffdffa64c1    scheduleWaitThread (rts/Schedule.c:2634.11)
            0x7fffdffa952c    hs_main (rts/RtsMain.c:73.18)
                 0x12d7d70    _init (/home/me/Work/hasura/graphql-engine-mono/dist-newstyle/build/x86_64-linux/ghc-9.2.4.20220919/graphql-engine-1.0.0/x/graphql-engine/opt/build/graphql-engine/graphql-engine)
            0x7fffde02920a    __libc_start_call_main (../sysdeps/nptl/libc_start_call_main.h:58.16)
            0x7fffde0292bc    __libc_start_main@@GLIBC_2.34 (../csu/libc-start.c:128.20)
                  0x36f561    _start (/home/me/Work/hasura/graphql-engine-mono/dist-newstyle/build/x86_64-linux/ghc-9.2.4.20220919/graphql-engine-1.0.0/x/graphql-engine/opt/build/graphql-engine/graphql-engine)

    (GHC version 9.2.4.20220919 for x86_64_unknown_linux)
    Please report this as a GHC bug:  https://www.haskell.org/ghc/reportabug

Steps to reproduce

I will have to circle back with a small reproducer if I can find one, but you can do:

  • edit scripts/dev.sh and add --nonmoving-gc to RUN_INVOCATION
  • change line 28 of cabal/dev-sh.project.local to flags: +optimize-hasura; ln -s cabal/dev-sh.project.local cabal.project.local
  • $ scripts/dev.sh graphql-engine in one terminal
  • $ scripts/dev.sh postgres in another
  • if you haven’t segfaulted by this point: cd server/benchmarks/benchmark_sets/chinook && curl -X POST -H 'Content-Type: application/json' -d @replace_metadata.json http://127.0.0.1:8181/v1/query

Expected behavior

normal functioning

Environment

  • GHC version used: 9.2.4 (fork with backported fix: https://gitlab.haskell.org/ghc/ghc/-/pipelines/56889 ; I'll need to circle back about whether I see this on stock 9.2.4 as well)

Optional:

  • Operating System: linux
  • System Architecture: debian bullseye
Edited Oct 06, 2022 by jberryman
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking