Segfaults and "PAP object entered!" with --nonmoving-gc
Summary
On 8.10.7 and on 9.2.4 (but only with optimizations on afaict) I get segfaults and an occasional internal error running our app (https://github.com/hasura/graphql-engine/commits/master) with --nonmoving-gc
. I've only started debugging with 9.2.4. There I see: sometimes the server throws an error on startup, sometimes only after sending a query, and sometimes it seems to function normally.
A couple gdb sessions are attached; I don't really know how to get useful information from it. I’ve built with -g2
and I'm using the dwarf bindist.
The one "internal error" I managed to get was:
graphql-engine: internal error: PAP object (0x4205af6858) entered!
Stack trace:
0x7fffdffa048f set_initial_registers (rts/Libdw.c:294.5)
0x7fffde9435a8 dwfl_thread_getframes (/usr/lib/x86_64-linux-gnu/libdw-0.187.so)
0x7fffde94395c dwfl_getthread_frames (/usr/lib/x86_64-linux-gnu/libdw-0.187.so)
0x7fffdffa0ab7 libdwGetBacktrace (rts/Libdw.c:265.8)
0x7fffdff9610d rtsFatalInternalErrorFn (rts/RtsMessages.c:176.6)
0x7fffdff962d0 barf (rts/RtsMessages.c:49.3)
0x7fffdffc9093 stg_PAP_info (rts/Apply.cmm:215.1)
0x7ffff6e730e8 graphqlzmenginezm1zi0zi0zminplace_HasuraziRQLziDDLziSchemaziCache_buildRebuildableSchemaCacheWithReason1847_info (src-lib/Hasura/RQL/DDL/Schema/Cache.hs:337.23)
0x7fffdffc7230 stg_upd_frame_info (rts/Updates.cmm:31.1)
0x7ffff6df5ed0 (null) (libraries/base/GHC/Base.hs:1491.22)
0x7ffff6ea1b88 (null) (src-lib/Hasura/RQL/DDL/Schema/Cache.hs:156.10)
0x886ca8 (null) (src-lib/Hasura/App.hs:409.21)
0x7fffdffc75f8 stg_maskAsyncExceptionszh_ret_info (rts/Exception.cmm:115.1)
0x7fffdffc7b08 stg_catch_frame_info (rts/Exception.cmm:376.1)
0x87e2f0 (null) (libraries/base/GHC/Base.hs:1514.12)
0x7fffdffc74b0 stg_unmaskAsyncExceptionszh_ret_info (rts/Exception.cmm:63.1)
0x87e750 Main_zdsforkManagedT_info (libraries/base/GHC/IO.hs:290.36)
0x7fffdffc75f8 stg_maskAsyncExceptionszh_ret_info (rts/Exception.cmm:115.1)
0x7fffdffc7b08 stg_catch_frame_info (rts/Exception.cmm:376.1)
0x3d9ab0 (null) (libraries/base/Control/Exception/Base.hs:239.10)
0x7fffdffc74b0 stg_unmaskAsyncExceptionszh_ret_info (rts/Exception.cmm:63.1)
0x3d9f60 Main_zdszdwmkLoggerCtx_info (libraries/base/GHC/IO.hs:290.36)
0x7fffdffc7b08 stg_catch_frame_info (rts/Exception.cmm:376.1)
0x7fffdffc7b08 stg_catch_frame_info (rts/Exception.cmm:376.1)
0x7fffdffc6fc8 stg_stop_thread_info (rts/StgStartup.cmm:42.1)
0x7fffdffa0c5b StgRunIsImplementedInAssembler (rts/StgCRun.c:379.5)
0x7fffdffa4bc3 schedule (rts/Capability.h:226.12)
0x7fffdffa64c1 scheduleWaitThread (rts/Schedule.c:2634.11)
0x7fffdffa952c hs_main (rts/RtsMain.c:73.18)
0x12d7d70 _init (/home/me/Work/hasura/graphql-engine-mono/dist-newstyle/build/x86_64-linux/ghc-9.2.4.20220919/graphql-engine-1.0.0/x/graphql-engine/opt/build/graphql-engine/graphql-engine)
0x7fffde02920a __libc_start_call_main (../sysdeps/nptl/libc_start_call_main.h:58.16)
0x7fffde0292bc __libc_start_main@@GLIBC_2.34 (../csu/libc-start.c:128.20)
0x36f561 _start (/home/me/Work/hasura/graphql-engine-mono/dist-newstyle/build/x86_64-linux/ghc-9.2.4.20220919/graphql-engine-1.0.0/x/graphql-engine/opt/build/graphql-engine/graphql-engine)
(GHC version 9.2.4.20220919 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
Steps to reproduce
I will have to circle back with a small reproducer if I can find one, but you can do:
- edit
scripts/dev.sh
and add--nonmoving-gc
toRUN_INVOCATION
- change line 28 of
cabal/dev-sh.project.local
toflags: +optimize-hasura
;ln -s cabal/dev-sh.project.local cabal.project.local
-
$ scripts/dev.sh graphql-engine
in one terminal -
$ scripts/dev.sh postgres
in another - if you haven’t segfaulted by this point:
cd server/benchmarks/benchmark_sets/chinook && curl -X POST -H 'Content-Type: application/json' -d @replace_metadata.json http://127.0.0.1:8181/v1/query
Expected behavior
normal functioning
Environment
- GHC version used: 9.2.4 (fork with backported fix: https://gitlab.haskell.org/ghc/ghc/-/pipelines/56889 ; I'll need to circle back about whether I see this on stock 9.2.4 as well)
Optional:
- Operating System: linux
- System Architecture: debian bullseye