ppc64le hslua segfaulting for ghc 9.2+
Summary
We discovered in Fedora that on ppc64le pandoc built with ghc-9.2.6 (and also 9.4.4)
segfaults for pandoc --version
.
There are also details in the downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=2172771
The crash does not happen when built with ghc-9.0 and earlier.
This is problematic because many R packages use pandoc and check its version for example.
Steps to reproduce
Build pandoc-2.19.2 on ppc64le with ghc-9.2 or later and run pandoc -v
.
Expected behavior
pandoc -v
should not crash
Environment
- GHC version used: 9.2.6
Optional:
- Operating System: Linux
- System Architecture: ppc64le
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Jens Petersen added PowerPC needs triage labels
added PowerPC needs triage labels
- Jens Petersen changed the description
Compare with previous version changed the description
- Author Developer
Also it is probably worth saying explicitly that
pandoc
does not segfault, onlypandoc -v
does. (edit: note this is no longer true with pandoc-cli - it crashes without -v too)I tried pandoc-cli-0.1 (pandoc-3.1) too and it also segfaults (with ghc-9.2.6).
cc @trommler
I will try to reduce it to a more minimal reproducer.
Edited by Jens Petersen - Jens Petersen changed the description
Compare with previous version changed the description
- Jens Petersen changed title from ppc64le pandoc -v segfaults for ghc 9.2 and 9.4 to ppc64le hslua segfaulting for ghc 9.2 and 9.4
changed title from ppc64le pandoc -v segfaults for ghc 9.2 and 9.4 to ppc64le hslua segfaulting for ghc 9.2 and 9.4
- Author Developer
So actually it seems to be segfaulting in hslua. Still a bit suspicious that this only happens for ghc-9.2+
Edited by Jens Petersen - Peter Trommler assigned to @trommler
assigned to @trommler
- Developer
Do you have a list of versions for the many dependencies of pandoc? My build failed in yaml.
Do you also happen to know which is the oldest version of ghc where pandoc/hslua segfault?
Collapse replies - Author Developer
I haven't tried different minor versions if that is what you mean.
I see the segfault with 9.2 but not 9.0.
Are you building pandoc-3.1? - I think 2.19 is sufficient but see below. Basically I used Stackage LTS 20 versions for Fedora, to keep my sanity. :-)
- Developer
Yes, I meant minor version of ghc for a stare-at-the-commit-history debug strategy.
I could also try bisect but with submodules and ghc core library versions my experience has been less than encouraging in the past.
1 - Author Developer
Perhaps I can try ghc9.2-9.2.5 also, though I would expect the same result.
- Author Developer
The smallest reproducer I have so far:
{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE TypeApplications #-} import HsLua (Exception, getglobal, openlibs, peek, run, top) main :: IO () main = do luaVersion <- HsLua.run @HsLua.Exception $ do openlibs _ <- getglobal "_VERSION" peek top putStrLn luaVersion
just needs hslua.
(This code is extracted from
pandoc -v
.)Edited by Jens Petersen 1 - Author Developer
There is also a shallow gdb backtrace now in the downstream bug - though dunno if it helps much.
Collapse replies - Developer
I copy the backtrace here so I don't have to refer back to the downstream ticket:
#0 0x0000000017e436e8 in lua_type () #1 0x0000000017e3b428 in ghczuwrapperZC1ZCluazm2zi2zi1zmKRmEomuYCwNASKOuLK7fDIZCLuaziPrimaryZCluazutype () #2 0x0000000017df7ca8 in hsluazmmarshallingzm2zi2zi1zmCyWh4OZZgxoeJYv9YK7F33ZZ_HsLuaziMarshallingziPeekers_zdwtoByteString_entry () #3 0x000000001bf29710 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?)
- Author Developer
I put my testcase here: https://github.com/juhp/ppc64le-hslua-version if it helps :-)
- Author Developer
Also segfault with ghc-9.2.5 on Fedora 37 ppc64le: https://copr.fedorainfracloud.org/coprs/petersen/pandoc-test-ppc64le/build/5563255/
and also for EPEL 9 (RHEL9): https://copr.fedorainfracloud.org/coprs/petersen/pandoc-test-ppc64le/build/5563428/
Edited by Jens Petersen Collapse replies - Developer
Thanks. I will try to bisect from there.
- Developer
Bisecting ended in Cabal hell right from the start :-( I will have to reduce the testcase further so it does not depend on
hslua
anymore. - Author Developer
Do you have a smaller reproducer already?
Edited by Jens Petersen - Developer
No success so far.
- Developer
I verified the bug is already present in GHC 9.2.2
Collapse replies - Developer
... and also 9.2.1
- Ben Gamari added Phigh Tbug labels and removed needs triage label
added Phigh Tbug labels and removed needs triage label
- Author Developer
Seems worse with hslua-cli used by pandoc 3.x (pandoc-cli) and ghc-9.4(.5).
Previously I could workaround the
--version
crash by disabling the LUA version output, but that appears no longer to be the case ;-( Help!Edited by Jens Petersen - Author Developer
A bit of progress: I reduced my testcase from hslua to hslua-core/lua and (I am now testing with runghc instead of building with cabal for faster turnaround).
It seems to be crashing on ppc64le here in:
foreign import ccall SAFTY "lua.h lua_tolstring" lua_tolstring
from lua Lua.Primary.
The LUA C code is https://www.lua.org/source/5.4/lapi.c.html#lua_tolstring
(Terse notes: peek became hslua-marshalling "peekByteString" which I further reduced to
tostring
(definition) and expanded (inlined)).Edited by Jens Petersen - Reporter
We are facing the same issue in Debian, now that we migrated to GHC 9.4. The testsuite for
lua
package segfaults on ppc64le (see https://buildd.debian.org/status/fetch.php?pkg=haskell-lua&arch=ppc64el&ver=2.3.1%2Bds1-1&stamp=1698828205&raw=0). @juhp @trommler Did you manage to find a solution to this? Collapse replies - Developer
Sorry, I have no solution to offer and I cannot work on it right now. I am unassigning myself for now.
- Peter Trommler unassigned @trommler
unassigned @trommler