Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 5,255
    • Issues 5,255
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 562
    • Merge requests 562
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell CompilerGlasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #17445
Closed
Open
Issue created Nov 07, 2019 by Julian Stecklina@blitz

Segfaults on Intel Goldmont Plus Atom CPU

Summary

GHC eventually segfaults when compiling non-trivial code on an Intel Celeron J4005 (Goldmont Plus) Atom. The issue is not completely reproducible, but happens often enough that it renders GHC useless on this system. The resulting segfaults happen at similar places. For example, ghc 8.6.3 has hit this exact segfault 3 times now:

Nov 06 20:36:35 kernel: ghc[8021]: segfault at 1 ip 00007ffff099d0df sp 00007fffffff64d8 error 6 in libHSr
ts_thr-ghc8.6.3.so[7ffff094c000+71000]
Nov 06 20:36:35  kernel: Code: 00 00 00 19 00 00 00 00 00 00 00 48 83 ec 08 48 8d 3d 41 d9 00 00 31 c0 e8 be 57 fd ff 48 83 c4 08 eb e8 48 63 43 0c 48 89 c1 <48> c1 e1 03 48 83 c1 02 48 89 ea 48 29 ca 4c 39 fa 0f 82 8a 00 00

This is extremely weird, because the offending instruction is a shift:

0000000000000018 4863430c         movsxd rax, dword [rbx+0xc]
000000000000001c 4889c1           mov rcx, rax
000000000000001f 48c1e103         shl rcx, 0x3   <--- Segfault happens here?
0000000000000023 4883c102         add rcx, 0x2
0000000000000027 4889ea           mov rdx, rbp
000000000000002a 4829ca           sub rdx, rcx
000000000000002d 4c39fa           cmp rdx, r15

I initially thought this system has bad RAM, but:

  • I see no other software crashing on this system,
  • I ran memtest86 for the last night without issue.

I also see no segfaults with the same software stack on the three Intel Core systems I have available (Broadwell, Skylake, Whiskey Lake). The only important difference seems to be the CPU.

I would appreciate if someone with a modern Atom CPU can try to reproduce this issue.

Steps to reproduce

The easiest way to hit the problem is to compile GHC itself. On NixOS, this can be achieved by:

% git clone https://github.com/NixOS/nixpkgs.git
% cd nixpkgs
# We have to avoid fetching a cached copy, but don't want to rebuild all the dependencies
# as well. So the easiest way is to change the derivation in a meaningless way. Add a newline in preConfigure:
% $EDITOR pkgs/development/compilers/ghc/8.8.1.nix
% nix-build -A haskell.compiler.ghc881
[...]
"/nix/store/vkz7mjh3frlc3rl8wlyqddcn0356dicv-ghc-8.6.3-binary/bin/ghc" [...] -no-user-package-db -rtsopts     
   -outputdir compiler/stage1/build    -c compiler/main/Ar.hs -o compiler/stage1/build/Ar.o
make[1]: *** [compiler/ghc.mk:444: compiler/stage1/build/Ar.o] Segmentation fault (core dumped)

Expected behavior

GHC does not crash.

Environment

  • GHC version used: 8.6.3 (but also 8.4.4)

Optional:

  • Operating System: Linux (NixOS 19.09), both on Linux 4.19.81 and 5.3.7
  • System Architecture: x86_64
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking