More aarch64 memory barrier problems.
Attempting to build 8.8.3 on aarch64, I'm hit with loads of these while the build system is using stage 2 to build haddock:
ghc-stage2: internal error: evacuate: strange closure type -1860476400ghc-stage2: internal error:
(GHC version 8.8.3 for aarch64_unknown_linux)
evacuate: strange closure type -1860476400 Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
ghc-stage2: internal error: (GHC version 8.8.3 for aarch64_unknown_linux)
evacuate: strange closure type -1860476400 Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
(GHC version 8.8.3 for aarch64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
ghc-stage2: internal error: evacuate: strange closure type -1860476400
ghc-stage2: internal error: (GHC version 8.8.3 for aarch64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
ghc-stage2: internal error: evacuate: strange closure type -1860476400
(GHC version 8.8.3 for aarch64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
ghc-stage2: internal error: evacuate: strange closure type -1860476400
(GHC version 8.8.3 for aarch64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
ghc-stage2: internal error: evacuate: strange closure type -1860476400ghc-stage2: internal error:
(GHC version 8.8.3 for aarch64_unknown_linux)
evacuate: strange closure type -1860476400 Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
(GHC version 8.8.3 for aarch64_unknown_linux)
evacuate: strange closure type -1860476400 Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
(GHC version 8.8.3 for aarch64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
ghc-stage2: internal error: evacuate: strange closure type -1860476400
(GHC version 8.8.3 for aarch64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
I'm fairly certain this failure is nondeterministic, but building haddock produces it reliably. Re-attempting the build several times, I was unable to get it to build haddock without crashing.
I'm pretty sure that this is more of the behavior originally identified in #15449 (closed). To attempt to fix this, I originally opened !734 (closed). However, my understanding of how STG and the RTS work wasn't up to snuff and some of the added barriers were identified as superfluous. !1128 (merged) ultimately merged in a subset of these barriers. I think the issue is some combination of:
- !1128 (merged) omitted some barriers from !734 (closed) that were actually necessary.
- There are regressions between !1128 (merged) and 8.8.3. Sadly, I suspect very few people are using a native aarch64 GHC at all, and I'm not sure what (if any) aarch64 CI testing is done, so this doesn't seem too unlikely.
Unlike the first time I encountered these issues, all of the crashes are about "strange closure type" during evacuation, so that narrows things down just a bit.
A hopefully useful anecdote: my organization has been using a GHC 8.6.4 with !734 (closed) backported for about a year and a half. Thousands of native aarch64 builds were done, and lots of different Haskell programs built with such a GHC have been deployed to hundreds of aarch64 machines during that time, and not a single crash occurred attributable to aarch64 memory barrier issues. This gives me confidence that the correct set of barriers is a subset of what's in !734 (closed).
It'll likely be a week or two before I'm able to commit any time to looking into this...