Austin sugests that this happened around the time when I pushed my loopification patch (this bug reported on 30 Aug, my patch pushed a day earlier). Panic happens in splitAtProcPoints function and I recall that my previous attempt at loopification as a Cmm pass didn't work with LLVM because it broke the invariant that a block may be reachable only from a single procpoint.
@jstolarek : On powerpc64-linux, if I revert commit d61c3ac1 the stage1 build completes and then fails during the stage2 build. This suggests that something in that commit is causing the expectJust failure.
Thanks for your build.mk. Unfortunately I can't reproduce this on my Linux machine - it seems that the problem only happens on Macs.
if I revert commit d61c3ac1 the stage1 build completes and then fails during the stage2 build.
This is most strange. If stage2 fails after reverting that commit this would mean that you are experiencing some other bug. Did you clean the build tree after reverting the commit? Also, how does stage2 fail? What error do you get?
One thing you could do to help us in debugging this is trying to build HEAD and when you get a build failure, you could re-run the last command with -ddump-cmm -dcmm-lint added to the command line. So this would be something like this:
THis is actually the first command run using the second stage compiler. It builds the non dynamic timeUtils.o object successfully and when it builds timeUtils.dyn_o I get this segfault.
i tried dumping the cmm and lint files but it seems the compiler segfaulted before it got to that stage.
OK, I'm really puzzled about this segfault in stage2 compiler, but perhaps I can help with panic in expectJust - we need to reproduce it. This means you need to revert the reverting commit :) or in other words attempt to build unmodified HEAD and allow stage1 build to fail with panic that you originally reported. After that happens run the command that causes the segfault with -ddump-cmm -dcmm-lint added.
Edward: No, unfortunately I don't have access to one. I asked Richard if he is able to reproduce the problem on his Mac but everything builds fine on his machine.
erikd: Thanks. Are you sure this is the right dump? If compiler panicked during compilation the dump should be incomplete, whereas yours is.
But that's not that important - Kazu provided a dump which allows me to figure out what's going on. Below is an explanation of what is going on (no solution yet).
Here is how Cmm looks before stack layout (cFXJ and cFXS are important here):
Notice that cFXS block was eliminated during stack layout and we got a new uFY0 block. Now comes the time for CAF analysis followed by proc-point analysis:
==================== CAFEnv ====================[(cFXI, {}), (cFXJ, {}), (cFXN, {}), (cFXO, {}), (cFXP, {}), (cFXR, {}), (cFXT, {}), (cFXU, {}), (uFY0, {})]==================== procpoint map ====================[(cFXI, reached by cFXP), (cFXJ, reached by cFXR), (cFXN, reached by cFXT), (cFXO, reached by cFXT), (cFXP, <procpt>), (cFXR, <procpt>), (cFXS, <procpt>), (cFXT, <procpt>), (cFXU, reached by cFXT), (uFY0, reached by cFXR)]
Notice that procpoint map refers to deleted cFXS block. The problem is that we determine proc-points before stack layout but run proc-point analysis after stack layout. Clearly, stack layout can remove some proc-points that we previously computed and thus invalidate our analysis. I don't have a good idea for a solution yet. We can't compute proc-points after stack layout, because stack-layout needs that information. One idea that comes to my mind is modifying stack layout so that it returns a new list of procpoints, possibly modified.
I wonder why does this only happen on MacOS and why only on some machines. I think this should be deterministic and happen always, regardless of operating system.