Obscure linker bug leads to crash in GHCi
I have a build of GHC (with DYNAMIC_GHC_PROGRAMS=NO
) that exhibits the following crash:
$ ghc -e 'System.Environment.getEnvironment'
<segfault>
I tracked it down, eventually, to a bad reference to the symbol environ
from __hscore_environ
in libraries/base/includes/HsBase.h
. Somehow, environ
had got linked to the wrong address.
Lots more investigation lead me to discover this: internal_dlsym()
in Linker.c
tries to look up a symbol in all the different shared libraries we have loaded so far, one by one. (see be497c20). Unfortunately, this seems to break things in my case. Here's a simple test program that works on Ubuntu 12.04:
#include <dlfcn.h>
#include <stdio.h>
char *so = "/usr/lib/x86_64-linux-gnu/libgmp.so";
char *so2 = "/usr/lib/x86_64-linux-gnu/libpthread.so";
extern char**environ;
int main(int argc, char *argv[])
{
void *deflt, *hdl;
deflt = dlopen(NULL, RTLD_LAZY | RTLD_GLOBAL);
printf("environ = %p\n", &environ);
printf("dlsym(deflt, \"environ\") = %p\n", dlsym(deflt,"environ"));
hdl = dlopen(so, RTLD_LAZY);
printf("dlsym(\"libgmp\", \"environ\") = %p\n", dlsym(hdl,"environ"));
hdl = dlopen(so2, RTLD_LAZY);
printf("dlsym(\"libpthread\", \"environ\") = %p\n", dlsym(hdl,"environ"));
}
And the output:
environ = 0x601040
dlsym(deflt, "environ") = 0x601040
dlsym("libgmp", "environ") = 0x2aaaab290568
dlsym("libpthread", "environ") = 0x601040
Note that the value we get from looking up environ
in libgmp
is different to the others. The correct value is 0x601040
. gdb thinks that 0x2aaaab290568
is also environ
:
(gdb) p4 0x2aaaab290568
0x2aaaab290580 <buflen.9817>: 0x0
0x2aaaab290578: 0x0
0x2aaaab290570 <miss_F_GETOWN_EX>: 0x0
0x2aaaab290568 <environ>: 0x0
but note that it contains zero. The real one is:
(gdb) p4 0x601040
0x601058: 0x0
0x601050 <dtor_idx.6533>: 0x0
0x601048 <completed.6531>: 0x0
0x601040 <environ@@GLIBC_2.2.5>: 0x7fffffffe268
In GHC we're loading libgmp
when we load the integer-gmp
package, and this causes future references to environ
to go wrong.
I've locally fixed this by changing internal_dlsym
to dlsym
, but since there was a reason to make this change in the first place I haven't pushed it to master. Suggestions welcome.
Trac metadata
Trac field | Value |
---|---|
Version | 7.6.3 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | high |
Resolution | Unresolved |
Component | Runtime System |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | igloo, simonmar |
Operating system | |
Architecture |