GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2020-07-08T17:18:09Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/16992compactFixupPointers# fails if compact region blocks are more than 2 GiB apart2020-07-08T17:18:09ZFabian ThorandcompactFixupPointers# fails if compact region blocks are more than 2 GiB apart## Summary
Importing a compact region using `importCompact` from the `ghc-compact` package fails if the addresses of the compact region blocks span across more than 2 GiB.
## Steps to reproduce
The smallest Haskell example for reprodu...## Summary
Importing a compact region using `importCompact` from the `ghc-compact` package fails if the addresses of the compact region blocks span across more than 2 GiB.
## Steps to reproduce
The smallest Haskell example for reproducing the bug I could make is the following:
```haskell
import qualified Data.Compact as Compact
import qualified Data.Compact.Serialize as CompactSerialize
-- | Minimal test case for reproducing compactFixupPointers# bug for large compact regions.
main :: IO ()
main = do
let
large = 1024 * 1024 * 128
largeString = replicate large 'A'
region <- Compact.compact largeString
Compact.compactSize region >>= print
CompactSerialize.writeCompact "/tmp/bug.compact" region
deserialized <- CompactSerialize.unsafeReadCompact "/tmp/bug.compact"
case deserialized of
Left err -> putStrLn err
Right largeString' -> print (Compact.getCompact largeString' == largeString)
```
It uses the `compact` library, but I could also reproduce it with a hand-rolled deserializer calling `GHC.Compact.Serialized.importCompact` directly.
Compiling and running that file results in `Failed to import compact` being printed to stdout.
I traced that error message (which comes from the `compact` library) back to a call to `compactFixupPointers#` in the `ghc-compact` library.
I had a quick look at the corresponding source code of the RTS (in `rts/sm/CNF.c`) and I suspect a faulty order predicate passed to `qsort` in `build_fixup_table` is the root cause of the issue. It is implemented as
```c
static int
cmp_fixup_table_item (const void *e1, const void *e2)
{
const StgWord *w1 = e1;
const StgWord *w2 = e2;
return *w1 - *w2;
}
```
However, on a 64 bit system, `sizeof(int)` is still just 4 bytes while the difference of two `StgWord`s is 8 bytes in size.
If the two pointers being compared are too far apart, an overflow causes the returned order to be the reverse of the actual order.
The following self-contained C program demonstrates this behavior (if compiled on a 64 bit architecture):
```c
#include <stdio.h>
#include <stdint.h>
typedef uint64_t StgWord;
static int
cmp_fixup_table_item (const void *e1, const void *e2)
{
const StgWord *w1 = e1;
const StgWord *w2 = e2;
return *w1 - *w2;
}
static void
check (const StgWord *e1, const StgWord *e2)
{
printf("*e1 = %ld *e2=%ld\n", *e1, *e2);
printf("*e1 < *e2 == %d\n", (int)(*e1 < *e2));
printf("cmp_fixup_table_item(e1,e2) == %d\n\n", cmp_fixup_table_item(e1, e2));
}
int main() {
StgWord examplePointer1 = 1234567;
StgWord examplePointer2 = examplePointer1 + 1234;
StgWord bigOffset = (1L << 31) + 1234L;
StgWord examplePointer3 = examplePointer1 + bigOffset;
check(&examplePointer1, &examplePointer2);
check(&examplePointer1, &examplePointer3);
return 0;
}
```
The output is
```
*e1 = 1234567 *e2=1235801
*e1 < *e2 == 1
cmp_fixup_table_item(e1,e2) == -1234
*e1 = 1234567 *e2=2148719449
*e1 < *e2 == 1
cmp_fixup_table_item(e1,e2) == 2147482414
```
Note how for pointers that are close to each other, it returns a negative number (as it should),
but if the pointers are too far apart, the sign switches.
## Expected behavior
The Haskell program above should print `True`, not an error message.
## Environment
* GHC version used: 8.6.3
Optional:
* Operating System: Ubuntu 18.04
* System Architecture: x86-648.10.2Ömer Sinan AğacanÖmer Sinan Ağacan