integer-gmp passes arguments to __gmpn_gcd_1() in contravention of the docs

I have encountered this issue on GHC HEAD, but it looks like it's been the case for a long time: The implementation of gcdInt#, which is used by gcd for Int#, some other integer types, and Integer when both parameters use the S# "small" constructor, is a wrapper around __gmpn_gcd_1(). The GMP documentation for mpn_gcd() states some input preconditions for its inputs, and the documentation for mpn_gcd_1() comes right after it and is very terse, implying that the same preconditions are assumed. We do in fact guarantee most of those preconditions, in particular that the first input is the larger one, but it specifies that the second input must be odd, and we don't check that. (I'm not quite clear on what we're supposed to do if it isn't, but presumably there's some simple algorithm ...)

This is not just a theoretical problem. GMP has multiple implementations of its algorithms, for multiple architectures, and no doubt each one has slightly different characteristics; I have tested on an x86-64 system, which uses the implementation of mpn_gcd_1() in the file libraries/integer-gmp/gmp/gmpbuild/mpn/x86_64/gcd_1.asm. I don't really want to make sense of the assembly language and figure out why it misbehaves, but we're violating the documentation and the entire mpn_* suite of functions is considered "internal and difficult to use correctly", so we are the ones who need to change, not them.

The problem can be seen as follows:

GHCi, version 7.3.20111124: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> import GHC.Integer.GMP.Internals
Prelude GHC.Integer.GMP.Internals> import GHC.Types
Prelude GHC.Integer.GMP.Internals GHC.Types GHC.Int> import GHC.Integer.GMP.Prim 
Prelude GHC.Integer.GMP.Internals GHC.Types GHC.Int GHC.Integer.GMP.Prim> I# (gcdInt# 0x2B992DDFA23249D6# 0xDE0B6B3A7640000#)
0

This should reproduce on any x86-64 system, although I don't have access to another one to test that. In my setup, I'm using the in-tree GMP, which is GMP 5.0.2 as of this writing.

To elicit this misbehavior without using an internal function is a little harder. It comes up on my system when attempting to build the GHC haddocks, trying to compute a Rational approximation of pi (that process is where I discovered the above magic constants). But it will not reproduce easily for other people. I am still investigating why not, but I suspect that cross-module dictionary-function unfolding, which is implemented with built-in rewrite rules, one for each class, is broken and has not worked on any platform for at least a couple months. This in turn prevents other rewrite rules from firing; constant folding of arithmetic operations can't happen because those rules are written against primops, not against the methods of Num. On investigation, I have found that some compilation settings (profiling, dynamic linking) correctly include the definition, and not just the type, of dictionary functions in interface files, but vanilla compilation does not. I note that a comment in compiler/main/TidyPgm.lhs claims that DFun unfoldings are communicated to the code in compiler/iface/MkIface.lhs by placing them in the binding; but the bindings are part of the CgGuts, which are not passed to mkIface at all (only the ModGuts are). So there needs to be a new mechanism to pass this information, and I'm not sure what it should be, or why it does get included with some settings. I'll file a separate ticket on this when I have more facts, and not just speculation; I mention it here only because it interacts with the gcd bug, both masking it and evoking it in different situations.

The upshot: Anyone seeking to reproduce the behavior will find it does not arise if they invoke "gcd 3141592653589793238 1000000000000000000 :: Integer", because this gets folded at compile-time by a rule that DOES still work. Invoking "gcdInteger 3141592653589793238 1000000000000000000" directly will trigger the bug from the interpreter, but not when compiled, because there is a non-builtin rewrite rule on gcdInteger (not, you will observe, a dictionary function, and hence not affected by the other problem) which folds it. However, because of the aforementioned brokenness of constant folding on arithmetic, I expect that the behavior can be reproduced on all x86-64 platforms, both interpreted and compiled, by invoking "gcdInteger (3141592653589793238 * 1) (1 * 1000000000000000000)".

The code that needs changes for the gcd issue can be found in libraries/integer-gmp/cbits/gmp-wrappers.cmm.

Edited Mar 09, 2019 by Ian Lynagh -

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information