IntegerGmp2
(#9281 (closed))
Design of This wiki page is meant as a scratchpad to describe the plans/ideas behind the integergmp2
rewrite
Motivations

allow to avoid the custom GMP allocator hack, and thus

avoid issues when linking against other C libraries using GMP,

simplify code, as we would perform all heap allocations in Haskell code (and never inside Cmm/C code as its done now),

maybe even remove a few more superfluous temporary heap allocations.

linktime selectable integerlibrary backend (think
ghc make O fintegerlibrary=bsdnt Hello.hs
) simplifies handling on platforms where GMP is not a systemlibrary (mostly OSX and Windows)
 user code can still link with GMP, even though the primitive operation use
bsdnt

Allow to implement (big) Natural number type w/o redundant sign handling

somewhat more robust outofmemory handling (since allocations occur in Haskellland)
Roadmap
For GHC 7.10.1

(done) Switch to
integergmp2
as defaultINTEGER_LIBRARY
 but leave
INTEGER_LIBRARY=integergmp
in place as buildoption
 but leave

(done) Leave the package name as
integergmp
, i.e. for
INTEGER_LIBRARY=integergmp
we end up withintegergmp0.5.1.0
, while for 
INTEGER_LIBRARY=integergmp2
we end up withintegergmp1.0.0.0
.


(maybe) update intree GMP version
(by adding a
.patch
file for the existing GMP 5.0.3 tarball, or by retrieving a new GMP tarball from outsideghc.git
, c.f. mingwconfigure
time autodownloading) 
(maybe) implement
configure
time selectablebsdnt
backend(could be easily backported as outoftree patch from GHC 7.11 branch if there's demand)
For GHC 7.12 (or later)
 all "(maybe)"items from above if they didn't make it into 7.10
 Implement linktime backend selection as described below
Rough Design
Haskellside API Types
  Type representing /raw/ arbitraryprecision Naturals

 This is common type used by 'Natural' and 'Integer'. As this type
 consists of a single constructor wrapping a 'ByteArray#' it can be
 unpacked.

 Essential invariants:

  'ByteArray#' size is an exact multiple of 'Word#' size
  limbs are stored in leastsignificantlimbfirst order,
  the mostsignificant limb must be nonzero, except for
  @0@ which is represented as a 1limb.
data BigNat = BN# ByteArray#
  Invariant: 'Jn#' and 'Jp#' are used iff value doesn't fit in 'SI#'

 Useful properties resulting from the invariants:

  @abs ('S#' _) <= abs ('Jp#' _)@
  @abs ('S#' _) < abs ('Jn#' _)@

data Integer = S# Int#
 ^ iff value in @[minBound::'Int', maxBound::'Int']@ range
 Jp# {# UNPACK #} !BigNat
 ^ iff value in @]maxBound::'Int', +inf[@ range
 Jn# {# UNPACK #} !BigNat
 ^ iff value in @]inf, minBound::'Int'[@ range
deriving (Eq)
  Type representing arbitraryprecision Naturals

 Invariant: 'NatJ#' is used iff when value doesn't fit in 'NatS#'
data Natural = NatS# Word#  ^ @[0, maxBound::Word]@
 NatJ# {# UNPACK #} !BigNat  ^ @]maxBound::GmpLimb, +inf[@
deriving (Eq,Ord)

BigNat
is an internal common type toInteger
andNatural
and not exposed throughbase

Natural
(#9818 (closed)) finally fills the gap of the missing unsignedInteger
counterpart inbase
 requires fallback implementation (on top of
Integer
) inbase
ifinteger{simple,gmp}
are still to be supported
 requires fallback implementation (on top of

Most operations can avoid calling into the FFI backend integerlibrary, if the values don't exceed the
SI#
/NatS#
capacity. This also means that the performance difference between using e.g.bsdnt
orgmp
is only noticeable for large operands.
Linktime integerlibrary selection

The basic requirements to be able to write an adapter for a FFI integerlibrary for
integergmp2
: no memory allocation of result BigNums (but rather writes into callerallocated buffers)
 BigNums are represented as machinewordsized arrays + length information

integergmp2
needs only about a dozen of rather generic lowlevel primitive arithmetic BigNum operations, such as e.g. mp_limb_t mpn_add_1 (mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t n,  mp_limb_t s2limb) foreign import ccall unsafe "gmp.h __gmpn_add_1" c_mpn_add_1 :: MutableByteArray# s > ByteArray# > GmpSize# > GmpLimb# > IO GmpLimb  mp_limb_t mpn_add (mp_limb_t *rp, const mp_limb_t *s1p, mp_size_t s1n,  const mp_limb_t *s2p, mp_size_t s2n) foreign import ccall unsafe "gmp.h __gmpn_add" c_mpn_add :: MutableByteArray# s > ByteArray# > GmpSize# > ByteArray# > GmpSize# > IO GmpLimb

The plan is to write a small shimlike C library that implements a uniform adapter for
integergmp2
to call into, that wraps libraries such as GMP orbsdnt
. This small adapter is then selected at linktime byghc make
similiar to howghc
selects one of its various runtimes (threaded, debugging, nonthreaded, ...)
Simplifying 3rd party GMPaddon packages

integergmp1.0.0
provides an installed<ghcgmp.h>
header which is supposed to either include the systemwide<gmp.h>
header, or correspond to the intree's GMP headers content. 
integergmp1.0.0
persists metainformation about the buildtime GMP library used inHsIntegerGmp.h
.
Here's an example for the definitions created for the intree GMP:
#define GHC_GMP_INTREE 1 #define GHC_GMP_VERSION_MJ 5 #define GHC_GMP_VERSION_MI 0 #define GHC_GMP_VERSION_PL 4 #define GHC_GMP_VERSION (5 * 10000 + 0 * 100 + 4)
And here's an example for a systeminstalled GMP:
#define GHC_GMP_INTREE 0 #define GHC_GMP_VERSION_MJ 6 #define GHC_GMP_VERSION_MI 0 #define GHC_GMP_VERSION_PL 0 #define GHC_GMP_VERSION (6 * 10000 + 0 * 100 + 0)