Skip to content
Snippets Groups Projects
  1. Jan 24, 2000
  2. Jan 23, 2000
    • AndyGill's avatar
      [project @ 2000-01-23 09:55:17 by andy] · ee152a67
      AndyGill authored
      GHC now uses the "Hugs" split function, which is believed to have
      better behaviour: it is not unsafe, is deterministic, and works
      better with QuickCheck, a major client.
      
      Rational for the Record, quoted from Mark Jones mail on the Hugs list:
      
      [Mark Jones]
      A couple of months ago, John Hughes sent me mail about a problem that
      he had uncovered with the implementation of the Random library in Hugs.
      He had been using the "split" function in an attempt to generate a
      stream of random number generators, each of which he hoped would be
      different from the others.  But instead he found that he actually
      ended with many different copies of the *same* random number generator.
      A disappointing, and frustratingly non-random result.
      
      If you don't happen to recall, split is a member of the RandomGen class,
      with type RandomGen g => g -> (g,g); it takes a single random number
      generator as its argument, and returns a pair of two new generators as
      its result.  The only thing that the specification requires is that the
      two generators returned are (a) distinct and (b) `independently robust'
      from a statistical point of view.  To the best of my knowledge, the
      implementation in Hugs meets this modest specification.  Sadly, assuming
      only this specification, you cannot write the function that John was
      looking for and be sure that it will generate more than two different
      generators.
      
      For example, the specification allows even the following trivial
      implementation for split:  split _ = (g1, g2), where g1 and g2 are some
      arbitrary but constant pair of distinct, and independently robust
      generators.  With this implementation, you can split as often as you
      want and you'll never get more that two generators.
      
      Hugs and GHC (as far as I can tell) both use definitions of the form:
      
         split g = (g, f g)
      
      for some function f.  (My understanding of the code in GHC is that it
      uses an unsafe function for f, breaking referential transparency; I hope
      the optimizer knows about this.)  Note that this definition returns the
      argument as a result; the specification doesn't prohibit that; all it
      requires is that the two results returned be distinct.  But if you try
      to generate a list of n different generators using:
      
         take n (iterate (fst . split) g)
      
      then you will be sorely disappointed; you might as well have written
      replicate n g.  (On the other hand, if you were lucky enough to have
      used (snd . split), instead of (fst . split), then you wouldn't have
      noticed the problem ...)
      
      I know very little about the mathematics or pragmatics of random
      number generators, so I'm not sure that I know how to fix this
      problem.  However, starting from this position of ignorance, I have
      hacked up a new version of "split" for the standard "StdGen" that
      will appear in the next release of Hugs (real soon now!).  Judging
      from the tests that I've tried so far, it seems to work much
      better than the old version.  That said:
      
       - Take care if you use Random.split in your programs, because it
         may not do what you expect.
      
       - There should probably be an errata about this for the Haskell 98
         library report ... if somebody can figure out what it should say.
      
       - If you use Hugs, be aware that the implementation of Random.split
         was hacked up by someone who has no way of justifying that
         implementation, beyond some simple experiments.
      
       - If you know something about the mechanics of random number
         generators, here's an area where Haskell specifications and
         implementations could benefit from your knowledge!
      
      All the best,
      Mark
      [end quote]
      ee152a67
  3. Jan 22, 2000
  4. Jan 20, 2000
  5. Jan 19, 2000
  6. Jan 18, 2000
Loading