JSON Sector Data Service

tjoneslo · Dec 16, 2014

robject said:
Does the code use 32 bit signed integers? 0xffffffff makes me think they're forced to be so, but... well, let's try it and see.

Python uses a big integer internally for integer values. The documentation warns if you are left shifting values that you want constrained to a specific size (e.g. 32 bits), you need to mask them as I've done.

I would suggest you aim to duplicate Joshua Bell's experiment here, which does it with JavaScript: http://calormen.com/tmp/prng/burtle.html

The "nice" thing about using that as a target is that we get to use 64-bit IEEE754 floating point numbers i.e. C doubles. I should ask Joshua to open up his implementation to using those without forcing them to 32 bit values.

I was following the original from the source page which uses 32bit values and two shifts. I'm not sure of the advantage of using 64 bits.

Also as Gist for later reference: https://gist.github.com/tjoneslo/2c3b472f641bab4069b7

inexorabletash · Dec 16, 2014

robject said:
JavaScript (thanks to Joshua Bell)

Code:

function rot(x, k) { return (x << k) | (x >> (32 - k)); }

There's a bug. Sorry about that. That should be:

Code:

  function rot(x, k) { return (x << k) | (x >>> (32 - k)); }

In JS, >> sign extends, >>> shifts in zeros.

tjoneslo said:
Here is the Python version, but I'm not convinced that it returns the same set of random numbers your implementation do given the same seed.

Code:

class BurtleRandom(object): def __init__(self, seed): self.random_seed = [0xf1ea5eed, int(seed), int(seed), int(seed)]

The other implementations generate and discard 20 values before considering the generator initialized. If I tweak your implementation to be:

Code:

class BurtleRandom(object):

    def __init__(self, seed):
        self.random_seed = [0xf1ea5eed, int(seed), int(seed), int(seed)]
        for _ in range(20):
            self.value()

... then I get the same output from the C, (corrected) JS and Python.

inexorabletash · Dec 16, 2014

robject said:
The "nice" thing about using that as a target is that we get to use 64-bit IEEE754 floating point numbers i.e. C doubles. I should ask Joshua to open up his implementation to using those without forcing them to 32 bit values.

I'm not sure what you mean here, either. The PRNG at http://www.burtleburtle.net/bob/rand/smallprng.html is 32-bit - the "unsigned long int" may be bogus on 64-bit depending on your compiler; in modern terms it should be uint32_t.

A 64-bit IEEE754 float gives you 2^53 bits of precision, so that's the most you can reason about without running into floating point weirdness. And there's only one NaN value in JS so you can't even store arbitrary 64-bit patterns in a JS number. The standard way of dealing with 64-bit values in JS is to use a type that uses two numbers as the hi/lo 32 halves. But... why are we even talking about this?

robject · Dec 17, 2014

I also noticed that Burtle is 32 bit only. I'll be casting around for a 'bignum' version, or a 64 bit long version, etc etc.

Oh, there's the 64 bit version further down on his webpage.

If you want to use 8-byte terms instead of 4-byte terms, the proper rotate amounts are (39,11) for the two-rotate version (yielding at least 13.3 bits of avalanche after 5 rounds) or (7,13,37) for the three-rotate version (yielding 18.4 bits of avalanche after 5 rounds). I think I'd got with the three-rotate version, because the ideal is 32 bits of avalanche, and 13.3 isn't even half of that.

...Quite likely 64-bit deserves a whole different approach, not just different constants.

robject · Dec 17, 2014

inexorabletash said:
But... why are we even talking about this?

I think I was worried about the input seed being only 32 bits wide.

inexorabletash · Dec 17, 2014

robject said:
I think I was worried about the input seed being only 32 bits wide.

Ah! Right, wanting more than 2^32 sequences, or assuming we suck at hashing coordinates.

Well... There's nothing in the algorithm that forces you to init the state vector with 3 copies of the seed. It should (he says foolishly) be fine to init with more unique bits. Someone may want to verify that it doesn't impact the randomness.

robject · Dec 17, 2014

inexorabletash said:
There's nothing in the algorithm that forces you to init the state vector with 3 copies of the seed. It should (he says foolishly) be fine to init with more unique bits.

THAT would be very interesting indeed.

I was considering going back to simply using a hash algorithm, e.g. SHA1 or something. I didn't want to do that because it created a need for a rule on when and how to hash again for more bits. Plus, hash algorithms tend to be slower. Sounds like trouble, but maybe it's less trouble than Burtle.

Just to say what we're all sort-of thinking:

Code:

sub rand1d(): long whatever
{
   my width = 3;
   return bit_bucket.getInt(width) % 6;
}

...

object BitBucket bit_bucket = 
{
   ...

   sub getInt( width: int ): long whatever
   {
      my[] bits = self.getBits( width );
      return self.generateIntFrom( bits );
   }

   sub getBits( width: int ): []
   {
      get_more_bits() if self.isLow();
      my[] bits = self.bits.pop(width);
      return bits;
   }

   sub get_more_bits()
   {
      self.seed.increment();
      my[] newbits = self.hash_on( self.seed );
      self.bits.push( newbits );
   }
};

JSON Sector Data Service

tjoneslo

SOC-14 1K

inexorabletash

SOC-14 1K

inexorabletash

SOC-14 1K

robject

SOC-14 10K

robject

SOC-14 10K

inexorabletash

SOC-14 1K

robject

SOC-14 10K

Similar threads