Random/LFSR on P2

13468914

Comments

  • What summing carry bit? There is none in Chip's scheme.
  • cgracey wrote: »
    As you pointed out, though, the carry output of the final adder would seem to be of good quality, but it must not be if they didn't mention it, because it would have been a great way to get a full 64 bits out of the PRNG.
    The defence I have here is that the documentation never discusses this in a hardware context. In fact they only discuss high level languages without direct access to a carry result.
    As long as websites don't require javascript in the browser then I'm happy.
  • evanh wrote: »
    Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

    Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

    Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.
  • Heater. wrote: »
    What summing carry bit? There is none in Chip's scheme.
    But it could be added with ease. The joys of designing the hardware the way you want it. :)
    As long as websites don't require javascript in the browser then I'm happy.
  • Heater. wrote: »
    What summing carry bit? There is none in Chip's scheme.

    It's just the carry output of the 64-bit add of s0 and s1.
  • Ahle2Ahle2 Posts: 906
    edited March 5 Vote Up0Vote Down
    cgracey wrote: »
    evanh wrote: »
    Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

    Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

    Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.
    Of course I can! :)

    I have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

    Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Hehe, I've just solved a lib dependency problem where the package manager thought one of the packages was broken and absolutely insisted on removing it. After going though the rigmarole of reinstating it manually a couple of times I found instructions on how to remove dependency entries in either the package itself or the subsequent forced install. I chose the latter as it didn't need repackaged.

    Now I can simple install dieharder from repository ... and do all the belated updates too ...
    As long as websites don't require javascript in the browser then I'm happy.
  • cgraceycgracey Posts: 7,090
    edited March 5 Vote Up0Vote Down
    Ahle2 wrote: »
    cgracey wrote: »
    evanh wrote: »
    Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

    Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

    Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.
    Of course I can! :)

    I have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

    Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

    No. I mean that when you compute s0+s1, you actually get a 65-bit result, with the bits ranging in position from 64 down to 0. We want to know what the integrity of bits 64 down to 1 are.

    If you think in Verilog, it looks like this:

    wire [63:0] s0, s1;
    wire [64:0] ss = s0 + s1;
    wire [63:0] rndbits = ss[64:1];

    It's rndbits that we're after.

    If you have a 65-bit adder, we are interested in: (s0 + s1) >> 1
  • evanhevanh Posts: 3,608
    edited March 5 Vote Up0Vote Down
    Ahle2 wrote: »
    Just to be clear, do you mean "s0[64:01] + s1[64:01]"?
    No, it's: ss[64:00] = s0[63:00] + s1[63:00] with carry. Carry bit goes in ss[64]. It's addition's equivalent of the extended multiply, 64-bit = 32-bit x 32-bit.
    As long as websites don't require javascript in the browser then I'm happy.
  • Ahle2Ahle2 Posts: 906
    edited March 5 Vote Up0Vote Down
    Okey...
    GCC has built in support for 128 bits types (the C++ standard has not) so it should be easy!
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Heater.Heater. Posts: 18,615
    edited March 5 Vote Up0Vote Down
    Ah, OK, that sum.

    There is a bias in that carry bit.

    Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

    There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

    Width: 1 Carries: 25% Bias: 25%
    Width: 2 Carries: 37.5% Bias: 12.5%
    Width: 3 Carries: 43.75% Bias: 6.25%
    Width: 4 Carries: 46.875% Bias: 3.125%
    Width: 5 Carries: 48.4375% Bias: 1.5625%
    Width: 6 Carries: 49.21875% Bias: 0.78125%
    Width: 7 Carries: 49.609375% Bias: 0.390625%
    Width: 8 Carries: 49.8046875% Bias: 0.1953125%
    Width: 9 Carries: 49.90234375% Bias: 0.09765625%
    Width: 10 Carries: 49.951171875% Bias: 0.048828125%
    Width: 11 Carries: 49.9755859375% Bias: 0.0244140625%
    Width: 12 Carries: 49.98779296875% Bias: 0.01220703125%
    Width: 13 Carries: 49.993896484375% Bias: 0.006103515625%
    Width: 14 Carries: 49.9969482421875% Bias: 0.0030517578125%
    Width: 15 Carries: 49.99847412109375% Bias: 0.00152587890625%
    Width: 16 Carries: 49.999237060546875% Bias: 0.000762939453125%

    We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 10E-18 percent.

    Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

    Would that bias ever show up in Dieharder? No idea.
  • Ahle2Ahle2 Posts: 906
    edited March 5 Vote Up0Vote Down
    I tried clocking the original P2 LFSR 32 (as proposed by Phil) steps for each random number output to Dieharder. Amazingly most tests passes. Doing the same with Practrand fails everything while xoroshiro passes! It really seems like Practrand is superior for finding out which PRNG among good ones that stands out.
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Heater. wrote: »
    Ah, OK, that sum.

    There is a bias in that carry bit.

    Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

    There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

    Width: 1 Carries: 25% Bias: 25%
    Width: 2 Carries: 37.5% Bias: 12.5%
    Width: 3 Carries: 43.75% Bias: 6.25%
    Width: 4 Carries: 46.875% Bias: 3.125%
    Width: 5 Carries: 48.4375% Bias: 1.5625%
    Width: 6 Carries: 49.21875% Bias: 0.78125%
    Width: 7 Carries: 49.609375% Bias: 0.390625%
    Width: 8 Carries: 49.8046875% Bias: 0.1953125%
    Width: 9 Carries: 49.90234375% Bias: 0.09765625%
    Width: 10 Carries: 49.951171875% Bias: 0.048828125%
    Width: 11 Carries: 49.9755859375% Bias: 0.0244140625%
    Width: 12 Carries: 49.98779296875% Bias: 0.01220703125%
    Width: 13 Carries: 49.993896484375% Bias: 0.006103515625%
    Width: 14 Carries: 49.9969482421875% Bias: 0.0030517578125%
    Width: 15 Carries: 49.99847412109375% Bias: 0.00152587890625%
    Width: 16 Carries: 49.999237060546875% Bias: 0.000762939453125%

    We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 1 in 10E-18.

    Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

    Would that bias ever show up in Dieharder? No idea.

    Unless Dieharder could run the test for 10e29 years at 160MHz sample rate, the bias wouldn't show up.
  • Ahle2 wrote: »
    I tried clocking the original P2 LFSR 32 (as proposed by Phil) steps for each random number output to Dieharder. Amazingly most tests passes. Doing the same with Practrand fails everything while xoroshiro passes! It really seems like Practrand is superior for finding out which PRNG among good ones that stands out.

    With 128-bit types, just do: (so + s1) >> 1
  • Heater.Heater. Posts: 18,615
    edited March 5 Vote Up0Vote Down
    Edit: Deleted brain fart.
  • Yeah, OK, 10e29 years should be good enough for anybody :)


  • Heater. wrote: »
    Width: 1 Carries: 25% Bias: 25%
    Width: 2 Carries: 37.5% Bias: 12.5%
    Width: 3 Carries: 43.75% Bias: 6.25%
    Width: 4 Carries: 46.875% Bias: 3.125%
    Width: 5 Carries: 48.4375% Bias: 1.5625%
    Width: 6 Carries: 49.21875% Bias: 0.78125%
    Width: 7 Carries: 49.609375% Bias: 0.390625%
    Width: 8 Carries: 49.8046875% Bias: 0.1953125%
    Width: 9 Carries: 49.90234375% Bias: 0.09765625%
    ...
    Cool! I've wondered about that sort of thing before but didn't have a clue of significance. Mulling over a 1-bit addition in my head made that list very clear. Thanks Heater.
    As long as websites don't require javascript in the browser then I'm happy.
  • Heater.Heater. Posts: 18,615
    edited March 5 Vote Up0Vote Down
    The only remaining question then is: Is that carry bit, bit 64, any more or less "random" than the bit 0 ?
  • Heater.Heater. Posts: 18,615
    edited March 6 Vote Up0Vote Down
    Could not sleep so I thought I'd have a play with dieharder.

    I changed my xoroshiro128+ to do 128 bit addition S0 + S1 and shift down by 1. Like so:
    uint64_t next(void) {
    	const uint64_t s0 = s[0];
    	uint64_t s1 = s[1];
    	const uint64_t result = ((unsigned __int128)s0 + (unsigned __int128)s1) >> 1;
    
    	s1 ^= s0;
    	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
    	s[1] = rotl(s1, 36); // c
    
    	return result;
    }
    
    The test harness just writes the binary results of next() to standard out. Like so:
    #include "xoroshiro128plus.c"
    
    // Seed MUST not be all zero.
    #define SEED_0 ((uint64_t)0x1)
    #define SEED_1 ((uint64_t)0x0) 
    
    int main(int argc, char* argv[])
    {
        uint64_t random64;
        int i;
    
        // Seed the state array
        s[0] = SEED_0;  
        s[1] = SEED_1;
    
        // Print some randomness 
        while(1)
        {
            random64 = next();
            // printf("%016jx\n", (uintmax_t)(random64 >> 1));
            for (i = 0; i < 8; i++) {
                putchar(random64 >> i * 8);
            }
        }
        return 0;
    }
    

    I pipe the output into dieharder like so:
    ./xoroshiro128plus-test | dieharder -g200 -a
    

    Sadly it fails miserably:
    $ ./xoroshiro128plus-test | dieharder -g200 -a
    #=============================================================================#
    #            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
    #=============================================================================#
       rng_name    |rands/second|   Seed   |
    stdin_input_raw|  2.46e+07  |2926663896|
    #=============================================================================#
            test_name   |ntup| tsamples |psamples|  p-value |Assessment
    #=============================================================================#
       diehard_birthdays|   0|       100|     100|0.90721666|  PASSED
          diehard_operm5|   0|   1000000|     100|0.00000000|  FAILED
      diehard_rank_32x32|   0|     40000|     100|0.81803211|  PASSED
        diehard_rank_6x8|   0|    100000|     100|0.67677153|  PASSED
       diehard_bitstream|   0|   2097152|     100|0.00000000|  FAILED
            diehard_opso|   0|   2097152|     100|0.91666900|  PASSED
            diehard_oqso|   0|   2097152|     100|0.21512226|  PASSED
             diehard_dna|   0|   2097152|     100|0.00000000|  FAILED
    diehard_count_1s_str|   0|    256000|     100|0.00159028|   WEAK
    diehard_count_1s_byt|   0|    256000|     100|0.82748065|  PASSED
     diehard_parking_lot|   0|     12000|     100|0.00000000|  FAILED
        diehard_2dsphere|   2|      8000|     100|0.08446717|  PASSED
        diehard_3dsphere|   3|      4000|     100|0.17783008|  PASSED
         diehard_squeeze|   0|    100000|     100|0.00000000|  FAILED
            diehard_sums|   0|       100|     100|0.02361403|  PASSED
            diehard_runs|   0|    100000|     100|0.00000000|  FAILED
            diehard_runs|   0|    100000|     100|0.00000000|  FAILED
           diehard_craps|   0|    200000|     100|0.00000000|  FAILED
           diehard_craps|   0|    200000|     100|0.00000000|  FAILED
     marsaglia_tsang_gcd|   0|  10000000|     100|0.00000000|  FAILED
     marsaglia_tsang_gcd|   0|  10000000|     100|0.33508574|  PASSED
             sts_monobit|   1|    100000|     100|0.29833481|  PASSED
                sts_runs|   2|    100000|     100|0.00000000|  FAILED
              sts_serial|   1|    100000|     100|0.75364224|  PASSED
              sts_serial|   2|    100000|     100|0.00000000|  FAILED
              sts_serial|   3|    100000|     100|0.00000000|  FAILED
              sts_serial|   3|    100000|     100|0.00000000|  FAILED
              sts_serial|   4|    100000|     100|0.00000000|  FAILED
              sts_serial|   4|    100000|     100|0.00000000|  FAILED
              sts_serial|   5|    100000|     100|0.00000000|  FAILED
              sts_serial|   5|    100000|     100|0.00001544|   WEAK
              sts_serial|   6|    100000|     100|0.00000000|  FAILED
              sts_serial|   6|    100000|     100|0.08702530|  PASSED
              sts_serial|   7|    100000|     100|0.00000000|  FAILED
              sts_serial|   7|    100000|     100|0.76972737|  PASSED
              sts_serial|   8|    100000|     100|0.00000000|  FAILED
              sts_serial|   8|    100000|     100|0.90452649|  PASSED
              sts_serial|   9|    100000|     100|0.00000000|  FAILED
              sts_serial|   9|    100000|     100|0.29567132|  PASSED
              sts_serial|  10|    100000|     100|0.00000000|  FAILED
    
  • If I remove that ">> 1", keeping the 128 bit addition and using bits [63:0] then everything in dieharder starts passing again. So I guess the problem is not elsewhere.

    Basically that carry cannot be used.

    Unless I have slipped up again...
  • evanhevanh Posts: 3,608
    edited March 6 Vote Up0Vote Down
    Ouch! Do it again with post-summing bit 63 removal so as to prove it's not a flaw in the 128-bit conversions.
    As long as websites don't require javascript in the browser then I'm happy.
  • Heater.Heater. Posts: 18,615
    edited March 6 Vote Up0Vote Down
    evanh,

    What do you mean by "post-summing bit 63 removal" ?
  • Sorry for the delay, I'd written that while at work. I mean just knock out the msb of the returned result, eg:

    return result & 0x7fffffffffffffff;

    As long as websites don't require javascript in the browser then I'm happy.
  • But, but, the point of the exercise is to get a 64 bit result. Using the carry rather than bit 0. Which is said to be not so statistically "random" as the other 63 bits.

    There is no way dieharde will like an input of 64 bit integers each with it's top bit set zero.

    As it stands it looks like we may as well use that bit zero as it performs very well.
  • That's what the existing solution is, only 63 bits. I'm just trying to sort out where the problem is with the this experimental 64-bit variant.
    As long as websites don't require javascript in the browser then I'm happy.
  • Yes, the existing solution uses only the high 63 bits. Why? Because the bit zero is said to be not so statistically random looking as the rest.

    In posts above Chip wondered if that carry bit could be used instead. Thus making a 64 bit result. And was asking for that solution to be tested with Dieharder and whatever statistical tests. I think his idea is that when distributing bits to COGs with his "rats nest" of connections it would be nice to have a multiple of 8 bits and have all the bits used the same number of times. Or something like that.

    I pointed out that there is a tiny bias in the carry bit. But we reckoned on it being so small as not to show up in a test. That left the question mark over it's statistical "randomness".

    Then, being unable to sleep, I could not resist trying carry bit experiment myself.

    Unless I have made a mistake I think we may as well use the original 64 bit result.
  • Ahle2Ahle2 Posts: 906
    edited March 6 Vote Up0Vote Down
    Heater, It passes with flying colors for me!!

    Here's my PRNG code
    uint64_t PRNG::getXoroshiro()
    {
        const uint64_t s0 = s[0];
        uint64_t s1 = s[1];
        const unsigned __int128 result = static_cast<unsigned __int128>(s0) + static_cast<unsigned __int128>(s1);
    
        s1 ^= s0;
        s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
        s[1] = rotl(s1, 36); // c
    
        return static_cast<uint64_t>(result >> 1);
    }
    

    The main code
                while(1)
                {
                    auto random = prng.getXoroshiro();
                    std::cout << static_cast<char>(random >> 24);
                    std::cout << static_cast<char>(random >> 16);
                    std::cout << static_cast<char>(random >> 8);
                    std::cout << static_cast<char>(random);
                }
    
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Ahle2Ahle2 Posts: 906
    edited March 6 Vote Up0Vote Down
    Okay... I made a mistake as seen in my code above. Only passing 32 bits of the 64 to stdout. When I ran it again with all bits, it began failing misserable!

    Here's the output from Practrand
    PE 1 | ./RNG_test stdin
    RNG_test using PractRand version 0.93
    RNG = RNG_stdin, seed = 0xa4eaaac4
    test set = normal, folding = standard(unknown format)
    
    rng=RNG_stdin, seed=0xa4eaaac4
    length= 128 megabytes (2^27 bytes), time= 3.8 seconds
      Test Name                         Raw       Processed     Evaluation
      BCFN(2+0,13-3,T)                  R= +74.9  p =  3.0e-35    FAIL !!!       
      BCFN(2+1,13-3,T)                  R= +18.9  p =  9.6e-9    VERY SUSPICIOUS 
      DC6-9x1Bytes-1                    R= +1413  p =  6.3e-868   FAIL !!!!!!!   
      Gap-16:A                          R=+149.1  p =  1.3e-126   FAIL !!!!!     
      Gap-16:B                          R=+367.1  p =  6.1e-294   FAIL !!!!!!    
      FPF-14+6/16:(0,14-0)              R= +5478  p =  8e-5069    FAIL !!!!!!!!  
      FPF-14+6/16:(1,14-0)              R= +2754  p =  5e-2548    FAIL !!!!!!!!  
      FPF-14+6/16:(2,14-0)              R= +1357  p =  2e-1255    FAIL !!!!!!!!  
      FPF-14+6/16:(3,14-1)              R=+967.8  p =  1.7e-857   FAIL !!!!!!!   
      FPF-14+6/16:(4,14-2)              R=+655.1  p =  1.0e-572   FAIL !!!!!!!   
      FPF-14+6/16:(5,14-2)              R=+254.4  p =  3.0e-222   FAIL !!!!!!    
      FPF-14+6/16:all                   R= +5306  p =  2e-4979    FAIL !!!!!!!!  
      FPF-14+6/16:all2                  R=+9738850 p = 0           FAIL !!!!!!!!  
      FPF-14+6/16:cross                 R= +5198  p =  2e-4373    FAIL !!!!!!!!  
      [Low1/8]Gap-16:B                  R=  +4.4  p =  1.0e-3   unusual          
      ...and 136 test result(s) without anomalies
    
    
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Heater,
    const uint64_t result = ((unsigned __int128)s0 + (unsigned __int128)s1) >> 1;
    

    I first thought that your code failed because of the implicit cast to the 64 bit result variable (according to the standard you can never know what the compiler decides to do under the hood), but obviously it fails in both Practrand and Dieharder for me as well using explicit casting in C++ 11;


    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Yes, but you are not using all 64 bits of the output, only the low 32.

    Here is my main code:
        while(1)
        {
            random64 = next();
            for (i = 0; i < 8; i++) {
                putchar(random64 >> (i * 8));
            }
        }
    
Sign In or Register to comment.