Random/LFSR on P2

1246721

Comments

  • Heater. wrote: »
    True enough.

    It's just that when I played with the KISS32 PRNG it would take forever to start looking "random" when seeded with zero or a low entropy seed. Much better to have at least an equal number of zero and one bits.

    So I had to try the 1, 0 seed with xoroshiro128+. It gets into something looking "random" much quicker. Like so:

    0000000000000001
    0080001000004001
    0008402018000121
    8080563010444122
    09d0240b1809c401
    01d82012758940e2
    69b05d703207c544
    df59215fd8d25ee6
    9a652772eb590ca2
    4ba7fb3a655fc1b1
    a44648618c9bfe2c
    d89157b50d9ced27
    431bed2f5777656a

    Does the verilog/PASM version generate the same sequence?

    Yes!!! I'm able to output the 16 LSBs and they match your sequence. Here's what I saw:

    0001
    4001
    0121
    4122
    C401
    40E2
    C544
    5EE6
    0CA2
    C1B1
    FE2C
    ED27
    656A
    7508
    D26F
    44F2
    80C3
    4734
    4DCE
    1CB8
    364B
    3248
    E5DB
    A465
    40D1
    A65F
    CE36
    0C5E
    2CBD
    188D
    66F4
    8A00
    2225
    04EE
    3834
    9344
    602A
    7277
    C0EA
    CAD8
  • Ahle2Ahle2 Posts: 906
    edited March 3 Vote Up0Vote Down
    Main code
    int main(int argc, char *argv[])
    {
        PRNG prng;
        uint64_t result;
    
        for(uint32_t i = 0; i < 12345; ++i)
        {
            result = prng.getXoroshiro() >> 1;
        }
    
        std::cout << std::hex << result;
    
        return 0;
    }
    

    In the PRNG class constructor
        s[0] = 1;
        s[1] = 0;
    

    PRNG get method
    uint64_t PRNG::getXoroshiro()
    {
        const uint64_t s0 = s[0];
        uint64_t s1 = s[1];
        const uint64_t result = s0 + s1;
    
        s1 ^= s0;
        s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
        s[1] = rotl(s1, 36); // c
    
        return result;
    }
    
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • evanhevanh Posts: 3,960
    edited March 3 Vote Up0Vote Down
    It could just be a sample number dependency discrepancy. Out-by-one would yield no match.
    $50,000 buys you a discrediting of a journalist
  • Ahle2Ahle2 Posts: 906
    edited March 3 Vote Up0Vote Down
    Chip,

    Casting to 16 bits and printing from the beginning gives the exact same sequence as you!

    I lied; It does not! I got the same 32 bit result as Heater though... Something funky is going on!

    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Ahle, Heater's and my results match. We seem to have implemented it the same way. There must be some difference in your code.
  • Ahle2 wrote: »
    Chip,

    Casting to 16 bits and printing from the beginning gives the exact same sequence as you!

    I lied; It does not! I got the same 32 bit result as Heater though... Something funky is going on!

    Your result was throwing away the LSB. After I shifted my result by 1 bit, I got your same values from the 12345th iteration, as well as the bigger one.
  • I thought we agreed on shifting one step to the right?! Now it gives the same result!
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Ahle2 wrote: »
    I thought we agreed on shifting one step to the right?! Now it gives the same result!

    In the normal Verilog, I don't use the LSB, but for single-stepping, I made it output result[31:0].

    Okay, we have it working then!
  • Ahle2Ahle2 Posts: 906
    edited March 3 Vote Up0Vote Down
    Now that we know that your P2 implementation matches the original C code I could just use that to feed the test suites with data! :)
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • I fixed the error in my xoroshiro128+ PASM code on the first page. Everything agrees between my PASM, my Verilog, Heater's C, and Ahle2's C. Super!
  • Ahle2 wrote: »
    Now that we know that your P2 implementation matches the original C code I could just use that to feed the test suites with data! :)

    Right. There should be no surprises, though, if that xoroshiro128+ page was accurate in its claims.
  • Heater.Heater. Posts: 19,257
    edited March 3 Vote Up0Vote Down
    I'm not getting the same output from verilog as the C version. I modified my C test harness to output the high 31 bits:
    It starts out well, and many values along the way can be correct. With the odd error thrown in:

    C output:
    0000000000000000
    0040000800002000
    000420100c000090
    c0402b1808222091
    04e812058c04e200
    00ec10093ac4a071
    34d82eb81903e2a2
    efac90afec692f73
    cd3293b975ac8651
    25d3fd9d32afe0d8
    d2232430c64dff16
    ec48abda86ce7693
    218df697abbbb2b5
    1e994ae9f17d3a84
    d34d862e033fe937
    05f696ca131fa279
    3d5db0e43348c061
    263a9c552d81a39a
    25dab82a39faa6e7
    f55244eb04720e5c


    Verilog output:
    xxxxxxxxxxxxxxxx
    0000000000000000
    0040000800002000
    000420100c000090
    40402b1808222091 Err
    04e812058c04e200
    00ec10093ac4a071
    34d82eb81903e2a2
    6fac90afec692f73 Err
    4d3293b975ac8651 Err
    25d3fd9d32afe0d8
    52232430c64dff16 Err
    6c48abda86ce7693 Err
    218df697abbbb2b5
    1e994ae9f17d3a84
    534d862e033fe937 Err
    05f696ca131fa279
    3d5db0e43348c061
    263a9c552d81a39a
    25dab82a39faa6e7
    755244eb04720e5c Err

    Something seems to be going wrong with the top bits.

    Could be my test harness of course...

    I made a verilog test harness:
    module test;
    
      /* Make a reset that pulses low once. */
      reg reset = 1;
      initial begin
         # 1 reset = 0;
         # 1 reset = 1;
         # 200 $stop;
      end
    
      /* Make a regular pulsing clock. */
      reg clk = 0;
      always #5 clk = !clk;
    
      wire [62:0] value;
      rnd r1 (reset, clk, value); 
    
      initial
         $monitor("%h", value);
    endmodule // test
    

    I use the plain verilog version as I don't have the regscan macro available.

    I run it under Icarus Verilog:
    $ iverilog -o xoroshiro128plus.vpp xoroshiro128plus_tb.v xoroshiro128plus.v
    $ ./xoroshiro128plus.vpp
    

    Chip, what have you done? I'm writing verilog!

  • Heater, you've only got 63 bits in the verilog version. Your C version is outputting higher values than that.
    $50,000 buys you a discrediting of a journalist
  • TorTor Posts: 1,720
    @Heater,

    On AIX (Power 7) it fails with gcc as well, in 32-bit mode. Although differently than with xlc (IBM compiler). gcc 4.8.3
    All's well with xlc when compiled with -q64.
    '//' comes from C++, so I call it C++ comments, even if it's been adopted by C99 or whatever. Never liked the look of those comments either. I liked the Ada comments though:
    -- this is a comment
    
  • evanh wrote: »
    Heater, you've only got 63 bits in the verilog version. Your C version is outputting higher values than that.

    Yes, Heater needs to change this line:

    wire [63:0] value;
  • Heater. wrote: »
    Chip, what have you done? I'm writing verilog!

    And I'm reading C code.
  • Ahle2Ahle2 Posts: 906
    edited March 3 Vote Up0Vote Down
    I'm running Dieharder as we speak. On my i7-6700K @ 4GHz it will run for 30 minutes or so, using the raw input mode through the pipe mechanism in Linux!
    The data rate is 9.51e+06 random character / sec.

    /Johannes
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Ahle2 wrote: »
    I'm running Dieharder as we speak. On my i7-6700K @ 4GHz it will run for 30 minutes or so, using the raw input mode through the pipe mechanism in Linux!
    The data rate is 9.51e+06 random character / sec.

    /Johannes

    That's the Chr(0..255) byte rate?
  • Yep!
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Ahle2 wrote: »
    Yep!

    Super! I'm going to sleep now, but I'll be checking this thread for the results when I get up. If I could work all day and night, I would.
  • Heater.Heater. Posts: 19,257
    edited March 3 Vote Up0Vote Down
    No worries, my fault.

    My wire definition was fine. Changing it as suggested causes Icarus Verilog to give a warning:

    $ iverilog -o xoroshiro128plus.vpp xoroshiro128plus_tb.v xoroshiro128plus.v
    xoroshiro128plus_tb.v:17: warning: Port 3 (out) of rnd expects 63 bits, got 64.
    xoroshiro128plus_tb.v:17: : Padding 1 high bits of the expression.

    My C code test harness went wrong because I had a cast to intmax_t rather than uintmax_t so there was a signed shift right going on!

    C and Verilog results match up fine now.

    The C test harness:
    //
    // Exercise the xoroshiro128+ PRNG 
    //
    // For 64 bit output Compile with:
    //    $ gcc -Wall -std=c99 -o xoroshiro128plus-test xoroshiro128plus-test.c
    //
    // For 32 bit output Compile with:
    //    $ gcc -Wall -std=c99 -DOUTPUT_32 -o xoroshiro128plus-test xoroshiro128plus-test.c
    //
    // Also works on 32 bit machines (-m32)
    //
    // Note: 64 bit output will never contain zero. 32 bit output can be any 32 bit value.
    //
    #include <stdio.h>
    #include "xoroshiro128plus.c"
    
    // Seed MUST not be all zero.
    #define SEED_0 ((uint64_t)0x1)
    #define SEED_1 ((uint64_t)0x0) 
    
    #define SAMPLE_SIZE 1000000
    #define HEAD 100 
    #define TAIL 8
    
    int main(int argc, char* argv[])
    {
        uint64_t random64;
        uint64_t i;
        char ellipsis = 1;
    
        // Seed the state array
        s[0] = SEED_0;  
        s[1] = SEED_1;
    
        // Print some randomness 
        for (i = 0; i < SAMPLE_SIZE; i++)
        {
            random64 = next();
            if ((i < HEAD) || (i > (SAMPLE_SIZE - TAIL - 1)))
            {
                 printf("%016jx\n", (uintmax_t)(random64 >> 1));
            }
            else
            {
                if (ellipsis)
                {
                    printf("...\n");
                    ellipsis--;
                }
            }    
        }
        return 0;
    }
    
    
    Results:
    C output:
    0000000000000000
    0040000800002000
    000420100c000090
    40402b1808222091
    04e812058c04e200
    00ec10093ac4a071
    34d82eb81903e2a2
    6fac90afec692f73
    4d3293b975ac8651
    25d3fd9d32afe0d8
    52232430c64dff16
    6c48abda86ce7693
    218df697abbbb2b5
    
    Verilog output:
    0000000000000000
    0040000800002000
    000420100c000090
    40402b1808222091
    04e812058c04e200
    00ec10093ac4a071
    34d82eb81903e2a2
    6fac90afec692f73
    4d3293b975ac8651
    25d3fd9d32afe0d8
    52232430c64dff16
    6c48abda86ce7693
    218df697abbbb2b5
    
  • Ahle2Ahle2 Posts: 906
    edited March 3 Vote Up0Vote Down
    The results are in:
    #=============================================================================#
    #            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
    #=============================================================================#
       rng_name    |rands/second|   Seed   |
    stdin_input_raw|  9.51e+06  |1595572412|
    #=============================================================================#
            test_name   |ntup| tsamples |psamples|  p-value |Assessment
    #=============================================================================#
       diehard_birthdays|   0|       100|     100|0.83379396|  PASSED  
          diehard_operm5|   0|   1000000|     100|0.25558220|  PASSED  
      diehard_rank_32x32|   0|     40000|     100|0.52897077|  PASSED  
        diehard_rank_6x8|   0|    100000|     100|0.38678624|  PASSED  
       diehard_bitstream|   0|   2097152|     100|0.29019586|  PASSED  
            diehard_opso|   0|   2097152|     100|0.69787334|  PASSED  
            diehard_oqso|   0|   2097152|     100|0.84063566|  PASSED  
             diehard_dna|   0|   2097152|     100|0.62916350|  PASSED  
    diehard_count_1s_str|   0|    256000|     100|0.92236112|  PASSED  
    diehard_count_1s_byt|   0|    256000|     100|0.59253912|  PASSED  
     diehard_parking_lot|   0|     12000|     100|0.83699181|  PASSED  
        diehard_2dsphere|   2|      8000|     100|0.72079614|  PASSED  
        diehard_3dsphere|   3|      4000|     100|0.92494671|  PASSED  
         diehard_squeeze|   0|    100000|     100|0.76276892|  PASSED  
            diehard_sums|   0|       100|     100|0.35705999|  PASSED  
            diehard_runs|   0|    100000|     100|0.95657635|  PASSED  
            diehard_runs|   0|    100000|     100|0.65493134|  PASSED  
           diehard_craps|   0|    200000|     100|0.12569810|  PASSED  
           diehard_craps|   0|    200000|     100|0.33025898|  PASSED  
     marsaglia_tsang_gcd|   0|  10000000|     100|0.41365816|  PASSED  
     marsaglia_tsang_gcd|   0|  10000000|     100|0.64628534|  PASSED  
             sts_monobit|   1|    100000|     100|0.16772087|  PASSED  
                sts_runs|   2|    100000|     100|0.91443072|  PASSED  
              sts_serial|   1|    100000|     100|0.95551409|  PASSED  
              sts_serial|   2|    100000|     100|0.11502741|  PASSED  
              sts_serial|   3|    100000|     100|0.24329000|  PASSED  
              sts_serial|   3|    100000|     100|0.95246396|  PASSED  
              sts_serial|   4|    100000|     100|0.86703713|  PASSED  
              sts_serial|   4|    100000|     100|0.94258614|  PASSED  
              sts_serial|   5|    100000|     100|0.04708983|  PASSED  
              sts_serial|   5|    100000|     100|0.15211892|  PASSED  
              sts_serial|   6|    100000|     100|0.30394607|  PASSED  
              sts_serial|   6|    100000|     100|0.16023796|  PASSED  
              sts_serial|   7|    100000|     100|0.70988015|  PASSED  
              sts_serial|   7|    100000|     100|0.75742550|  PASSED  
              sts_serial|   8|    100000|     100|0.87835733|  PASSED  
              sts_serial|   8|    100000|     100|0.74750865|  PASSED  
              sts_serial|   9|    100000|     100|0.06143689|  PASSED  
              sts_serial|   9|    100000|     100|0.11206147|  PASSED  
              sts_serial|  10|    100000|     100|0.68131394|  PASSED  
              sts_serial|  10|    100000|     100|0.16241324|  PASSED  
              sts_serial|  11|    100000|     100|0.85577452|  PASSED  
              sts_serial|  11|    100000|     100|0.84633148|  PASSED  
              sts_serial|  12|    100000|     100|0.25783932|  PASSED  
              sts_serial|  12|    100000|     100|0.96519459|  PASSED  
              sts_serial|  13|    100000|     100|0.87453165|  PASSED  
              sts_serial|  13|    100000|     100|0.80473399|  PASSED  
              sts_serial|  14|    100000|     100|0.90976972|  PASSED  
              sts_serial|  14|    100000|     100|0.46481592|  PASSED  
              sts_serial|  15|    100000|     100|0.33824103|  PASSED  
              sts_serial|  15|    100000|     100|0.35972982|  PASSED  
              sts_serial|  16|    100000|     100|0.24205718|  PASSED  
              sts_serial|  16|    100000|     100|0.58009166|  PASSED  
             rgb_bitdist|   1|    100000|     100|0.72455794|  PASSED  
             rgb_bitdist|   2|    100000|     100|0.08839988|  PASSED  
             rgb_bitdist|   3|    100000|     100|0.72988534|  PASSED  
             rgb_bitdist|   4|    100000|     100|0.02537498|  PASSED  
             rgb_bitdist|   5|    100000|     100|0.99235347|  PASSED  
             rgb_bitdist|   6|    100000|     100|0.92528294|  PASSED  
             rgb_bitdist|   7|    100000|     100|0.66209713|  PASSED  
             rgb_bitdist|   8|    100000|     100|0.39928028|  PASSED  
             rgb_bitdist|   9|    100000|     100|0.09853240|  PASSED  
             rgb_bitdist|  10|    100000|     100|0.33874593|  PASSED  
             rgb_bitdist|  11|    100000|     100|0.19805425|  PASSED  
             rgb_bitdist|  12|    100000|     100|0.89791848|  PASSED  
    rgb_minimum_distance|   2|     10000|    1000|0.11514702|  PASSED  
    rgb_minimum_distance|   3|     10000|    1000|0.09949151|  PASSED  
    rgb_minimum_distance|   4|     10000|    1000|0.71371410|  PASSED  
    rgb_minimum_distance|   5|     10000|    1000|0.42679082|  PASSED  
        rgb_permutations|   2|    100000|     100|0.28438129|  PASSED  
        rgb_permutations|   3|    100000|     100|0.13019670|  PASSED  
        rgb_permutations|   4|    100000|     100|0.93910058|  PASSED  
        rgb_permutations|   5|    100000|     100|0.34992170|  PASSED  
          rgb_lagged_sum|   0|   1000000|     100|0.91389520|  PASSED  
          rgb_lagged_sum|   1|   1000000|     100|0.45613194|  PASSED  
          rgb_lagged_sum|   2|   1000000|     100|0.49300600|  PASSED  
          rgb_lagged_sum|   3|   1000000|     100|0.71010206|  PASSED  
          rgb_lagged_sum|   4|   1000000|     100|0.94808677|  PASSED  
          rgb_lagged_sum|   5|   1000000|     100|0.98009360|  PASSED  
          rgb_lagged_sum|   6|   1000000|     100|0.93957434|  PASSED  
          rgb_lagged_sum|   7|   1000000|     100|0.12415484|  PASSED  
          rgb_lagged_sum|   8|   1000000|     100|0.67596394|  PASSED  
          rgb_lagged_sum|   9|   1000000|     100|0.00508887|  PASSED  
          rgb_lagged_sum|  10|   1000000|     100|0.94249200|  PASSED  
          rgb_lagged_sum|  11|   1000000|     100|0.60613939|  PASSED  
          rgb_lagged_sum|  12|   1000000|     100|0.26155684|  PASSED  
          rgb_lagged_sum|  13|   1000000|     100|0.08331932|  PASSED  
          rgb_lagged_sum|  14|   1000000|     100|0.99955094|   WEAK   
          rgb_lagged_sum|  15|   1000000|     100|0.85131082|  PASSED  
          rgb_lagged_sum|  16|   1000000|     100|0.80457554|  PASSED  
          rgb_lagged_sum|  17|   1000000|     100|0.36633132|  PASSED  
          rgb_lagged_sum|  18|   1000000|     100|0.95989992|  PASSED  
          rgb_lagged_sum|  19|   1000000|     100|0.09248094|  PASSED  
          rgb_lagged_sum|  20|   1000000|     100|0.79549433|  PASSED  
          rgb_lagged_sum|  21|   1000000|     100|0.52583117|  PASSED  
          rgb_lagged_sum|  22|   1000000|     100|0.40921376|  PASSED  
          rgb_lagged_sum|  23|   1000000|     100|0.58494999|  PASSED  
          rgb_lagged_sum|  24|   1000000|     100|0.01392463|  PASSED  
          rgb_lagged_sum|  25|   1000000|     100|0.85694357|  PASSED  
          rgb_lagged_sum|  26|   1000000|     100|0.63171725|  PASSED  
          rgb_lagged_sum|  27|   1000000|     100|0.42951286|  PASSED  
          rgb_lagged_sum|  28|   1000000|     100|0.37432100|  PASSED  
          rgb_lagged_sum|  29|   1000000|     100|0.97003672|  PASSED  
          rgb_lagged_sum|  30|   1000000|     100|0.63896568|  PASSED  
          rgb_lagged_sum|  31|   1000000|     100|0.30440336|  PASSED  
          rgb_lagged_sum|  32|   1000000|     100|0.93002810|  PASSED  
         rgb_kstest_test|   0|     10000|    1000|0.71159167|  PASSED  
         dab_bytedistrib|   0|  51200000|       1|0.92653216|  PASSED  
                 dab_dct| 256|     50000|       1|0.56538469|  PASSED  
    Preparing to run test 207.  ntuple = 0
            dab_filltree|  32|  15000000|       1|0.72218965|  PASSED  
            dab_filltree|  32|  15000000|       1|0.70765494|  PASSED  
    Preparing to run test 208.  ntuple = 0
           dab_filltree2|   0|   5000000|       1|0.25990946|  PASSED  
           dab_filltree2|   1|   5000000|       1|0.51931629|  PASSED  
    Preparing to run test 209.  ntuple = 0
            dab_monobit2|  12|  65000000|       1|0.60109807|  PASSED  
    
    
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Brilliant!

    I think we have a winner.

  • Ahle2Ahle2 Posts: 906
    edited March 3 Vote Up0Vote Down
    I'm running Dieharder on the original P2 LSFR now, but due to the bit reordering it is slow, just 1.82e+06.
    It will take 5 hours to complete! :)

    The first few tests looks like this
    #=============================================================================#
    #            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
    #=============================================================================#
       rng_name    |rands/second|   Seed   |
    stdin_input_raw|  1.82e+06  |1385337568|
    #=============================================================================#
            test_name   |ntup| tsamples |psamples|  p-value |Assessment
    #=============================================================================#
       diehard_birthdays|   0|       100|     100|0.00000000|  FAILED  
          diehard_operm5|   0|   1000000|     100|0.00000000|  FAILED  
      diehard_rank_32x32|   0|     40000|     100|0.00000000|  FAILED  
        diehard_rank_6x8|   0|    100000|     100|0.00000000|  FAILED  
       diehard_bitstream|   0|   2097152|     100|0.00000000|  FAILED  
            diehard_opso|   0|   2097152|     100|0.00000000|  FAILED  
            diehard_oqso|   0|   2097152|     100|0.00000000|  FAILED  
             diehard_dna|   0|   2097152|     100|0.00000000|  FAILED  
    diehard_count_1s_str|   0|    256000|     100|0.00000000|  FAILED  
    diehard_count_1s_byt|   0|    256000|     100|0.00000000|  FAILED  
     diehard_parking_lot|   0|     12000|     100|0.00000000|  FAILED  
        diehard_2dsphere|   2|      8000|     100|0.00000000|  FAILED  
        diehard_3dsphere|   3|      4000|     100|0.00000000|  FAILED  
         diehard_squeeze|   0|    100000|     100|0.00000000|  FAILED  
            diehard_sums|   0|       100|     100|0.31392291|  PASSED  
            diehard_runs|   0|    100000|     100|0.01074318|  PASSED  
            diehard_runs|   0|    100000|     100|0.00803469|  PASSED  
           diehard_craps|   0|    200000|     100|0.00000000|  FAILED  
           diehard_craps|   0|    200000|     100|0.00000000|  FAILED 
    

    Without bit reordering I guess every test would fail!
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Super, Ahle2!

    We've got a much better random number generator now.
  • cgraceycgracey Posts: 7,737
    edited March 3 Vote Up0Vote Down
    Anyone out there want a programming challenge?

    I think it would be best to take the good 63 bits out of the xoroshiro128+ and come up with sixteen randomly-chosen static 32-bit patterns which each use 32 of those 63 bits. Most of those 63 bits should only be used 8 times across all 16 patterns, while a few will need to be used 9 times, since we only have 63 bits, not 64, from the xoroshiro128+. Each cog will get one of these 16 patterns for its own RND value.

    Here is what the result needs to look like, except those "--" need to become "00".."62" values:
    wire [62:0] x = xoroshiro128plus_output;
    
    assign rnd = {
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h428A2F98,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h71374491,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'hB5C0FBCF,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'hE9B5DBA5,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h3956C25B,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h59F111F1,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h923F82A4,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'hAB1C5ED5,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'hD807AA98,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h12835B01,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h243185BE,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h550C7DC3,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h72BE5D74,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h80DEB1FE,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'h9BDC06A7,
    {x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--],x[--]} ^ 32'hC19BF174 };
    

    Those 32'hxxxxxxxx values are fractional parts of the square roots of the first 16 prime numbers. They are there to further distinguish patterns from each other. I know, they have no cryptographic benefit.
  • jmgjmg Posts: 10,208
    cgracey wrote: »
    ... come up with sixteen randomly-chosen static 32-bit patterns which each use 32 of those 63 bits.

    Since the bits are random, do you need random pickoff ?
    eg you could allocate 4 groups of 32 to 4 COGS and get different results, then you could repeat that byte shifted 4 x, and everyone has a different snapshot of the 128 bits.

    If you really do want random taps, run this RNG to generate some ;)


  • cgraceycgracey Posts: 7,737
    edited March 3 Vote Up0Vote Down
    jmg wrote: »
    cgracey wrote: »
    ... come up with sixteen randomly-chosen static 32-bit patterns which each use 32 of those 63 bits.

    Since the bits are random, do you need random pickoff ?
    eg you could allocate 4 groups of 32 to 4 COGS and get different results, then you could repeat that byte shifted 4 x, and everyone has a different snapshot of the 128 bits.

    If you really do want random taps, run this RNG to generate some ;)

    Random pickoff matters, relatively, between sets of 32 bits.

    It's kind of a two-dimensional problem. Some divide-and-conquer is needed.
  • I assume the new FPGA images with this new random number generator will have the same instruction set and encoding as the current v16. Is that correct?
  • David Betz wrote: »
    I assume the new FPGA images with this new random number generator will have the same instruction set and encoding as the current v16. Is that correct?

    Yes. This is a very subtle change that almost nobody would recognize, in practice.
Sign In or Register to comment.