Random/LFSR on P2

2456721

Comments

  • jmgjmg Posts: 10,243
    Ahle2 wrote: »
    I will implement the old P2 LFSR, Jkiss , PCG (by Melissa O'Neill) and xoroshiro 128+. Then run the them through Testu01 and Dieharder suites. Stay tuned! (Melissa claims that her PRNG is the best available to this date. Her paper is under review as we speak)

    .. and Chips P2ASM too ?
    There is some mention above that was not quite xoroshiro 64b ?

    cgracey wrote: »
    If we can be sure this is correct, I'll update the PRNG in the hub to use this, instead.
    Do you mean here update in verilog ?

  • Heater.Heater. Posts: 19,352
    edited March 2 Vote Up0Vote Down
    Ahle2,

    What is the easiest way to run some decent tests on a generator?

    Ages ago I used diehard occasionally, it always seemed a pain to work with. I just took a look at the testu01 and, well, the user manual is a monster.

    All I want is to feed a giga byte of random in and get a "good, bad, ugly" result out !

    I have a transistor avalanche random generator here that will need testing when I get it's output into a computer some how.
  • jmgjmg Posts: 10,243
    Ahle2 wrote: »
    @Evanh
    Have you tried your zener TRNG with some RNG test suites? It would be cool to see how well it performs. I guess It would pass with Flying colors!
    A challenge here would be getting rid of fixed frequency ripple in the analog domain, like Mains Hum, SMPS switching frequencies etc. There will always be some spectral sneak thru...

  • jmg,

    I wondered about that.

    One could run a zenner or transistor noise generator off a battery, have the whole thing in a metal box with an optical output.

    All getting a bit over the top.

    Not what you want for an on chip random generator !
  • jmg wrote:
    A challenge here would be getting rid of fixed frequency ripple in the analog domain, like Mains Hum, SMPS switching frequencies etc. There will always be some spectral sneak thru...
    I think the Von Neumann algo that Chip uses in RealRandom should take care of any bias or spectral pass-thru.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Ahle2 wrote: »
    @Chip
    Do you think the smart pin LSB will act as a good TRNG? Maybe smart pins in combination with a zener diode and a...

    I think the ADC feedback will have lots of thermal noise represented in it.
  • Ahle2Ahle2 Posts: 906
    edited March 2 Vote Up0Vote Down
    @Heater
    "What is the easiest way to run some decent tests on a generator?"
    My way of doing it is with a tool that outputs random characters to stdout. Then you could pipe the output to a file or directly to Dieharder in raw input mode! :) I'm currently tryging to figure out how to run TestU01 with my tool. All "experts" says that TestU01 is a superior test battery for deciding between a "great PRNG" and a "superior PRNG". My initial testings indicates that all mentioned PRNG in this threads passes the Dieharder tests. But I have not been able to test Chips LFSR yet. Btw, I'm talking about DiehardER not Diehard as that is considered "crap" by todays standard.

    @all
    I have included the C++ source code for my testing tool. It would be nice if someone with knowledge in both C++ and Verilog could help me implement Chips LFSR, so we could use that as a reference for further testings.

    /Johannes
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • The command line I'm running is as follows:
    ./PP2PE 1 | dieharder -g200 -a
    

    1 means Xoroshiro in this case.

    /Johannes
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Ahle2Ahle2 Posts: 906
    edited March 2 Vote Up0Vote Down
    An output from Dieharder using PCG looks like this!
    ./PP2PE 2 | dieharder -g 200 -a
    #=============================================================================#
    #            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
    #=============================================================================#
       rng_name    |rands/second|   Seed   |
    stdin_input_raw|  1.42e+07  | 885928609|
    #=============================================================================#
            test_name   |ntup| tsamples |psamples|  p-value |Assessment
    #=============================================================================#
       diehard_birthdays|   0|       100|     100|0.29960471|  PASSED  
          diehard_operm5|   0|   1000000|     100|0.26079820|  PASSED  
      diehard_rank_32x32|   0|     40000|     100|0.05263385|  PASSED  
        diehard_rank_6x8|   0|    100000|     100|0.48317317|  PASSED  
       diehard_bitstream|   0|   2097152|     100|0.57603361|  PASSED  
            diehard_opso|   0|   2097152|     100|0.71719125|  PASSED  
            diehard_oqso|   0|   2097152|     100|0.20879796|  PASSED  
             diehard_dna|   0|   2097152|     100|0.37774651|  PASSED  
    diehard_count_1s_str|   0|    256000|     100|0.99932855|   WEAK   
    diehard_count_1s_byt|   0|    256000|     100|0.85655375|  PASSED  
     diehard_parking_lot|   0|     12000|     100|0.99430846|  PASSED  
        diehard_2dsphere|   2|      8000|     100|0.46170135|  PASSED  
        diehard_3dsphere|   3|      4000|     100|0.88822056|  PASSED  
         diehard_squeeze|   0|    100000|     100|0.39573244|  PASSED  
            diehard_sums|   0|       100|     100|0.04406544|  PASSED  
            diehard_runs|   0|    100000|     100|0.84283422|  PASSED  
            diehard_runs|   0|    100000|     100|0.99944572|   WEAK   
           diehard_craps|   0|    200000|     100|0.50201529|  PASSED  
           diehard_craps|   0|    200000|     100|0.67240240|  PASSED  
     marsaglia_tsang_gcd|   0|  10000000|     100|0.61052204|  PASSED  
     marsaglia_tsang_gcd|   0|  10000000|     100|0.66039542|  PASSED  
    

    As you can see a few weak test results comes up. if I haven't done anything wrong when implementing Mellissas algorithm, it does not seem to be the best PRNG available today as she claims.

    /Johannes
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Ahle2,
    DiehardER
    Ah yes. Heard of that. I don't think it existed when I last played with diehard.

    Funny aside:

    In the days of CD's George Marsaglia, creator of diehard and many PRNGs, published a CD with diehard on it. The CD also contained 100's megabytes of random bits.

    Problem was that when George created the CD content he was using MSDOS and copied the files without using the "binary" option to the copy command. (/b was it?). MSDOS copied the file as text and fixed up all that it thought was line ending in the binary data. Replacing 0x0d with 0x0d, 0x0a or whatever it does.

    Result: The random files on George's CD did not do very well in the diehard test!

    I was nerdy enough to copy the files off the CD and fix the damage. Then the files passed diehard very well :)

    Slippery things, random numbers.


  • Heater.Heater. Posts: 19,352
    edited March 2 Vote Up0Vote Down
    Ahle2,

    Thanks for the heads up on dieharder.

    I don't recall diehard classifying test results, PASSED, WEAK, etc so nicely. One had to look at the numbers and make a decision.
  • Heater.Heater. Posts: 19,352
    edited March 2 Vote Up0Vote Down
    Chip,

    I wrapped xoroshiro128+ PRNG into a simple test which generates a million numbers and prints the first and last 4. You could check if your PASM version does the same. Be sure to use the same seed of course.

    First the results:
    First 4 and last 4 of 1000000 xoroshiro128plus PRNG output (hex):
    4aa4921626d12792
    da319ec1da35800a
    6a392789cef68c3c
    eb50047a5d895c50
    ...
    26fa38e131ff753b
    f4cf7cfb892c43a3
    56575790aa078000
    438948e55200ef1a
    From seed: s[0] = 730358127afd03f7, s[1] = d7a13a03abd4239b
    

    And test code:
    (The xoroshiro128 source file comes from here: http://xoroshiro.di.unimi.it)
    //
    // Exercise the xoroshiro128+ PRNG 
    //
    // Compile with:
    //    gcc -Wall -std=c99 -o xoroshiro128plus-test xoroshiro128plus-test.c
    //
    #include <stdio.h>
    #include "xoroshiro128plus.c"
    
    // Seed MUST not be all zero.
    #define SEED_0 0x730358127afd03f7
    #define SEED_1 0xd7a13a03abd4239b 
    #define SAMPLE_SIZE 1000000
    #define HEAD 4
    #define TAIL 4
    
    int main(int argc, char* argv[])
    {
        uint64_t random;
        uint64_t i;
        char ellipsis = 1;
    
        // Seed the state array
        s[0] = SEED_0;  
        s[1] = SEED_1;
    
        printf ("First %d and last %d of %d xoroshiro128plus PRNG output:\n", HEAD, TAIL, SAMPLE_SIZE);
    
        // Print some randomness 
        for (i = 0; i < SAMPLE_SIZE; i++)
        {
            random = next();
            if ((i < HEAD) || (i > (SAMPLE_SIZE - TAIL - 1)))
            {
                printf("%016lx\n", random);
            }
            else
            {
                if (ellipsis)
                {
                    printf("...\n");
                    ellipsis--;
                }
            }    
        }
    
        printf("From seed: s[0] = %016lx, s[1] = %016lx\n", SEED_0, SEED_1); 
        return 0;
    }
    
  • Ahle2Ahle2 Posts: 906
    edited March 2 Vote Up0Vote Down
    @Chip
    I don't want to add any more time to the P2 project, so don't wait for us to decide which one of the multitudes of PRNG's that is the best. Just go with Xoroshiro 128+ and it will endlessly much better than a simple LFSR!
    The internal state is 128 bits, but all bits but the LSB is good as a tap for the final 32 bit value. That will fix the problem of the generator never returning a zero!

    And to be honest everything points in favour of Xoroshiro 128+ according to all comparisons I have come across. Fastest (according to many implementations/tests, but that is only interesting from a software perspective though), least trouble with the more advanced test suites, just 128 bits of state memory and maybe most importantly it is completely free (in every sense of the word) to implement.

    /Johannes
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • BTW, if clocked at 160 Mhz, it will take 1.3984206859764595e+29 years before it wraps. I think it's good enough! ;)
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Heater.Heater. Posts: 19,352
    edited March 3 Vote Up0Vote Down
    Ahle2,

    Is this what you had in mind? A thirty two bit output version of the xoroshiro128+ test.

    This shifts the 64 bit result right by one and outputs the low 32 bits.

    Sadly the authors of xoroshiro don't provide any test data. We have to trust our compiler is correct.

    The results;
    First 8 and last 8 of 1000000 xoroshiro128plus PRNG output (hex):
    00000000
    00002000
    0c000090
    08222091
    8c04e200
    3ac4a071
    1903e2a2
    ec692f73
    ...
    88940a17
    4c8836ca
    808622f1
    bb2ae479
    c96b14cb
    28208a41
    aef9334d
    d2ef9822
    From seed: s[0] = 0x0000000000000001, s[1] = 0x0000000000000000
    
    The code:
    //
    // Exercise the xoroshiro128+ PRNG 
    //
    // For 64 bit output Compile with:
    //    $ gcc -Wall -std=c99 -o xoroshiro128plus-test xoroshiro128plus-test.c
    //
    // For 32 bit output Compile with:
    //    $ gcc -Wall -std=c99 -DOUTPUT_32 -o xoroshiro128plus-test xoroshiro128plus-test.c
    //
    // Also works on 32 bit machines (-m32)
    //
    // Note: 64 bit output will never contain zero. 32 bit output can be any 32 bit value.
    //
    #include <stdio.h>
    #include "xoroshiro128plus.c"
    
    // Seed MUST not be all zero.
    #define SEED_0 ((uint64_t)0x1)
    #define SEED_1 ((uint64_t)0x0) 
    
    #define SAMPLE_SIZE 1000000
    #define HEAD 8
    #define TAIL 8
    
    int main(int argc, char* argv[])
    {
        uint64_t random64;
        uint64_t i;
        char ellipsis = 1;
    
        // Seed the state array
        s[0] = SEED_0;  
        s[1] = SEED_1;
    
        printf ("First %d and last %d of %d xoroshiro128plus PRNG output (hex):\n", HEAD, TAIL, SAMPLE_SIZE);
    
        // Print some randomness 
        for (i = 0; i < SAMPLE_SIZE; i++)
        {
            random64 = next();
            if ((i < HEAD) || (i > (SAMPLE_SIZE - TAIL - 1)))
            {
    #ifdef OUTPUT_32
                uint32_t random32 = random64 >> 1;
                printf("%08x\n", random32);
    #else
                 printf("%016jx\n", (intmax_t)random64);
    #endif
            }
            else
            {
                if (ellipsis)
                {
                    printf("...\n");
                    ellipsis--;
                }
            }    
        }
    
        printf("From seed: s[0] = 0x%016jx, s[1] = 0x%016jx\n", (intmax_t)SEED_0, (intmax_t)SEED_1); 
        return 0;
    }
    

    Edit: Changed the seed to that used by Chip.
  • Ahle2Ahle2 Posts: 906
    edited March 2 Vote Up0Vote Down
    Running on average 2^32 iterations to see if it ever returns zero should just take some minutes seconds. Hopefully the PRNG is better than this!

    dilbert.jpg
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • Heater.Heater. Posts: 19,352
    edited March 2 Vote Up0Vote Down
    EDIT: Ignore the following it's gibberish.

    Not sure I follow. Surely to check by brute force that the output is never zero we have to run through all the outputs of the 64 b generator:

    I read that does cycle through all possible 64 bit output values except zero.

    That implies that whatever 32 bits one selects from the 64 bit result a 32 bit zero is very possible.

    I think if one only wants a 32 bit output then one has to check those 32 bits for zero and discard them if zero and try again.

    Love that old Dilbert.
  • Heater. wrote: »
    I think if one only wants a 32 bit output then one has to check those 32 bits for zero and discard them if zero and try again.

    Is zero not an acceptable random value? Or do you mean "reject zero in your code when you don't want zero as a random value"?

  • Oops...you are right. Sorry. All of a sudden my brain flipped into reverse for some odd reason.
  • Ahle2Ahle2 Posts: 906
    edited March 2 Vote Up0Vote Down
    @Chip
    I don't understand Verilog. Is there a spin version I could use as a base to convert to C?
    SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
    AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
    SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
    Propeller chiptune player: Thread
  • AribaAriba Posts: 2,079
    edited March 2 Vote Up0Vote Down
    Ahle2

    Here is a translation to Spin as I understand the Verilog. There is a lot of bit manipulation, so I use the trick with the unimplemented PortB registers to access single bits in a long:
    VAR
      LONG  rndx,rnd
    
    PUB Main
      rndx := -1
    
      repeat
        DIRB := rndx
        rndx := rndx<<1 + (DIRB[31] ^ DIRB[10] ^ DIRB[4] ^ DIRB[1])
    
        'Bit reorder
        DIRB := rndx
        repeat i from 31 to 0
          OUTB[i] := DIRB[lookupz(31-i : 23,7,25,8,4,9,11,3,15,31,10,6,5,2,26,18,20,28,16,17,0,27,22,13,14,30,1,29,12,19,21,24)]
    
        'cog pattern
        rnd := OUTB ^ $428A2F98   'for cog0
    
        'output rnd here
    
    
    It's untested, I have not even tried to compile it!
    Maybe the quality of the random generator is not affected by the bit reordering, and it's enough to test only the rndx calculation.
    Ahle2 wrote: »
    BTW, if clocked at 160 Mhz, it will take 1.3984206859764595e+29 years before it wraps. I think it's good enough! ;)
    I think we should verify this before the P2 goes into synthesis. Will give us enough time to discuss every aspect of SPIN2 ... :lol:

    Andy
  • Roy Eltham wrote: »
    const mostly does nothing for modern C++ compilers. It's a hint at best.
    It can be cast away anywhere in the codebase, so it's not a reliable thing for the optimizing compiler anyway.

    I think you were thinking of "register" -- that's ignored by modern C++ compilers. "const" on the other hand certainly does have an effect; the compiler will give an error if you try to modify a const object, for example, and the optimizer is free to assume that consts do not change. It is true, as you pointed out, that const is easily cast away, but that's dangerous. If you tell the compiler something is const it's free to optimize based on that assumption (and GCC, at least, will do so).

    Eric
  • As I often say "There is no single person alive that understands C++"
  • Practically, if Chip want's to use xoroshiro128+ in PASM then we need to know the PASM works correctly.

    I have taken the original xoroshiro128+ C source and used it to generate test data that any PASM version can be checked against.

    The only remaining doubt is that the authors of xoroshiro128+ do not provide any sample output. So we have to trust that GCC, say, does the right thing with the same source.

    Perhaps someone else would like to reproduce what I have done on whatever machine/compiler they have?

    So that we can cross-check.


  • ersmith wrote: »
    Roy Eltham wrote: »
    const mostly does nothing for modern C++ compilers. It's a hint at best.
    It can be cast away anywhere in the codebase, so it's not a reliable thing for the optimizing compiler anyway.

    I think you were thinking of "register" -- that's ignored by modern C++ compilers. "const" on the other hand certainly does have an effect; the compiler will give an error if you try to modify a const object, for example, and the optimizer is free to assume that consts do not change. It is true, as you pointed out, that const is easily cast away, but that's dangerous. If you tell the compiler something is const it's free to optimize based on that assumption (and GCC, at least, will do so).

    Eric

    No, I meant const, and I meant as it relates to the code generation and optimization. Yes it does introduce compile errors and protections at the source/compile time levels.

  • cgraceycgracey Posts: 7,743
    edited March 3 Vote Up0Vote Down
    I was implementing the xoroshiro and doing some tests running the PASM implementation when I realized there was an error in my code: In those REPeat blocks which make the 64-bit rotators, the first RCL instructions needed a WC. The carry was not propagating into the top long. I fixed the code in my post, so you should copy the new version if you want to run it.
  • cgraceycgracey Posts: 7,743
    edited March 3 Vote Up0Vote Down
    Just noticed another problem: I had implemented the shift-left-14 as a rotate-left-14. I updated the code in the post again to fix it.
  • cgraceycgracey Posts: 7,743
    edited March 3 Vote Up0Vote Down
    Here is xoroshiro128+ in Verilog:
    // RND
    
    module		rnd
    (
    input		resn,
    input		clk,
    
    output	[62:0]	out
    );
    
    
    // xoroshiro128+
    
    reg  [63:0] s0, s1, ss;
    
    wire [63:0] sx = s0 ^ s1;
    
    always @(posedge clk or negedge resn)
    if (!resn)
    begin
      s0 <= 64'b1;    // seed s0
      s1 <= 64'b0;
      ss <= 64'b0;
    end
    else
    begin
      s0 <= {s0[08:00], s0[63:09]} ^ sx ^ {sx[49:00], 14'b0};
      s1 <= {sx[27:00], sx[63:28]};
      ss <= s0 + s1;
    end
    
    assign out = ss[63:01];    // output high-quality bits (LSB=LFSR)
    
    endmodule
    

    With the scan-register macros, it's easier to look at:
    // RND
    
    module		rnd
    (
    input		resn,
    input		clk,
    
    output	[62:0]	out
    );
    
    
    // xoroshiro-128
    
    reg  [63:0] s0, s1, ss;
    
    wire [63:0] sx = s0 ^ s1;
    
    `regscan (s0, 64'b1, 1'b1, {s0[08:00], s0[63:09]} ^ sx ^ {sx[49:00], 14'b0})
    `regscan (s1, 64'b0, 1'b1, {sx[27:00], sx[63:28]})
    `regscan (ss, 64'b0, 1'b1, s0 + s1)
    
    assign out = ss[63:01];    // output high-quality bits (LSB=LFSR)
    
    endmodule
    
  • Ahle2 wrote: »
    @Chip
    I don't want to add any more time to the P2 project, so don't wait for us to decide which one of the multitudes of PRNG's that is the best. Just go with Xoroshiro 128+ and it will endlessly much better than a simple LFSR!
    The internal state is 128 bits, but all bits but the LSB is good as a tap for the final 32 bit value. That will fix the problem of the generator never returning a zero!

    And to be honest everything points in favour of Xoroshiro 128+ according to all comparisons I have come across. Fastest (according to many implementations/tests, but that is only interesting from a software perspective though), least trouble with the more advanced test suites, just 128 bits of state memory and maybe most importantly it is completely free (in every sense of the word) to implement.

    /Johannes

    The Verilog is done and now I'm plugging it in.

    Ahle2, thanks a lot for this xoroshiro128+ pointer. This is a huge upgrade to what we had.
  • Ahle2 wrote: »
    Evanh
    Have you tried your zener TRNG with some RNG test suites? It would be cool to see how well it performs. I guess It would pass with Flying colors!
    Nothing more than general knowledge on my behalf, sorry.
    $50,000 buys you a discrediting of a journalist
Sign In or Register to comment.