Shop OBEX P1 Docs P2 Docs Learn Events
Random/LFSR on P2 - Page 59 — Parallax Forums

Random/LFSR on P2

1565759616292

Comments

  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-05 02:31
    evanh wrote: »
    So you want double iterated 32-bit output from the Xoroshiro generator. And remove the pairing from the distribution test code so that it still only has to deal with 32-bit indexing. Right?

    Do you have the code handy, Evan? It's quite late here. In a nutshell, don't save the previous output and do two new iterations to get the next 32-bit pair. Everything else is the same. No point testing more than one candidate.
  • Actually, you could do this for the six candidates you cut off last night! :)
  • evanhevanh Posts: 15,091
    Here's the double iterate line from the gridding/culling test code:
    #ifdef dITER
    		random64 = rotr( (drnword_t)nextxoro() | ((drnword_t)nextxoro() << ACCUM_SIZE), SAMPLE_POSITION );
    #else
    
    The functional detail is it concatenates two consecutive 16-bit outputs into one 32-bit output that when sampled as 32>>0 still provides all the random data. And the second 2 Gwords has the the two halves of the output exchanged because of the 2^32-1 state space.

    This looks functionally very similar to the pairing function in the distribution test code:
    		next_prn = rotr( nextxoro(), SAMPLE_POSITION ) & SAMPLE_MASK;
    		pairs[ prev_prn | (next_prn << SAMPLE_SIZE) ]++;  // 8-bit limited tallies
    		prev_prn = next_prn;
    
    There is a difference if the sample size and accumulator size are different but for the distribution testing, so far, they are the same.

    I think the pairing might be doing what you want already. Was that maybe the original intent of the pairing? I thought it was a feature of the distribution test though.
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-05 12:11
    evanh wrote: »
    Here's the double iterate line from the gridding/culling test code:
    #ifdef dITER
    		random64 = rotr( (drnword_t)nextxoro() | ((drnword_t)nextxoro() << ACCUM_SIZE), SAMPLE_POSITION );
    #else
    
    The functional detail is it concatenates two consecutive 16-bit outputs into one 32-bit output that when sampled as 32>>0 still provides all the random data. And the second 2 Gwords has the the two halves of the output exchanged because of the 2^32-1 state space.

    This looks functionally very similar to the pairing function in the distribution test code:
    		next_prn = rotr( nextxoro(), SAMPLE_POSITION ) & SAMPLE_MASK;
    		pairs[ prev_prn | (next_prn << SAMPLE_SIZE) ]++;  // 8-bit limited tallies
    		prev_prn = next_prn;
    
    There is a difference if the sample size and accumulator size are different but for the distribution testing, so far, they are the same.

    I think the pairing might be doing what you want already. Was that maybe the original intent of the pairing? I thought it was a feature of the distribution test though.

    I found my original post:
    TonyB_ wrote: »
    Some thoughts about pair distribution. The C code iterates xoroshiro32++, concatenates successive 16-bit outputs and records how often each possible 32-bit value occurs. If the first 16-bit output is 1, the second 2, etc., then the C code produces the following 32-bit data:
    {2,1}
    {3,2}
    {4,3}
    ...
    {2^32-2,2^32-3}
    {2^32-1,2^32-2}
    {1,2^32-1}
    

    XORO32 does a double iteration and outputs 32-bit data in the following order:
    {2,1}
    {4,3}
    ...
    {2^32-2,2^32-3}
    {1,2^32-1}
    {3,2}
    ...
    {2^32-1,2^23-2}
    

    Although the order is different the data are the same, i.e. the pair distribution is the XORO32 output distribution and therefore a good test of the XORO32 randomness. The xoroshiro32++ 16-bit output is equidistributed so that every non-zero values occurs 2^16 times and zero 2^16-1 times, but the XORO32 32-bit output is not equidistributed because the ++ scrambler is only 1-dimensionally equidistributed.

    It would have been very easy to make the C code work exactly like XORO32 but it is quicker the way it is now. If one candidate has the same distributions for either method then they all will and it seems that is the case, so thanks for checking. Every 16-bit output occurs twice and the C code interleaves the two ~2GB blocks that are separate with XORO32. Although the byte value at any particular address is of no consequence as it's the overall frequency distribution we want, the byte addresses differ by a one bit rotation, more or less, in the two methods.

  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-10 22:33
    First 64 XORO32 outputs for [13,5,10,9] when seed = 1.
    62690201
    12A2AE16
    D7194AE8
    984B0C52
    743C1DF1
    BCC6DBA0
    746C34C9
    07FF3643
    C642BBC0
    594EAA85
    701AD05A
    F3AEA328
    695A67EE
    93C6C140
    4964F5E1
    FC575E24
    A638D7AD
    181AE233
    7BE766B5
    5CBD8445
    49A45F17
    C687C751
    175E7CA8
    8BBA386F
    46776DFD
    62BFF63A
    4F64FD4F
    5E5A15D9
    B4E1379D
    2E1E01C0
    6A806734
    B2D8BDB8
    C4645DBB
    99D47BC4
    650F3E11
    C0E82B43
    82C2B0A0
    9D53DC65
    E0C12D02
    92249C67
    84638C3F
    2A36E209
    640EF1B9
    D236A707
    184A9FE3
    8A183209
    947E9886
    FC1232D4
    F6A2D1CA
    FE8A985B
    B93481C2
    22425E3A
    8EE02987
    E12D494E
    E72C03B4
    EE576CA0
    5C0B072E
    6FFB0598
    4D699199
    16A97FAE
    DC090AAC
    689A45D1
    E280AF76
    436BCA2D
    
  • evanhevanh Posts: 15,091
    What was the pairing for, originally?
  • evanhevanh Posts: 15,091
    Here's the alternate source added for filling out what was the pairs count array, it's verbatim from the culling/gridding source:
    	#ifdef dITER
    		next_prn = rotr( (drnword_t)nextxoro() | ((drnword_t)nextxoro() << ACCUM_SIZE), SAMPLE_POSITION ) & SAMPLE_MASK;
    		pairs[next_prn]++;
    	#else
    		next_prn = rotr( nextxoro(), SAMPLE_POSITION ) & SAMPLE_MASK;
    		pairs[ prev_prn | (next_prn << SAMPLE_SIZE) ]++;  // 8-bit limited tallies
    		prev_prn = next_prn;
    	#endif
    
  • evanh wrote: »
    What was the pairing for, originally?

    Originally, I looked at pairs of outputs to see whether they were duplicates. I knew the distribution would be binomial from one of Melissa's posts. The max frequency is less than 2^16 so it was easy to do on my PC. I soon realised a generalised pair distribution would be much more accurate, as every output would count instead of only one in 2^16, but my PC had insufficient RAM. A little while later it dawned on me that this pair distribution is identical to the XORO32 output distribution.

    The distributions test XORO32 as a 32-bit generator. PractRand and the Crush suite test xoroshiro32++ as a 16-bit generator. A good score for one size does not imply a good score in the other. We need the overall best score in both and I think [13,5,10,9] gives us that.


  • evanhevanh Posts: 15,091
    Looks like the new code is running about 5% slower. Hmm, it's still doing 4 G loops of the pairs array filling, which makes it 8 G calls to nextxoro(). I think that might be over doing it.

    Wow, twice the number of random numbers generated but it's still the same number of writes to main memory! DRAM speed is hitting hard!
  • evanhevanh Posts: 15,091
    edited 2018-10-05 13:55
    Actually, the DRAM performance limit is worse than a simple write because it has to be a read-modify-write to do each increment.
  • evanh wrote: »
    Actually, the DRAM performance limit is worse than a simple write because it has to be a read-modify-write to do each increment.

    Yes and it's only changing a byte at a time, but I don't see any way to make it faster.

    XORO32 [13,5,10,9] outputs agree in BASIC and x86. Should be verified independently, though. Next stop is the modified Verilog.
  • evanhevanh Posts: 15,091
    edited 2018-10-05 14:10
    Tony,
    You realise, because of the 8 G calls, it's doubling up on every occurrence. All the pair[] tallies will be even values as a result.

  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-05 14:57
    evanh wrote: »
    Tony,
    You realise, because of the 8 G calls, it's doubling up on every occurrence. All the pair[] tallies will be even values as a result.

    You're doing 8G iterations to get 4G pairs now. Before it was 4G iterations as previous one was saved. Nothing else has changed. Remember period is 2^32-1, therefore a particular output that provides low word of address first time around provides high word the second time around and vice-versa.
  • evanhevanh Posts: 15,091
    Oh, right. I figured the total engine iterations shouldn't change. Doh!

  • evanhevanh Posts: 15,091
    Here's a quick selection using the dITER compile switch:
  • evanh wrote: »
    Here's a quick selection using the dITER compile switch:

    Thanks Evan, the two different ways to generate the distributions agree.
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-06 02:10
    I think this post shows Verilog for XORO32 instruction, repeated immediately below.

    [14,2,7,5] current:
    wire [15:0] xoro32z	= d[31:16] ^ d[15:0];			// first iteration
    wire [31:0] xoro32y	= { xoro32z[8:0], xoro32z[15:9],
    			    {d[1:0], d[15:2]} ^
    			    {xoro32z[13:0], 2'b0} ^ xoro32z };
    
    wire [15:0] xoro32x	= xoro32y[31:16] ^ xoro32y[15:0];	// second iteration
    wire [31:0] xoro32	= { xoro32x[8:0], xoro32x[15:9],	// xoro32 = d result (state store)
    			    {xoro32y[1:0], xoro32y[15:2]} ^
    			    {xoro32x[13:0], 2'b0} ^ xoro32x };
    
    wire [15:0] xoro32a	= d[31:16] + d[15:0];			// hashing sum of input state
    wire [15:0] xoro32b	= xoro32y[31:16] + xoro32y[15:0];	// hashing sum of first iterator state output
    
    assign xoro32r	= { {xoro32b[10:0], xoro32b[15:11]} + xoro32y[15:0],
    		    {xoro32a[10:0], xoro32a[15:11]} + d[15:0] }; // xoro32r = prng result, next instruction's s value
    

    [13,5,10,9] proposed:
    wire [15:0] xoro32z	= d[31:16] ^ d[15:0];			// first iteration
    wire [31:0] xoro32y	= { xoro32z[5:0], xoro32z[15:6],
    			    {d[2:0], d[15:3]} ^
    			    {xoro32z[10:0], 5'b0} ^ xoro32z };
    
    wire [15:0] xoro32x	= xoro32y[31:16] ^ xoro32y[15:0];	// second iteration
    wire [31:0] xoro32	= { xoro32x[5:0], xoro32x[15:6],	// xoro32 = d result (state store)
    			    {xoro32y[2:0], xoro32y[15:3]} ^
    			    {xoro32x[10:0], 5'b0} ^ xoro32x };
    
    wire [15:0] xoro32a	= d[31:16] + d[15:0];			// hashing sum of input state
    wire [15:0] xoro32b	= xoro32y[31:16] + xoro32y[15:0];	// hashing sum of first iterator state output
    
    assign xoro32r	= { {xoro32b[6:0], xoro32b[15:7]} + xoro32y[15:0],
    		    {xoro32a[6:0], xoro32a[15:7]} + d[15:0] };	// xoro32r = prng result, next instruction's s value
    

    xoro32y, xoro32 and xoro32r to be changed.
    If the above is correct, it should produce the output here.

    P.S. Good to see the old code blocks back as the recent change was affecting productivity.
  • evanhevanh Posts: 15,091
    Isn't this the change now?
  • Does the proposed [13,5,10,9] Verilog look right to you, Evan?

    I'm part way through creating some tables that show how good it is. I'll finish later today.
  • evanhevanh Posts: 15,091
    Looks right to me. I've commented the constants for quicker reading:
    //  Verilog HDL for Xoroshiro32++ candidate [13 5 10 9]
    ///////////////////////////////////////////////////////////
    
    wire [15:0] xoro32z	= d[31:16] ^ d[15:0];			// first iteration
    wire [31:0] xoro32y	= { xoro32z[5:0], xoro32z[15:6],        // rol #10
    			    {d[2:0], d[15:3]} ^                 // rol #13
    			    {xoro32z[10:0], 5'b0} ^ xoro32z };  // shl #5
    
    wire [15:0] xoro32x	= xoro32y[31:16] ^ xoro32y[15:0];	// second iteration
    wire [31:0] xoro32	= { xoro32x[5:0], xoro32x[15:6],	// rol #10, xoro32 = d result (state store)
    			    {xoro32y[2:0], xoro32y[15:3]} ^     // rol #13
    			    {xoro32x[10:0], 5'b0} ^ xoro32x };  // shl #5
    
    wire [15:0] xoro32a	= d[31:16] + d[15:0];			// hashing sum of input state
    wire [15:0] xoro32b	= xoro32y[31:16] + xoro32y[15:0];	// hashing sum of first iterator state output
    
    assign xoro32r	= { {xoro32b[6:0], xoro32b[15:7]} + xoro32y[15:0],
    		    {xoro32a[6:0], xoro32a[15:7]} + d[15:0] };	// rol #9, xoro32r = prng result, next instruction's s value
    
  • Thanks, Evan. It's a good idea to add the comments as a reminder of how it works.
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-07 00:01
    deleted
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-07 00:01
    deleted
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-07 00:02
    deleted
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-07 00:43
    The following three posts show distribution results for selected candidates. |Observed-Expected| is the absolute error.
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-06 23:55
    Pair frequency |Observed-Expected|
    # |Observed-Expected|
    # a,  b,  c,  d,     pfreq0,     pfreq1,     pfreq2,     pfreq3,     pfreq4,     pfreq5,     pfreq6,     pfreq7,     pfreq8,     pfreq9,    pfreq10,    pfreq11,    pfreq12
    # scro   , , , ,      12447,      14664,       6423,       9195,       2453,       1491,        434,        120,        173,         34,         13,          4,          0
     13,  5, 10,  9,      78899,      89687,      41504,      42541,       5417,       2691,       1688,         89,         26,         49,         28,          6,          1
     13,  5,  8, 10,     119611,     135729,      56511,      50169,      14837,       6873,       1113,        677,        287,         14,         37,          5,          1
     14,  2,  7,  6,     693240,     705595,     333559,     116940,     150145,      56973,      17844,       2842,       1044,        126,         11,          8,          2
      6,  2,  3,  9,     700291,     715125,     325707,     114279,     137714,      64907,      18365,       4083,       1164,         58,         16,         13,          2
      3,  2,  6,  9,    1852203,    1862261,     896846,     283035,     392180,     168742,      48919,      11809,       1791,        368,         53,          3,          3
     14,  2,  7,  5,    3369116,    3393152,    1658965,     561575,     713103,     302470,      83700,      18360,       3320,        409,         63,          1,          0
    
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-06 23:54
    Zero run frequency |Observed-Expected|
    # |Observed-Expected|
    # a,  b,  c,  d,     zfreq1,     zfreq2,     zfreq3,     zfreq4,     zfreq5,     zfreq6,     zfreq7,     zfreq8,     zfreq9,    zfreq10,    zfreq11,    zfreq12,    zfreq13,    zfreq14,    zfreq15,    zfreq16,    zfreq17,    zfreq18,    zfreq19,    zfreq20,    zfreq21
    # scro   , , , ,      17271,      11651,       8070,       2037,       2957,        444,        775,        515,        427,        230,        275,        148,         72,         24,         15,         19,          8,          4,         11,          3,          0
     13,  5, 10,  9,     274316,      12986,      14125,      38593,      19943,      10341,       7191,       4027,        574,        852,        317,        255,         74,        107,          7,          9,          2,          0,          4,          0,          2
     13,  5,  8, 10,     399287,      37005,      16381,      29568,      11097,       8837,       2939,       1086,        453,        238,         53,        138,        106,         15,         19,         16,          9,          4,          1,          1,          0
     14,  2,  7,  6,      62101,      44563,      23048,      37978,      31150,      23066,      11996,       5767,       4211,       1980,        830,        406,        222,         30,          9,          9,          1,          3,          3,          1,          3
      6,  2,  3,  9,     262462,     352036,     114239,      57912,      10014,      11589,      11818,       5940,       3476,       1584,        765,        527,        137,         56,         32,         36,         10,          7,          3,          1,          1
      3,  2,  6,  9,    1118583,      79497,     214756,     182201,     137546,      77988,      39744,      19549,       9537,       4176,       1966,        772,        315,        189,         33,         33,         18,         10,          2,          2,          1
     14,  2,  7,  5,    1866306,     236796,     410816,     336427,     246302,     139743,      72213,      34671,      16523,       7142,       2820,       1207,        475,        232,        107,         51,         21,          9,          1,          1,          0
    
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-06 23:52
    Non-zero run frequency |Observed-Expected|
    # |Observed-Expected|
    # a,  b,  c,  d,    nzfreq1,    nzfreq2,    nzfreq3,    nzfreq4,    nzfreq5,    nzfreq6,    nzfreq7,    nzfreq8,    nzfreq9,   nzfreq10,   nzfreq11,   nzfreq12,   nzfreq13,   nzfreq14,   nzfreq15,   nzfreq16,   nzfreq17,   nzfreq18,   nzfreq19,   nzfreq20,   nzfreq21,   nzfreq22,   nzfreq23,   nzfreq24,   nzfreq25,   nzfreq26,   nzfreq27,   nzfreq28,   nzfreq29,   nzfreq30,   nzfreq31,   nzfreq32,   nzfreq33,   nzfreq34,   nzfreq35,   nzfreq36,   nzfreq37,   nzfreq38,   nzfreq39,   nzfreq40,   nzfreq41,   nzfreq42,   nzfreq43,   nzfreq44,   nzfreq45
    # scro   , , , ,      17548,       8203,       8451,      18153,      10472,       8301,       7820,       5494,        259,       3692,         14,       1199,        188,        227,        320,        756,        744,        283,        272,        184,        180,         61,         78,         21,         61,         16,         17,         31,         26,          5,         38,         12,          1,          2,          2,          0,          6,          2,          3,          4,          2,          0,          2,          1,          1
     13,  5, 10,  9,     123916,      28906,      96761,      13003,      13403,      24422,         86,       9758,      20511,       9668,       5791,       4952,       1724,       3477,       1213,        975,        212,        691,        201,         20,        140,         20,        114,        188,          3,         30,         10,         94,         14,         55,          4,         20,          6,          1,          1,          6,          9,          1,          0,          2,          1,          2,          0,          0,          1
     13,  5,  8, 10,     301733,      12768,     111630,      26604,       7345,      34203,      18037,      16628,       9519,       9368,      13916,       4820,       4130,       2393,       1137,         61,        892,        513,        299,        107,        222,        151,         51,         98,        210,         37,         44,         10,         36,         18,         41,          5,         12,         16,         18,          5,         10,          2,          9,          5,          2,          1,          0,          2,          0
     14,  2,  7,  6,     461338,      61444,      84874,     129163,      87316,      32725,      40235,      27583,      13636,       3551,       3536,       8258,       8282,       5669,       4864,       3407,       3451,       2262,       2161,       1686,       1247,        752,        572,        396,        178,        189,        194,         36,        107,         47,         33,         29,         23,         15,          9,          6,          8,          7,          5,          0,          1,          2,          2,          0,          0
      6,  2,  3,  9,     164117,     201875,      89806,     204802,      64224,      64097,      22963,      24718,      40474,      35969,      32163,      21653,      17188,      14458,       9364,       8544,       5381,       3922,       2496,       1889,       1204,        724,        661,        269,        281,        170,        153,        109,         88,         15,         42,          5,          6,         18,         18,         10,          4,          1,          0,          1,          0,          2,          1,          0,          1
      3,  2,  6,  9,     356549,     212743,     188614,     299169,     147752,      84527,      32412,      21264,       3829,      13929,      20684,      18030,      16223,      11768,      10150,       7530,       5548,       4239,       2753,       1775,       1316,        898,        565,        427,        420,        169,        101,        110,         18,         59,         53,         24,         23,          2,         13,          1,          3,          7,          3,          0,          0,          1,          1,          1,          1
     14,  2,  7,  5,     796603,     449094,     371827,     495986,     298229,     174131,      28618,      22581,      19650,      23117,      35575,      29984,      22917,      20059,      17747,      11176,       8308,       5917,       4938,       3120,       2176,       1456,       1062,        829,        579,        265,        301,        104,         89,         59,         20,         20,         24,         20,          8,          4,          8,          0,          1,          1,          0,          3,          1,          2,          1
    
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-07 00:46
    The next three posts show distribution results using the Chi-Square goodness of fit test. These show most clearly the difference between the current [14,2,7,5] and proposed [13,5,10,9] constants in the XORO32 instruction when considering the output as a single 32-bit value.

    Scroll to the extreme right to see the total for all frequencies.
  • TonyB_TonyB_ Posts: 2,099
    edited 2018-10-07 00:23
    Pair frequency Chi-Square = (Observed-Expected)^2/Expected
    # (Observed-Expected)^2/Expected
    # a,  b,  c,  d,     pfreq0,     pfreq1,     pfreq2,     pfreq3,     pfreq4,     pfreq5,     pfreq6,     pfreq7,     pfreq8,     pfreq9,    pfreq10,    pfreq11,    pfreq12,      total
    # scro   , , , ,          0,          0,          0,          0,          0,          0,          0,          0,          1,          0,          0,          0,          0,          3
     13,  5, 10,  9,          4,          5,          2,          7,          0,          1,          1,          0,          0,          1,          2,          1,          0,         24
     13,  5,  8, 10,          9,         12,          4,         10,          3,          4,          1,          1,          2,          0,          3,          1,          0,         50
     14,  2,  7,  6,        304,        315,        141,         52,        342,        247,        145,         26,         28,          4,          0,          2,          1,       1606
      6,  2,  3,  9,        310,        324,        134,         50,        288,        320,        154,         53,         35,          1,          1,          4,          1,       1674
      3,  2,  6,  9,       2171,       2195,       1018,        304,       2336,       2163,       1090,        445,         82,         31,          6,          0,          3,      11845
     14,  2,  7,  5,       7184,       7287,       3484,       1198,       7724,       6948,       3192,       1075,        281,         38,          9,          0,          0,      38421
    
Sign In or Register to comment.