Random/LFSR on P2

cgracey · 2017-11-06 18:21

Cluso, the timing diagram is at the top of the instructions_v26.txt file.

You'll need a wide display to see this properly:

clk
_________------------____________------------____________------------____________------------____________------------____________------------____________-
    
         |                       |                       |                       |                       |                       |                       |
rdRAM Ib |-------+               |              rdRAM Ic |-------+               |              rdRAM Id |-------+               |              rdRAM Ie |
         |       |               |                       |       |               |                       |       |               |                       |
latch Da |---+   +----> rdRAM Db |------------> latch Db |---+   +----> rdRAM Dc |------------> latch Dc |---+   +----> rdRAM Dd |------------> latch Dd |
latch Sa |---+   +----> rdRAM Sb |------------> latch Sb |---+   +----> rdRAM Sc |------------> latch Sc |---+   +----> rdRAM Sd |------------> latch Sd |
latch Ia |---+   +----> latch Ib |------------> latch Ib |---+   +----> latch Ic |------------> latch Ic |---+   +----> latch Id |------------> latch Id |
         |   |                   |                       |   |                   |                       |   |                   |                       |
         |   +------------------ALU-----------> wrRAM Ra |   +------------------ALU-----------> wrRAM Rb |   +------------------ALU-----------> wrRAM Rc |
         |                       |                       |                       |                       |                       |                       |
         |                       |  stall/done = 'gox'   |                       |  stall/done = 'gox'   |                       |  stall/done = 'gox'   |
         |         'get'         |        done = 'go'    |         'get'         |        done = 'go'    |         'get'         |        done = 'go'    |

Cluso99 · 2017-11-06 18:44

Thanks Chip

evanh · 2017-11-06 21:23

Cluso99 wrote: »
Thanks Chip.
A long time ago you posted the basic clock processing for the P2. Not sure where that was and if it is even current. How does it work now?
Here's the P1 for example
I d S D e R
        I d S D e R

Drawn that way, Prop2 is:

I  SD e  e  R
      I  SD e  e  R
            I  SD e  e  R

EDIT: The S and D ALU ports of the second instruction can be fed straight from the first instruction completion - effectively a forced immediate addressing mode on the second instruction. Or the second instruction's SD operand substitution can be done after the first's first execution stage.

I'm assuming Chip prefers operand substitution where possible but what use does a random number have as anything other than an immediate number?

EDIT2: Chip did say the second execution stage is basically all mux'ing. I'm not sure the implications there. Does that mean that none of the prefixing instructions, eg: ALTS, can do operand substitution? I've long forgotten how those instructions work ... (Now I'm late for work!)

evanh · 2017-11-06 23:24

And the AVR would be this:

I SDe R
   I SDe R
      I SDe R

TonyB_ · 2017-11-07 00:51

TonyB_ wrote: »

evanh wrote: »

You might be amused to know, when only using 16-bits or less, the quality scores for our XORO32 doing the even-odd looping trick are generally a little better than straight incremental. There is a few 1 GB scores in the word0 column, and even some 4 GB and 8 GB scores in other columns too. It's also more of a mixed bag, [14 2 7] no longer looks so great for example.

For the testing, I modified the source to double iterate on every call to next(). Nothing else changed.

Remember when you swapped bits 1 and 9? [14,2,7] got worse but [15,3,6] got better. I wonder what the double-iteration score would be with that swapping?

Evan, could you please try a double-iteration test with bits 1 and 9 swapped? Also it would be good to add an lsBit test to this to see how bit 0 performs.

I think this will be the last test, I hope so anyway. PractRand has been an excellent tool for comparing triplets and [14,2,7] has always been the one in the Verilog, however XORO32 has changed and I think [15,3,6] is the better choice now. That's the only modification XORO32 really needs.

cgracey · 2017-11-07 02:50

If [15,3,6] is better, we should switch over to it.

evanh · 2017-11-07 04:12

In that case, I'd recommend also moving the parity bit to the most significant bit position.

My impression is, without subsequent summing, a final bit shuffle is only hiding things from PractRand. I doubt it's a quality improvement.

To change the subject a little, that last odd-even scoring also had a lowly 32MB score for [15 3 6] as well as [14 2 7]. Tony is picking the score columns that will likely be used in production code but there is many more possible sampling variants that can show differing quality.

I don't know how relevant such aligned scores are to the true quality of an algorithm. Maybe what Tony is picking out is the right answer.

Scro, is there a rigid answer to this question?

EDIT: Typo

evanh · 2017-11-07 04:37

The question isn't all that rigidly defined is it.

The question is one of over-all quality. Doing lots of sampling variations like I've been doing, is there ever going to be a flat set of scores for a given algorithm? Or should we only expect to settle for finding a combination and config that makes the common uses produce good scores?

evanh · 2017-11-07 10:51

TonyB_ wrote: »

Evan, could you please try a double-iteration test with bits 1 and 9 swapped? Also it would be good to add an lsBit test to this to see how bit 0 performs.

Ouch, the whole set of candidates is taking up to an hour to run now! That increase will be solely down to the average increase in score sizes. Those old times, ~12 minutes, must have been just before we'd found the full-width parity improvement.

Here's the earlier scoring with parity at msBit position - http://forums.parallax.com/discussion/comment/1424022/#Comment_1424022

Scoring with double-iterating, parity at msBit and exchanged bit8 with bit0:

  Xoroshiro32+p PractRand Score Table - Run 2017-11-07 23:48:10 +1300

Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   Byte0   msBit
=====================================================================================
 [ 1  2  8]     64M    128M    256M    128M    256M     16M      4M    128M      1G
 [ 2  1 11]     64M    128M     64M    128M    128M     32M    256M    512M      1G
 [ 2  1 15]     32M     64M     32M      1M      1M    512K    512K      1M      1G
 [ 2  2  7]     64M    256M    128M    128M    128M     32M      8M     32M      1G
 [ 2  6 15]    128M    512M    512M    128M    256M      2M      1M      2M      1G
 [ 2  8  7]    256M     16G      8G    128M      1G      1G    512M    512M      1G
 [ 2  9  9]    128M      2G    512M     16M      8M     32M    128M     32M      1G
 [ 2 11  3]     64M      1G      1G    128M      8M      4M      4M      2M      1G
 [ 3  2  6]    512M      8G      8G    128M      1G      1G      1G    512M      1G
 [ 3  3 10]     64M      2G      1G     64M    128M     16M    128M     64M      1G
 [ 3 11  2]     64M    256M     64M    128M      2M      1M      2M    512K      1G
 [ 3 11 14]    512M     16G      8G    128M    128M    512M    512M      1G      1G
 [ 4  1  9]    512K    512K      1M    512K     64M     32M    512K    512K      1G
 [ 4  7 15]     64M    512M    512M    128M     32M    512M    512M    256M      1G
 [ 4  8  5]    512M     16G      8G    128M      1G      1G    512M    512M      1G
 [ 5  2  6]    512M     16G      4G    128M      2G      1G    128M      1G      1G
 [ 5  8  4]    128M    128M    128M      8M      1G    512M    512M    512M      1G
 [ 5 14 12]    512M      8G      8G    128M      1G      1G    512M    512M      1G
 [ 6  2  3]      1G     16G     16G    128M      1G      1G      1G      1G      1G
 [ 6  2  5]    256M      4G      4G    128M     64M    512M     32M     64M      1G
 [ 6  2 11]    512M      8G      8G    128M      2G      1G    512M      1G      1G
 [ 6  3 15]      1G     16G      8G    128M      1G     64M    128M     64M      1G
 [ 6 14 11]    512M      2G      4G     64M     64M      1G    128M    128M      1G
 [ 7  1  8]     32M    128M    256M     32M     64M     16M      8M     16M      1G
 [ 7  2  2]     64M    256M    512M     64M    256M     64M      8M     32M      1G
 [ 7  2 10]    128M    512M      2G    128M    512M     64M     16M     64M      1G
 [ 7  2 14]    512M     16G      8G    128M      1G      1G     32M    128M      1G
 [ 7  8  2]    512M      8G      8G    128M      1G    256M    256M    512M      1G
 [ 7 10 10]    256M      4G      4G    128M    128M    128M    128M    256M      1G
 [ 7 15  8]    512K      1M      1M    256K      2M    256K     64K    256K      1G
 [ 8  1  7]     16M     64M     64M     16M    128M     16M      4M     16M      1G
 [ 8  2  1]     64M    128M    256M     16M    256M     16M      8M      8M      1G
 [ 8  5 13]     64M    128M    128M    128M      8M     32M     64M    512M      1G
 [ 8  7 15]      4M     16M     16M      8M     32M      8M      2M      4M      1G
 [ 8  9 13]    512M      8G      4G    128M    256M    512M    128M    256M      1G
 [ 8 15  7]    512K      1M      1M    256K      1M    256K    128K    256K      1G
 [ 9  1  4]      8M      8M     32M      4M     16M     64M      2M      8M    128M
 [ 9  2 14]    512M      8G      4G    128M    512M    128M     64M    128M      1G
 [ 9  8 14]    512M     16G      8G    128M    256M    128M     32M    256M      1G
 [ 9  9  2]    512M      4G      2G    128M    256M    256M    256M    512M      1G
 [10  2  7]    128M    512M    512M    128M      1G     64M     16M     32M      1G
 [10  3  3]     64M    512M    512M    128M     64M    256M    256M    128M      1G
 [10  3 11]    512M      8G      8G    128M      2G      1G    128M    512M      1G
 [10  5 13]    512M     16G      8G    128M    256M     32M      1G     32M      1G
 [10  7 11]      1G     16G      8G    128M    512M      1G    512M    512M      1G
 [10 10  7]    256M      4G      4G    128M     16M    128M    128M    256M      1G
 [11  1  2]    512M     16G      4G    128M      1G    512M    512M    512M      1G
 [11  1 14]    512M     16G      8G    128M    256M    512M    512M    512M      1G
 [11  2  6]    512M     16G      8G    128M      1G    512M    512M    256M      1G
 [11  3 10]    128M      2G      2G    128M    128M    256M    512M     32M      1G
 [11  7 10]    512M     16G      8G    128M    128M    256M     64M    128M      1G
 [11  8 12]    512M     16G     16G    128M      1G      1G    512M    512M      1G
 [11 10 12]    128M      8G      8G    128M    512M    256M    512M    512M      1G
 [11 11 14]    256M     16G      8G    128M      1G    512M    256M    512M      1G
 [11 14  6]    512M      8G      8G    128M    128M    512M    512M    256M      1G
 [12  4 15]    128M      1G      1G    128M      1G    512M    512M    512M      1G
 [12  8 11]     64M    256M    512M    128M     64M     32M     32M      8M      1G
 [12 10 11]    512M      1G      2G    128M     32M    512M     64M     64M      1G
 [12 10 13]    128M      8G      8G    128M      1G    512M    512M    512M      1G
 [12 14  5]    128M      4G      4G    128M      1G      1G    512M    512M      1G
 [13  3 14]    512M      4G      1G    128M    512M    512M    512M    512M      1G
 [13  5  8]    128M      1G      4G    128M    256M    512M    512M      1G      1G
 [13  5 10]    512M     16G      8G    128M      1G      1G    512M      1G      1G
 [13  9  8]    512M      2G      8G    128M    256M      1G    512M      1G      1G
 [13 10 12]    256M    512M    128M    128M    512M    512M    512M    512M      1G
 [13 11 14]     64M      4G      1G    128M    512M    512M    512M    512M      1G
 [13 12 14]     16M      1G    256M    128M    128M    512M    512M    512M      1G
 [13 13 14]    512M      4G      1G    128M    512M    512M    512M    256M      1G
 [14  1 11]    512M      8G     16G    128M      1G      1G      1G    256M      1G
 [14  2  7]    256M     16G      8G    128M     64M      1G     32M     64M      1G
 [14  2  9]    512M     16G     16G    128M     64M    128M     64M    128M      1G
 [14  3 13]    128M     16M     32M    128M      2M      8M    512K    512K      1G
 [14  8  9]    512M     16G      8G    128M      2G      1G    512M    512M      1G
 [14 11  3]    512M     16G      8G    128M    128M    512M    512M    512M      1G
 [14 11 11]    512M      2G      4G    128M    128M    256M    256M      1G      1G
 [14 11 13]    128M     32M     16M    128M      1M    512K      1M    512K      1G
 [14 12 13]    128M     16M      8M    128M    512K    512K      2M    512K      1G
 [14 13 13]     32M     16M     16M    128M      1M      1M      2M    512K      1G
 [15  1  2]    512M    512M      4G    128M     32M     64M     64M    256M      1G
 [15  3  6]    512M     16G     16G    128M    128M    128M    512M     64M      1G
 [15  4 12]    256M    512M      1G    128M    512M    512M    512M    512M      1G
 [15  6  2]    512M     16G      8G    128M    512M    128M    256M    512M      1G
 [15  7  4]      1G      8G      2G    128M    256M    512M    512M    512M      1G
 [15  7  8]     32M    128M    128M    128M    512M    512M    128M    256M      1G

evanh · 2017-11-07 10:51

And the scoring you've asked for - double-iterating, parity at bit0 and exchanged bit9 with bit1:

  Xoroshiro32+p PractRand Score Table - Run 2017-11-07 22:30:28 +1300

Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   Byte0   msBit
=====================================================================================
 [ 1  2  8]     32M     64M    128M     64M    128M      4M    128M     64M      1G
 [ 2  1 11]     64M    128M     64M    128M     32M    256M    512M      2G      1G
 [ 2  1 15]    128M     32M     16M      2M    512K    512K      1M      4M      1G
 [ 2  2  7]    128M    128M    128M    128M    128M      8M     32M    128M      1G
 [ 2  6 15]    256M    512M    256M      2G     32M      1M      2M      4M      1G
 [ 2  8  7]    256M      8G      4G      1G    512M    512M    512M      1G      1G
 [ 2  9  9]    256M      1G    512M      4M     16M    128M     32M     32M      1G
 [ 2 11  3]    128M    256M    256M     64M      4M      4M      2M      2M      1G
 [ 3  2  6]    512M      8G      4G      1G      1G      1G    512M      2G      1G
 [ 3  3 10]    256M    128M      1G     64M     32M    128M     64M     64M      1G
 [ 3 11  2]    128M    128M    128M      4M      1M      2M    512K    512K      1G
 [ 3 11 14]    512M     16G     16G      2G    128M    512M      1G      4G      1G
 [ 4  1  9]    512K    512K    512K    512K    256M    512K    512K    512K    512M
 [ 4  7 15]     64M    512M    512M     32M    128M    512M    256M      1G      1G
 [ 4  8  5]      1G      8G      8G      1G      1G    512M    512M      4G      1G
 [ 5  2  6]    512M      8G      4G      1G      1G    128M      1G      2G      1G
 [ 5  8  4]    128M    128M    256M      8M      1G    512M    512M      2G      1G
 [ 5 14 12]    512M      8G      8G    512M      1G    512M    512M      2G    512M
 [ 6  2  3]      1G     16G      4G      1G      1G      1G      1G      2G    512M
 [ 6  2  5]    512M      4G      2G      2G    128M     32M     64M      2G      1G
 [ 6  2 11]    512M      8G      2G      1G    512M    512M      1G      2G      1G
 [ 6  3 15]      1G      8G      4G      1G     32M    128M     64M    256M      1G
 [ 6 14 11]      1G      4G      2G    128M     64M    128M    128M    256M    256M
 [ 7  1  8]     64M    128M    128M     16M     64M      8M     16M     32M      1G
 [ 7  2  2]    256M    512M    512M     64M    128M      8M     32M    128M      1G
 [ 7  2 10]    256M    512M    512M    128M    256M     16M     64M    512M      1G
 [ 7  2 14]    512M      4G      1G      1G      1G     32M    128M    512M    128M
 [ 7  8  2]    512M      8G      4G      1G    512M    256M    512M      2G      1G
 [ 7 10 10]    512M      2G      4G    256M    128M    128M    256M    512M      1G
 [ 7 15  8]    512K    512K    512K    256K      2M     64K    256K    256K      1G
 [ 8  1  7]     32M    128M     64M      8M     64M      4M     16M     32M      1G
 [ 8  2  1]     32M     64M     64M     32M    128M      8M      8M     16M    128M
 [ 8  5 13]     64M    128M    128M     64M     32M     64M    512M    512M      1G
 [ 8  7 15]      4M     16M     16M      4M     32M      2M      4M      4M      1G
 [ 8  9 13]    512M     16G      4G      1G     64M    128M    256M      2G      1G
 [ 8 15  7]    256K      1M    512K    128K      1M    128K    256K    128K    512M
 [ 9  1  4]      8M      2M     16M      2M     16M      2M      8M      8M    512M
 [ 9  2 14]    512M     16G      4G    512M    512M     64M    128M    256M      1G
 [ 9  8 14]    512M     16G      8G    512M      1G     32M    256M    512M    512M
 [ 9  9  2]      1G      4G    512M    128M    512M    256M    512M      2G      1G
 [10  2  7]    256M    256M    256M    128M    128M     16M     32M    128M      1G
 [10  3  3]    256M    128M    256M     16M     32M    256M    128M      1G      1G
 [10  3 11]      1G      4G      4G      2G      1G    128M    512M      2G      1G
 [10  5 13]    512M     16G      4G    512M     64M      1G     32M     64M      1G
 [10  7 11]      1G      8G      8G      1G    512M    512M    512M      1G      1G
 [10 10  7]    512M      2G      2G     32M    128M    128M    256M    256M      1G
 [11  1  2]    512M      2G      4G      1G    512M    512M    512M      2G      1G
 [11  1 14]    512M      8G      4G      1G    256M    512M    512M      2G      1G
 [11  2  6]      1G      8G      4G      1G      1G    512M    256M      2G      1G
 [11  3 10]    512M    128M      2G      1G    256M    512M     32M    256M      1G
 [11  7 10]    512M      8G      8G    256M    128M     64M    128M    256M      1G
 [11  8 12]    512M      8G      4G      1G      1G    512M    512M      4G      1G
 [11 10 12]    256M     16G      4G    512M    512M    512M    512M      1G      1G
 [11 11 14]    512M     16G      8G      1G      1G    256M    512M      1G      1G
 [11 14  6]    512M      8G      4G    256M    256M    512M    256M    512M      1G
 [12  4 15]    128M    512M    512M      1G      1G    512M    512M      2G      1G
 [12  8 11]     64M     64M    128M    512M     64M     32M      8M     32M      1G
 [12 10 11]    512M      1G    256M    512M     64M     64M     64M      2G      1G
 [12 10 13]    256M     16G      4G     64M      1G    512M    512M      2G      1G
 [12 14  5]    256M      4G      2G      1G      1G    512M    512M      2G      1G
 [13  3 14]    512M      4G      1G      1G    512M    512M    512M      4G      1G
 [13  5  8]    256M    512M      1G    512M     64M    512M      1G      2G      1G
 [13  5 10]    512M      8G      4G      1G     64M    512M      1G      2G      1G
 [13  9  8]    512M      2G      2G      1G    128M    512M      1G      2G      1G
 [13 10 12]    512M    512M    128M      1G    512M    512M    512M      1G      1G
 [13 11 14]    128M      2G    256M      1G    512M    512M    512M    128M      1G
 [13 12 14]     32M    512M    256M     32M    512M    512M    512M      4G      1G
 [13 13 14]    512M      2G    512M    512M    512M    512M    256M      1G      1G
 [14  1 11]      1G      8G      4G      1G    512M      1G    256M      1G      1G
 [14  2  7]    512M      8G      8G    512M      1G     32M     64M    512M      1G
 [14  2  9]      1G      8G      8G    128M    256M     64M    128M    256M      1G
 [14  3 13]    256M     16M      8M    256M      4M    512K    512K     16M      1G
 [14  8  9]    512M     16G      8G      1G      1G    512M    512M      1G      1G
 [14 11  3]    512M     16G      8G    512M    256M    512M    512M      1G      1G
 [14 11 11]    512M      2G      2G    512M    512M    256M      1G      1G      1G
 [14 11 13]    256M     32M     16M    256M      1M      1M    512K      2M      1G
 [14 12 13]    256M     16M      8M      8M    512K      2M    512K      2M      1G
 [14 13 13]     64M     16M      8M      8M    512K      2M    512K      2M      1G
 [15  1  2]    512M    512M      2G    512M    256M     64M    256M    512M      1G
 [15  3  6]      1G     16G      8G      2G      1G    512M     64M    256M      1G
 [15  4 12]    512M      1G    256M      2G    512M    512M    512M      1G      1G
 [15  6  2]    512M     16G      4G      1G    512M    256M    512M      1G      1G
 [15  7  4]    512M      4G      1G      1G    512M    512M    512M      2G      1G
 [15  7  8]     32M    128M    128M      1G    512M    128M    256M      2G      1G

evanh · 2017-11-07 11:54

Scoring with double-iterating and parity at msBit:

  Xoroshiro32+p PractRand Score Table - Run 2017-11-08 00:50:09 +1300

Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   Byte0   msBit
=====================================================================================
 [ 1  2  8]     32M     64M    128M     16M     32M    256M    128M      4M      1G
 [ 2  1 11]     64M    128M     64M    128M     64M    512M    512M    128M      1G
 [ 2  1 15]     32M     16M     16M    512K    512K    512K      1M      1M      1G
 [ 2  2  7]     32M     64M     64M     32M    256M     64M     32M      8M      1G
 [ 2  6 15]     64M    256M    256M      1G     64M      4M      1M      1M      1G
 [ 2  8  7]    256M     16G      8G      2G      1G    512M      1G    128M      1G
 [ 2  9  9]     64M    512M    512M    256M      4M     16M     32M    128M      1G
 [ 2 11  3]    128M      1G    512M    256M      8M      8M      8M      8M      1G
 [ 3  2  6]     64M      8G      4G      2G    512M    512M      1G    128M      1G
 [ 3  3 10]    128M    256M    256M    512M     64M     64M     64M    128M      1G
 [ 3 11  2]     64M    256M     64M     64M      4M      4M      4M      4M      1G
 [ 3 11 14]    512M      8G      4G      2G    128M    128M    512M    128M      1G
 [ 4  1  9]    512K    512K      1M    256K     32M     32M    512K    512K      1G
 [ 4  7 15]     64M    512M    512M      1G     32M    256M    256M    128M      1G
 [ 4  8  5]    512M      8G     16G      1G    512M    512M    512M    128M      1G
 [ 5  2  6]    512M      8G      4G      2G      1G      1G    512M     64M      1G
 [ 5  8  4]    128M    256M    128M      8M    512M    512M    512M    128M      1G
 [ 5 14 12]    256M      2G      2G      1G    256M      1G    512M    128M      1G
 [ 6  2  3]    128M      8G      8G      2G      2G      1G      1G    128M      1G
 [ 6  2  5]    512M      4G      8G    128M     64M     32M     32M     32M      1G
 [ 6  2 11]    512M    128M      8G      1G    512M    128M      1G    128M      1G
 [ 6  3 15]    512M     16G      8G      2G      1G     64M     32M    128M      1G
 [ 6 14 11]    512M     16G      8G      1G    128M    512M    512M    128M      1G
 [ 7  1  8]     16M     64M    128M     16M     32M     32M     16M      8M      1G
 [ 7  2  2]     32M    128M    512M     64M    512M     64M     32M      8M      1G
 [ 7  2 10]     64M    512M      1G    128M    256M    128M     64M     16M      1G
 [ 7  2 14]    512M     16G      8G      1G      1G    256M     64M     64M      1G
 [ 7  8  2]    512M     16G      8G      1G    128M    512M    512M    128M      1G
 [ 7 10 10]    512M      2G      2G    128M     64M     64M    128M    128M      1G
 [ 7 15  8]    256K      1M    512K    128K    512K    256K    256K     64K      1G
 [ 8  1  7]     16M     64M     64M     16M     32M     32M     16M      4M      1G
 [ 8  2  1]     16M     64M     64M     32M     32M     64M     32M      8M      1G
 [ 8  5 13]     64M    128M    128M     64M      8M     16M    512M     32M      1G
 [ 8  7 15]      4M      8M      8M      2M      2M      4M      4M      2M      1G
 [ 8  9 13]    512M      8G      4G      1G    512M    512M    128M    128M      1G
 [ 8 15  7]    256K    512K    512K    128K    256K    256K    256K    128K      1G
 [ 9  1  4]      8M      8M     32M      2M      8M    256M      8M      2M    128M
 [ 9  2 14]    512M     16G      8G      2G    256M    128M     64M     64M      1G
 [ 9  8 14]    512M     16G      8G    128M    256M    128M    128M     64M      1G
 [ 9  9  2]    512M      2G      2G     64M      1G    512M    512M    128M      1G
 [10  2  7]     32M    256M    512M    128M    256M     64M     32M     16M      1G
 [10  3  3]     64M    256M    256M    256M    256M    256M     64M    128M      1G
 [10  3 11]    512M      2G      4G      1G      1G    512M    128M    128M      1G
 [10  5 13]    512M      8G      8G    512M      1G     32M     32M    128M      1G
 [10  7 11]    512M     16G     16G      1G    256M    512M    512M    128M      1G
 [10 10  7]    256M      1G      2G    256M     16M     64M    256M    128M      1G
 [11  1  2]    512M     16G      8G      2G      1G    512M    512M    128M      1G
 [11  1 14]    512M      8G      4G      1G    512M    512M    512M    128M      1G
 [11  2  6]    256M    256M      8G      1G    512M    128M    512M    128M      1G
 [11  3 10]    256M    512M    512M      1G    128M    512M      1G    128M      1G
 [11  7 10]    512M     16G      8G      1G      1G    512M    256M     64M      1G
 [11  8 12]    512M     16G      8G      1G    512M    512M    512M    128M      1G
 [11 10 12]    512M      8G      4G      2G    256M    256M    256M    128M      1G
 [11 11 14]    512M      8G      1G      1G    512M    512M    512M    128M      1G
 [11 14  6]    512M      8G      4G      1G    512M    512M    512M    128M      1G
 [12  4 15]    128M    512M      1G      1G    512M    512M    512M    128M      1G
 [12  8 11]     64M     64M    128M      1G    256M      8M      8M     32M      1G
 [12 10 11]    256M      8G      4G      2G    256M     32M     32M     32M      1G
 [12 10 13]    512M     16G      8G      1G    512M    512M    512M    128M      1G
 [12 14  5]    128M      4G      4G      2G      1G    512M    512M    128M      1G
 [13  3 14]    512M      4G      1G      2G    512M    512M    512M    128M      1G
 [13  5  8]    128M      1G      2G      2G      1G    128M      1G    128M      1G
 [13  5 10]    512M      8G      8G    512M    512M      1G    512M    128M      1G
 [13  9  8]    512M      2G      8G      1G    128M    256M    512M    128M      1G
 [13 10 12]    128M    128M    128M      2G    512M    128M      1G    128M      1G
 [13 11 14]    512M      2G    512M      1G    512M    512M    256M    128M      1G
 [13 12 14]    512M      1G    512M      1G    512M    512M    512M    128M      1G
 [13 13 14]    512M      1G    512M      1G    512M    256M    256M    128M      1G
 [14  1 11]    512M     16G      8G      2G    512M    512M      1G    128M      1G
 [14  2  7]    512M     16G      8G     64M      1G    128M     32M     32M      1G
 [14  2  9]    512M     16G      8G      1G    256M    128M     64M     64M      1G
 [14  3 13]    128M     64M     64M      2G      8M      1M    512K    256K      1G
 [14  8  9]    512M     16G      8G      1G    512M      1G    512M    128M      1G
 [14 11  3]    512M     16G      8G      1G    512M    512M    512M    128M      1G
 [14 11 11]    512M      2G      2G    256M    128M    512M    512M    128M      1G
 [14 11 13]    128M     16M      8M      1G    256K    256K    256K    256K      1G
 [14 12 13]    128M      8M      8M      1G    256K    256K    256K    512K      1G
 [14 13 13]     32M      8M      4M      4M    256K    256K    256K    256K      1G
 [15  1  2]    256M    512M    512M    256M     32M    256M    256M     64M      1G
 [15  3  6]    512M     16G      8G      1G      1G     64M     32M    128M      1G
 [15  4 12]    128M      2G      2G      1G    512M    512M    512M    128M      1G
 [15  6  2]    512M     16G      8G      1G    512M    512M    256M    128M      1G
 [15  7  4]    512M      4G      2G      1G    128M    512M    512M    128M      1G
 [15  7  8]     32M     64M    128M      2G    512M    512M    256M    128M      1G

evanh · 2017-11-07 11:56

I'll do some more configurations tomorrow if Scro doesn't tell me I'm wasting my time.

scro · 2017-11-07 18:24

evanh wrote: »

I'll do some more configurations tomorrow if Scro doesn't tell me I'm wasting my time.

I don't know what you'd be doing to wasting your time or not waste your time.

But if your trying xoroshiro32plus with a parity bit replacing one bit of output, could you instead have the parity bit act as the carry-in on the addition that produces the output? That might be better, maybe.

TonyB_ · 2017-11-07 18:51

scro wrote: »

evanh wrote: »

I'll do some more configurations tomorrow if Scro doesn't tell me I'm wasting my time.

I don't know what you'd be doing to wasting your time or not waste your time.

But if your trying xoroshiro32plus with a parity bit replacing one bit of output, could you instead have the parity bit act as the carry-in on the addition that produces the output? That might be better, maybe.

The idea of using the parity is a bit of a novelty and it's probably not easy to see in all our previous discussions where the parity comes from. It's the parity of the full 17-bit result after the addition of the two 16-bit halves of the state.

TonyB_ · 2017-11-07 18:58

evanh wrote: »

TonyB_ wrote: »

Evan, could you please try a double-iteration test with bits 1 and 9 swapped? Also it would be good to add an lsBit test to this to see how bit 0 performs.

Ouch, the whole set of candidates is taking up to an hour to run now! That increase will be solely down to the average increase in score sizes. Those old times, ~12 minutes, must have been just before we'd found the full-width parity improvement.

Many thanks for today's results, Evan. I think you have done enough tests.

Here is a summary of the most important scores for the top two triplets with three combinations of sum and parity:

                   Top    Bot    Top    Bot
Triplet	   Word   Byte   Byte    Bit    Bit   Iter.    PRN[15:0] = 
------------------------------------------------------------------
[14,2,7]   512M     1G   512M     1G     1G   single
[15,3,6]   512M   512M     1G     1G     1G     "
                                                       sum[15:1],parity
[14,2,7]   512M   256M     1G     1G     1G   double
[15,3,6]     1G     1G     2G     1G     1G     "
------------------------------------------------------------------
[14,2,7]   512M   256M   512M     1G     1G   single	 
[15,3,6]   512M   512M     1G     1G     1G     "
                                                       sum[15:10],sum[1],sum[8:2],sum[9],parity
[14,2,7]   512M   512M   512M     1G     1G   double	             
[15,3,6]     1G     2G   256M     1G     1G     "
------------------------------------------------------------------
[14,2,7]   512M   256M   128M     1G     ?    single
[15,3,6]   512M   512M   128M     1G     ?      "
                                                       parity,sum[15:1]
[14,2,7]   512M    64M    32M     1G     ?    double	
[15,3,6]   512M     1G   128M     1G     ?      "
------------------------------------------------------------------

Word size is 16 bits. Parity as msb has the worst scores but was a convenient way of doing a parity-only test and the top bit here has been copied to the bottom bit of the other two sets. Swapping bits 1 and 9 has worse scores overall than not swapping which leaves parity as lsb as the best option and here it is again on its own:

                   Top    Bot    Top    Bot
Triplet	   Word   Byte   Byte    Bit    Bit   Iter.    PRN[15:0] = 
------------------------------------------------------------------
[14,2,7]   512M     1G   512M     1G     1G   single
[15,3,6]   512M   512M     1G     1G     1G     "
                                                       sum[15:1],parity
[14,2,7]   512M   256M     1G     1G     1G   double
[15,3,6]     1G     1G     2G     1G     1G     "
------------------------------------------------------------------

XORO32 does a double iteration and concatenates two 16-bit sum & parity results. Treating those as one long PRN, [14,2,7] is better in the top bytes and [15,3,6] better in the bottom bytes but otherwise they score the same.

If the best possible word or byte PRN is wanted from each XORO32 instruction, then [15,3,6] is clearly superior. The word score of 1G is the best since using all 16 bits and the bottom byte score of 2G is the highest we have had.

We have always relied on the PractRand scores and they are saying choose [15,3,6].

evanh · 2017-11-07 21:11

scro wrote: »

evanh wrote: »

I'll do some more configurations tomorrow if Scro doesn't tell me I'm wasting my time.

I don't know what you'd be doing to wasting your time or not waste your time.

The main attempts right now are shuffling the summing output bit order. This is the question I was trying to articulate earlier - Is there is any real quality advantage to a fixed post-summing shuffle? Or are we just tricking PrandRand?

scro · 2017-11-07 21:26

evanh wrote: »

scro wrote: »

evanh wrote: »

I'll do some more configurations tomorrow if Scro doesn't tell me I'm wasting my time.

I don't know what you'd be doing to wasting your time or not waste your time.

The main attempts right now are shuffling the summing output bit order. This is the question I was trying to articulate earlier - Is there is any real quality advantage to a fixed post-summing shuffle? Or are we just tricking PrandRand?

Shuffling bit order in the final output? Yeah, that sounds pretty suspicious. Exactly how useless that is depends upon what test failures you're managing to avoid that way, but in general it sounds like a bad idea.

evanh · 2017-11-07 21:33

TonyB_ wrote: »

Many thanks for today's results, Evan. I think you have done enough tests.

The thing is, the shuffling is pretty arbitrary. The original bit9<->bit1 swap, intentionally 8 bits apart, was just to test if PracrRand can be fooled - which seems easily done.

If we're going to start down this path we should do a decent amount of it to see what else comes up.

cgracey · 2017-11-07 21:56

So, it looks like we should switch from [14,2,7] to [15,3,6], right?

And our output is still {sum[15:1], ^sum[16:0]}, right?

cgracey · 2017-11-07 22:15

Does this look correct for [15,3,6]?

wire [15:0] xoro32z	= d[31:16] ^ d[15:0];			// first iteration
wire [31:0] xoro32y	= { xoro32z[9:0], xoro32z[15:10],
			    {d[0], d[15:1]} ^
			    {xoro32z[12:0], 3'b0} ^ xoro32z };

wire [15:0] xoro32x	= xoro32y[31:16] ^ xoro32y[15:0];	// second iteration
wire [31:0] xoro32	= { xoro32x[9:0], xoro32x[15:10],	// xoro32 = d result
			    {xoro32y[0], xoro32y[15:1]} ^
			    {xoro32x[12:0], 3'b0} ^ xoro32x };

wire [16:0] xoro32a	= xoro32y[31:16] + xoro32y[15:0];	// first sum
wire [16:0] xoro32b	= xoro32[31:16] + xoro32[15:0];		// second sum

assign xoro32r		= { xoro32b[15:1], ^xoro32b,		// xoro32r = prng result, next instruction's s value
			    xoro32a[15:1], ^xoro32a };

TonyB_ · 2017-11-07 22:17

evanh wrote: »

TonyB_ wrote: »

Many thanks for today's results, Evan. I think you have done enough tests.

The thing is, the shuffling is pretty arbitrary. The original bit9<->bit1 swap, intentionally 8 bits apart, was just to test if PracrRand can be fooled - which seems easily done.

If we're going to start down this path we should do a decent amount of it to see what else comes up.

I think you have done enough tests for the decision about switching to [15,3,6]. Someone should double check my summary, though.

You could try swapping bits 1 and 8/10/11/12/13/14 but swapping bits 1 and 9 made things worse overall.

TonyB_ · 2017-11-07 23:26

cgracey wrote: »

So, it looks like we should switch from [14,2,7] to [15,3,6], right?

And our output is still {sum[15:1], ^sum[16:0]}, right?

cgracey wrote: »

Does this look correct for [15,3,6]?

wire [15:0] xoro32z	= d[31:16] ^ d[15:0];			// first iteration
wire [31:0] xoro32y	= { xoro32z[9:0], xoro32z[15:10],
			    {d[0], d[15:1]} ^
			    {xoro32z[12:0], 3'b0} ^ xoro32z };

wire [15:0] xoro32x	= xoro32y[31:16] ^ xoro32y[15:0];	// second iteration
wire [31:0] xoro32	= { xoro32x[9:0], xoro32x[15:10],	// xoro32 = d result
			    {xoro32y[0], xoro32y[15:1]} ^
			    {xoro32x[12:0], 3'b0} ^ xoro32x };

wire [16:0] xoro32a	= xoro32y[31:16] + xoro32y[15:0];	// first sum
wire [16:0] xoro32b	= xoro32[31:16] + xoro32[15:0];		// second sum

assign xoro32r		= { xoro32b[15:1], ^xoro32b,		// xoro32r = prng result, next instruction's s value
			    xoro32a[15:1], ^xoro32a };

Yes, yes and yes, I reckon.

I'll run my prog to get some states and sums when seed is $0000_0001.

cgracey · 2017-11-07 23:40

Here are my first 16 PRNG results using [15,3,6] with a starting seed of $00000001:

54658048
51BE8BA2
35E3784F
AFD08E7D
629FE601
946714EB
302C36C5
5AF40E3A
71FD53EE
1D5B5B24
87F15D3B
B5F7674D
DFED8C27
F9C387AA
8977BE03
777DEC19

Do these look correct?

evanh · 2017-11-07 23:49

Tony,
You are somewhat cherry picking. There is other triplets that also look better than either of those two when the bits are shuffled appropriately.

TonyB_ · 2017-11-07 23:57

cgracey wrote: »
Here are my first 16 PRNG results using [15,3,6] with a starting seed of $00000001:
54658048
51BE8BA2
35E3784F
AFD08E7D
629FE601
946714EB
302C36C5
5AF40E3A
71FD53EE
1D5B5B24
87F15D3B
B5F7674D
DFED8C27
F9C387AA
8977BE03
777DEC19
Do these look correct?

Yes, my first 32 results:

54658048
51BE8BA2
35E3784F
AFD08E7D
629FE601
946714EB
302C36C5
5AF40E3A
71FD53EE
1D5B5B24
87F15D3B
B5F7674D
DFED8C27
F9C387AA
8977BE03
777DEC19
FF7837B4
A62D7B9D
E33DBEC7
52C6AD4E
34ED75B5
14D188AA
A63DD965
C193B6BB
AFF7466C
06037367
8FD6A4C4
48EB868F
3DE8A410
B342C19D
97C72985
394CD5BC

cgracey · 2017-11-08 00:00

Mine look the same. Thanks!

TonyB_ · 2017-11-08 01:32

evanh wrote: »

Tony,
You are somewhat cherry picking. There is other triplets that also look better than either of those two when the bits are shuffled appropriately.

I chose the triplets that performed best in the three sets of tests that you did as listed in my summary post above. However, in my view the most important set is the first one and we need the best triplet for both single and double iterations when bit 0 is parity, i.e. just a single bit change from the original algorithm. I don't like the idea of shuffling bits about after the addition unless that gives better scores, in which case there is probably something wrong with the algorithm. What I am saying needs checking but I'm confident that [15,3,6] scores the best.

Links to test results:

Bit 0 is parity, single iteration
http://forums.parallax.com/discussion/comment/1423900/#Comment_1423900
http://forums.parallax.com/discussion/comment/1423918/#Comment_1423918 (bottom byte added)
http://forums.parallax.com/discussion/comment/1424022/#Comment_1424022 (bit 0 test)
Bit 0 is parity, double iteration
http://forums.parallax.com/discussion/comment/1424903/#Comment_1424903

Bit 0 is parity, bits 1 and 9 swapped, single iteration
http://forums.parallax.com/discussion/comment/1423918/#Comment_1423918
Bit 0 is parity, bits 1 and 9 swapped, double iteration
http://forums.parallax.com/discussion/comment/1425052/#Comment_1425052

Bit 15 is parity, single iteration
http://forums.parallax.com/discussion/comment/1424022/#Comment_1424022
Bit 15 is parity, double iteration
http://forums.parallax.com/discussion/comment/1425053/#Comment_1425053

evanh · 2017-11-08 02:14

[15 3 6] has always been one of the better candidates. I won't argue any more.

cgracey · 2017-11-08 02:56

It is done.

TonyB_ · 2017-11-11 02:04

Jump functions work with xoroshiro32. It is possible to jump by 64K or 128K states with only 32 single iterations plus some XORs according to the jump bit patterns:

JUMP_64K  = 1A57604D
JUMP_128K = 2FD3E1B9

Bit 31 is the first bit to test.

The jump function needs single-stepping of states which XORO32 can no longer do, unless it is modified. The easiest way, avoiding use of CZ opcode bits, would be to move XORO32 to an empty D,S slot with S selecting single or double iterations.

Here are the first 32 * 64K jumps from state 00000001:

64K*n jumps for xoroshiro32+ [15,3,6]

  n,	state

  0,	00000001
  1,	90B3FECF
  2,	86FD9DC1
  3,	90C5FFCF
  4,	EA7B0DC1
  5,	432C08B9
  6,	01186857
  7,	AFD000FD
  8,	9F5ABCD0
  9,	BAC6246B
 10,	0AAAD5CF
 11,	1DDB13D4
 12,	E603101A
 13,	A61BF445
 14,	AF3D09E0
 15,	E4C59E46
 16,	8785CE09
 17,	B80FDDE9
 18,	38C4CEC4
 19,	4BF8A52C
 20,	8EC2FF96
 21,	F0BCA253
 22,	B0DFADBA
 23,	0099D500
 24,	43282641
 25,	9F272A87
 26,	BB131234
 27,	96833C01
 28,	89D2AE8F
 29,	A726D929
 30,	D641DBF1
 31,	F71F077F
 32,	98569497

EDIT:
Added 64K jumps.

Random/LFSR on P2

Comments