Random/LFSR on P2

cgracey · 2017-10-31 14:45

Evanh, thanks for trying that out.

TonyB_ · 2017-10-31 14:45

cgracey wrote: »

I was thinking we'd hit the jackpot.

I thought of saying that - good job I didn't!

Adding scrambles the bits to a certain extent and this latest idea unscrambles them, it seems.

evanh · 2017-10-31 14:45

cgracey wrote: »

ACCUM_SIZE was 16, right? The code looks good to me.

Yep, 16, I've been referring to that as s16 as well.

evanh · 2017-10-31 14:50

Here's the scores: (took only 8 minutes to run.. O_o)

  Xoroshiro32+p PractRand Score Table - Run 2017-11-01 03:39:40 +1300

Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   Byte0   msBit
=====================================================================================
 [ 1  2  8]      8K      4K      4K      2K     16M     64M    512M      2M      1K
 [ 2  1 11]     16K      4K      4K      2K    256K    256K    512K      2M      1K
 [ 2  1 15]     16K      4K      4K      4K    128K    128K    128K    128K      1K
 [ 2  2  7]      8K      4K      4K      4K     16M     64M    128M      2M      1K
 [ 2  6 15]      8K      2K      4K      4K    256K    256K    256K    128K      1K
 [ 2  8  7]     16K      4K      4K      4K     16M     64M    128M      2M      1K
 [ 2  9  9]     16K      4K      2K      4K    512K    256M      2G      2M      1K
 [ 2 11  3]      8K      4K      2K      2K    256K    256K    256K    256K      1K
 [ 3  2  6]      8K      2K      2K      4K     16M      8M      4M      1M      1K
 [ 3  3 10]      8K      4K      4K      2K      1M    512M      1G      2M      1K
 [ 3 11  2]      8K      4K      4K      4K    128K    128K    128K    128K      1K
 [ 3 11 14]      4K      4K      2K      2K    512K    512K      1M      1M      1K
 [ 4  1  9]      8K      4K      4K      4K    256M    256M     64M      2M      1K
 [ 4  7 15]     16K      4K      4K      2K      1M      1M      1M      1M      1K
 [ 4  8  5]     16K      4K      2K      2K     16M     16M      8M      2M      1K
 [ 5  2  6]      8K      2K      4K      4K     16M     16M     16M      2M      1K
 [ 5  8  4]     16K      4K      2K      4K      2M    256K    128K    256K      1K
 [ 5 14 12]     16K      4K      4K      4K      8M    256M      2G      2M      1K
 [ 6  2  3]      8K      2K      2K      4K      2M      2M      4M      2M      1K
 [ 6  2  5]      8K      2K      2K      2K    512K      1M      1M      1M      1K
 [ 6  2 11]      8K      4K      4K      2K    256M    256M    512M      2M      1K
 [ 6  3 15]      8K      2K      2K      1K      2M      4M      8M      2M      1K
 [ 6 14 11]     16K      4K      4K      4K     64M    256M    512M      2M      1K
 [ 7  1  8]     16K      4K      4K      4K     64M    256M    512M      2M      1K
 [ 7  2  2]      8K      1K      1K      4K      4M      8M      8M      2M      1K
 [ 7  2 10]     16K      4K      4K      4K    256M    128M      1G      2M      1K
 [ 7  2 14]      8K      4K      4K      4K     16M      1G      1G      2M      1K
 [ 7  8  2]     16K      2K      2K      2K      8M      8M      8M      2M      1K
 [ 7 10 10]      8K      4K      2K      4K      2M     64M    256M      2M      1K
 [ 7 15  8]      8K      4K      4K      4K     16M    256M    128M      2M      1K
 [ 8  1  7]     16K      4K      4K      2K     64M    512M    256M      2M      1K
 [ 8  2  1]      8K      4K      4K      2K     16M     32M    512M      2M      1K
 [ 8  5 13]      8K      4K      4K      4K    512M      2G      1G      2M      1K
 [ 8  7 15]     16K      4K      4K      4K     64M    512M    512M      2M      1K
 [ 8  9 13]      8K      4K      4K      4K    512M    256M      1G      2M      1K
 [ 8 15  7]     16K      4K      4K      4K      4M    128M    128M      2M      1K
 [ 9  1  4]     16K      4K      4K      2K     32M      1G    512M      2M      1K
 [ 9  2 14]      4K      1K      1K      1K      1G    128M    512M      2M      1K
 [ 9  8 14]     16K      4K      4K      4K     64M    128M    128M      2M      1K
 [ 9  9  2]     16K      4K      4K      4K     64M    512M    512M      2M      1K
 [10  2  7]      8K      4K      4K      4K     64M    256M    256M      2M      1K
 [10  3  3]     16K      4K      4K      4K     64M      1G    512M      2M      1K
 [10  3 11]     16K      2K      4K      2K     32M      1G    512M      2M      1K
 [10  5 13]     16K      4K      4K      4K    256M      1G    512M      2M      1K
 [10  7 11]      8K      4K      4K      2K    256M    512M    512M      2M      1K
 [10 10  7]      8K      4K      4K      2K     16M     64M    256M      2M      1K
 [11  1  2]      8K      4K      2K      2K     64M     64M     64M      2M      1K
 [11  1 14]     16K      4K      4K      4K     16M    512M    512M      2M      1K
 [11  2  6]     16K      4K      4K      4K     64M      1G      1G      2M      1K
 [11  3 10]     16K      4K      2K      2K      8M    256K    128K     64K      1K
 [11  7 10]     16K      4K      4K      2K      4M    256K     64K     64K      1K
 [11  8 12]     16K      4K      4K      4K    128M      1G    512M      2M      1K
 [11 10 12]      8K      4K      4K      4K    128M      1G    512M      2M      1K
 [11 11 14]      8K      4K      4K      4K     16M    512M    512M      2M      1K
 [11 14  6]      8K      2K      2K      2K     64M      1G      1G      2M      1K
 [12  4 15]      8K      4K      2K      4K    128M     32M     32M      2M      1K
 [12  8 11]      8K      4K      4K      4K    256K    256K    128K    128K      1K
 [12 10 11]      8K      4K      2K      2K    256K    256K    128K    128K      1K
 [12 10 13]      8K      4K      2K      4K    512M    512M    512M      2M      1K
 [12 14  5]     16K      4K      4K      4K     32M    128M    128M      2M      1K
 [13  3 14]      8K      2K      4K      2K     64M     32M     16M      2M      1K
 [13  5  8]      8K      4K      4K      4K    256M      1G    256M      2M      1K
 [13  5 10]     16K      4K      4K      4K     16M      2M      1M    128K      1K
 [13  9  8]      4K      2K      4K      4K    128M    128M    128M      2M      1K
 [13 10 12]      8K      2K      2K      2K    256K    256K    128K    128K      1K
 [13 11 14]      8K      4K      4K      2K     32M     32M     16M      2M      1K
 [13 12 14]     16K      4K      4K      2K     32M     32M     16M      2M      1K
 [13 13 14]      8K      4K      4K      2K     32M     32M     16M      2M      1K
 [14  1 11]      8K      4K      4K      2K    512K    512K    512K    256K      1K
 [14  2  7]     16K      4K      4K      2K     64M      2G      1G      2M      1K
 [14  2  9]     16K      4K      4K      4K    512M    256M    256M      2M      1K
 [14  3 13]     16K      4K      4K      2K    512K    128K    128K     64K      1K
 [14  8  9]     16K      4K      4K      4K    128M     64M    128M      2M      1K
 [14 11  3]      8K      4K      4K      2K    256M    256M    256M      2M      1K
 [14 11 11]      8K      2K      4K      4K    512K    512K    512K    128K      1K
 [14 11 13]     16K      4K      4K      2K    128K    128K    128K     64K      1K
 [14 12 13]     16K      8K      8K      4K    128K    128K    128K     64K      1K
 [14 13 13]     16K      4K      4K      1K    128K    128K    128K     64K      1K
 [15  1  2]      8K      4K      4K      2K     64M     64M     64M      2M      1K
 [15  3  6]     16K      4K      4K      4K     64M      1G      1G      2M      1K
 [15  4 12]      8K      4K      4K      4K     16M    512K    512K    256K      1K
 [15  6  2]      8K      2K      4K      4K    256M    128M     64M      2M      1K
 [15  7  4]     16K      4K      4K      4K      1G    512M    512M      2M      1K
 [15  7  8]      8K      4K      4K      2K    128M      1G      1G      2M      1K

cgracey · 2017-10-31 14:53

Hey, what if we made another sum from the two 16-bit halves, but did so using their bits in reverse, then we took those two sums and combined them in some way?

Or, we could just concatenate the upper 8 bits of each sum.

cgracey · 2017-10-31 14:54

TonyB_ wrote: »

cgracey wrote: »

I was thinking we'd hit the jackpot.

I thought of saying that - good job I didn't!

Adding scrambles the bits to a certain extent and this latest idea unscrambles them, it seems.

I think you're right about the unscrambling.

cgracey · 2017-10-31 14:57

Evanh, get some sleep. In several more hours, some new ideas might percolate in each of our heads.

cgracey · 2017-10-31 15:55

When Evanh gets up, maybe he could try this:

sum_a[15:0] = state[31:16] + state[15:0]
sum_b[15:0] = state[16:31] + state[0:15]
result[15:0] = {sum_a[15:8], sum_b[15:8]}

This way, all result bits are the sum of at least two whole bytes (no thinly-veiled LFSR bits showing through). Also, the two bytes that make up the result word are sums of different sets of whole bits, with the fractional bits being each other's whole bits.

TonyB_ · 2017-10-31 16:48

Chip, xoroshiro+ can run backwards using same amount of logic as forwards: two rotates, one shift and one add. Would that be useful? One of the spare opcode bits could select the direction.

TonyB_ · 2017-10-31 16:56

My last post was not a reply to the previous one.

cgracey · 2017-10-31 16:59

TonyB_ wrote: »

Chip, xoroshiro+ can run backwards using same amount of logic as forwards: two rotates, one shift and one add. Would that be useful? One of the spare opcode bits could select the direction.

Yes, maybe. And we'd only need to run the iterator backwards. The sum can be computed the same.

Wait, there is some shifting, not just rotating in the iteration step. Shifting is not reversible. Are you sure?

TonyB_ · 2017-10-31 17:22

cgracey wrote: »

TonyB_ wrote: »

Chip, xoroshiro+ can run backwards using same amount of logic as forwards: two rotates, one shift and one add. Would that be useful? One of the spare opcode bits could select the direction.

Yes, maybe. And we'd only need to run the iterator backwards. The sum can be computed the same.

Wait, there is some shifting, not just rotating in the iteration step. Shifting is not reversible. Are you sure?

Yes.

You XOR the same shift going backwards to counteract the XOR of that shift going forwards.

cgracey · 2017-10-31 17:32

TonyB_,

What does the code look like that does that?

TonyB_ · 2017-10-31 18:39

cgracey wrote: »

TonyB_,

What does the code look like that does that?

I'm not so good with C-style operators. I hope this pseudo-code is right.

;Reverse xoroshiro algorithm for triplet [a,b,c]

x   =  s1 ROR c				;x = s0 XOR s1
s0 := (s0 XOR (x SHL b) XOR x) ROR a
s1 :=  s0 XOR x

;Forward xoroshiro algorithm for triplet [a,b,c]

x   =  s0 XOR s1
s0 := (s0 ROL a) XOR (x SHL b) XOR x
s1 :=   x ROL c

Added forward code as a check.

TonyB_ · 2017-10-31 18:52

Mistake in x for reverse code corrected.

TonyB_ · 2017-10-31 20:54

Here is another summary for xoroshiro32+ [14,2,7], ignoring today's and earlier failures:

Xoroshiro32+ [14,2,7] sum bits
[15:0]  [15:8]  [7:0]    [15] 
====================================================================================
 512K     1G     	  1G	sum[0] = sum[0]    original algorithm
  16M     1G     	  1G	sum[0] = sum[15:0] parity
 512M     1G    512M	  1G	sum[0] = sum[16:0] parity
   ?	  1G	  ?	  1G	sum[0] = sum[16:0] parity, sum[1] = sum[16:1] parity

I think the first test to do next should be aimed at improving bit 1 as shown in the last line above. We know that 16-bit parity improves bit 0 and it might do the same for bit 1. We can't keep going to the parity well after that because 15-bit parity or less won't be random.

If [15:0] and [7:0] scores are the same, bit 1 will need testing on its own, before and after, to detect any difference.

cgracey · 2017-10-31 22:02

The significant thing seems to be that there are at least a few stages of carry underneath the high-quality bits, or the bits are affected by what amounts to a few stages of carry - the more the better.

I'm anxious to see if Evanh can run the test on the normal and reverse-field adds, where every bit of the final result has at least 8 stages of carry under it.

TonyB_ · 2017-10-31 22:05

TonyB_ wrote: »

;Reverse xoroshiro algorithm for triplet [a,b,c]

x   =  s1 ROR c				;x = s0 XOR s1
s0 := (s0 XOR (x SHL b) XOR x) ROR a
s1 :=  s0 XOR x

I think reverse code needs changing to this:

;Reverse xoroshiro algorithm for triplet [a,b,c]

x   =   s1 ROR c			;x = s0 XOR s1
s0 :=  (s0 XOR (x SHL b) XOR x) ROR a
s1 := ((s0 XOR (x SHL b) XOR x) ROR a) XOR x

TonyB_ · 2017-10-31 22:16

cgracey wrote: »

The significant thing seems to be that there are at least a few stages of carry underneath the high-quality bits, or the bits are affected by what amounts to a few stages of carry - the more the better.

I'm anxious to see if Evanh can run the test on the normal and reverse-field adds, where every bit of the final result has at least 8 stages of carry under it.

It's not a case of either/or, just the order of the tests.

LFSR artifacts have gone from bit 0 now. Try to do the same for bit 1 where they are not so bad, so if any improvement is only a small one that's all we need, probably.

cgracey · 2017-10-31 22:24

TonyB_,

Reversing something like this seems complicated. And that shift still bothers me.

If you can go forward 100 iterations and then backward 100 iterations, and wind up where you started, I guess you've got it figured out.

TonyB_ · 2017-10-31 22:49

cgracey wrote: »

TonyB_,

Reversing something like this seems complicated. And that shift still bothers me.

If you can go forward 100 iterations and then backward 100 iterations, and wind up where you started, I guess you've got it figured out.

Chip,

64K iterations backwards from state $0000_0001 gives state $0A5A_9287 and you could try that forwards.

cgracey · 2017-10-31 23:15

TonyB_ wrote: »

cgracey wrote: »

TonyB_,

Reversing something like this seems complicated. And that shift still bothers me.

If you can go forward 100 iterations and then backward 100 iterations, and wind up where you started, I guess you've got it figured out.

Chip,

64K iterations backwards from state $0000_0001 gives state $0A5A_9287 and you could try that forwards.

Somehow, I'm getting $0001_0009 after 64k iterations, which is 32k XORO32 instructions, right?

TonyB_ · 2017-10-31 23:40

32768 XORO instructions if $0A5A_9287 is correct.

TonyB_ · 2017-10-31 23:53

Sanity check:

Chip, what state do you get after 1024 XORO32's from state $0000_0001?

cgracey · 2017-11-01 00:34

TonyB_ wrote: »

Sanity check:

Chip, what state do you get after 1024 XORO32's from state $0000_0001?

I'm not at my computer, anymore, for the next few hours. When I get home I'll check.

TonyB_ · 2017-11-01 02:45

Attached file has states and sums for 2048 xoroshiro32+p iterations backwards and forwards from state $0000_0001.

As xoroshiro32+ can run in reverse on the Z80, I'm sure it would be possible on the P2!

Another thought:
CZ opcode bits in XORO32 could select forward/reverse iterations and bits in each word normal/reversed.

TonyB_ · 2017-11-01 03:03

cgracey wrote: »

TonyB_ wrote: »

cgracey wrote: »

TonyB_,

Reversing something like this seems complicated. And that shift still bothers me.

If you can go forward 100 iterations and then backward 100 iterations, and wind up where you started, I guess you've got it figured out.

Chip,

64K iterations backwards from state $0000_0001 gives state $0A5A_9287 and you could try that forwards.

Somehow, I'm getting $0001_0009 after 64k iterations, which is 32k XORO32 instructions, right?

I've just noticed that $0001_0009 is the 32-bit sum, not the state which should be $0000_0001. Nothing was wrong at my end.

cgracey · 2017-11-01 03:07

TonyB_ wrote: »

cgracey wrote: »

TonyB_ wrote: »

cgracey wrote: »

TonyB_,

Reversing something like this seems complicated. And that shift still bothers me.

If you can go forward 100 iterations and then backward 100 iterations, and wind up where you started, I guess you've got it figured out.

Chip,

64K iterations backwards from state $0000_0001 gives state $0A5A_9287 and you could try that forwards.

Somehow, I'm getting $0001_0009 after 64k iterations, which is 32k XORO32 instructions, right?

I've just noticed that $0001_0009 is the 32-bit sum, not the state which should be $0000_0001. Nothing was wrong.

Yes! I was thinking that was probably the case on the way back home. I was going to try it out.

So, it looks like it IS possible to reverse our PRNG pretty easily.

Now, I'm wondering if it's useful to do so. Any ideas?

TonyB_ · 2017-11-01 03:20

Two PRNGs could start off with the same seed in different directions. They wouldn't meet again for 2^31 or 2^31-1 iterations, the maximum gap possible. If they use all 32-bits as one PRN, the high word in one would be the low word in the other due to the odd period.

TonyB_ · 2017-11-01 03:25

deleted

Random/LFSR on P2

Comments