Random/LFSR on P2

cgracey · 2017-11-01 03:25

That's a good point. Having some offset seeds would be good, too, and not cost any hardware.

cgracey · 2017-11-01 03:29

For as much hardware as it would take to make our current PRNG reversible, we could just have an additional tap set implementation.

scro · 2017-11-01 03:40

evanh wrote: »

Well, here's the source as compiled ... PractRand pukes on it at 1kB.

[snip]

Wait, it does? Did I screw something up, maybe in the minimal adaption I did when copying it in to this thread? The code looks right... examining your test.out, the data even looks right, well, the first 16 bytes match my first 16 bytes anyway, haven't checked further but that's probably enough. 1 KB failure on PractRand implies that something is very very wrong in a way that should probably be obvious looking at the data, but I can't see anything. Hm... you didn't, by any chance, test the ASCII hexadecimal representation of output like in test.out instead of the binary representation, did you? That's the only thing I can think of at the moment that would produce that big a discrepancy in our results.

evanh · 2017-11-01 09:19

Doh! Found the problem. Thanks for the assurance, I went and had another look ... The problem was at the command line, the following

evanh@control:~/hoard/rng_testing$ test | ./PractRand stdin -a -tlmin 1KB

should have been

evanh@control:~/hoard/rng_testing$ ./test | ./PractRand stdin -a -tlmin 1KB

Maybe I should stop using "test" as a name for my quick hacks.

Anyway, it scores 32GB now. On a 32-bit state! Very nice.

evanh · 2017-11-01 09:22

Damn, I must have been dozing 'cos I was clearly using ./test to produce the sample data as text.

evanh · 2017-11-01 09:40

evanh wrote: »

Anyway, it scores 32GB now. On a 32-bit state! Very nice.

Managed to knock that down to 16GB by using the options -tf 2 -te 1

evanh · 2017-11-01 11:05

cgracey wrote: »

sum_a[15:0] = state[31:16] + state[15:0]
sum_b[15:0] = state[16:31] + state[0:15]
result[15:0] = {sum_a[15:8], sum_b[15:8]}

Not good at all.

  Xoroshiro32+p PractRand Score Table - Run 2017-11-02 00:01:54 +1300

Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   Byte0   msBit
=====================================================================================
 [ 1  2  8]      8M     64M     32M     16M    512K      1M     16M     16M      1G
 [ 2  1 11]      8M     32M     16M    256M      1M    128K    128K    128K      1G
 [ 2  1 15]    256K      1M      2M    512K    128K    128K     64K     32K      1G
 [ 2  2  7]      8M     32M     32M     16M    128K      1M      8M     32M      1G
 [ 2  6 15]     16M     64M     16M    512K    512K    512K      4M     16M      1G
 [ 2  8  7]    512K     64M     32M     32M    128K      1M      8M    128M      1G
 [ 2  9  9]    128K    512K      1M     32M    256K     64K     64K     16K      1G
 [ 2 11  3]      1M      4M      2M    128K      2M    256K    256K    512K      1G
 [ 3  2  6]     32M     64M     32M    256K    512K    512K      2M      2M      1G
 [ 3  3 10]     16M     64M     32M     64M      1M     32M    512K    256K      1G
 [ 3 11  2]    256K      2M      4M      8M    512K    256K    128K    128K    512M
 [ 3 11 14]     32M     64M     64M     64M    128M    512K    256K    256K      1G
 [ 4  1  9]      1M      1M      1M    512K      1M    512K    512K    512K    512M
 [ 4  7 15]     32M    128M     32M    512K    256K      1M      8M    128M      1G
 [ 4  8  5]      8M     16M     16M    128K    512K      1M      2M     32M      1G
 [ 5  2  6]     16M     64M     16M     64K      1M    512K      1M      8M      1G
 [ 5  8  4]      4M     16M     16M     64M      4M      1M      2M      4M    128M
 [ 5 14 12]     32M    256M    128M     32M     64M      1M      1M      2M      1G
 [ 6  2  3]     32M    128M     64M    128M      1M    512K      1M      1M      1G
 [ 6  2  5]     16M     64M     64M    128M      1M    512K    512K    256K      1G
 [ 6  2 11]     32M    512M    256M     16M     32M     16M     16M      8M    256M
 [ 6  3 15]     32M    256M    128M     16M      4M      1M      1M      1M    256M
 [ 6 14 11]     32M    512M    256M     16M     64M     16M     16M     32M    256M
 [ 7  1  8]      4M     16M     16M      8M    128K    512K      2M     16M    256M
 [ 7  2  2]      8M     32M     64M     16M      2M      1M      1M      2M      1G
 [ 7  2 10]     32M    256M    128M      8M      2M    128M     32M    128M    128M
 [ 7  2 14]     64M    256M    256M     16M     16M      2M      2M      4M     64M
 [ 7  8  2]     16M     64M     64M     32M    256K    512K      2M     64M      1G
 [ 7 10 10]     32M    256M    128M      8M      2M      4M      4M     32M    128M
 [ 7 15  8]    512K      1M      2M    256K    128K    512K    256K    256K    128M
 [ 8  1  7]      4M     16M     16M     16M    256K      4M      8M      1M      1G
 [ 8  2  1]      8M     64M    128M      8M      4M      2M      4M      8M      1G
 [ 8  5 13]     32M    128M    128M      4M     32M     32M     16M     16M     16M
 [ 8  7 15]      1M     32M     32M      8M      2M      4M      2M    512K     16M
 [ 8  9 13]     16M     16M      8M      4M     16M      8M     16M    512M     16M
 [ 8 15  7]    512K      1M      2M    256K    256K    512K    512K    256K      1G
 [ 9  1  4]      4M      8M      8M      1M     32M      4M      4M      8M    512M
 [ 9  2 14]     32M    512M    256M      4M     16M     32M     32M    128M      4M
 [ 9  8 14]     64M    128M    128M      2M      8M     16M     32M    256M      4M
 [ 9  9  2]      8M     64M    128M     32M      4M      4M      8M     16M      1G
 [10  2  7]     64M    128M    128M    128M      8M     32M     32M     16M      1G
 [10  3  3]     32M    128M    256M      4M      8M      8M     16M     32M      1G
 [10  3 11]      2M     32M     16M    256K      2M     16M     16M    256M      4M
 [10  5 13]     32M     64M     64M      1M     16M     64M     32M      1G      2M
 [10  7 11]      4M     32M     16M    128K      8M     16M     16M    512M      2M
 [10 10  7]     32M    128M    128M     64M      4M     32M     64M     16M      1G
 [11  1  2]     64M    128M     64M      2M      8M     16M      2M     32M      1G
 [11  1 14]     32M     64M     32M    512K      8M     16M     32M     32M    512K
 [11  2  6]     64M    128M    256M     32M     64M    128M     32M     32M      1G
 [11  3 10]     32M    128M    512M     16M      4M     32M     32M     64M      1G
 [11  7 10]      8M     32M     64M     16M      2M      2M     32M     32M      1G
 [11  8 12]      1M      8M      8M    128K      4M      8M     16M    512M      1M
 [11 10 12]      1M      8M      8M    128K      4M      8M     16M      1G    512K
 [11 11 14]      4M      8M      8M    512K      4M     16M     32M     16M    512K
 [11 14  6]     64M     64M    128M     16M     32M    256M    128M    256M      1G
 [12  4 15]      2M      8M      8M    256K    512M      1M    256K    128M     64K
 [12  8 11]     16M     64M     64M      4M      4M      1M     16M     64M      1G
 [12 10 11]      8M     64M     64M      4M      4M      1M    512K    128K      1G
 [12 10 13]    512K      2M      2M     64K      1M      8M      8M     16M     64K
 [12 14  5]     64M    128M     64M      1M     32M     32M    128M    128M      1G
 [13  3 14]      2M      4M      4M    128K      4M    256K    128K     64M     32K
 [13  5  8]     32M    128M    128M    128M     16M     32M     32M    512M      1G
 [13  5 10]    128M    256M    128M     32M    128M     32M     32M     64M      1G
 [13  9  8]     32M    128M    256M      1G      4M     16M     64M    512M      1G
 [13 10 12]      8M     32M     32M    512K      8M      1M    128K    128K      1G
 [13 11 14]    256K    512K    512K     64K    256K      2M      4M      8M     16K
 [13 12 14]    256K    512K    512K     32K    256K      2M      4M      8M     16K
 [13 13 14]    128K    512K    512K     32K    256K      2M      4M      8M     16K
 [14  1 11]     32M    256M    128M      1M      4M      4M    256K    256K      1G
 [14  2  7]    128M     64M     64M    256K      8M     16M     64M      1G      1G
 [14  2  9]    128M    512M    256M     64M      2M     64M     64M    128M      1G
 [14  3 13]      8M     64M     32M    256K      4M     64K    128K      1M      1G
 [14  8  9]     16M     64M    128M    128M      2M      4M     16M      1G      1G
 [14 11  3]      4M     16M      8M    128K      2M      4M    256K    256M      1G
 [14 11 11]     16M     32M     32M      1M      2M    128K    256K    256K      1G
 [14 11 13]      2M      8M      8M    256K      4M    128K    128K    128K      1G
 [14 12 13]      1M      4M      4M    256K      2M    128K    128K    128K      1G
 [14 13 13]    512K      2M      2M    256K    128K     64K    128K     64K      1G
 [15  1  2]     64K      1M      1M     64K      1M    512K    256K      1M      1G
 [15  3  6]      2M     16M     16M     64K      8M      2M      2M    512M      1G
 [15  4 12]      1M     16M     16M    128K     16M    256K    512K    128M      1G
 [15  6  2]      1M      2M      2M     64K      8M      1M      2M    128M      1G
 [15  7  4]      1M     16M      8M    128K     16M      2M      4M    512M      1G
 [15  7  8]    128K      8M     16M      2M    256K    512K      4M     32M      1G

evanh · 2017-11-01 11:07

	uint64_t  z0 = 0, s0 = s[0];
	uint64_t  z1 = 0, s1 = s[1];
	int  shift;
	uint64_t  rezult, result = s0 + s1;


	for( shift = 0; shift < ACCUM_SIZE; shift++ )  {
		z0 |= ((s0 >> shift) & 1) << (ACCUM_SIZE - 1 - shift);
		z1 |= ((s1 >> shift) & 1) << (ACCUM_SIZE - 1 - shift);
	}

	rezult = z0 + z1;
	result = (rezult & (ACCUM_MASK ^ HALF_MASK)) | ((result >> (ACCUM_SIZE/2)) & HALF_MASK);

evanh · 2017-11-01 11:08

#define  ACCUM_MASK  (((1ULL<<(ACCUM_SIZE-1))-1)<<1|1)   // Bit of a shuffle to work on all 64 bits :)
#define  HALF_MASK  (ACCUM_MASK>>(ACCUM_SIZE/2))

TonyB_ · 2017-11-01 11:22

TonyB_ wrote: »
Here is another summary for xoroshiro32+ [14,2,7], ignoring today's and earlier failures:
Xoroshiro32+ [14,2,7] sum bits
[15:0]  [15:8]  [7:0]    [15] 
====================================================================================
 512K     1G     	  1G	sum[0] = sum[0]    original algorithm
  16M     1G     	  1G	sum[0] = sum[15:0] parity
 512M     1G    512M	  1G	sum[0] = sum[16:0] parity
   ?	  1G	  ?	  1G	sum[0] = sum[16:0] parity, sum[1] = sum[16:1] parity
I think the first test to do next should be aimed at improving bit 1 as shown in the last line above. We know that 16-bit parity improves bit 0 and it might do the same for bit 1. We can't keep going to the parity well after that because 15-bit parity or less won't be random.

If [15:0] and [7:0] scores are the same, bit 1 will need testing on its own, before and after, to detect any difference.

Evan, please try the above. Change is sum[1] = sum[16:1] parity.

cgracey · 2017-11-01 12:47

evanh wrote: »

#define  ACCUM_MASK  (((1ULL<<(ACCUM_SIZE-1))-1)<<1|1)   // Bit of a shuffle to work on all 64 bits :)
#define  HALF_MASK  (ACCUM_MASK>>(ACCUM_SIZE/2))

Thanks for trying this, Evanh. All your code looks correct. So, that idea did not work.

What was this other algorithm that got to 32GB-quality in 32 bits of state? That is the best yet!

evanh · 2017-11-01 12:56

cgracey wrote: »

What was this other algorithm that got to 32GB-quality in 32 bits of state? That is the best yet!

Here's my compile of it - http://forums.parallax.com/discussion/comment/1424408/#Comment_1424408

Here's the original as supplied by Scro - http://forums.parallax.com/discussion/comment/1424224/#Comment_1424224

Scro says he intentionally biased it. What that implies I'm not sure. Maybe it's a sort of cheat to beat up on PractRand and it ain't quite as good as the score suggests.

cgracey · 2017-11-01 13:10

evanh wrote: »

cgracey wrote: »

What was this other algorithm that got to 32GB-quality in 32 bits of state? That is the best yet!

Here's my compile of it - http://forums.parallax.com/discussion/comment/1424408/#Comment_1424408

Here's the original as supplied by Scro - http://forums.parallax.com/discussion/comment/1424224/#Comment_1424224

Scro says he intentionally biased it. What that implies I'm not sure. Maybe it's a sort of cheat to beat up on PractRand and it ain't quite as good as the score suggests.

Okay. I see what that is doing. It does twice as many bit adders and maybe twice as many XORs as our current XORO32 implementation. But it does score 32 times better, right?

Scro, thank you for showing us that. It's very interesting. What constitutes the bias within it, though?

evanh · 2017-11-01 13:28

cgracey wrote: »

But it does score 32 times better, right?

Ya, 32GB score is 8G-Iterations. PractRand is detecting the full period repeating, so it's a perfect score for a 32 bit state with 32 bit output.

cgracey · 2017-11-01 13:36

evanh wrote: »

cgracey wrote: »

But it does score 32 times better, right?

Ya, 32GB score is 8G-Iterations. PractRand is detecting the full period repeating, so it's a perfect score for a 32 bit state with 32 bit output.

The showstopper is that we don't have enough time in a clock cycle to do two 32-bit additions, one after the other.

What we have now is much more resource efficient. I wonder how much better we could do if we improved bit 1's quality?

evanh · 2017-11-01 13:38

TonyB_ wrote: »

Evan, please try the above. Change is sum[1] = sum[16:1] parity.

I have tried a version that I'm not sure I got right. The scores weren't good.

evanh · 2017-11-01 13:41

evanh wrote: »

Ya, 32GB score is 8G-Iterations.

Here's the PractRand output. It straight falls off a cliff at 32GB.

TonyB_ · 2017-11-01 13:45

evanh wrote: »

TonyB_ wrote: »

Evan, please try the above. Change is sum[1] = sum[16:1] parity.

I have tried a version that I'm not sure I got right. The scores weren't good.

Bits [15:8] and [15] should have same scores as before. If not, something is wrong.

evanh · 2017-11-01 13:46

cgracey wrote: »

What we have now is much more resource efficient. I wonder how much better we could do if we improved bit 1's quality?

There must be more to it than that. Sampling the msByte should be better than 1GB but it isn't. I think all we'll ever achieve is to bring bit 1 up to this level. That was all we were wanting before of course.

cgracey · 2017-11-01 13:47

evanh wrote: »

evanh wrote: »

Ya, 32GB score is 8G-Iterations.

Here's the PractRand output. It straight falls off a cliff at 32GB.

But there are only 4G states in 32 bits, right?

If each state outputs 4 bytes, it seems that only a 16GB score should be possible.

---- unless maybe it was taking in hexadecimal, so each byte became two ASCII bytes, winding the score up to 32GB?

TonyB_ · 2017-11-01 13:54

evanh wrote: »

cgracey wrote: »

What we have now is much more resource efficient. I wonder how much better we could do if we improved bit 1's quality?

There must be more to it than that. Sampling the msByte should be better than 1GB but it isn't. I think all we'll ever achieve is to bring bit 1 up to this level. That was all we were wanting before of course.

If we could get 1G for word, top byte, bottom byte and top bit we'd have consistency.

evanh · 2017-11-01 13:55

The scoring is taken from the PractRand failure size, not the capacity of the RNG. Practrand only detects the failure after the poor quality data has been generated.

So half every score to get a better idea of quality working range of each RNG tested.

cgracey · 2017-11-01 14:03

evanh wrote: »

cgracey wrote: »

But it does score 32 times better, right?

Ya, 32GB score is 8G-Iterations. PractRand is detecting the full period repeating, so it's a perfect score for a 32 bit state with 32 bit output.

Okay. Now I understand. Thanks.

TonyB_ · 2017-11-01 14:05

XORing bit 0 of the sum with higher parity could, in effect, add a carry input into the bit 0 sum, which it would not have otherwise. Bit 1 already has a carry from bit 0, so higher parity might cancel that out and results could be worse.

evanh · 2017-11-01 14:11

Heh. If there was an edit count on the posts, mine would hit double digits quite often.

cgracey · 2017-11-01 14:12

We could even try parity of just the even bits or parity of just the odd bits as an XOR source for bits 1 and 0.

evanh · 2017-11-01 14:17

100% coverage parity is the only one that is a contender. Anything else is at least 32x sub-par. I don't think there is any way to make up a second bonus bit with this approach.

TonyB_ · 2017-11-01 14:20

The fundamental point about using parity is that provided the number as a whole is random then its parity will be random. Once we start ignoring bits what's left is not random.

evanh · 2017-11-01 14:23

The most recent attempt was to add "sum[1] = sum[16:1] parity" along side "sum[0] = sum[16:0] parity".

Here's the source that I'm not sure about and gives messy scores:

	for( shift=1; shift <= ACCUM_SIZE; shift++ )  {
		parity = parity ^ (result >> shift);
	}
	result = (result ^ (parity & 3)) & ACCUM_MASK;

Here's the source that works well and only has the weak bit 1:

	for( shift=1; shift <= ACCUM_SIZE; shift++ )  {
		parity = parity ^ (result >> shift);
	}
	result = (result ^ (parity & 1)) & ACCUM_MASK;

PS: I'm off the bed.

TonyB_ · 2017-11-01 14:28

Thanks Evan, it will all still be here in the morning.

Random/LFSR on P2

Comments