Well, here's the source as compiled ... PractRand pukes on it at 1kB.
[snip]
Wait, it does? Did I screw something up, maybe in the minimal adaption I did when copying it in to this thread? The code looks right... examining your test.out, the data even looks right, well, the first 16 bytes match my first 16 bytes anyway, haven't checked further but that's probably enough. 1 KB failure on PractRand implies that something is very very wrong in a way that should probably be obvious looking at the data, but I can't see anything. Hm... you didn't, by any chance, test the ASCII hexadecimal representation of output like in test.out instead of the binary representation, did you? That's the only thing I can think of at the moment that would produce that big a discrepancy in our results.
I think the first test to do next should be aimed at improving bit 1 as shown in the last line above. We know that 16-bit parity improves bit 0 and it might do the same for bit 1. We can't keep going to the parity well after that because 15-bit parity or less won't be random.
If [15:0] and [7:0] scores are the same, bit 1 will need testing on its own, before and after, to detect any difference.
Evan, please try the above. Change is sum[1] = sum[16:1] parity.
Scro says he intentionally biased it. What that implies I'm not sure. Maybe it's a sort of cheat to beat up on PractRand and it ain't quite as good as the score suggests.
Scro says he intentionally biased it. What that implies I'm not sure. Maybe it's a sort of cheat to beat up on PractRand and it ain't quite as good as the score suggests.
Okay. I see what that is doing. It does twice as many bit adders and maybe twice as many XORs as our current XORO32 implementation. But it does score 32 times better, right?
Scro, thank you for showing us that. It's very interesting. What constitutes the bias within it, though?
What we have now is much more resource efficient. I wonder how much better we could do if we improved bit 1's quality?
There must be more to it than that. Sampling the msByte should be better than 1GB but it isn't. I think all we'll ever achieve is to bring bit 1 up to this level. That was all we were wanting before of course.
What we have now is much more resource efficient. I wonder how much better we could do if we improved bit 1's quality?
There must be more to it than that. Sampling the msByte should be better than 1GB but it isn't. I think all we'll ever achieve is to bring bit 1 up to this level. That was all we were wanting before of course.
If we could get 1G for word, top byte, bottom byte and top bit we'd have consistency.
The scoring is taken from the PractRand failure size, not the capacity of the RNG. Practrand only detects the failure after the poor quality data has been generated.
So half every score to get a better idea of quality working range of each RNG tested.
XORing bit 0 of the sum with higher parity could, in effect, add a carry input into the bit 0 sum, which it would not have otherwise. Bit 1 already has a carry from bit 0, so higher parity might cancel that out and results could be worse.
100% coverage parity is the only one that is a contender. Anything else is at least 32x sub-par. I don't think there is any way to make up a second bonus bit with this approach.
The fundamental point about using parity is that provided the number as a whole is random then its parity will be random. Once we start ignoring bits what's left is not random.
Comments
evanh@control:~/hoard/rng_testing$ test | ./PractRand stdin -a -tlmin 1KB
should have been
evanh@control:~/hoard/rng_testing$ ./test | ./PractRand stdin -a -tlmin 1KB
Maybe I should stop using "test" as a name for my quick hacks.
Anyway, it scores 32GB now. On a 32-bit state! Very nice.
Xoroshiro32+p PractRand Score Table - Run 2017-11-02 00:01:54 +1300 Combination Word0 Word1 Word2 msByte Byte04 Byte2 Byte1 Byte0 msBit ===================================================================================== [ 1 2 8] 8M 64M 32M 16M 512K 1M 16M 16M 1G [ 2 1 11] 8M 32M 16M 256M 1M 128K 128K 128K 1G [ 2 1 15] 256K 1M 2M 512K 128K 128K 64K 32K 1G [ 2 2 7] 8M 32M 32M 16M 128K 1M 8M 32M 1G [ 2 6 15] 16M 64M 16M 512K 512K 512K 4M 16M 1G [ 2 8 7] 512K 64M 32M 32M 128K 1M 8M 128M 1G [ 2 9 9] 128K 512K 1M 32M 256K 64K 64K 16K 1G [ 2 11 3] 1M 4M 2M 128K 2M 256K 256K 512K 1G [ 3 2 6] 32M 64M 32M 256K 512K 512K 2M 2M 1G [ 3 3 10] 16M 64M 32M 64M 1M 32M 512K 256K 1G [ 3 11 2] 256K 2M 4M 8M 512K 256K 128K 128K 512M [ 3 11 14] 32M 64M 64M 64M 128M 512K 256K 256K 1G [ 4 1 9] 1M 1M 1M 512K 1M 512K 512K 512K 512M [ 4 7 15] 32M 128M 32M 512K 256K 1M 8M 128M 1G [ 4 8 5] 8M 16M 16M 128K 512K 1M 2M 32M 1G [ 5 2 6] 16M 64M 16M 64K 1M 512K 1M 8M 1G [ 5 8 4] 4M 16M 16M 64M 4M 1M 2M 4M 128M [ 5 14 12] 32M 256M 128M 32M 64M 1M 1M 2M 1G [ 6 2 3] 32M 128M 64M 128M 1M 512K 1M 1M 1G [ 6 2 5] 16M 64M 64M 128M 1M 512K 512K 256K 1G [ 6 2 11] 32M 512M 256M 16M 32M 16M 16M 8M 256M [ 6 3 15] 32M 256M 128M 16M 4M 1M 1M 1M 256M [ 6 14 11] 32M 512M 256M 16M 64M 16M 16M 32M 256M [ 7 1 8] 4M 16M 16M 8M 128K 512K 2M 16M 256M [ 7 2 2] 8M 32M 64M 16M 2M 1M 1M 2M 1G [ 7 2 10] 32M 256M 128M 8M 2M 128M 32M 128M 128M [ 7 2 14] 64M 256M 256M 16M 16M 2M 2M 4M 64M [ 7 8 2] 16M 64M 64M 32M 256K 512K 2M 64M 1G [ 7 10 10] 32M 256M 128M 8M 2M 4M 4M 32M 128M [ 7 15 8] 512K 1M 2M 256K 128K 512K 256K 256K 128M [ 8 1 7] 4M 16M 16M 16M 256K 4M 8M 1M 1G [ 8 2 1] 8M 64M 128M 8M 4M 2M 4M 8M 1G [ 8 5 13] 32M 128M 128M 4M 32M 32M 16M 16M 16M [ 8 7 15] 1M 32M 32M 8M 2M 4M 2M 512K 16M [ 8 9 13] 16M 16M 8M 4M 16M 8M 16M 512M 16M [ 8 15 7] 512K 1M 2M 256K 256K 512K 512K 256K 1G [ 9 1 4] 4M 8M 8M 1M 32M 4M 4M 8M 512M [ 9 2 14] 32M 512M 256M 4M 16M 32M 32M 128M 4M [ 9 8 14] 64M 128M 128M 2M 8M 16M 32M 256M 4M [ 9 9 2] 8M 64M 128M 32M 4M 4M 8M 16M 1G [10 2 7] 64M 128M 128M 128M 8M 32M 32M 16M 1G [10 3 3] 32M 128M 256M 4M 8M 8M 16M 32M 1G [10 3 11] 2M 32M 16M 256K 2M 16M 16M 256M 4M [10 5 13] 32M 64M 64M 1M 16M 64M 32M 1G 2M [10 7 11] 4M 32M 16M 128K 8M 16M 16M 512M 2M [10 10 7] 32M 128M 128M 64M 4M 32M 64M 16M 1G [11 1 2] 64M 128M 64M 2M 8M 16M 2M 32M 1G [11 1 14] 32M 64M 32M 512K 8M 16M 32M 32M 512K [11 2 6] 64M 128M 256M 32M 64M 128M 32M 32M 1G [11 3 10] 32M 128M 512M 16M 4M 32M 32M 64M 1G [11 7 10] 8M 32M 64M 16M 2M 2M 32M 32M 1G [11 8 12] 1M 8M 8M 128K 4M 8M 16M 512M 1M [11 10 12] 1M 8M 8M 128K 4M 8M 16M 1G 512K [11 11 14] 4M 8M 8M 512K 4M 16M 32M 16M 512K [11 14 6] 64M 64M 128M 16M 32M 256M 128M 256M 1G [12 4 15] 2M 8M 8M 256K 512M 1M 256K 128M 64K [12 8 11] 16M 64M 64M 4M 4M 1M 16M 64M 1G [12 10 11] 8M 64M 64M 4M 4M 1M 512K 128K 1G [12 10 13] 512K 2M 2M 64K 1M 8M 8M 16M 64K [12 14 5] 64M 128M 64M 1M 32M 32M 128M 128M 1G [13 3 14] 2M 4M 4M 128K 4M 256K 128K 64M 32K [13 5 8] 32M 128M 128M 128M 16M 32M 32M 512M 1G [13 5 10] 128M 256M 128M 32M 128M 32M 32M 64M 1G [13 9 8] 32M 128M 256M 1G 4M 16M 64M 512M 1G [13 10 12] 8M 32M 32M 512K 8M 1M 128K 128K 1G [13 11 14] 256K 512K 512K 64K 256K 2M 4M 8M 16K [13 12 14] 256K 512K 512K 32K 256K 2M 4M 8M 16K [13 13 14] 128K 512K 512K 32K 256K 2M 4M 8M 16K [14 1 11] 32M 256M 128M 1M 4M 4M 256K 256K 1G [14 2 7] 128M 64M 64M 256K 8M 16M 64M 1G 1G [14 2 9] 128M 512M 256M 64M 2M 64M 64M 128M 1G [14 3 13] 8M 64M 32M 256K 4M 64K 128K 1M 1G [14 8 9] 16M 64M 128M 128M 2M 4M 16M 1G 1G [14 11 3] 4M 16M 8M 128K 2M 4M 256K 256M 1G [14 11 11] 16M 32M 32M 1M 2M 128K 256K 256K 1G [14 11 13] 2M 8M 8M 256K 4M 128K 128K 128K 1G [14 12 13] 1M 4M 4M 256K 2M 128K 128K 128K 1G [14 13 13] 512K 2M 2M 256K 128K 64K 128K 64K 1G [15 1 2] 64K 1M 1M 64K 1M 512K 256K 1M 1G [15 3 6] 2M 16M 16M 64K 8M 2M 2M 512M 1G [15 4 12] 1M 16M 16M 128K 16M 256K 512K 128M 1G [15 6 2] 1M 2M 2M 64K 8M 1M 2M 128M 1G [15 7 4] 1M 16M 8M 128K 16M 2M 4M 512M 1G [15 7 8] 128K 8M 16M 2M 256K 512K 4M 32M 1G
uint64_t z0 = 0, s0 = s[0]; uint64_t z1 = 0, s1 = s[1]; int shift; uint64_t rezult, result = s0 + s1; for( shift = 0; shift < ACCUM_SIZE; shift++ ) { z0 |= ((s0 >> shift) & 1) << (ACCUM_SIZE - 1 - shift); z1 |= ((s1 >> shift) & 1) << (ACCUM_SIZE - 1 - shift); } rezult = z0 + z1; result = (rezult & (ACCUM_MASK ^ HALF_MASK)) | ((result >> (ACCUM_SIZE/2)) & HALF_MASK);
#define ACCUM_MASK (((1ULL<<(ACCUM_SIZE-1))-1)<<1|1) // Bit of a shuffle to work on all 64 bits :) #define HALF_MASK (ACCUM_MASK>>(ACCUM_SIZE/2))
Evan, please try the above. Change is sum[1] = sum[16:1] parity.
Thanks for trying this, Evanh. All your code looks correct. So, that idea did not work.
What was this other algorithm that got to 32GB-quality in 32 bits of state? That is the best yet!
Here's the original as supplied by Scro - http://forums.parallax.com/discussion/comment/1424224/#Comment_1424224
Scro says he intentionally biased it. What that implies I'm not sure. Maybe it's a sort of cheat to beat up on PractRand and it ain't quite as good as the score suggests.
Okay. I see what that is doing. It does twice as many bit adders and maybe twice as many XORs as our current XORO32 implementation. But it does score 32 times better, right?
Scro, thank you for showing us that. It's very interesting. What constitutes the bias within it, though?
The showstopper is that we don't have enough time in a clock cycle to do two 32-bit additions, one after the other.
What we have now is much more resource efficient. I wonder how much better we could do if we improved bit 1's quality?
Bits [15:8] and [15] should have same scores as before. If not, something is wrong.
But there are only 4G states in 32 bits, right?
If each state outputs 4 bytes, it seems that only a 16GB score should be possible.
---- unless maybe it was taking in hexadecimal, so each byte became two ASCII bytes, winding the score up to 32GB?
If we could get 1G for word, top byte, bottom byte and top bit we'd have consistency.
So half every score to get a better idea of quality working range of each RNG tested.
Okay. Now I understand. Thanks.
Here's the source that I'm not sure about and gives messy scores:
for( shift=1; shift <= ACCUM_SIZE; shift++ ) { parity = parity ^ (result >> shift); } result = (result ^ (parity & 3)) & ACCUM_MASK;
Here's the source that works well and only has the weak bit 1:
for( shift=1; shift <= ACCUM_SIZE; shift++ ) { parity = parity ^ (result >> shift); } result = (result ^ (parity & 1)) & ACCUM_MASK;
PS: I'm off the bed.