... My PC isn't up to the job as 4 Gbytes of RAM would be needed ...
If DRAM prices hadn't gone through the roof in the last year or so I'd be telling you it's time to get more.
I've tested xoroshiro16++ and the frequency distribution of 16-bit pairs is close to the binomial for the best quadruples, which confirms that xoroshiro++ is not 2-dimensionally equidistributed.
What code can you show me. I'm struggling to grok your descriptions.
... My PC isn't up to the job as 4 Gbytes of RAM would be needed ...
If DRAM prices hadn't gone through the roof in the last year or so I'd be telling you it's time to get more.
I've tested xoroshiro16++ and the frequency distribution of 16-bit pairs is close to the binomial for the best quadruples, which confirms that xoroshiro++ is not 2-dimensionally equidistributed.
What code can you show me. I'm struggling to grok your descriptions.
My PC has all the RAM it can take and it's much less than 4GB!
Program sent by PM (and you'll know why when you see it).
Comments
[14,2,7,8] is the worst performer, as predicted by theory when d is exactly half the maximum possible rotation. The third table illustrates best how close most of the [a,b,c,d] quadruples are to the expected binomial, e.g. f0-f3 for [14,2,7,6] are each a better than 99.5% match and f1-f3 make up 92% of the total non-zero pair frequencies. Doing well in this test does not imply a quadruple will do the same in other tests.
I'm not convince her concerns are entirely valid for what we are wanting though:
1: The progressiveness in state changes are hidden by the scrambler. That's the scrambler's job.
2: Algorithmically predictable is a given. Not a concern to us. This isn't for cryptography.
3: Invertible I don't understand. She seems focused on particular multipliers there. Maybe it's similar to predictability in that it can be hacked. I'm unsure how important this is, but it just sounds like it needs new constants. No shortage of those to choose from.
EDIT: I guess when she read "all-purpose" she interpreted that to mean for cryptographic uses too.
I'm pretty sure Melisa does not include cryptographic quality when she says "all purpose".
As far as I understand these things, which is very little, she is pointing out a serious problem as in:
1) You have this PRNG whose output passses all maner of statistical tests of randomness. Which is sweet.
2) But if you happen to multiply it's output by the wrong numbers in your application, which is something one is likely to do, what you up end up with fails those statistical tests pretty quickly. Not so sweet.
That's just a simple change to the multiplier value then isn't it.
That sounds like a question. I have no idea.
There's no shortage of options for what constants to choose from. No biggie there.
Might take a long while to try them all and see which works best.
I thought the idea was use a multiplier that could be implemented as a couple of shifts and adds instead of an actual multiply. For the sake of speed. That would imply a multiplier with few 1 bits. In that case the search space for a better multiplier is much smaller.
It's fascinating to watch these PRNG "wars". Even if I don't understand the details:
Somebody dreams up a new PRNG.
Somebody else dreams up a new statistical test or other technique to show how crappy it is.
Rinse, repeat...
Next report for Tony - There's two versions of the numbers in the zdist_report, I think the first one is the correct one but that's not how the sudo-code was written. The second one is as per the sudo-code.
Next report for Tony - There's two versions of the numbers in the zdist_report, I think the first one is the correct one but that's not how the sudo-code was written. The second one is as per the sudo-code.
Thanks, Evan. I must have made a mistake with the zero distribution code. I didn't refer back to my working x86 version. We need zdist for two quads, ideally [14,2,7,5] and [14,2,7,6], to see if there is any real difference.
In case others are interested, the sixteen columns beginning with 6:15 are PractRand scores for every possible contiguous 8-bit sample of the 16-bit output. The first number is the msb and the second the lsb, with wrapping around from bit 0 to bit 15 where necessary. No quadruples have all 16 scores of 1G or more and overall quadruple [14,2,7,5] chosen for the XORO32 instruction does well.
Next report for Tony - There's two versions of the numbers in the zdist_report, I think the first one is the correct one but that's not how the pseudo-code was written. The second one is as per the pseudo-code.
Evan, a small correction is are needed in your C code from
so that all 2^32 pairs are read. However, I doubt there is much need for further distribution tests.
Looking at the zero run frequency distributions or zfreq for short, I'm not convinced that zfreq(0) is calculated correctly. Having said that, zfreq(n)/zfreq(n+1) ~ 2.71828 = e for most values of n, including n = 0. In other words, the zero run distribution is an exponential decay function.
I'm not convince her concerns are entirely valid for what we are wanting though:
1: The progressiveness in state changes are hidden by the scrambler. That's the scrambler's job.
2: Algorithmically predictable is a given. Not a concern to us. This isn't for cryptography.
3: Invertible I don't understand. She seems focused on particular multipliers there. Maybe it's similar to predictability in that it can be hacked. I'm unsure how important this is, but it just sounds like it needs new constants. No shortage of those to choose from.
The paper explains the theoretical basis for the multipliers in the * and ** scramblers.
I'm wondering how often people will multiply millions of outputs by the same value of 57.
Evan, a small correction is are needed in your C code from ...
It'd need written, in the prior loop, first.
I think it's correct already though. With the initial call to next_xoro(), there is a total of 2^32 random numbers generated. This should correctly count the wraparound pair.
Doh! I see now. I wasn't really trying to understand the code. The first loop is effectively building indexes into the array - which can be any 32-bit index.
Evan, a small correction is are needed in your C code from ...
It'd need written, in the prior loop, first.
I think it's correct already though. With the initial call to next_xoro(), there is a total of 2^32 random numbers generated. This should correctly count the wraparound pair.
next_xoro() loop is correct as period is 2^32-1 and an initial iteration is needed to get the first "half pair" but the loop that increments freq values is definitely one short as 2^32 pairs must be read.
That's interesting, looks like this is where C's do-while loop comes into its own. The other loop constructs, for() and while(), don't allow all values of an integer to be used because there is an implicit loop completion test at both ends of the loop.
Pity the do-while doesn't ascetically conform to common source formatting styles. That's probably the main reason I don't normally use it.
I've coded the generalised smaller than 8-bit sampling aperture now, and run the 1-bit sampling on the our select candidates. Everything went as expected until I hit [ 6 2 3 9]. It strangely has one seriously bad score at position 13 and there is no sign of this at 8-bit sampling.
Looking at that particular Practrand report file, it hasn't gone far over the threshold. In fact I'm a little surprised by how narrow that one seems to be. I've force a rerun with no termination and it went right to the expected 1 GB without further fails. So I'll put that down to PractRand being a little too sensitive on a particular test.
BEGIN ... at 2018-05-08 13:15:26
PractRand scoring candidate [6 2 3 9] of Xoroshiro32(16)++ random generator.
Without parity replacement bit.
PractRand v0.93 options: stdin -multithreaded -te 1 -tf 2 -tlmin 1KB -tlmaxonly
Sampling width 1, position 13
RNG_test using PractRand version 0.93
RNG = RNG_stdin, seed = 0x9d8a22e4
test set = expanded, folding = extra
rng=RNG_stdin, seed=0x9d8a22e4
length= 1 kilobyte (2^10 bytes), time= 0.4 seconds
no anomalies in 14 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 2 kilobytes (2^11 bytes), time= 0.7 seconds
no anomalies in 19 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 4 kilobytes (2^12 bytes), time= 1.0 seconds
no anomalies in 56 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 8 kilobytes (2^13 bytes), time= 1.7 seconds
Test Name Raw Processed Evaluation
[Low4/16]BCFN_FF(2+0):freq R= +7.1 p~= 5e-6 mildly suspicious
...and 113 test result(s) without anomalies
rng=RNG_stdin, seed=0x9d8a22e4
length= 16 kilobytes (2^14 bytes), time= 2.9 seconds
no anomalies in 179 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 32 kilobytes (2^15 bytes), time= 4.5 seconds
no anomalies in 246 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 64 kilobytes (2^16 bytes), time= 6.4 seconds
no anomalies in 316 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 128 kilobytes (2^17 bytes), time= 8.7 seconds
no anomalies in 367 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 256 kilobytes (2^18 bytes), time= 10.9 seconds
Test Name Raw Processed Evaluation
[Low1/32]BCFN_FF(2+2):freq R= +6.6 p~= 2e-5 unusual
...and 421 test result(s) without anomalies
rng=RNG_stdin, seed=0x9d8a22e4
length= 512 kilobytes (2^19 bytes), time= 13.3 seconds
no anomalies in 476 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 1 megabyte (2^20 bytes), time= 15.7 seconds
no anomalies in 531 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 2 megabytes (2^21 bytes), time= 18.2 seconds
Test Name Raw Processed Evaluation
[Low1/8]FPF-14+6/4:cross R= +5.1 p = 2.3e-4 unusual
[Low1/16]BCFN_FF(2+8):freq R= +12.4 p~= 5e-15 FAIL
[Low1/32]BCFN_FF(2+7):freq R= +9.1 p~= 5e-9 suspicious
...and 582 test result(s) without anomalies
rng=RNG_stdin, seed=0x9d8a22e4
length= 4 megabytes (2^22 bytes), time= 20.8 seconds
Test Name Raw Processed Evaluation
[Low1/16]DC6-9x1Bytes-1 R= +7.4 p = 4.3e-4 unusual
[Low1/32]BCFN_FF(2+8):freq R= +8.0 p~= 2e-7 unusual
...and 639 test result(s) without anomalies
rng=RNG_stdin, seed=0x9d8a22e4
length= 8 megabytes (2^23 bytes), time= 23.6 seconds
Test Name Raw Processed Evaluation
[Low8/32]BCFN_FF(2+0):freq R= +6.6 p~= 1e-5 unusual
...and 693 test result(s) without anomalies
rng=RNG_stdin, seed=0x9d8a22e4
length= 16 megabytes (2^24 bytes), time= 26.9 seconds
no anomalies in 747 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 32 megabytes (2^25 bytes), time= 30.5 seconds
no anomalies in 796 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 64 megabytes (2^26 bytes), time= 34.6 seconds
no anomalies in 843 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 128 megabytes (2^27 bytes), time= 39.4 seconds
no anomalies in 891 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 256 megabytes (2^28 bytes), time= 45.6 seconds
no anomalies in 938 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 512 megabytes (2^29 bytes), time= 53.5 seconds
no anomalies in 985 test result(s)
rng=RNG_stdin, seed=0x9d8a22e4
length= 1 gigabyte (2^30 bytes), time= 66.3 seconds
Test Name Raw Processed Evaluation
BCFN_FF(2+0,13-1,T) R= +26.5 p = 1.0e-13 FAIL
BCFN_FF(2+1,13-1,T) R= +31.1 p = 3.4e-16 FAIL !
BCFN_FF(2+2,13-1,T) R= +35.8 p = 9.7e-19 FAIL !
BCFN_FF(2+3,13-1,T) R= +43.1 p = 1.1e-22 FAIL !!
BCFN_FF(2+4,13-2,T) R= +26.8 p = 2.6e-13 FAIL
BCFN_FF(2+5,13-3,T) R= +26.1 p = 3.9e-12 VERY SUSPICIOUS
BCFN_FF(2+6,13-3,T) R= +26.1 p = 3.9e-12 VERY SUSPICIOUS
BCFN_FF(2+7,13-4,T) R= +17.5 p = 1.3e-7 suspicious
BCFN_FF(2+6):freq R= +8.3 p~= 1e-7 mildly suspicious
BCFN_FF(2+7):freq R= +8.1 p~= 2e-7 unusual
BCFN_FF(2+8):freq R= +11.5 p~= 3e-13 VERY SUSPICIOUS
BCFN_FF(2+9):freq R= +12.1 p~= 1e-14 FAIL
DC6-9x1Bytes-1 R= +20.9 p = 5.4e-11 FAIL
DC6-6x2Bytes-1 R= +37.5 p = 6.8e-18 FAIL !
DC6-5x4Bytes-1 R= +57.3 p = 4.0e-35 FAIL !!!
Gap-16:A R= +67.3 p = 1.1e-58 FAIL !!!!
Gap-16:B R= +47.1 p = 3.3e-40 FAIL !!!
[Low4/16]Gap-16:A R= +5.5 p = 3.8e-4 unusual
[Low4/32]Gap-16:A R= +6.2 p = 9.5e-5 unusual
[Low8/32]BCFN_FF(2+1,13-2,T) R= +11.0 p = 3.1e-5 mildly suspicious
[Low8/32]DC6-9x1Bytes-1 R= +19.7 p = 2.3e-10 VERY SUSPICIOUS
[Low8/32]DC6-6x2Bytes-1 R= +9.6 p = 5.4e-5 mildly suspicious
[Low8/32]Gap-16:A R= +26.0 p = 1.8e-20 FAIL !!
[Low8/32]Gap-16:B R= +15.0 p = 3.4e-12 FAIL
[Low8/64]BCFN_FF(2+0,13-3,T) R= +9.6 p = 2.4e-4 unusual
[Low8/64]DC6-9x1Bytes-1 R= +19.5 p = 1.4e-11 FAIL
[Low8/64]DC6-6x2Bytes-1 R= +11.0 p = 2.7e-6 suspicious
[Low8/64]Gap-16:A R= +23.5 p = 1.6e-19 FAIL !
[Low8/64]Gap-16:B R= +10.1 p = 4.2e-8 very suspicious
...and 1010 test result(s) without anomalies
The above exercise is me thinking that 8-bit sampling at each bit position is a good way to check each bit position of the generator's output.
I'm happy to say that not only is 8-bit sampling a good way to do it, but the best way. Practrand internally works on bytes and bigger. But it is stated in the documentation that it is also most sensitive to the patterns in bit0 of the incoming bytes. This means that by shifting an 8-bit sampling window, on a per run basis, to align Practrand's bit0 to each bit of the generator we can test each output bit one by one.
BTW: I can now sample the generator output at any width from 1-bit up to the width of the generator itself. And the generator can be anything from 8-bit to 64-bit. With 64-bit being a Xoroshrio128.
Here's a solid confirmation of how useless the 1-bit apertures are. I've chosen a normally poor scoring candidate, [ 4 7 15 11], and run all sampling cases on it. The 1-bit apertures didn't show anything unusual when they clearly should have:
The above exercise is me thinking that 8-bit sampling at each bit position is a good way to check each bit position of the generator's output.
I'm happy to say that not only is 8-bit sampling a good way to do it, but the best way. Practrand internally works on bytes and bigger. But it is stated in the documentation that it is also most sensitive to the patterns in bit0 of the incoming bytes. This means that by shifting an 8-bit sampling window, on a per run basis, to align Practrand's bit0 to each bit of the generator we can test each output bit one by one.
Each bit can be tested as the lsb of a byte but the other seven bits are also being tested.
BTW: I can now sample the generator output at any width from 1-bit up to the width of the generator itself. And the generator can be anything from 8-bit to 64-bit. With 64-bit being a Xoroshiro128.
Here's a solid confirmation of how useless the 1-bit apertures are. I've chosen a normally poor scoring candidate, [ 4 7 15 11], and run all sampling cases on it. The 1-bit apertures didn't show anything unusual when they clearly should have:
Comments
What code can you show me. I'm struggling to grok your descriptions.
My PC has all the RAM it can take and it's much less than 4GB!
Program sent by PM (and you'll know why when you see it).
Many thanks for running the pair frequency tests, Evan. I think there is a tiny error in the code and this
misses out the FFFF_FFFF pair and it should be
I've run a program to find the frequencies of the FFFF_FFFF pairs and add them to your data, so there is no need for you to run the tests again.
Pair frequencies for xoroshiro32++ [14,2,7,d]
Actual values and Expected binomial
All other frequencies not shown = 0
f0*0 + f1*1 + f2*2 + f3*3 + ... = 2^32-1 = full period
f0 + f1 + f2 + f3 + ... = 2^32-1 or 2^32
|Actual-Expected| values
|Actual-Expected|/Expected values
Truncated to f0-f12 to avoid division by zero
Comments
[14,2,7,8] is the worst performer, as predicted by theory when d is exactly half the maximum possible rotation. The third table illustrates best how close most of the [a,b,c,d] quadruples are to the expected binomial, e.g. f0-f3 for [14,2,7,6] are each a better than 99.5% match and f1-f3 make up 92% of the total non-zero pair frequencies. Doing well in this test does not imply a quadruple will do the same in other tests.
http://www.pcg-random.org/posts/a-quick-look-at-xoshiro256.html
I'm not convince her concerns are entirely valid for what we are wanting though:
1: The progressiveness in state changes are hidden by the scrambler. That's the scrambler's job.
2: Algorithmically predictable is a given. Not a concern to us. This isn't for cryptography.
3: Invertible I don't understand. She seems focused on particular multipliers there. Maybe it's similar to predictability in that it can be hacked. I'm unsure how important this is, but it just sounds like it needs new constants. No shortage of those to choose from.
EDIT: I guess when she read "all-purpose" she interpreted that to mean for cryptographic uses too.
As far as I understand these things, which is very little, she is pointing out a serious problem as in:
1) You have this PRNG whose output passses all maner of statistical tests of randomness. Which is sweet.
2) But if you happen to multiply it's output by the wrong numbers in your application, which is something one is likely to do, what you up end up with fails those statistical tests pretty quickly. Not so sweet.
I think I'll stick to JKISS32
I thought the idea was use a multiplier that could be implemented as a couple of shifts and adds instead of an actual multiply. For the sake of speed. That would imply a multiplier with few 1 bits. In that case the search space for a better multiplier is much smaller.
It's fascinating to watch these PRNG "wars". Even if I don't understand the details:
Somebody dreams up a new PRNG.
Somebody else dreams up a new statistical test or other technique to show how crappy it is.
Rinse, repeat...
Thanks, Evan. I must have made a mistake with the zero distribution code. I didn't refer back to my working x86 version. We need zdist for two quads, ideally [14,2,7,5] and [14,2,7,6], to see if there is any real difference.
Err, make that 4 + 4:
In case others are interested, the sixteen columns beginning with 6:15 are PractRand scores for every possible contiguous 8-bit sample of the 16-bit output. The first number is the msb and the second the lsb, with wrapping around from bit 0 to bit 15 where necessary. No quadruples have all 16 scores of 1G or more and overall quadruple [14,2,7,5] chosen for the XORO32 instruction does well.
Evan, a small correction is are needed in your C code from
to
so that all 2^32 pairs are read. However, I doubt there is much need for further distribution tests.
Looking at the zero run frequency distributions or zfreq for short, I'm not convinced that zfreq(0) is calculated correctly. Having said that, zfreq(n)/zfreq(n+1) ~ 2.71828 = e for most values of n, including n = 0. In other words, the zero run distribution is an exponential decay function.
But Seba is unfazed.
The paper explains the theoretical basis for the multipliers in the * and ** scramblers.
I'm wondering how often people will multiply millions of outputs by the same value of 57.
It'd need written, in the prior loop, first.
I think it's correct already though. With the initial call to next_xoro(), there is a total of 2^32 random numbers generated. This should correctly count the wraparound pair.
Doh! I see now. I wasn't really trying to understand the code. The first loop is effectively building indexes into the array - which can be any 32-bit index.
next_xoro() loop is correct as period is 2^32-1 and an initial iteration is needed to get the first "half pair" but the loop that increments freq values is definitely one short as 2^32 pairs must be read.
Pity the do-while doesn't ascetically conform to common source formatting styles. That's probably the main reason I don't normally use it.
Looking at that particular Practrand report file, it hasn't gone far over the threshold. In fact I'm a little surprised by how narrow that one seems to be. I've force a rerun with no termination and it went right to the expected 1 GB without further fails. So I'll put that down to PractRand being a little too sensitive on a particular test.
I'm happy to say that not only is 8-bit sampling a good way to do it, but the best way. Practrand internally works on bytes and bigger. But it is stated in the documentation that it is also most sensitive to the patterns in bit0 of the incoming bytes. This means that by shifting an 8-bit sampling window, on a per run basis, to align Practrand's bit0 to each bit of the generator we can test each output bit one by one.
Each bit can be tested as the lsb of a byte but the other seven bits are also being tested.
Excellent!
Thanks Evan, it was worthwhile testing.
According to your byte theory, 4-bit samples should have better scores than 8-bit samples.