The Z80 code for [14,8,9,9] is only 13% bigger and slower than for [2,8,7,8] and it would be interesting to see the former's grid scores, when they are ready.
[2,8,7,7] or [2,8,7,9] or [7,8,2,7] or [7,8,2,9] are also +13% compared to [2,8,7,8].
Heh, my focus is entirely on the low scores spoiling the brew. I've pondered a little on how to calculate a single value that strongly represents the few poor scores while still giving some weight to the bulk of the grid ... but my maths knowledge is pretty limited really.
Heh, my focus is entirely on the low scores spoiling the brew. I've pondered a little on how to calculate a single value that strongly represents the few poor scores while still giving some weight to the bulk of the grid ... but my maths knowledge is pretty limited really.
A simple thing to try first is add all 256 grid scores in K, so that 16G becomes 16M and the result is guaranteed to fit into 32 bits. Do the best candidates have the highest total score? The minimum of the 256 scores could be recorded separately.
Because of the exponential scoring, that makes the low scores near irrelevant and the freak 16G's overly strong weight. The inverse effect of what I think is needed. I'm guessing a log base2 will be part of the answer.
Zero frequency provides nothing extra really. For good candidates the results are very similar to pair frequency and for poor candidates the results are all over the shop.
Because of the exponential scoring, that makes the low scores near irrelevant and the freak 16G's overly strong weight. The inverse effect of what I think is needed. I'm guessing a log base2 will be part of the answer.
As the scores are all in the form of 2^N we could just use N:
34 for 16G, 33 for 8G, 32 for 4G, 31 for 2G, 30 for 1G, etc.
Because of the exponential scoring, that makes the low scores near irrelevant and the freak 16G's overly strong weight. The inverse effect of what I think is needed. I'm guessing a log base2 will be part of the answer.
As the scores are all in the form of 2^N we could just use N:
34 for 16G, 33 for 8G, 32 for 4G, 31 for 2G, 30 for 1G, etc.
Righty-ho, here's a graph of everything gridded so far. I've highlighted XORO32's [14 2 7 5] and the top scoring [3 2 6 9].
PS: Tony, the [6 2 3 9] average of 30.902 differs from your calculation because the grid you have for that candidate has a number of doubled scores in it.
Thanks, Evan! I was slightly out with my total for [6,2,3,9] by 3. I think there is space to show the odd values on the vertical axis of the graph. Is it worth displaying the highest scores as well?
Although xoshiro is not used in the P2, I'd like to know the ** scrambler for the new xoshiro32** (8-bit output) and also why xoshiro40** does not exist.
Although xoshiro is not used in the P2, I'd like to know the ** scrambler for the new xoshiro32** (8-bit output) and also why xoshiro40** does not exist.
I was rereading Melisa's initial review of Xoshiro** to see if I could understand the problem with "Invertible Output Functions". What I realised that Xoroshiro++ probably doesn't suffer from invertibility, or at least not a simple constant multiplier. The only constant involved in the output scrambler, "D", gets applied to one of two dynamic components prior to summing. This can't be a simple invertible situation.
For old 8-bit CPUs that shift/rotate only one bit at a time, four xoroshiro32++ candidates in order of increasing code size, execution time and quality could be:
[2,8,7,8] or [13,9,8,8]
[2,8,7,7] or [7,8,2,7]
[3,2,6,9]
EDIT:
[3,2,6,9] is same size and speed as [14,2,7,5] on the Z80.
Oh, wow! I just made a small change to the culling code so that it uses any pre-existing PractRand report files that a previous run had generated. Which also includes the PR reports from grid scoring. Boy, what a speed up!
Of course, it was a long time before I'd stopped meddling with the basics like what PractRand options to settle on, so maybe this has only recently been useful feature anyway.
Tony,
Here's the matching complete set of grids and frequency distribution files for the above chart:
Although xoshiro is not used in the P2, I'd like to know the ** scrambler for the new xoshiro32** (8-bit output) and also why xoshiro40** does not exist.
I'd be very keen to see what Melissa makes of what we have here. Firstly, putting the Xoroshiro++ algorithm through her expert hands. It seems to me it's nice upgrade to the original algorithm.
But, secondly, also explaining the much less* severe but continued reduced scores on 16-bit sampling aperture problem that PractRand seems to be highlighting. Sometimes the effect is notable on 4/8/12-bit apertures too, but the 16-bit problem just never goes away, no exceptions. And to repeat, this did not show up with Chris's PRNG algorithm, for example.
*Much less severe compared to original Xoroshiro+ algorithm. As in 1000:1 better 16-bit scores.
EDIT: I suppose, to be fair, the effect is always a little bit visible in the 4/8/12-bit apertures. I should forget it and move on.
Although xoshiro is not used in the P2, I'd like to know the ** scrambler for the new xoshiro32** (8-bit output) and also why xoshiro40** does not exist.
I'd be very keen to see what Melissa makes of what we have here. Firstly, putting the Xoroshiro++ algorithm through her expert hands. It seems to me it's nice upgrade to the original algorithm.
But, secondly, also explaining the much less* severe but continued reduced scores on 16-bit sampling aperture problem that PractRand seems to be highlighting. Sometimes the effect is notable on 4/8/12-bit apertures too, but the 16-bit problem just never goes away, no exceptions. And to repeat, this did not show up with Chris's PRNG algorithm, for example.
*Much less severe compared to original Xoroshiro+ algorithm. As in 1000:1 better 16-bit scores.
EDIT: I suppose, to be fair, the effect is always a little bit visible in the 4/8/12-bit apertures. I should forget it and move on.
Yes, I do intend to get in touch with Melissa again. Could you do the pair frequency distribution for xoroshiro32+ [3,2,6] first?
There's no doubt that xoroshiro++ is much better than xoroshiro+ and I'd like to send her our distributions for [3,2,6,9] and [3,2,6]. She doesn't like xoshiro very much, nor the simple ** scrambler which she thinks is easy to unscramble. It would be good if she could look at the ++ scrambler.
Is it worthwhile doing the grid tests on xoroshiro32+p [3,2,6] or even xoroshiro32+ [3,2,6]?
... Could you do the pair frequency distribution for xoroshiro32+ [3,2,6] first?
...
Is it worthwhile doing the grid tests on xoroshiro32+p [3,2,6] or even xoroshiro32+ [3,2,6]?
You're messing with my perfectly tuned scripts again! ...
Although xoshiro is not used in the P2, I'd like to know the ** scrambler for the new xoshiro32** (8-bit output) and also why xoshiro40** does not exist.
I'd be very keen to see what Melissa makes of what we have here. Firstly, putting the Xoroshiro++ algorithm through her expert hands. It seems to me it's nice upgrade to the original algorithm.
But, secondly, also explaining the much less* severe but continued reduced scores on 16-bit sampling aperture problem that PractRand seems to be highlighting. Sometimes the effect is notable on 4/8/12-bit apertures too, but the 16-bit problem just never goes away, no exceptions. And to repeat, this did not show up with Chris's PRNG algorithm, for example.
*Much less severe compared to original Xoroshiro+ algorithm. As in 1000:1 better 16-bit scores.
EDIT: I suppose, to be fair, the effect is always a little bit visible in the 4/8/12-bit apertures. I should forget it and move on.
Yes, I do intend to get in touch with Melissa again. Could you do the pair frequency distribution for xoroshiro32+ [3,2,6] first?
There's no doubt that xoroshiro++ is much better than xoroshiro+ and I'd like to send her our distributions for [3,2,6,9] and [3,2,6]. She doesn't like xoshiro very much, nor the simple ** scrambler which she thinks is easy to unscramble. It would be good if she could look at the ++ scrambler.
Is it worthwhile doing the grid tests on xoroshiro32+p [3,2,6] or even xoroshiro32+ [3,2,6]?
[3,2,6] was nowhere in the earlier xoroshiro+ testing. If the extended test scores are equally bad perhaps +p[14,2,7] or +[14,2,7] could also be done?
Found a newly added bug, in the last few days, that allowed a rare case of quadruple score. ... after some 17k report files scanned I can say luckily the bug hasn't affected any of the done grids. I picked it out on the latest culling run today.
Have to say it's been a tad hairy trying to suss the logic to manage PractRand's BCFN glitches. Here's the latest decision logic for it:
if [ ! -f "$PRreport" ]; then# run and score the candidate"$RNGbin" | stdbuf -o L $PRcommand >"$PRreport"if [ $? -ne 0 ]; thenprintf"Aborted ${RNGbin} - PractRand error\n"exit 2
fifi# Output the score to console/logfile. This is not score extraction for score table.
extract_score
if [ $bcfncnt -lt $failcnt ]; then# a valid score. Only the BCFN tests trip upprintf"[${a}${b}${c}${d}] sampling aperture ${sampsize}>>${samppos} - PractRand score: ${scoresize}B\n"else# probably incorrect score due to too sensitive BCFN testing. Rerun testing from +1 powerif [ $bcfncnt -gt 2 ];thenprintf"Passing because "elseprintf"BCFNs: $bcfncnt, [${a}${b}${c}${d}] sampling aperture ${sampsize}>>${samppos}, PractRand score: ${scoresize}B - Trying larger ...\n"
mv "$PRreport""${PRreport}.tmp"
scoresizedn=$scoresize
sizeup=$(( $sizekb * 2 ))
"$RNGbin" | stdbuf -o L $PRcommand -tlmin "${sizeup}KB" >"$PRreport"
extract_score
if [ $sizekb -eq $sizeup ]; then# recuring failsif [ $bcfncnt -lt $failcnt ] || [ $bcfncnt -gt 2 ]; then# initial score was valid, revert
rm "$PRreport"
mv "${PRreport}.tmp""$PRreport"
scoresize=$scoresizednprintf"BCFNs: $bcfncnt, Fails: $failcnt, Reverted - "
bcfncnt=0
else# two BCFN glitches in a row! raise another power, rerun againprintf"BCFNs: $bcfncnt, Fails: $failcnt, ${RNGbin} - Larger again ...\n"
sizeup=$(( $sizekb * 2 ))
"$RNGbin" | stdbuf -o L $PRcommand -tlmin "${sizeup}KB" >"$PRreport"
extract_score
if [ $sizekb -eq $sizeup ]; then# initial score was valid, revert
rm "$PRreport"
mv "${PRreport}.tmp""$PRreport"
scoresize=$scoresizednprintf"BCFNs: $bcfncnt, Fails: $failcnt, Reverted - "
bcfncnt=0
else# got through it. Hopefully no other BCFN glitch!
rm "${PRreport}.tmp"fifielse# new report is normal, BCFN glitch cleared
rm "${PRreport}.tmp"fifiif [ $bcfncnt -eq $failcnt ]; thenprintf"BCFN & Fails: $failcnt, "fiprintf"[${a}${b}${c}${d}] sampling aperture ${sampsize}>>${samppos} - PractRand score: ${scoresize}B\n"fi
Oh, wow! I just made a small change to the culling code so that it uses any pre-existing PractRand report files that a previous run had generated. Which also includes the PR reports from grid scoring. Boy, what a speed up!
Of course, it was a long time before I'd stopped meddling with the basics like what PractRand options to settle on, so maybe this has only recently been useful feature anyway.
Tony,
Here's the matching complete set of grids and frequency distribution files for the above chart:
Evan, many thanks for the data.
Some thoughts about pair distribution. The C code iterates xoroshiro32++, concatenates successive 16-bit outputs and records how often each possible 32-bit value occurs. If the first 16-bit output is 1, the second 2, etc., then the C code produces the following 32-bit data:
Although the order is different the data are the same, i.e. the pair distribution is the XORO32 output distribution and therefore a good test of the XORO32 randomness. The xoroshiro32++ 16-bit output is equidistributed so that every non-zero values occurs 2^16 times and zero 2^16-1 times, but the XORO32 32-bit output is not equidistributed because the ++ scrambler is only 1-dimensionally equidistributed.
So dimensionality, and therefore equidistribution, is just a perspective thing then.
The consecutive generator output data is identical between my code and XORO32. In fact Chip even made a late change back to iterating the engine, post scrambling. The same as I was always sequencing things. Did we double check the output data after that change? I don't remember.
So dimensionality, and therefore equidistribution, is just a perspective thing then.
The consecutive generator output data is identical between my code and XORO32. In fact Chip even made a late change back to iterating the engine, post scrambling. The same as I was always sequencing things. Did we double check the output data after that change? I don't remember.
Yes, we did check the data. We calculate the PRN using the state before it is changed, as that is the quickest way in hardware and in software if parallel processing is available.
It is faster for your C code to save the previous output and use it again rather than go through the whole period twice as in XORO32 but otherwise the results are the same.
Continuing from my last but one post, I think we need extra grid tests tailored to the 32-bit XORO32 output. In the table below the first figure is the sample size, followed by the range of numbers of the lsb's in the samples:
The total is 528 and 527 new ones as [32,0] will be identical to the [16,0] already done. These new tests would use data either from successive outputs or every other output. For example, a 16-bit sample could be the high byte of one output and the low byte of the next and the PractRand scores should be higher than one complete output as the equidistribution will be disturbed/destroyed.
EDIT:
The results could be included in the documentation so that users know which is the best subset for each sample size. A shift or rotate could align the data left or right, but this step might not be needed if the data are sent directly to pins.
Comments
[2,8,7,7] or [2,8,7,9] or [7,8,2,7] or [7,8,2,9] are also +13% compared to [2,8,7,8].
__________________________________________________________________________________________________________ PractRand scoring of Xoroshiro32(16)++ candidate [4 8 5 7]. Byte Sampled Double Full Period = 8 GB PractRand v0.93 options: stdin -multithreaded -te 1 -tf 2 -tlmin 1KB |===00====01====02====03====04====05====06====07====08====09====10====11====12====13====14====15= 16 | 512M 1G 1G 1G 1G 1G 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 15 | 4G 4G 4G 4G 2G 4G 4G 4G 4G 2G 2G 2G 4G 4G 4G 4G 14 | 2G 2G 8G 4G 4G 2G 4G 2G 2G 2G 2G 1G 1G 1G 1G 2G 13 | 8G 8G 4G 2G 8G 4G 4G 4G 2G 2G 2G 2G 4G 2G 2G 4G 12 | 512M 1G 1G 512M 512M 512M 512M 512M 1G 512M 512M 512M 512M 512M 512M 512M 11 | 2G 4G 4G 2G 2G 2G 4G 4G 2G 2G 2G 2G 2G 2G 2G 2G 10 | 1G 2G 1G 1G 1G 2G 4G 2G 2G 2G 2G 4G 2G 1G 1G 1G 09 | 2G 4G 4G 4G 2G 4G 8G 8G 4G 4G 4G 4G 4G 8G 4G 4G 08 | 1G 512M 1G 1G 1G 512M 512M 512M 512M 512M 512M 512M 512M 1G 1G 1G 07 | 2G 8G 8G 8G 4G 4G 8G 4G 4G 4G 8G 8G 8G 8G 8G 8G 06 | 4G 4G 2G 2G 2G 2G 2G 2G 2G 2G 2G 4G 2G 4G 2G 2G 05 | 4G 4G 4G 4G 2G 2G 2G 4G 4G 4G 4G 2G 4G 4G 4G 4G 04 | 512M 256M 2G 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 1G 2G 03 | 2G 4G 4G 2G 4G 2G 2G 4G 4G 2G 2G 2G 2G 2G 2G 4G 02 | 512M 2G 2G 256M 2G 2G 2G 1G 2G 2G 1G 1G 512M 256M 1G 1G 01 | 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G
A simple thing to try first is add all 256 grid scores in K, so that 16G becomes 16M and the result is guaranteed to fit into 32 bits. Do the best candidates have the highest total score? The minimum of the 256 scores could be recorded separately.
__________________________________________________________________________________________________________ PractRand scoring of Xoroshiro32(16)++ candidate [14 8 9 9]. Byte Sampled Double Full Period = 8 GB PractRand v0.93 options: stdin -multithreaded -te 1 -tf 2 -tlmin 1KB |===00====01====02====03====04====05====06====07====08====09====10====11====12====13====14====15= 16 | 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 15 | 1G 1G 1G 1G 1G 1G 1G 1G 2G 1G 2G 1G 2G 1G 1G 2G 14 | 512M 256M 256M 256M 512M 512M 512M 256M 512M 512M 512M 512M 512M 512M 512M 512M 13 | 256M 128M 128M 128M 128M 256M 256M 256M 128M 128M 128M 128M 256M 256M 256M 256M 12 | 32M 64M 64M 64M 64M 64M 64M 64M 64M 32M 64M 64M 64M 64M 64M 64M 11 | 32M 64M 32M 32M 32M 32M 64M 64M 64M 32M 32M 32M 32M 32M 32M 64M 10 | 16M 64M 32M 32M 32M 32M 32M 64M 32M 32M 16M 32M 32M 32M 16M 64M 09 | 1G 2G 2G 2G 2G 2G 2G 2G 2G 2G 2G 2G 2G 2G 2G 2G 08 | 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 512M 07 | 2G 4G 4G 2G 128M 2G 4G 4G 2G 2G 2G 2G 2G 2G 2G 4G 06 | 4G 4G 4G 2G 1G 512M 4G 2G 4G 2G 1G 1G 2G 4G 4G 2G 05 | 2G 2G 2G 4G 2G 4G 4G 2G 2G 2G 4G 4G 4G 2G 2G 2G 04 | 256M 512M 512M 1G 1G 1G 512M 512M 512M 1G 2G 1G 1G 1G 512M 4G 03 | 4G 4G 2G 4G 2G 4G 2G 4G 4G 4G 4G 4G 2G 2G 4G 2G 02 | 1G 1G 2G 2G 1G 2G 1G 2G 2G 2G 2G 1G 2G 2G 1G 1G 01 | 512M 1G 512M 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G
[6 2 5 5] is another second place contender.
__________________________________________________________________________________________________________ PractRand scoring of Xoroshiro32(16)++ candidate [6 2 5 5]. Byte Sampled Double Full Period = 8 GB PractRand v0.93 options: stdin -multithreaded -te 1 -tf 2 -tlmin 1KB |===00====01====02====03====04====05====06====07====08====09====10====11====12====13====14====15= 16 | 512M 512M 512M 512M 512M 1G 1G 512M 1G 512M 512M 512M 512M 512M 512M 512M 15 | 4G 4G 8G 4G 8G 8G 8G 16G 16G 8G 16G 8G 8G 8G 16G 8G 14 | 4G 2G 2G 4G 4G 4G 4G 4G 4G 4G 2G 8G 4G 2G 1G 4G 13 | 4G 4G 4G 4G 8G 4G 4G 8G 4G 16G 8G 4G 8G 8G 2G 8G 12 | 1G 2G 1G 512M 1G 1G 1G 512M 1G 1G 1G 1G 1G 1G 1G 1G 11 | 8G 8G 2G 4G 2G 2G 2G 4G 4G 4G 4G 4G 4G 4G 4G 4G 10 | 2G 4G 2G 2G 2G 2G 2G 2G 2G 4G 2G 2G 2G 2G 2G 2G 09 | 8G 4G 8G 4G 4G 4G 8G 8G 8G 4G 4G 4G 4G 8G 8G 8G 08 | 1G 1G 512M 1G 1G 512M 1G 1G 1G 1G 1G 1G 512M 512M 1G 1G 07 | 4G 4G 2G 4G 2G 4G 8G 8G 4G 4G 4G 4G 4G 4G 4G 4G 06 | 1G 1G 1G 2G 2G 2G 2G 2G 2G 2G 2G 2G 2G 1G 1G 1G 05 | 2G 2G 4G 2G 4G 4G 4G 4G 4G 4G 4G 4G 4G 4G 2G 4G 04 | 512M 512M 1G 512M 1G 1G 2G 2G 1G 1G 1G 1G 1G 512M 2G 1G 03 | 2G 4G 4G 4G 4G 2G 4G 2G 4G 2G 4G 2G 4G 2G 512M 4G 02 | 1G 1G 2G 2G 1G 2G 2G 2G 2G 2G 256M 2G 2G 2G 2G 2G 01 | 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G 1G
http://forums.parallax.com/discussion/comment/1441593/#Comment_1441593
Zero frequency provides nothing extra really. For good candidates the results are very similar to pair frequency and for poor candidates the results are all over the shop.
As the scores are all in the form of 2^N we could just use N:
34 for 16G, 33 for 8G, 32 for 4G, 31 for 2G, 30 for 1G, etc.
Total of 256 scores fits easily in 16 bits.
Selected results based on the scheme above:
# a, b, c, d, Total, Average 3, 2, 6, 9, 7981, 31.176 6, 2, 5, 5, 7967, 31.121 14, 2, 7, 5, 7925, 30.957 6, 2, 3, 9, 7914, 30.914 14, 2, 7, 6, 7903, 30.871 4, 8, 5, 7, 7884, 30.797 3, 2, 6, 5, 7880, 30.781
Calculated manually. I hope Evan will modify his program from now on!
PS: Tony, the [6 2 3 9] average of 30.902 differs from your calculation because the grid you have for that candidate has a number of doubled scores in it.
EDIT: Updated chart here - https://forums.parallax.com/discussion/download/123206/scores_xo++s16.png
http://www.pcg-random.org/posts/xoshiro-repeat-flaws.html
Although xoshiro is not used in the P2, I'd like to know the ** scrambler for the new xoshiro32** (8-bit output) and also why xoshiro40** does not exist.
I was rereading Melisa's initial review of Xoshiro** to see if I could understand the problem with "Invertible Output Functions". What I realised that Xoroshiro++ probably doesn't suffer from invertibility, or at least not a simple constant multiplier. The only constant involved in the output scrambler, "D", gets applied to one of two dynamic components prior to summing. This can't be a simple invertible situation.
[7,10,10,7]
[5,2,6,8]
[6,2,3,8]
Zero frequency can be dropped.
[2,8,7,8] or [13,9,8,8]
[2,8,7,7] or [7,8,2,7]
[3,2,6,9]
EDIT:
[3,2,6,9] is same size and speed as [14,2,7,5] on the Z80.
Of course, it was a long time before I'd stopped meddling with the basics like what PractRand options to settle on, so maybe this has only recently been useful feature anyway.
Tony,
Here's the matching complete set of grids and frequency distribution files for the above chart:
I'd be very keen to see what Melissa makes of what we have here. Firstly, putting the Xoroshiro++ algorithm through her expert hands. It seems to me it's nice upgrade to the original algorithm.
But, secondly, also explaining the much less* severe but continued reduced scores on 16-bit sampling aperture problem that PractRand seems to be highlighting. Sometimes the effect is notable on 4/8/12-bit apertures too, but the 16-bit problem just never goes away, no exceptions. And to repeat, this did not show up with Chris's PRNG algorithm, for example.
*Much less severe compared to original Xoroshiro+ algorithm. As in 1000:1 better 16-bit scores.
EDIT: I suppose, to be fair, the effect is always a little bit visible in the 4/8/12-bit apertures. I should forget it and move on.
Yes, I do intend to get in touch with Melissa again. Could you do the pair frequency distribution for xoroshiro32+ [3,2,6] first?
There's no doubt that xoroshiro++ is much better than xoroshiro+ and I'd like to send her our distributions for [3,2,6,9] and [3,2,6]. She doesn't like xoshiro very much, nor the simple ** scrambler which she thinks is easy to unscramble. It would be good if she could look at the ++ scrambler.
Is it worthwhile doing the grid tests on xoroshiro32+p [3,2,6] or even xoroshiro32+ [3,2,6]?
[3,2,6] was nowhere in the earlier xoroshiro+ testing. If the extended test scores are equally bad perhaps +p[14,2,7] or +[14,2,7] could also be done?
Have to say it's been a tad hairy trying to suss the logic to manage PractRand's BCFN glitches. Here's the latest decision logic for it:
if [ ! -f "$PRreport" ]; then # run and score the candidate "$RNGbin" | stdbuf -o L $PRcommand >"$PRreport" if [ $? -ne 0 ]; then printf "Aborted ${RNGbin} - PractRand error\n" exit 2 fi fi # Output the score to console/logfile. This is not score extraction for score table. extract_score if [ $bcfncnt -lt $failcnt ]; then # a valid score. Only the BCFN tests trip up printf "[${a} ${b} ${c} ${d}] sampling aperture ${sampsize}>>${samppos} - PractRand score: ${scoresize}B\n" else # probably incorrect score due to too sensitive BCFN testing. Rerun testing from +1 power if [ $bcfncnt -gt 2 ];then printf "Passing because " else printf "BCFNs: $bcfncnt, [${a} ${b} ${c} ${d}] sampling aperture ${sampsize}>>${samppos}, PractRand score: ${scoresize}B - Trying larger ...\n" mv "$PRreport" "${PRreport}.tmp" scoresizedn=$scoresize sizeup=$(( $sizekb * 2 )) "$RNGbin" | stdbuf -o L $PRcommand -tlmin "${sizeup}KB" >"$PRreport" extract_score if [ $sizekb -eq $sizeup ]; then # recuring fails if [ $bcfncnt -lt $failcnt ] || [ $bcfncnt -gt 2 ]; then # initial score was valid, revert rm "$PRreport" mv "${PRreport}.tmp" "$PRreport" scoresize=$scoresizedn printf "BCFNs: $bcfncnt, Fails: $failcnt, Reverted - " bcfncnt=0 else # two BCFN glitches in a row! raise another power, rerun again printf "BCFNs: $bcfncnt, Fails: $failcnt, ${RNGbin} - Larger again ...\n" sizeup=$(( $sizekb * 2 )) "$RNGbin" | stdbuf -o L $PRcommand -tlmin "${sizeup}KB" >"$PRreport" extract_score if [ $sizekb -eq $sizeup ]; then # initial score was valid, revert rm "$PRreport" mv "${PRreport}.tmp" "$PRreport" scoresize=$scoresizedn printf "BCFNs: $bcfncnt, Fails: $failcnt, Reverted - " bcfncnt=0 else # got through it. Hopefully no other BCFN glitch! rm "${PRreport}.tmp" fi fi else # new report is normal, BCFN glitch cleared rm "${PRreport}.tmp" fi fi if [ $bcfncnt -eq $failcnt ]; then printf "BCFN & Fails: $failcnt, " fi printf "[${a} ${b} ${c} ${d}] sampling aperture ${sampsize}>>${samppos} - PractRand score: ${scoresize}B\n" fi
Evan, many thanks for the data.
Some thoughts about pair distribution. The C code iterates xoroshiro32++, concatenates successive 16-bit outputs and records how often each possible 32-bit value occurs. If the first 16-bit output is 1, the second 2, etc., then the C code produces the following 32-bit data:
{2,1} {3,2} {4,3} ... {2^32-2,2^32-3} {2^32-1,2^32-2} {1,2^32-1}
XORO32 does a double iteration and outputs 32-bit data in the following order:
{2,1} {4,3} ... {2^32-2,2^32-3} {1,2^32-1} {3,2} ... {2^32-1,2^23-2}
Although the order is different the data are the same, i.e. the pair distribution is the XORO32 output distribution and therefore a good test of the XORO32 randomness. The xoroshiro32++ 16-bit output is equidistributed so that every non-zero values occurs 2^16 times and zero 2^16-1 times, but the XORO32 32-bit output is not equidistributed because the ++ scrambler is only 1-dimensionally equidistributed.
The consecutive generator output data is identical between my code and XORO32. In fact Chip even made a late change back to iterating the engine, post scrambling. The same as I was always sequencing things. Did we double check the output data after that change? I don't remember.
Yes, we did check the data. We calculate the PRN using the state before it is changed, as that is the quickest way in hardware and in software if parallel processing is available.
It is faster for your C code to save the previous output and use it again rather than go through the whole period twice as in XORO32 but otherwise the results are the same.
32, 0 31, 0-1 30, 0-2 29, 0-3 28, 0-4 27, 0-5 26, 0-6 25, 0-7 24, 0-8 23, 0-9 22, 0-10 21, 0-11 20, 0-12 19, 0-13 18, 0-14 17, 0-15 16, 0-16 15, 0-17 14, 0-18 13, 0-19 12, 0-20 11, 0-21 10, 0-22 9, 0-23 8, 0-24 7, 0-25 6, 0-26 5, 0-27 4, 0-28 3, 0-29 2, 0-30 1, 0-31
The total is 528 and 527 new ones as [32,0] will be identical to the [16,0] already done. These new tests would use data either from successive outputs or every other output. For example, a 16-bit sample could be the high byte of one output and the low byte of the next and the PractRand scores should be higher than one complete output as the equidistribution will be disturbed/destroyed.
EDIT:
The results could be included in the documentation so that users know which is the best subset for each sample size. A shift or rotate could align the data left or right, but this step might not be needed if the data are sent directly to pins.
Xoroshiro128+ Xoroshiro128+ Xoroshiro128++ Xoroshiro128** [55 14 36] [24 16 37] [24 16 37, 29] [24 16 37, 5 7 9] ================================================================================= 0000000000000001 0000000000000001 0000000000008280 0000000000001680 0080001000004001 0000002001010001 0000008302808280 0000001696801680 0008402018000121 2041004100000401 d290d28000008289 e682e68000001680 8080563010444122 0100802505212521 8000850d155d02fe 800016f099c09692 09d0240b1809c401 24452841411129c1 935ae353156d54c9 db9ce96699c368c0 01d82012758940e2 4748505981940125 ea6b050aa8850d6a acd4a897e816f3bc 69b05d703207c544 5490a62565962735 6a060e1804df1055 736225bd8540f37d df59215fd8d25ee6 e3f63b395995dc40 a59f1df35ee54491 d5ef4bc8db65564a 9a652772eb590ca2 5939cda7f152e916 05cc33ecd8cc11de e6844f92c4467f20 4ba7fb3a655fc1b1 081d29d20aea9481 29f268efebc9139f ecbfdd2089c19276 a44648618c9bfe2c 5f1fa965e4aacd70 35c5a4efb1d988ea 3e3c8f32277dc824 d89157b50d9ced27 861f506b0cbeabb1 9a35ab1adf567cff 6135641f1dadd2ca 431bed2f5777656a debdf4440da5e450 bfc6ffb683b9c1c0 0695119b0de23cfa 3d3295d3e2fa7508 59c623f9812de7b1 7c773f8f9058cb2a 8837ded221b7094d a69b0c5c067fd26f 4eeb9061804f6b3e 49c5f3032816e63d 99f5fdc2c04a92d6 0bed2d94263f44f2 fd5a67d5f3eb3e99 c643c3c4f29c5bd0 57262a94b712228c 7abb61c8669180c3 cbf60640f2627164 6e4666bc70fba338 efb3dcbf62ed996d 4c7538aa5b034734 e2c19f18434c0a0a f64f0f76d78bf49d 2a778fe85a20f7aa 4bb5705473f54dce b8e8e961e7c838e3 2e0f599b1dcccf83 dbcdae58898d40be eaa489d608e41cb8 dbfdaa161e0172e8 f448857fb889f8bd 18766f4afc81ba49
seed=00000001
Output sequence is:
50AD0021 {1,0} A3D9B89A {3,2} 9BD90C87 {5,4} 35023CAA {7,6} 3D248840 {9,8} 71F57287 ... A81890AC 2CE3B4C1 B31DCC58 B7A4DF3D 3B2EA2E3 BFFF2C20 5B9AAFE9 35F43BB8 6A2D921C A217CD9A