zfreq(x) only has a meaning when x > 0 and zfreq(0) tells us nothing new as the code makes zfreq(0) = 2^32 - pfreq(0). The average non-zero run length = e ~ 2.718 and the average zero run length = e / (e-1) ~ 1.582. The former is greater than the latter because there are more non-zero pair values than zero ones. The number of runs is the same for both by definition (to within one at worst). I think no more distribution tests are needed and I'll put a summary of the scores together soon.

Thanks a lot, Evan. zfreq(0) is correct now. Looking just at zfreq, [6,2,3,9] and [6,2,11,6] do very well. Not checked others.

[6 2 11 6] comes up clean. There is three very poor scores but they all check out as borderline BCFN_FF cases. So, really only a single 256 MB and the rest are 512 MB or better.

Here's all nine of the every-aperture score tables, and also the extended PR reports for the three borderline cases of [6 2 11 6].

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

PS: [6 2 3 9] has a correctly poor score of 16 MB at aperture 2+1.

RNG_test using PractRand version 0.93
RNG = RNG_stdin, seed = 0xf3bf6941
test set = expanded, folding = extra
rng=RNG_stdin, seed=0xf3bf6941
length= 1 kilobyte (2^10 bytes), time= 0.4 seconds
no anomalies in 14 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 2 kilobytes (2^11 bytes), time= 0.7 seconds
Test Name Raw Processed Evaluation
FPF-14+6/4:all R= -6.3 p =1-4.2e-3 unusual
...and 18 test result(s) without anomalies
rng=RNG_stdin, seed=0xf3bf6941
length= 4 kilobytes (2^12 bytes), time= 1.1 seconds
no anomalies in 56 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 8 kilobytes (2^13 bytes), time= 1.7 seconds
no anomalies in 114 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 16 kilobytes (2^14 bytes), time= 2.9 seconds
no anomalies in 179 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 32 kilobytes (2^15 bytes), time= 4.5 seconds
no anomalies in 246 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 64 kilobytes (2^16 bytes), time= 6.4 seconds
no anomalies in 316 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 128 kilobytes (2^17 bytes), time= 8.7 seconds
no anomalies in 367 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 256 kilobytes (2^18 bytes), time= 10.9 seconds
no anomalies in 422 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 512 kilobytes (2^19 bytes), time= 13.3 seconds
no anomalies in 476 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 1 megabyte (2^20 bytes), time= 15.6 seconds
no anomalies in 531 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 2 megabytes (2^21 bytes), time= 18.1 seconds
no anomalies in 585 test result(s)
rng=RNG_stdin, seed=0xf3bf6941
length= 4 megabytes (2^22 bytes), time= 20.7 seconds
Test Name Raw Processed Evaluation
FPF-14+6/4:(0,14-1) R= +9.1 p = 8.9e-8 mildly suspicious
FPF-14+6/4:all R= +6.8 p = 8.0e-6 mildly suspicious
FPF-14+6/4:all2 R= +12.8 p = 3.8e-6 mildly suspicious
...and 638 test result(s) without anomalies
rng=RNG_stdin, seed=0xf3bf6941
length= 8 megabytes (2^23 bytes), time= 23.6 seconds
Test Name Raw Processed Evaluation
FPF-14+6/4:(0,14-0) R= +10.9 p = 1.2e-9 very suspicious
FPF-14+6/4:all R= +7.2 p = 3.0e-6 mildly suspicious
FPF-14+6/4:all2 R= +20.8 p = 4.7e-9 very suspicious
...and 691 test result(s) without anomalies
rng=RNG_stdin, seed=0xf3bf6941
length= 16 megabytes (2^24 bytes), time= 26.8 seconds
Test Name Raw Processed Evaluation
FPF-14+6/4:(0,14-0) R= +21.1 p = 3.8e-19 FAIL !
FPF-14+6/4:all R= +16.1 p = 1.2e-14 FAIL
FPF-14+6/4:all2 R=+101.7 p = 1.6e-38 FAIL !!!
...and 744 test result(s) without anomalies
rng=RNG_stdin, seed=0xf3bf6941
length= 32 megabytes (2^25 bytes), time= 30.4 seconds
Test Name Raw Processed Evaluation
FPF-14+6/16:(0,14-0) R= +9.4 p = 2.8e-8 suspicious
FPF-14+6/16:all R= +6.9 p = 5.6e-6 mildly suspicious
FPF-14+6/16:all2 R= +14.2 p = 9.5e-7 suspicious
FPF-14+6/4:(0,14-0) R= +38.1 p = 6.9e-35 FAIL !!!
FPF-14+6/4:(1,14-0) R= +13.9 p = 1.9e-12 VERY SUSPICIOUS
FPF-14+6/4:(2,14-0) R= +9.9 p = 8.9e-9 suspicious
FPF-14+6/4:all R= +31.6 p = 4.1e-29 FAIL !!!
FPF-14+6/4:all2 R=+372.9 p = 7.6e-141 FAIL !!!!!
...and 788 test result(s) without anomalies
rng=RNG_stdin, seed=0xf3bf6941
length= 64 megabytes (2^26 bytes), time= 34.3 seconds
Test Name Raw Processed Evaluation
FPF-14+6/32:all R= +6.3 p = 2.0e-5 mildly suspicious
FPF-14+6/16:(0,14-0) R= +18.0 p = 2.8e-16 FAIL
FPF-14+6/16:(1,14-0) R= +8.5 p = 1.6e-7 mildly suspicious
FPF-14+6/16:(2,14-1) R= +8.7 p = 1.7e-7 mildly suspicious
FPF-14+6/16:all R= +19.1 p = 2.3e-17 FAIL !
FPF-14+6/16:all2 R= +90.7 p = 1.5e-34 FAIL !!!
FPF-14+6/4:(0,14-0) R= +71.8 p = 4.4e-66 FAIL !!!!
FPF-14+6/4:(1,14-0) R= +27.1 p = 1.0e-24 FAIL !!
FPF-14+6/4:(2,14-0) R= +18.3 p = 1.4e-16 FAIL
FPF-14+6/4:(3,14-0) R= +7.8 p = 7.1e-7 mildly suspicious
FPF-14+6/4:all R= +57.5 p = 1.9e-53 FAIL !!!!
FPF-14+6/4:all2 R= +1383 p = 4.8e-541 FAIL !!!!!!!
...and 831 test result(s) without anomalies

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

BTW: I currently have over 6300 report files from Practrand sloshing around the s16 testing directory. I might try to break that up a little ...

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

I haven't posted the sources in that topic. The testing source code is still evolving in some ways. And Tony has some of his own.

There is a number of archives in various posts of this topic though.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Thanks a lot, Evan. zfreq(0) is correct now. Looking just at zfreq, [6,2,3,9] and [6,2,11,6] do very well. Not checked others.

[6 2 11 6] comes up clean. There is three very poor scores but they all check out as borderline BCFN_FF cases. So, really only a single 256 MB and the rest are 512 MB or better.

Here's all nine of the every-aperture score tables, and also the extended PR reports for the three borderline cases of [6 2 11 6].

Thanks, Evan. Have you dropped a couple of candidates recently? Is it the case that the nine consist of the seven with the best scores, plus [14,2,7,5] and [14,2,7,6]?

The key question: based on PractRand tests which do you think is the best?

It would be useful to have an every-aperture table with all the corrections for borderline cases.

Ha! That won't be happening unless someone fines a way to tell Practrand to ease up on the sensitivity of the BCFN_FF tests.

You're welcome to hand edit the tables yourself.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Here's all nine of the every-aperture score tables, and also the extended PR reports for the three borderline cases of [6 2 11 6].

Thanks, Evan. Have you dropped a couple of candidates recently? Is it the case that the nine consist of the seven with the best scores, plus [14,2,7,5] and [14,2,7,6]?

There was plenty more where those came from. The only other candidate I'd being working on lately was the poorly scoring one for comparison.

The key question: based on PractRand tests which do you think is the best?

I've been spending a lot more time coding than pondering the results.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

It would be useful to have an every-aperture table with all the corrections for borderline cases.

Ha! That won't be happening unless someone fines a way to tell Practrand to ease up on the sensitivity of the BCFN_FF tests.

You're welcome to hand edit the tables yourself.

It would be good if Chris (the PractRand author) could tell us how to modify the BCFN_FF tests. I could edit the tables, but the amended borderline scores are scattered about in different files and I think not all of them have been posted.

It's interesting that for eight of the nine [a,b,c,d] candidates b = 2, which is a left shift, whereas a, c & d are all left rotates. The exception is [13,5,10,10].

The nine were hand picked from the following table (which I don't seem to have posted before). These are the auto-culled most recent complete run of the full set of candidates:

EDIT: I note it took only 9 hours to run. That would suggest the culling threshold was set very high, ie: Any scores (except for direct 16+0 aperture) below 512 MB would have terminated the testing of that candidate.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

We need some simplified scoring method to compare the every-aperture results, maybe mean and standard deviation for each sample size. Another way could be visually, using 16 by 16 "randomgrams".

The nine were hand picked from the following table (which I don't seem to have posted before). <snip>
EDIT: I note it took only 9 hours to run. That would suggest the culling threshold was set very high, ie: Any scores (except for direct 16+0 aperture) below 512 MB would have terminated the testing of that candidate.

I wonder how long it would take to run an every-aperture test for all possible [a,b,c,d] candidates (84 * 16 = 1344), then stop each test if a 16-bit sample score is less than 512M? That should weed out most of the duds. Ideally, the BCFN_FF test would be amended first.

I wonder how long it would take to run an every-aperture test for all possible [a,b,c,d] candidates (84 * 16 = 1344) ...

2000 hours (84 days) without any culling.

Ideally, the BCFN_FF test would be amended first.

Yeah, I'll have a nosy at the Practrand sources. I'm not likely to solve it myself though.

This issue has probably already made that culled list above smaller than it should be. It's been present all along but I hadn't gone looking at the report files that closely until the oddball scores really stuck out.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Notes
pfreq(x) = output pairs that occur x times
zfreq(x) = zero runs of length x

If N-1 = full period (2^32-1 for the above) then:

(a) Expected pfreq(x) = 1/x! * N/e
(b) Sum of pfreq(x) = N
(c) Sum of x * pfreq(x) = N-1
(d) Expected zero runs = (1-1/e) * N/e
(e) Expected mean zero run = 1/(1-1/e)
(f) Expected zfreq(x) = (1-1/e)^2 * N/e^x
(g) Sum of zfreq(x) = zero runs
(h) Sum of x * zfreq(x) = pfreq0
(i) Sum of x * zfreq(x) / Sum of zfreq(x) = mean run

x >= 0 for pfreq(x), x > 0 for zfreq(x)

Comments
These frequency distribution tests are intended as an independent "sanity check" but doing well does not imply the same will happen in other, more rigorous randomness tests. However, these frequency tests can detect certain bad candidates and also act as a "tie-breaker" for good candidates, as is the case here. For example, assuming all other scores are equal, [6,2,5,5] could be eliminated on pair frequency (which is more important in my opinion than zero run frequency).

I've made a multi-case launching script too. It uses all RAM and creates a lot of smaller files now. I mainly chose this approach because I was wary of many programs concurrently trying to append the same files.

There is a small error in the above code. If the last byte of the 4GB pair array is zero, then the last zero run will not be written to the zfreq array. Also, it would better if zfreq[0] held the number of zero runs, i.e. it is incremented whenever zfreq[tally] is incremented. I think the amended code could be:

The results two posts up are remarkably good. |Actual-Expected|/Expected show this best. The smaller the better and some of the errors are tiny. Note the consistency across pfreq0-pfreq3. I deduced the zero run exponential distribution from the results, I can't prove it!

I'd like to see how bad candidates perform, e.g. xoroshiro32++ [14,2,7,8] or xoroshiro32+p [14,2,7] or xoroshiro32+ [14,2,7], getting progressively worse. Repeat frequency tests suggests xoroshiro32++ [14,2,7,7] and [14,2,7,9] will be adversely affected by their proximity to [14,2,7,8].

We could run the frequency tests and use pfreq0 to cull candidates before any PractRand culling. For example, |Actual-Expected|/Expected less than 0.001 is achievable, thus the |Actual-Expected| threshold could be set to Expected/1000 = 1580030. In other words, we look for better than 0.999 accuracy (> 99.9%). Harsh, but fair. Record the candidates that pass, then try these with 16-bit sample PractRand tests with minimum threshold 512M. This process should be a lot faster than using PractRand only.

1. A pfreq13+ value that is the sum of all output pair frequencies of 13 and higher.
2. A zfreq22+ value that is the sum of all zero run lengths of 22 and higher.
3. A mean run value for average zero run length, which is the ratio of two sums calculated using test data and therefore arguably the most meaningful zero run result. The relative rankings between different candidates for mean run are very similar to pfreq0-pfreq3, which suggests the latter could be used solely as "tie-breakers".

Both 1 and 2 are expected to be zero for a good candidate, but could be significantly more than zero for a bad candidate (conjecture in the absence of an actual test).

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

## Comments

1,2261,226Evan, this full table with 256 PractRand scores is excellent. Do you have similar tables for all the 11 candidates you've been looking at recently?

7,709[6 2 11 6] comes up clean. There is three very poor scores but they all check out as borderline BCFN_FF cases. So, really only a single 256 MB and the rest are 512 MB or better.

Here's all nine of the every-aperture score tables, and also the extended PR reports for the three borderline cases of [6 2 11 6].

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

9,771Its very good work. Should be published and accessible.

Do not taunt Happy Fun Ball! @opengeekorg ---> Be Excellent To One AnotherSKYPE = acuity_dougParallax colors simplified: https://forums.parallax.com/discussion/123709/commented-graphics-demo-spin<br>

7,709By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

7,709By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

7,709I haven't posted the sources in that topic. The testing source code is still evolving in some ways. And Tony has some of his own.

There is a number of archives in various posts of this topic though.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,226Maybe one more, if possible. [14,2,7,8] should do poorly and would be a good comparison to the others.

1,226Thanks, Evan. Have you dropped a couple of candidates recently? Is it the case that the nine consist of the seven with the best scores, plus [14,2,7,5] and [14,2,7,6]?

The key question: based on PractRand tests which do you think is the best?

1,226I think we should keep using this thread for test results and leave the other topic for general info about the algorithm plus recommended constants.

1,226It would be useful to have an every-aperture table with all the corrections for borderline cases.

7,709Ha! That won't be happening unless someone fines a way to tell Practrand to ease up on the sensitivity of the BCFN_FF tests.

You're welcome to hand edit the tables yourself.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

7,709I've been spending a lot more time coding than pondering the results.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,226It would be good if Chris (the PractRand author) could tell us how to modify the BCFN_FF tests. I could edit the tables, but the amended borderline scores are scattered about in different files and I think not all of them have been posted.

It's interesting that for eight of the nine [a,b,c,d] candidates b = 2, which is a left shift, whereas a, c & d are all left rotates. The exception is [13,5,10,10].

7,709EDIT: I note it took only 9 hours to run. That would suggest the culling threshold was set very high, ie: Any scores (except for direct 16+0 aperture) below 512 MB would have terminated the testing of that candidate.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,2261,226I wonder how long it would take to run an every-aperture test for all possible [a,b,c,d] candidates (84 * 16 = 1344), then stop each test if a 16-bit sample score is less than 512M? That should weed out most of the duds. Ideally, the BCFN_FF test would be amended first.

1,2267,709Yeah, I'll have a nosy at the Practrand sources. I'm not likely to solve it myself though.

This issue has probably already made that culled list above smaller than it should be. It's been present all along but I hadn't gone looking at the report files that closely until the oddball scores really stuck out.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,226Here are the processed results for the selected xoroshiro32++ [a,b,c,d] in the reports:

Pair frequencyActual and ExpectedPair frequency|Actual-Expected|/ExpectedZero run frequencyActual and ExpectedZero run frequency|Actual-Expected|/ExpectedNotespfreq(x) = output pairs that occur x times

zfreq(x) = zero runs of length x

If N-1 = full period (2^32-1 for the above) then:

(a) Expected pfreq(x) = 1/x! * N/e

(b) Sum of pfreq(x) = N

(c) Sum of x * pfreq(x) = N-1

(d) Expected zero runs = (1-1/e) * N/e

(e) Expected mean zero run = 1/(1-1/e)

(f) Expected zfreq(x) = (1-1/e)^2 * N/e^x

(g) Sum of zfreq(x) = zero runs

(h) Sum of x * zfreq(x) = pfreq0

(i) Sum of x * zfreq(x) / Sum of zfreq(x) = mean run

x >= 0 for pfreq(x), x > 0 for zfreq(x)

CommentsThese frequency distribution tests are intended as an independent "sanity check" but doing well does not imply the same will happen in other, more rigorous randomness tests. However, these frequency tests can detect certain bad candidates and also act as a "tie-breaker" for good candidates, as is the case here. For example, assuming all other scores are equal, [6,2,5,5] could be eliminated on pair frequency (which is more important in my opinion than zero run frequency).

1,226There is a small error in the above code. If the last byte of the 4GB pair array is zero, then the last zero run will not be written to the zfreq array. Also, it would better if zfreq[0] held the number of zero runs, i.e. it is incremented whenever zfreq[tally] is incremented. I think the amended code could be:

1,226I'd like to see how bad candidates perform, e.g. xoroshiro32++ [14,2,7,8] or xoroshiro32+p [14,2,7] or xoroshiro32+ [14,2,7], getting progressively worse. Repeat frequency tests suggests xoroshiro32++ [14,2,7,7] and [14,2,7,9] will be adversely affected by their proximity to [14,2,7,8].

1,226beforeany PractRand culling. For example, |Actual-Expected|/Expected less than 0.001 is achievable, thus the |Actual-Expected| threshold could be set to Expected/1000 = 1580030. In other words, we look for better than 0.999 accuracy (> 99.9%).Harsh, but fair.Record the candidates that pass, then try these with 16-bit sample PractRand tests with minimum threshold 512M. This process should be a lot faster than using PractRand only.1,2261,226And a response:

http://www.pcg-random.org/posts/on-vignas-pcg-critique.html

1,2261. A

pfreq13+value that is the sum of all output pair frequencies of 13 and higher.2. A

zfreq22+value that is the sum of all zero run lengths of 22 and higher.3. A

mean runvalue for average zero run length, which is the ratio of two sums calculated using test data and therefore arguably the most meaningful zero run result. The relative rankings between different candidates for mean run are very similar to pfreq0-pfreq3, which suggests the latter could be used solely as "tie-breakers".Both 1 and 2 are expected to be zero for a good candidate, but could be significantly more than zero for a bad candidate (conjecture in the absence of an actual test).

1,226Do you have the full-period constants for any of xoroshiro36/38/40?

7,709By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,226Many thanks, Evan. xoroshiro40 has relatively few full-period constants.

7,709Here's the same run for Xorshift "xs-s20" but including the log file. I don't seem to have kept very many logs.

EDIT: Looking at the s15 candidate list, it's only 73 lines, which is shorter than the s16 list. That makes the s16 list oddly long at 84 lines.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."