I've been looking at xoroshiro32 scramblers using subtraction instead of addition and the following four permutations output non-zero values 2^16 times and zero 2^16-1 times.

I don't know whether the pair distributions (pfreq) are identical because I don't have 4GB of RAM to test them. Also I haven't looked at (s0 - s1) yet.

The following six permutations have identical equidistribution for single outputs:

If pfreq are not identical then maybe something other than ++ will have better quality.

I have studied all of those variations previously with PractRand (with only 0-15 aperture) and did not see any significant difference.
Definitely a valid path of investigation for tuning of freqs, but not sure otherwise.
I did implement a subtraction in my original xoroshiroNOT code because I could demonstrate a measurable improvement, but it is in a slightly different context:

//Fully one-dimensionally equidistributed (over full period, no short values) version, D must be odd
static inline uintx xoroshiroNOT1d(void) {
uintx tmp = s[0] ^ s[2];
s[2] = rotl(s[0] + s[1], D) + tmp - s[1] - const_odd;
uintx s1 = s[0] ^ s[1];
s[0] = rotl(s[0], A) ^ s1 ^ (s1 >> B ); //note reversed B shift direction required for inverted constants
s[1] = rotl(s1, C);
return s[2]; }

Evan and Chris, I think it would be useful if you were able to calculate Chi-Square sums so that you could rank the pair/zero/non-zero frequency distributions yourselves, without needing me.

I get most of what you posted, but have a few fundamental questions about the frequency distribution calculations (and since C is not my language of choice... I have to convert to/from either ASM or VB, which is tedious, at best):

My distribution programs are in QuickBASIC, which I could clean up and post here. Evan uses C to create the raw binary data that my programs read.

1. What are the raw count values based on for each (e.g. for pairs, is it word pairs, byte pairs, bit pairs)?

4GB of RAM are needed for a byte array that records how often each possible output pair from 0000_0000 to FFFF_FFFF occurs during two full periods of 2 x 2^32-1 = 2^33-2 single outputs. The array is zeroed, then each output pair provides the 32-bit address of a byte value that is incremented. Each state is iterated twice (except all zeroes) and its corresponding output is the low word of the address during the first period and the high word during the second period, or vice-versa.

There are three other arrays, each 1KB (256 x 32-bit) in size for pfreq, zfreq and nzfreq, also zeroed beforehand. The 4GB array is read sequentially from 0000_0000 to FFFF_FFFF and each byte value is used as an index to increment a four-byte pfreq count. Zero runs and non-zero runs can be calculated at the same time and when they end the run length is used as an index to increment a four-byte zfreq or nzfreq count.

2. Why terminating z / nz runs at end of 4GB of data (as opposed to the 8GB of a single period or 16GB of a double period)?

The FFFF_FFFF byte value will be the end of a zero/non-zero run, but there won't be a non-zero/zero value to terminate that run because all the data have been read. It's a minor point for strict accuracy, but without checking for end of data one of the runs won't be recorded.

If pfreq are not identical then maybe something other than ++ will have better quality.

I have studied all of those variations previously with PractRand (with only 0-15 aperture) and did not see any significant difference.
Definitely a valid path of investigation for tuning of freqs, but not sure otherwise.
I did implement a subtraction in my original xoroshiroNOT code because I could demonstrate a measurable improvement, but it is in a slightly different context:

//Fully one-dimensionally equidistributed (over full period, no short values) version, D must be odd
static inline uintx xoroshiroNOT1d(void) {
uintx tmp = s[0] ^ s[2];
s[2] = rotl(s[0] + s[1], D) + tmp - s[1] - const_odd;
uintx s1 = s[0] ^ s[1];
s[0] = rotl(s[0], A) ^ s1 ^ (s1 >> B ); //note reversed B shift direction required for inverted constants
s[1] = rotl(s1, C);
return s[2]; }

I don't expect any appreciable differences with PractRand scores, because PractRand can detect equidistribution which is identical for all +/- scramblers. I'm keen to know whether there are differences with pfreq/zfreq/nzfreq. Testing only [13,5,10,9] with +- and the two possible -+ and -- pairs should be enough.

I'm keen to know whether there are differences with pfreq/zfreq/nzfreq. Testing only [13,5,10,9] with +- and the two possible -+ and -- pairs should be enough.

I'm keen to know whether there are differences with pfreq/zfreq/nzfreq. Testing only [13,5,10,9] with +- and the two possible -+ and -- pairs should be enough.

Evan, do you have some spare time to try this?

The expectation would be that superior pfreq/zfreq/nzfreq would be abundant in at least some of the 5 untested variants...
If that were the case, it would be good news, because the same would very likely be true for the s0 * 5 class, which has already shown some excellent potential.
Use of '-' successfully would potentially eliminate any need to use M/E/'Bad-Seed', which were only originally only intended to entirely eliminate escape-from-zero issues (and as a side-effect, provide de-correlated streams for my own code), but I had co-opted those additional constants to attempt to manipulate pfreq/zfreq/nzfreq (thus overly-complicating implementation).
Tony's idea is much cleaner, if it works as I expect. I would guess that the highest several original PractRand scoring s0 and current high-scoring s0 * 5 candidates would be the top-runners.

I'm keen to know whether there are differences with pfreq/zfreq/nzfreq. Testing only [13,5,10,9] with +- and the two possible -+ and -- pairs should be enough.

Evan, do you have some spare time to try this?

The expectation would be that superior pfreq/zfreq/nzfreq would be abundant in at least some of the 5 untested variants...
If that were the case, it would be good news, because the same would very likely be true for the s0 * 5 class, which has already shown some excellent potential.
Use of '-' successfully would potentially eliminate any need to use M/E/'Bad-Seed', which were only originally only intended to entirely eliminate escape-from-zero issues (and as a side-effect, provide de-correlated streams for my own code), but I had co-opted those additional constants to attempt to manipulate pfreq/zfreq/nzfreq (thus overly-complicating implementation).
Tony's idea is much cleaner, if it works as I expect. I would guess that the highest several original PractRand scoring s0 and current high-scoring s0 * 5 candidates would be the top-runners.

I've now looked at pair outputs 0000_0000 to 0000_FFFF and pfreq for this subset is different for each of the first four scramblers listed above. The overall pfreq distribution for 0000_0000 to FFFF_FFFF might be identical, however, e.g. if there is a simple translation or mirroring at work here. Anyway, as a small subset is not the same the whole lot seems to be worth testing, which I can't do due to insufficient RAM. It might turn out that ++ produces the best distributions, though.

I am not sure if this will help: Approximate Counting Algorithm
If so (since it is only an estimate, and the memory savings is only modest, requiring 3-bits instead of 8 )... I suggest using the best PRNG that you can reasonably implement it with for use with QB.
You would then have to take the fully processed results and re-do multiple times (with different seeds) to get a more formal average estimate.

DDR4 DIMMs have finally come back down to normal pricing again. Now's a good time to refit an old PC box.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Ugh! That won't work for the initial tally in the 4 GB map, since it needs to be able to do single increments all the way.

I think Melisa did a distribution test on a much bigger state size some time back. She mentioned renting a server with 512 GB of RAM to do it but also mentioned doing some trickery to make do with the still very limited space compared to what was needed.

I pondered what she might have done and the only think that has come to mind is dividing the tally map up into logical segments and run the full period over each segment and generate the stats for each segment one by one.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Ugh! That won't work for the initial tally in the 4 GB map, since it needs to be able to do single increments all the way.

The single increments would only occur if the PRNG coin-flips say they should occur... which creates lots of overhead AND requires a 3-bit bucket calculation to index into a standard bit size array.
Very ugly (and inaccurate), yes. Normally only used when tallying much larger bit sizes (e.g. 6 bits for tallying 64 bit values).

Regarding tallying states in a probabilistic way, Melissa was finally able to get the memory requirements down to 6-12GB using a Bloom Filter (which I have not studied): Here

Hopefully there will be something useful to come from this soon.

Is what we have for P2 going to work? Or is it still and unknown theoretical experiment?

The P2 works fine, as is... the current discussion (which I restarted as an outsider interested in the P2 and random numbers) is indeed on theoretical improvements to the smallest (32-bit state) P2 pseudo-random number generator (PRNG).
Many of the properties of a 32-bit state PRNGs can be exhaustively tested and proven, and small improvements to the statistical randomness might be desirable if they hold up to the strictest scrutiny.
All pseudo-random number generators are unproven, in the sense that for most a statistical test can very easily be created to find that they are not truly random.

This thread, which I haven’t studied, makes me extremely nervous that all the silicon used for this in P2 is only based on unproved theory.

Part of this discussion has been concerned with ensuring that silicon real-estate is not wasted and the current functional and mathematically sound P2 32-bit PRNG is untouched unless there is sufficient evidence to implement improvements.

For my part (and for others too, I hope), I find the exploration of creating pseudo-random numbers enjoyable. It is one of the few approachable frontiers left in mathematics.

Regarding tallying states in a probabilistic way, Melissa was finally able to get the memory requirements down to 6-12GB using a Bloom Filter (which I have not studied): Here

Lol, I didn't read that far down, and that was the very article I remember too. Maybe I should read it some more ...

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Ah, and having now examined even the first sources, and shell output "198893 outputs to scan for duplicates", she's provided I can see there is no attempt to run the full period. It's a case of run a small amount, and do the stats on that. Not what I was thinking of before.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

DDR4 DIMMs have finally come back down to normal pricing again. Now's a good time to refit an old PC box.

Thanks, Evan. My RAM cannot be increased and is a long way short of 4GB.

With six +/- scramblers to test, I think it would be better if you could calculate the three Chi-Square sums for pfreq/zfreq/nzfreq yourself and post those rather than six lots of (partial) raw data in .bin files. Full details here: http://forums.parallax.com/discussion/comment/1469463/#Comment_1469463

[13,5,8,10] would be a good one to test along with [13,5,10,9].

You can usually pick up second hand gear for almost nothing. Just have to be willing to put the new mobo/cpu/ram in is all. And of course being familiar with doing a fresh OS install to clean out the crap on the existing HDD also helps.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Heh, I had to patch a script to do a selective distribution run. Then I discovered I'd only half finished inserting the E-Correction feature. I started trying to tidy that up but in the end removed it again.

I'm distribution testing a set of 12 ++ candidates from last years testing. On the third algorithm ...

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

I discovered I'd only half finished inserting the E-Correction feature. I started trying to tidy that up but in the end removed it again.

I searched for variants with M that that would not require E/'bad-seed' and did not find any, which makes sense.

My testing on xoroshiro128psp (which uses M for stream selection, E is not used/required and 'bad-seed' is tested explicitly by state comparison after one pump) is painfully slow.
Using Seba's current ABC constants, I have eliminated D=31 and 32 based on BigCrush meta-analysis. D=33 shows initial signs of bias in 1 of the 254 BigCrush p-values*.

Seba's Hamming Weight (0.5PB) and PractRand (2TB) are unaware of any issues (that I can easily see with BigCrush meta-analysis) on any of the three D values I've tested.
I'll have to defer full tests on those till BigCrush is statistically sound after 500+ runs.

It is unfortunate that at any much greater than 32-bit state size that pp/psp becomes too mathematically hard (yet promises superior randomness), and exhaustive testing is so difficult and time-consuming.

*Edit: The bias has dropped from +3.5 standard deviations to ~1 SD (but must keep testing), which can happen, as skew on partially correlated data can extend well beyond the normal minimum 30 data points used for published research (which can lead to false positives and negatives, which is unfortunate). For each P-value I use a minimum of 88 data points for the smallest (most strongly statistically correlated) sets, and at least 528 points for the larger (possibly cross-correlated) sets. All P-values are then tested as a superset using my SD alternative to chi-squared.

Tony,
I've implemented the Chi Squared thingy. I'm wanting to label the four columns but this is mostly over my head so over to you. Eg: what does "x" represent? Here's example output, currently:

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

Tony,
I've implemented the Chi Squared thingy. I'm wanting to label the four columns but this is mostly over my head so over to you. Eg: what does "x" represent? Here's example output, currently:

pfreq(x) is the number of different output pairs that occur x times. In this example, pfreq(0) = 1580769164, therefore 1580769164 different 32-bit outputs never occur, 1579279603 occur once, etc.

zfreq(x) or nzfreq(x) is the number of times there is a zero or non-zero run of length x (x successive 32-bit outputs are zero or non-zero).

A couple of points:

1. I think you might need to add 0.5 before truncation as expected pfreq(0) = expected pfreq(1) = 1580030169 to nearest integer. Floating-point could be used for the expected values for total accuracy, but I doubt rankings will be different with integer arithmetic.

2. The sum of Chi-Square(x) is the important result for each of pfreq/zfreq/nzfreq. It is not very good for pfreq in this instance.

Worked through all that thanks. Identical expected's now except short by one nzfreq. Attached is auto-generated CSV file for a selection of 12 high scoring candidates from back last year. I think they're from single iterated scoring.

PS: First column of CSV file is the total of ChiSquare of that candidate's frequency distribution.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

The two stand-outs of that selection is, with candidate [6 2 3 6]: rol(s0+s1,d)-s0 and rol(-s0-s1,d)+s0.

EDIT: Here's another auto-generated CSV that can be cleanly sorted in a spreadsheet and it shows which candidate and algorithm go with the best ChiSquare scores.

EDIT2: I've just improved the rounding for the auto-generation script as well. So some of the attached scores will be slightly revised in next posting. I'm off to bed now though.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

The two stand-outs of that selection is, with candidate [6 2 3 6]: rol(s0+s1,d)-s0 and rol(-s0-s1,d)+s0.

EDIT: Here's another auto-generated CSV that can be cleanly sorted in a spreadsheet and it shows which candidate and algorithm go with the best ChiSquare scores.

EDIT2: I've just improved the rounding for the auto-generation script as well. So some of the attached scores will be slightly revised in next posting. I'm off to bed now though.

Thanks for the results, Evan. The 12 candidates aren't the best ones with [3,2,6,5] top of the 12 at 32nd overall. It wil be interesting to see the results for better candidates. Using its best alternative scrambler [6,2,3,6] moves up from 178th to 6th overall, assuming everything else unchanged.

As I suspected there are four positive/negative scrambler pairs with very similar (but not quite identical) results:

I'm beginning a fresh full distribution run of "xo" (Xoroshiro32++), which will have the ChiSquare totals included in the report data. That'll net a full list of easy sortable ChiSquare scores. We can then compare against what you've got already, Tony.

PS: I think it's quicker to do a full rerun than make another program to analyse the old report data.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

The Chi-Square sum names could be shortened to pchi, zchi and nzchi. Adding these to give a composite sum is nonsense mathematically but it is a quick way to spot and eliminate the duds if > 9999.

The scramblers with positive s1 do not need the initial + sign and therefore the positive/negative pairs can be written as

I have a feeling the best [a,b,c,d] with ++ might be worse with +-/-+/-- or their negatives, but might be improved a little with ---. The only way to be sure is to create all the binaries. I suggest +-, -+ and -- first and the negatives later.

I've just done a compare against your most recent selection of ++ candidates - https://forums.parallax.com/discussion/comment/1468718/#Comment_1468718 - and note that, although the rankings differ, the pfreq/zfreq/nzfreq ChiSquare totals all match. Tony, I assume you've been ranking pfreq with higher priority.

"We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
By doing that, we can more accurately measure their mass, and determine whether
scientists have systematically been underestimating how much matter they contain."

I've just done a compare against your most recent selection of ++ candidates - https://forums.parallax.com/discussion/comment/1468718/#Comment_1468718 - and note that, although the rankings differ, the pfreq/zfreq/nzfreq ChiSquare totals all match. Tony, I assume you've been ranking pfreq with higher priority.

Evan, I've changed the Chi-Square titles in that post to pchi/zchi/nzchi. I give equal weight to prank, zrank and nzrank. I sum the ranks (i.e. prank+zrank+nzrank) to get an overall ranking and the lowest one is given a rank of 1. I do not sum pchi+zchi+nzchi. If prank+zrank+nzrank is the same integer for two different [a,b,c,d] then I use prank as the tie-breaker.

## Comments

1,323The following six permutations have identical equidistribution for single outputs:

If pfreq are not identical then maybe something other than ++ will have better quality.

158Definitely a valid path of investigation for tuning of freqs, but not sure otherwise.

I did implement a subtraction in my original xoroshiroNOT code because I could demonstrate a measurable improvement, but it is in a slightly different context:

1,323My distribution programs are in QuickBASIC, which I could clean up and post here. Evan uses C to create the raw binary data that my programs read.

4GB of RAM are needed for a byte array that records how often each possible output pair from 0000_0000 to FFFF_FFFF occurs during two full periods of 2 x 2^32-1 = 2^33-2 single outputs. The array is zeroed, then each output pair provides the 32-bit address of a byte value that is incremented. Each state is iterated twice (except all zeroes) and its corresponding output is the low word of the address during the first period and the high word during the second period, or vice-versa.

There are three other arrays, each 1KB (256 x 32-bit) in size for pfreq, zfreq and nzfreq, also zeroed beforehand. The 4GB array is read sequentially from 0000_0000 to FFFF_FFFF and each byte value is used as an index to increment a four-byte pfreq count. Zero runs and non-zero runs can be calculated at the same time and when they end the run length is used as an index to increment a four-byte zfreq or nzfreq count.

The FFFF_FFFF byte value will be the end of a zero/non-zero run, but there won't be a non-zero/zero value to terminate that run because all the data have been read. It's a minor point for strict accuracy, but without checking for end of data one of the runs won't be recorded.

1,323I don't expect any appreciable differences with PractRand scores, because PractRand can detect equidistribution which is identical for all +/- scramblers. I'm keen to know whether there are differences with pfreq/zfreq/nzfreq. Testing only [13,5,10,9] with +- and the two possible -+ and -- pairs should be enough.

1,323Evan, do you have some spare time to try this?

158If that were the case, it would be good news, because the same would very likely be true for the s0 * 5 class, which has already shown some excellent potential.

Use of '-' successfully would potentially eliminate any need to use M/E/'Bad-Seed', which were only originally only intended to entirely eliminate escape-from-zero issues (and as a side-effect, provide de-correlated streams for my own code), but I had co-opted those additional constants to attempt to manipulate pfreq/zfreq/nzfreq (thus overly-complicating implementation).

Tony's idea is much cleaner, if it works as I expect. I would guess that the highest several original PractRand scoring s0 and current high-scoring s0 * 5 candidates would be the top-runners.

1,323I've now looked at pair outputs 0000_0000 to 0000_FFFF and pfreq for this subset is different for each of the first four scramblers listed above. The overall pfreq distribution for 0000_0000 to FFFF_FFFF might be identical, however, e.g. if there is a simple translation or mirroring at work here. Anyway, as a small subset is not the same the whole lot seems to be worth testing, which I can't do due to insufficient RAM. It might turn out that ++ produces the best distributions, though.

158If so (since it is only an estimate, and the memory savings is only modest, requiring 3-bits instead of 8 )... I suggest using the best PRNG that you can reasonably implement it with for use with QB.

You would then have to take the fully processed results and re-do multiple times (with different seeds) to get a more formal average estimate.

8,043DDR4 DIMMs have finally come back down to normal pricing again. Now's a good time to refit an old PC box.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

8,043Ugh! That won't work for the initial tally in the 4 GB map, since it needs to be able to do single increments all the way.

I think Melisa did a distribution test on a much bigger state size some time back. She mentioned renting a server with 512 GB of RAM to do it but also mentioned doing some trickery to make do with the still very limited space compared to what was needed.

I pondered what she might have done and the only think that has come to mind is dividing the tally map up into logical segments and run the full period over each segment and generate the stats for each segment one by one.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

158Very ugly (and inaccurate), yes. Normally only used when tallying much larger bit sizes (e.g. 6 bits for tallying 64 bit values).

Regarding tallying states in a probabilistic way, Melissa was finally able to get the memory requirements down to 6-12GB using a Bloom Filter (which I have not studied): Here

15,456Is what we have for P2 going to work? Or is it still and unknown theoretical experiment?

This thread, which I haven’t studied, makes me extremely nervous that all the silicon used for this in P2 is only based on unproved theory.

My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBladeProp OS(also see Sphinx, PropDos, PropCmd, Spinix)Website: www.clusos.comProp Tools (Index) , Emulators (Index) , ZiCog (Z80)158Many of the properties of a 32-bit state PRNGs can be exhaustively tested and proven, and small improvements to the statistical randomness might be desirable if they hold up to the strictest scrutiny.

All pseudo-random number generators are unproven, in the sense that for most a statistical test can very easily be created to find that they are not truly random. Part of this discussion has been concerned with ensuring that silicon real-estate is not wasted and the current functional and mathematically sound P2 32-bit PRNG is untouched unless there is sufficient evidence to implement improvements.

For my part (and for others too, I hope), I find the exploration of creating pseudo-random numbers enjoyable. It is one of the few approachable frontiers left in mathematics.

8,043By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

8,043By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,323Thanks, Evan. My RAM cannot be increased and is a long way short of 4GB.

With six +/- scramblers to test, I think it would be better if you could calculate the three Chi-Square sums for pfreq/zfreq/nzfreq yourself and post those rather than six lots of (partial) raw data in .bin files. Full details here:

http://forums.parallax.com/discussion/comment/1469463/#Comment_1469463

[13,5,8,10] would be a good one to test along with [13,5,10,9].

8,043By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,323At first sight four appear to be negative versions of the other four, but might not be in reality because of the rotation.

8,043I'm distribution testing a set of 12 ++ candidates from last years testing. On the third algorithm ...

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

158My testing on xoroshiro128psp (which uses M for stream selection, E is not used/required and 'bad-seed' is tested explicitly by state comparison after one pump) is painfully slow.

Using Seba's current ABC constants, I have eliminated D=31 and 32 based on BigCrush meta-analysis. D=33 shows initial signs of bias in 1 of the 254 BigCrush p-values*.

Seba's Hamming Weight (0.5PB) and PractRand (2TB) are unaware of any issues (that I can easily see with BigCrush meta-analysis) on any of the three D values I've tested.

I'll have to defer full tests on those till BigCrush is statistically sound after 500+ runs.

It is unfortunate that at any much greater than 32-bit state size that pp/psp becomes too mathematically hard (yet promises superior randomness), and exhaustive testing is so difficult and time-consuming.

*Edit: The bias has dropped from +3.5 standard deviations to ~1 SD (but must keep testing), which can happen, as skew on partially correlated data can extend well beyond the normal minimum 30 data points used for published research (which can lead to false positives and negatives, which is unfortunate). For each P-value I use a minimum of 88 data points for the smallest (most strongly statistically correlated) sets, and at least 528 points for the larger (possibly cross-correlated) sets. All P-values are then tested as a superset using my SD alternative to chi-squared.

8,043I've implemented the Chi Squared thingy. I'm wanting to label the four columns but this is mostly over my head so over to you. Eg: what does "x" represent? Here's example output, currently:

And here's the relevant source:

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,323pfreq(x) is the number of different output pairs that occur x times. In this example, pfreq(0) = 1580769164, therefore 1580769164 different 32-bit outputs never occur, 1579279603 occur once, etc.

zfreq(x) or nzfreq(x) is the number of times there is a zero or non-zero run of length x (x successive 32-bit outputs are zero or non-zero).

A couple of points:

1. I think you might need to add 0.5 before truncation as expected pfreq(0) = expected pfreq(1) = 1580030169 to nearest integer. Floating-point could be used for the expected values for total accuracy, but I doubt rankings will be different with integer arithmetic.

2. The sum of Chi-Square(x) is the important result for each of pfreq/zfreq/nzfreq. It is not very good for pfreq in this instance.

8,043PS: First column of CSV file is the total of ChiSquare of that candidate's frequency distribution.

New source for pfreq

And zfreq

And nzfreq

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

8,043By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

8,043EDIT: Here's another auto-generated CSV that can be cleanly sorted in a spreadsheet and it shows which candidate and algorithm go with the best ChiSquare scores.

EDIT2: I've just improved the rounding for the auto-generation script as well. So some of the attached scores will be slightly revised in next posting. I'm off to bed now though.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,323Thanks for the results, Evan. The 12 candidates aren't the best ones with [3,2,6,5] top of the 12 at 32nd overall. It wil be interesting to see the results for better candidates. Using its best alternative scrambler [6,2,3,6] moves up from 178th to 6th overall, assuming everything else unchanged.

As I suspected there are four positive/negative scrambler pairs with very similar (but not quite identical) results:

I prefer to put s1 first always because it makes more sense to me to have s0 second.

8,043I'm beginning a fresh full distribution run of "xo" (Xoroshiro32++), which will have the ChiSquare totals included in the report data. That'll net a full list of easy sortable ChiSquare scores. We can then compare against what you've got already, Tony.

PS: I think it's quicker to do a full rerun than make another program to analyse the old report data.

By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,323The scramblers with positive s1 do not need the initial + sign and therefore the positive/negative pairs can be written as

which can be shortened to

where ++ is the familiar xoroshiro++.

I have a feeling the best [a,b,c,d] with ++ might be worse with +-/-+/-- or their negatives, but might be improved a little with ---. The only way to be sure is to create all the binaries. I suggest +-, -+ and -- first and the negatives later.

8,043By doing that, we can more accurately measure their mass, and determine whether

scientists have systematically been underestimating how much matter they contain."

1,323Evan, I've changed the Chi-Square titles in that post to pchi/zchi/nzchi. I give equal weight to prank, zrank and nzrank. I sum the

ranks(i.e. prank+zrank+nzrank) to get an overall ranking and the lowest one is given a rank of 1. I donotsum pchi+zchi+nzchi. If prank+zrank+nzrank is the same integer for two different [a,b,c,d] then I use prank as the tie-breaker.