BTW: That makes the Crush reports for aperture 16>>0 of candidates [13 5 8 10] and [7 2 14 10] clean reports.
Aperture 16>>0 for [14 2 7 5] seems to be good in most of my tests, but there is one arrangement producing e-12 that I'd class as VERY SUSPICIOUS (Practrand naming). It was my experimental candidate so, until I get back to retesting it, I'm not completely sure what the various arrangements were.
Aperture 16>>0 for [14 2 7 6] produces an eps in all the arrangements so it's out.
Long holiday away from PRNGs now over.
Evan, which of [13,5,10,9] and [13,5,8,10] are better overall in PractRand and Crush tests?
[13 5 8 10] seems the better, it was already visible in the good scoring Practrand scores before we used the distribution scores. I haven't done any exhaustive comparing with the Crushes. You've got the reports.
Evan, I do have the Crush reports you emailed me thanks and I also have Crush and BigCrush reports for Chris's 32-bit generator. Did you ever run BigCrush tests for [13,5,10,9] and [13,5,8,10]? If not, I think they would be good to have.
A respin might be the opportunity to improve XORO32 and make it the best it can be by replacing [14,2,7,5] with [13,5,10,9] or [13,5,8,10]. No other candidates really matter at this time. The XORO32 logic would be the same, just the signal routing would change due to different rotate and shift constants.
I think there is a third distribution we should look at. We have the xoroshiro32++ pair (XORO32 output) and corresponding zero run frequencies but we don't have the non-zero runs. There are longer non-zero runs than zero runs, which means there are more non-zero run frequencies > 0.
Now, this might not affect the ranking of candidates at all, but we won't know until we have the non-zero results. The C code could be modified to record non-zero runs as well as zero runs quite easily. As I don't have 4GB of RAM, I tested only the first and last 64K blocks, i.e. high word of 32-bit output was 0000 or FFFF, in x86 assembly. I could deduce the equation for expected nzfreq(x) and testing Chris's generator first should confirm it.
Expected distribution values to the nearest integer are listed below. Evan, could you please modify your C program to also record nzfreq for Chris's generator? If results agree with expected values then we'll need nzfreq distributions for some of the xoroshiro32++ candidates. I suggest testing the top 10 for zfreq to see if their ranking is the same for nzfreq and if so we could probably stop there.
A respin might be the opportunity to improve XORO32 and make it the best it can be by replacing [14,2,7,5] with [13,5,10,9] or [13,5,8,10]. No other candidates really matter at this time.
They are all much the same to me. I don't have any complaint with the existing [14 2 7 5], [3 2 6 9] also looks particularly good in Crush testing and was a top candidate with Practrand.
I can't see any particular judgement as better. I also have no clue as to the value of any specific test, including the distributions. I won't complain if you ask Chip to make a change.
A respin might be the opportunity to improve XORO32 and make it the best it can be by replacing [14,2,7,5] with [13,5,10,9] or [13,5,8,10]. No other candidates really matter at this time.
They are all much the same to me. I don't have any complaint with the existing [14 2 7 5], [3 2 6 9] also looks particularly good in Crush testing and was a top candidate with Practrand.
I can't see any particular judgement as better. I also have no clue as to the value of any specific test, including the distributions. I won't complain if you ask Chip to make a change.
I agree that switching to [3,2,6,9] would be fine if the XORO32 output were 16-bit, but as it's 32-bit using two successive xoroshiro32++ outputs we need to also consider how it compares to an ideal 32-bit generator, which Chris's is close to being. This is where the distributions come in: pfreq, zfreq and the new nzfreq, which might be a better test than zfreq as there are more non-zero frequencies > 0.
If nzfreq ranking is no different to zfreq, then it's matter of selecting either [13,5,10,9] or [13,5,8,10], based on Crush and BigCrush tests. However, latter not yet run, I think. We should use a candidate that is very good as 16-bit, based on PractRand and TestUI results, and very good as 32-bit, based on the distributions.
I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.
Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).
They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.
Just eyeballed the two Practrand grids for Xoroshiro32++ [6 2 3 9]. Nothing below a 512 MB score for 32, 16 and 8-bit grid lines. So looks better than [14 2 7 5] if ignoring all the other apertures.
Double iterated [13 5 10 9] has one 256 MB at aperture 8>>14.
Double iterated [13 5 8 10] has a couple of 128 MB and several 256 MB scores on the 8-bit line.
I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.
Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).
They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.
The pfreq and zfreq results for Chris's generator tell me that (a) his generator is excellent and the output is very close to the ideal distributions, (b) as a consequence of (a) the distributions are a good test of randomness and should be applied to XORO32 and (c) the very close match to the expected zfreq proves that the equation I deduced but cannot prove mathematically is correct. A nzfreq test of Chris's generator will prove or disprove that equation.
I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.
Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).
They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.
The pfreq and zfreq results for Chris's generator tell me that (a) his generator is excellent and the output is very close to the ideal distributions, (b) as a consequence of (a) the distributions are a good test of randomness and should be applied to XORO32 and (c) the very close match to the expected zfreq proves that the equation I deduced but cannot prove mathematically is correct. A nzfreq test of Chris's generator will prove or disprove that equation.
I see all that as only a correlation, not proof. But then I'm not trying to fully understand it.
P.S. I'm not sure Melissa has given xoroshiro++ a thorough test.
True, but the differences aren't huge. The biggest hole in the + scrambler is plugged but I'm sure she'll still have the same gripes about the engine itself. Luckily, it doesn't have the static flow of the ** scrambler. Melissa was most unimpressed with that move.
I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.
Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).
They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.
The pfreq and zfreq results for Chris's generator tell me that (a) his generator is excellent and the output is very close to the ideal distributions, (b) as a consequence of (a) the distributions are a good test of randomness and should be applied to XORO32 and (c) the very close match to the expected zfreq proves that the equation I deduced but cannot prove mathematically is correct. A nzfreq test of Chris's generator will prove or disprove that equation.
I see all that as only a correlation, not proof. But then I'm not trying to fully understand it.
Had I been wrong with the equation, the nzfreq test would have disproved it. A better than 99.9% match for Chris's pfreq and zfreq is proof enough for me! And the pfreq equation was known and proved already.
Tony,
You're looking at the wrong end of the scores for proof. While distribution candidates #1 and #4 do have good Practrand gridding scores, candidates #2 and #5 have rather bad scores. And there is worse still in the top 100 candidates.
Tony,
You're looking at the wrong end of the scores for proof. While distribution candidates #1 and #4 do have good Practrand gridding scores, candidates #2 and #5 have rather bad scores. And there is worse still in the top 100 candidates.
Evan, I've always said good scores in distribution tests are no guarantee of good scores in other tests. That's why we need to select the candidate that does well in several tests: distribution, PractRand and TestUI. It has to be said that [14,2,7,5] does badly on distribution: 487 out of 1260 on combined prank+zrank ranking.
The nzfreq Chi-square score for Chris's generator is 41 for nzfreq(1-45), or 18 for nzfreq(1-16) which make up over 99.9% of the total non-zero runs. This doesn't mean much without other results but I'm sure no xoroshiro32++ will match it. How do you feel about running the full set of nzfreq tests?
The engine for Chris's generator is xorshift, which produces equidistributed output and, as well as improving the output quality, his scrambler destroys this equidistribution by ensuring that certain outputs occur more than once and others never. I think this is what he means by "biasing".
Also a good pfreq score does not imply a good zfreq score and vice-versa.
Here are some combined scores for frequency distributions. The table belows show the top 10 for prank only and zrank only when the two rankings are added and sorted from lowest to highest. The lowest prank is the tiebreaker for equal prank+zrank. Candidates that are not in either top 10 have been omitted.
You're trying to use Chris's single data point as representative for proof that distribution is a good predictor. Yet, the much wider selection of best distribution Xoroshiro candidates demonstrates it's just random.
I've started the paired distribution run, with nzfreq, for a full sweep of the 1260 candidates again. Five minutes per candidate, seven candidates in parallel -> 15 hours.
EDIT: It's reproducing all the distribution data. Even with all three data sets, pfreq, zfreq and nzfreq, being produced, that part of the program is still only 5% of execution time. Filling the initial 4 GB of pair tallies takes the other 95%.
EDIT2: I can feel the sluggishness of the computer while it's doing this. Parctrand doesn't have this much impact even when hogging much more CPU. It'll be cache/RAM thrashing because of the large memory allocations, causing a lot of blocked waiting. Of note, when running a single case the run time is four minutes instead of five.
I've started the paired distribution run, with nzfreq, for a full sweep of the 1260 candidates again. Five minutes per candidate, seven candidates in parallel -> 15 hours.
EDIT: It's reproducing all the distribution data. Even with all three data sets, pfreq, zfreq and nzfreq, being produced, that part of the program is still only 5% of execution time. Filling the initial 4 GB of pair tallies takes the other 95%.
EDIT2: I can feel the sluggishness of the computer while it's doing this. Parctrand doesn't have this much impact even when hogging much more CPU. It'll be cache/RAM thrashing because of the large memory allocations, causing a lot of blocked waiting. Of note, when running a single case the run time is four minutes instead of five.
Thanks for starting the tests, Evan. Best to leave the computer alone to get on with it!
Tony and I are still searching for better constants to use here even.
The search is almost over.
I think the search is now over and [13,5,10,9] is the one.
Something that we should prove is that the distribution results are identical when they are generated in the same way as XORO32. To save time, the C code saves the previous 16-bit output and uses it again. What was the high word in one 32-bit output becomes the low word in the next. XORO32 repeats each of the 2^32-1 iterations but not until 2^32-2 others have been done first. The ordering of the data is different but the distributions should be the same and checking a single candidate would suffice.
So you want double iterated 32-bit output from the Xoroshiro generator. And remove the pairing from the distribution test code so that it still only has to deal with 32-bit indexing. Right?
Comments
Evan, I do have the Crush reports you emailed me thanks and I also have Crush and BigCrush reports for Chris's 32-bit generator. Did you ever run BigCrush tests for [13,5,10,9] and [13,5,8,10]? If not, I think they would be good to have.
A respin might be the opportunity to improve XORO32 and make it the best it can be by replacing [14,2,7,5] with [13,5,10,9] or [13,5,8,10]. No other candidates really matter at this time. The XORO32 logic would be the same, just the signal routing would change due to different rotate and shift constants.
Now, this might not affect the ranking of candidates at all, but we won't know until we have the non-zero results. The C code could be modified to record non-zero runs as well as zero runs quite easily. As I don't have 4GB of RAM, I tested only the first and last 64K blocks, i.e. high word of 32-bit output was 0000 or FFFF, in x86 assembly. I could deduce the equation for expected nzfreq(x) and testing Chris's generator first should confirm it.
Expected zfreq(x) = (1-1/e)^2 * N * (1/e)^x ... proven
Expected nzfreq(x) = (1/e)^2 * N * (1-1/e)^x ... unproven EDIT proven
N = 2^32. Very nice symmetry in the zfreq and nzfreq equations.
zfreq(x) > nzfreq(x) when x = 1
zfreq(x) = nzfreq(x) when x = 2
zfreq(x) < nzfreq(x) when x > 2
Rounding to the nearest integer, the highest frequencies > 0 are pfreq(12), zfreq(21) and nzfreq(45).
pfreq(x) proven
zfreq(x) proven
nzfreq(x) unproven EDIT proven
I can't see any particular judgement as better. I also have no clue as to the value of any specific test, including the distributions. I won't complain if you ask Chip to make a change.
EDIT: Here's the existing code:
It might be better to have separate code for non-zero runs as zero runs already done in their entirety. Changes:
ztally -> nztally
zfreq -> nzfreq
freqi == 0 -> freqi > 0
zero-run -> non-zero-run
pfreq could be commented out as already done.
I agree that switching to [3,2,6,9] would be fine if the XORO32 output were 16-bit, but as it's 32-bit using two successive xoroshiro32++ outputs we need to also consider how it compares to an ideal 32-bit generator, which Chris's is close to being. This is where the distributions come in: pfreq, zfreq and the new nzfreq, which might be a better test than zfreq as there are more non-zero frequencies > 0.
If nzfreq ranking is no different to zfreq, then it's matter of selecting either [13,5,10,9] or [13,5,8,10], based on Crush and BigCrush tests. However, latter not yet run, I think. We should use a candidate that is very good as 16-bit, based on PractRand and TestUI results, and very good as 32-bit, based on the distributions.
Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).
They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.
Double iterated [13 5 10 9] has one 256 MB at aperture 8>>14.
Double iterated [13 5 8 10] has a couple of 128 MB and several 256 MB scores on the 8-bit line.
Of these two, I'd now choose [13 5 10 9].
I'll get to work on the nzfreq distribution ...
The pfreq and zfreq results for Chris's generator tell me that (a) his generator is excellent and the output is very close to the ideal distributions, (b) as a consequence of (a) the distributions are a good test of randomness and should be applied to XORO32 and (c) the very close match to the expected zfreq proves that the equation I deduced but cannot prove mathematically is correct. A nzfreq test of Chris's generator will prove or disprove that equation.
A quick reminder of something you wrote on May 6:
The search is almost over. If I could do the tests that I'm asking you to do myself I would, but I can't.
P.S. I'm not sure Melissa has given xoroshiro++ a thorough test.
I'll create a Chi-square score from the binary later today.
True, but the differences aren't huge. The biggest hole in the + scrambler is plugged but I'm sure she'll still have the same gripes about the engine itself. Luckily, it doesn't have the static flow of the ** scrambler. Melissa was most unimpressed with that move.
Had I been wrong with the equation, the nzfreq test would have disproved it. A better than 99.9% match for Chris's pfreq and zfreq is proof enough for me! And the pfreq equation was known and proved already.
You're looking at the wrong end of the scores for proof. While distribution candidates #1 and #4 do have good Practrand gridding scores, candidates #2 and #5 have rather bad scores. And there is worse still in the top 100 candidates.
Does Chris's PractRand (or PracRand) forum still exist on sourceforge.net? I can't view it now on this PC, seeing as it's browser-challenged.
Evan, I've always said good scores in distribution tests are no guarantee of good scores in other tests. That's why we need to select the candidate that does well in several tests: distribution, PractRand and TestUI. It has to be said that [14,2,7,5] does badly on distribution: 487 out of 1260 on combined prank+zrank ranking.
The nzfreq Chi-square score for Chris's generator is 41 for nzfreq(1-45), or 18 for nzfreq(1-16) which make up over 99.9% of the total non-zero runs. This doesn't mean much without other results but I'm sure no xoroshiro32++ will match it. How do you feel about running the full set of nzfreq tests?
Also a good pfreq score does not imply a good zfreq score and vice-versa.
EDIT: It's reproducing all the distribution data. Even with all three data sets, pfreq, zfreq and nzfreq, being produced, that part of the program is still only 5% of execution time. Filling the initial 4 GB of pair tallies takes the other 95%.
EDIT2: I can feel the sluggishness of the computer while it's doing this. Parctrand doesn't have this much impact even when hogging much more CPU. It'll be cache/RAM thrashing because of the large memory allocations, causing a lot of blocked waiting. Of note, when running a single case the run time is four minutes instead of five.
Thanks for starting the tests, Evan. Best to leave the computer alone to get on with it!
Thanks, Evan! I've had a quick look and [13,5,10,9] ranking is 1 for nzfreq.
I think the search is now over and [13,5,10,9] is the one.
Something that we should prove is that the distribution results are identical when they are generated in the same way as XORO32. To save time, the C code saves the previous 16-bit output and uses it again. What was the high word in one 32-bit output becomes the low word in the next. XORO32 repeats each of the 2^32-1 iterations but not until 2^32-2 others have been done first. The ordering of the data is different but the distributions should be the same and checking a single candidate would suffice.