As you pointed out, though, the carry output of the final adder would seem to be of good quality, but it must not be if they didn't mention it, because it would have been a great way to get a full 64 bits out of the PRNG.

The defence I have here is that the documentation never discusses this in a hardware context. In fact they only discuss high level languages without direct access to a carry result.

Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

Of course I can!

I have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

Hehe, I've just solved a lib dependency problem where the package manager thought one of the packages was broken and absolutely insisted on removing it. After going though the rigmarole of reinstating it manually a couple of times I found instructions on how to remove dependency entries in either the package itself or the subsequent forced install. I chose the latter as it didn't need repackaged.

Now I can simple install dieharder from repository ... and do all the belated updates too ...

Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

Of course I can!

I have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

No. I mean that when you compute s0+s1, you actually get a 65-bit result, with the bits ranging in position from 64 down to 0. We want to know what the integrity of bits 64 down to 1 are.

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

No, it's: ss[64:00] = s0[63:00] + s1[63:00] with carry. Carry bit goes in ss[64]. It's addition's equivalent of the extended multiply, 64-bit = 32-bit x 32-bit.

Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 10E-18 percent.

Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

Would that bias ever show up in Dieharder? No idea.

I tried clocking the original P2 LFSR 32 (as proposed by Phil) steps for each random number output to Dieharder. Amazingly most tests passes. Doing the same with Practrand fails everything while xoroshiro passes! It really seems like Practrand is superior for finding out which PRNG among good ones that stands out.

Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 1 in 10E-18.

Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

Would that bias ever show up in Dieharder? No idea.

Unless Dieharder could run the test for 10e29 years at 160MHz sample rate, the bias wouldn't show up.

I tried clocking the original P2 LFSR 32 (as proposed by Phil) steps for each random number output to Dieharder. Amazingly most tests passes. Doing the same with Practrand fails everything while xoroshiro passes! It really seems like Practrand is superior for finding out which PRNG among good ones that stands out.

Cool! I've wondered about that sort of thing before but didn't have a clue of significance. Mulling over a 1-bit addition in my head made that list very clear. Thanks Heater.

If I remove that ">> 1", keeping the 128 bit addition and using bits [63:0] then everything in dieharder starts passing again. So I guess the problem is not elsewhere.

But, but, the point of the exercise is to get a 64 bit result. Using the carry rather than bit 0. Which is said to be not so statistically "random" as the other 63 bits.

There is no way dieharde will like an input of 64 bit integers each with it's top bit set zero.

As it stands it looks like we may as well use that bit zero as it performs very well.

Yes, the existing solution uses only the high 63 bits. Why? Because the bit zero is said to be not so statistically random looking as the rest.

In posts above Chip wondered if that carry bit could be used instead. Thus making a 64 bit result. And was asking for that solution to be tested with Dieharder and whatever statistical tests. I think his idea is that when distributing bits to COGs with his "rats nest" of connections it would be nice to have a multiple of 8 bits and have all the bits used the same number of times. Or something like that.

I pointed out that there is a tiny bias in the carry bit. But we reckoned on it being so small as not to show up in a test. That left the question mark over it's statistical "randomness".

Then, being unable to sleep, I could not resist trying carry bit experiment myself.

Unless I have made a mistake I think we may as well use the original 64 bit result.

Okay... I made a mistake as seen in my code above. Only passing 32 bits of the 64 to stdout. When I ran it again with all bits, it began failing misserable!

Here's the output from Practrand

PE 1 | ./RNG_test stdin
RNG_test using PractRand version 0.93
RNG = RNG_stdin, seed = 0xa4eaaac4
test set = normal, folding = standard(unknown format)
rng=RNG_stdin, seed=0xa4eaaac4
length= 128 megabytes (2^27 bytes), time= 3.8 seconds
Test Name Raw Processed Evaluation
BCFN(2+0,13-3,T) R= +74.9 p = 3.0e-35 FAIL !!!
BCFN(2+1,13-3,T) R= +18.9 p = 9.6e-9 VERY SUSPICIOUS
DC6-9x1Bytes-1 R= +1413 p = 6.3e-868 FAIL !!!!!!!
Gap-16:A R=+149.1 p = 1.3e-126 FAIL !!!!!
Gap-16:B R=+367.1 p = 6.1e-294 FAIL !!!!!!
FPF-14+6/16:(0,14-0) R= +5478 p = 8e-5069 FAIL !!!!!!!!
FPF-14+6/16:(1,14-0) R= +2754 p = 5e-2548 FAIL !!!!!!!!
FPF-14+6/16:(2,14-0) R= +1357 p = 2e-1255 FAIL !!!!!!!!
FPF-14+6/16:(3,14-1) R=+967.8 p = 1.7e-857 FAIL !!!!!!!
FPF-14+6/16:(4,14-2) R=+655.1 p = 1.0e-572 FAIL !!!!!!!
FPF-14+6/16:(5,14-2) R=+254.4 p = 3.0e-222 FAIL !!!!!!
FPF-14+6/16:all R= +5306 p = 2e-4979 FAIL !!!!!!!!
FPF-14+6/16:all2 R=+9738850 p = 0 FAIL !!!!!!!!
FPF-14+6/16:cross R= +5198 p = 2e-4373 FAIL !!!!!!!!
[Low1/8]Gap-16:B R= +4.4 p = 1.0e-3 unusual
...and 136 test result(s) without anomalies

I first thought that your code failed because of the implicit cast to the 64 bit result variable (according to the standard you can never know what the compiler decides to do under the hood), but obviously it fails in both Practrand and Dieharder for me as well using explicit casting in C++ 11;

## Comments

21,23310,91213,574Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

10,91213,574It's just the carry output of the 64-bit add of s0 and s1.

1,158I have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

10,912Now I can simple install dieharder from repository ... and do all the belated updates too ...

13,574No. I mean that when you compute s0+s1, you actually get a 65-bit result, with the bits ranging in position from 64 down to 0. We want to know what the integrity of bits 64 down to 1 are.

If you think in Verilog, it looks like this:

wire [63:0] s0, s1;

wire [64:0] ss = s0 + s1;

wire [63:0] rndbits = ss[64:1];

It's rndbits that we're after.

If you have a 65-bit adder, we are interested in: (s0 + s1) >> 1

10,9121,158GCC has built in support for 128 bits types (the C++ standard has not) so it should be easy!

21,233There is a bias in that carry bit.

Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

Width: 1 Carries: 25% Bias: 25%

Width: 2 Carries: 37.5% Bias: 12.5%

Width: 3 Carries: 43.75% Bias: 6.25%

Width: 4 Carries: 46.875% Bias: 3.125%

Width: 5 Carries: 48.4375% Bias: 1.5625%

Width: 6 Carries: 49.21875% Bias: 0.78125%

Width: 7 Carries: 49.609375% Bias: 0.390625%

Width: 8 Carries: 49.8046875% Bias: 0.1953125%

Width: 9 Carries: 49.90234375% Bias: 0.09765625%

Width: 10 Carries: 49.951171875% Bias: 0.048828125%

Width: 11 Carries: 49.9755859375% Bias: 0.0244140625%

Width: 12 Carries: 49.98779296875% Bias: 0.01220703125%

Width: 13 Carries: 49.993896484375% Bias: 0.006103515625%

Width: 14 Carries: 49.9969482421875% Bias: 0.0030517578125%

Width: 15 Carries: 49.99847412109375% Bias: 0.00152587890625%

Width: 16 Carries: 49.999237060546875% Bias: 0.000762939453125%

We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 10E-18 percent.

Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

Would that bias ever show up in Dieharder? No idea.

1,15813,574Unless Dieharder could run the test for 10e29 years at 160MHz sample rate, the bias wouldn't show up.

13,574With 128-bit types, just do: (so + s1) >> 1

21,23321,23310,91221,23321,233I changed my xoroshiro128+ to do 128 bit addition S0 + S1 and shift down by 1. Like so: The test harness just writes the binary results of next() to standard out. Like so:

I pipe the output into dieharder like so:

Sadly it fails miserably:

21,233Basically that carry cannot be used.

Unless I have slipped up again...

10,91221,233What do you mean by "post-summing bit 63 removal" ?

10,912return result & 0x7fffffffffffffff;

21,233There is no way dieharde will like an input of 64 bit integers each with it's top bit set zero.

As it stands it looks like we may as well use that bit zero as it performs very well.

10,91221,233In posts above Chip wondered if that carry bit could be used instead. Thus making a 64 bit result. And was asking for that solution to be tested with Dieharder and whatever statistical tests. I think his idea is that when distributing bits to COGs with his "rats nest" of connections it would be nice to have a multiple of 8 bits and have all the bits used the same number of times. Or something like that.

I pointed out that there is a tiny bias in the carry bit. But we reckoned on it being so small as not to show up in a test. That left the question mark over it's statistical "randomness".

Then, being unable to sleep, I could not resist trying carry bit experiment myself.

Unless I have made a mistake I think we may as well use the original 64 bit result.

1,158Here's my PRNG code

The main code

1,158Here's the output from Practrand

1,158I first thought that your code failed because of the implicit cast to the 64 bit result variable (according to the standard you can never know what the compiler decides to do under the hood), but obviously it fails in both Practrand and Dieharder for me as well using explicit casting in C++ 11;

21,233Here is my main code: