As you pointed out, though, the carry output of the final adder would seem to be of good quality, but it must not be if they didn't mention it, because it would have been a great way to get a full 64 bits out of the PRNG.

The defence I have here is that the documentation never discusses this in a hardware context. In fact they only discuss high level languages without direct access to a carry result.

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

What summing carry bit? There is none in Chip's scheme.

But it could be added with ease. The joys of designing the hardware the way you want it. :)

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

Of course I can! :)

I have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
Propeller chiptune player: Thread

Hehe, I've just solved a lib dependency problem where the package manager thought one of the packages was broken and absolutely insisted on removing it. After going though the rigmarole of reinstating it manually a couple of times I found instructions on how to remove dependency entries in either the package itself or the subsequent forced install. I chose the latter as it didn't need repackaged.

Now I can simple install dieharder from repository ... and do all the belated updates too ...

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

Hmm, I guess that means we have no clear answer for the summing carry bit either but I'd imagine it is a function of the lower bits.

Do you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

Of course I can! :)

I have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

No. I mean that when you compute s0+s1, you actually get a 65-bit result, with the bits ranging in position from 64 down to 0. We want to know what the integrity of bits 64 down to 1 are.

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

No, it's: ss[64:00] = s0[63:00] + s1[63:00] with carry. Carry bit goes in ss[64]. It's addition's equivalent of the extended multiply, 64-bit = 32-bit x 32-bit.

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

Okey...
GCC has built in support for 128 bits types (the C++ standard has not) so it should be easy!

SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
Propeller chiptune player: Thread

Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 10E-18 percent.

Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

Would that bias ever show up in Dieharder? No idea.

I tried clocking the original P2 LFSR 32 (as proposed by Phil) steps for each random number output to Dieharder. Amazingly most tests passes. Doing the same with Practrand fails everything while xoroshiro passes! It really seems like Practrand is superior for finding out which PRNG among good ones that stands out.

SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
Propeller chiptune player: Thread

Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 1 in 10E-18.

Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

Would that bias ever show up in Dieharder? No idea.

Unless Dieharder could run the test for 10e29 years at 160MHz sample rate, the bias wouldn't show up.

I tried clocking the original P2 LFSR 32 (as proposed by Phil) steps for each random number output to Dieharder. Amazingly most tests passes. Doing the same with Practrand fails everything while xoroshiro passes! It really seems like Practrand is superior for finding out which PRNG among good ones that stands out.

Cool! I've wondered about that sort of thing before but didn't have a clue of significance. Mulling over a 1-bit addition in my head made that list very clear. Thanks Heater.

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

If I remove that ">> 1", keeping the 128 bit addition and using bits [63:0] then everything in dieharder starts passing again. So I guess the problem is not elsewhere.

Ouch! Do it again with post-summing bit 63 removal so as to prove it's not a flaw in the 128-bit conversions.

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

Sorry for the delay, I'd written that while at work. I mean just knock out the msb of the returned result, eg:

return result & 0x7fffffffffffffff;

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

But, but, the point of the exercise is to get a 64 bit result. Using the carry rather than bit 0. Which is said to be not so statistically "random" as the other 63 bits.

There is no way dieharde will like an input of 64 bit integers each with it's top bit set zero.

As it stands it looks like we may as well use that bit zero as it performs very well.

That's what the existing solution is, only 63 bits. I'm just trying to sort out where the problem is with the this experimental 64-bit variant.

The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.

Yes, the existing solution uses only the high 63 bits. Why? Because the bit zero is said to be not so statistically random looking as the rest.

In posts above Chip wondered if that carry bit could be used instead. Thus making a 64 bit result. And was asking for that solution to be tested with Dieharder and whatever statistical tests. I think his idea is that when distributing bits to COGs with his "rats nest" of connections it would be nice to have a multiple of 8 bits and have all the bits used the same number of times. Or something like that.

I pointed out that there is a tiny bias in the carry bit. But we reckoned on it being so small as not to show up in a test. That left the question mark over it's statistical "randomness".

Then, being unable to sleep, I could not resist trying carry bit experiment myself.

Unless I have made a mistake I think we may as well use the original 64 bit result.

SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
Propeller chiptune player: Thread

Okay... I made a mistake as seen in my code above. Only passing 32 bits of the 64 to stdout. When I ran it again with all bits, it began failing misserable!

Here's the output from Practrand

PE 1 | ./RNG_test stdin
RNG_test using PractRand version 0.93
RNG = RNG_stdin, seed = 0xa4eaaac4
test set = normal, folding = standard(unknown format)
rng=RNG_stdin, seed=0xa4eaaac4
length= 128 megabytes (2^27 bytes), time= 3.8 seconds
Test Name Raw Processed Evaluation
BCFN(2+0,13-3,T) R= +74.9 p = 3.0e-35 FAIL !!!
BCFN(2+1,13-3,T) R= +18.9 p = 9.6e-9 VERY SUSPICIOUS
DC6-9x1Bytes-1 R= +1413 p = 6.3e-868 FAIL !!!!!!!
Gap-16:A R=+149.1 p = 1.3e-126 FAIL !!!!!
Gap-16:B R=+367.1 p = 6.1e-294 FAIL !!!!!!
FPF-14+6/16:(0,14-0) R= +5478 p = 8e-5069 FAIL !!!!!!!!
FPF-14+6/16:(1,14-0) R= +2754 p = 5e-2548 FAIL !!!!!!!!
FPF-14+6/16:(2,14-0) R= +1357 p = 2e-1255 FAIL !!!!!!!!
FPF-14+6/16:(3,14-1) R=+967.8 p = 1.7e-857 FAIL !!!!!!!
FPF-14+6/16:(4,14-2) R=+655.1 p = 1.0e-572 FAIL !!!!!!!
FPF-14+6/16:(5,14-2) R=+254.4 p = 3.0e-222 FAIL !!!!!!
FPF-14+6/16:all R= +5306 p = 2e-4979 FAIL !!!!!!!!
FPF-14+6/16:all2 R=+9738850 p = 0 FAIL !!!!!!!!
FPF-14+6/16:cross R= +5198 p = 2e-4373 FAIL !!!!!!!!
[Low1/8]Gap-16:B R= +4.4 p = 1.0e-3 unusual
...and 136 test result(s) without anomalies

SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
Propeller chiptune player: Thread

I first thought that your code failed because of the implicit cast to the 64 bit result variable (according to the standard you can never know what the compiler decides to do under the hood), but obviously it fails in both Practrand and Dieharder for me as well using explicit casting in C++ 11;

SIDcog - The sound of the Commodore 64 in a single cog: Thread, OBEX, SIDcogMedlay.mp3
AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX
SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX
Propeller chiptune player: Thread

## Comments

20,2160Vote UpVote Down4,5880Vote UpVote Down8,5970Vote UpVote DownDo you know who can resolve this question? Ahle2! He already ran the implementation we're using through the Dieharder test battery.

Ahle2, could you please run the Dieharder test on sum[64:01] of the 's0+s1' value. That would include the carry bit that is current being ignored. We need to keep ignoring sum[0]. Maybe we COULD get 64 good bits out of the PRNG.

4,5880Vote UpVote Down8,5970Vote UpVote DownIt's just the carry output of the 64-bit add of s0 and s1.

9060Vote UpVote DownI have successfully compiled Practrand on my Linux machine now. It is supposed to be the ultimate PRNG test battery according to the "PRNG community". I will run every variant of PRNG's that I have previously run with Dieharder + the 'Xoroshiro128+ sum of s0+s1' using it! (I will run the sum with Dieharder as well)

Just to be clear, do you mean "s0[64:01] + s1[64:01]"?

AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX

SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX

Propeller chiptune player: Thread

4,5880Vote UpVote DownNow I can simple install dieharder from repository ... and do all the belated updates too ...

8,5970Vote UpVote DownNo. I mean that when you compute s0+s1, you actually get a 65-bit result, with the bits ranging in position from 64 down to 0. We want to know what the integrity of bits 64 down to 1 are.

If you think in Verilog, it looks like this:

wire [63:0] s0, s1;

wire [64:0] ss = s0 + s1;

wire [63:0] rndbits = ss[64:1];

It's rndbits that we're after.

If you have a 65-bit adder, we are interested in: (s0 + s1) >> 1

4,5880Vote UpVote Down9060Vote UpVote DownGCC has built in support for 128 bits types (the C++ standard has not) so it should be easy!

AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX

SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX

Propeller chiptune player: Thread

20,2160Vote UpVote DownThere is a bias in that carry bit.

Consider that if S0 and S1 were only 8 bits wide adding them would produce a carry only 49.8% of the time. If they were 16 bits it would only be 49.999237% of the time.

There must be a way to calculate the bias of the carry bit when adding random numbers of a given width. I have no idea what it is but a bit of brute force experimental mathematics produces the following:

Width: 1 Carries: 25% Bias: 25%

Width: 2 Carries: 37.5% Bias: 12.5%

Width: 3 Carries: 43.75% Bias: 6.25%

Width: 4 Carries: 46.875% Bias: 3.125%

Width: 5 Carries: 48.4375% Bias: 1.5625%

Width: 6 Carries: 49.21875% Bias: 0.78125%

Width: 7 Carries: 49.609375% Bias: 0.390625%

Width: 8 Carries: 49.8046875% Bias: 0.1953125%

Width: 9 Carries: 49.90234375% Bias: 0.09765625%

Width: 10 Carries: 49.951171875% Bias: 0.048828125%

Width: 11 Carries: 49.9755859375% Bias: 0.0244140625%

Width: 12 Carries: 49.98779296875% Bias: 0.01220703125%

Width: 13 Carries: 49.993896484375% Bias: 0.006103515625%

Width: 14 Carries: 49.9969482421875% Bias: 0.0030517578125%

Width: 15 Carries: 49.99847412109375% Bias: 0.00152587890625%

Width: 16 Carries: 49.999237060546875% Bias: 0.000762939453125%

We see the bias in the carry bit is halving with every extra bit of width. So we can speculate that the bias for a 64 bit carry is about 25 / Math.pow(2, 63), or 10E-18 percent.

Which is pretty huge considering we are working with a high quality PRNG with a period of 10E128 or whatever it is.

Would that bias ever show up in Dieharder? No idea.

9060Vote UpVote DownAYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX

SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX

Propeller chiptune player: Thread

8,5970Vote UpVote DownUnless Dieharder could run the test for 10e29 years at 160MHz sample rate, the bias wouldn't show up.

8,5970Vote UpVote DownWith 128-bit types, just do: (so + s1) >> 1

20,2160Vote UpVote Down20,2160Vote UpVote Down4,5880Vote UpVote Down20,2160Vote UpVote Down20,2160Vote UpVote DownI changed my xoroshiro128+ to do 128 bit addition S0 + S1 and shift down by 1. Like so: The test harness just writes the binary results of next() to standard out. Like so:

`#include "xoroshiro128plus.c" // Seed MUST not be all zero. #define SEED_0 ((uint64_t)0x1) #define SEED_1 ((uint64_t)0x0) int main(int argc, char* argv[]) { uint64_t random64; int i; // Seed the state array s[0] = SEED_0; s[1] = SEED_1; // Print some randomness while(1) { random64 = next(); // printf("%016jx\n", (uintmax_t)(random64 >> 1)); for (i = 0; i < 8; i++) { putchar(random64 >> i * 8); } } return 0; }`

I pipe the output into dieharder like so:

Sadly it fails miserably:

20,2160Vote UpVote DownBasically that carry cannot be used.

Unless I have slipped up again...

4,5880Vote UpVote Down20,2160Vote UpVote DownWhat do you mean by "post-summing bit 63 removal" ?

4,5880Vote UpVote Downreturn result & 0x7fffffffffffffff;

20,2160Vote UpVote DownThere is no way dieharde will like an input of 64 bit integers each with it's top bit set zero.

As it stands it looks like we may as well use that bit zero as it performs very well.

4,5880Vote UpVote Down20,2160Vote UpVote DownIn posts above Chip wondered if that carry bit could be used instead. Thus making a 64 bit result. And was asking for that solution to be tested with Dieharder and whatever statistical tests. I think his idea is that when distributing bits to COGs with his "rats nest" of connections it would be nice to have a multiple of 8 bits and have all the bits used the same number of times. Or something like that.

I pointed out that there is a tiny bias in the carry bit. But we reckoned on it being so small as not to show up in a test. That left the question mark over it's statistical "randomness".

Then, being unable to sleep, I could not resist trying carry bit experiment myself.

Unless I have made a mistake I think we may as well use the original 64 bit result.

9060Vote UpVote DownHere's my PRNG code

The main code

`while(1) { auto random = prng.getXoroshiro(); std::cout << static_cast<char>(random >> 24); std::cout << static_cast<char>(random >> 16); std::cout << static_cast<char>(random >> 8); std::cout << static_cast<char>(random); }`

AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX

SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX

Propeller chiptune player: Thread

9060Vote UpVote DownHere's the output from Practrand

AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX

SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX

Propeller chiptune player: Thread

9060Vote UpVote DownI first thought that your code failed because of the implicit cast to the 64 bit result variable (according to the standard you can never know what the compiler decides to do under the hood), but obviously it fails in both Practrand and Dieharder for me as well using explicit casting in C++ 11;

AYcog - An emulation of the AY3-8910 / YM2149F PSG: Thread, OBEX

SNEcog - An emulation of the SN76489 PSG(and variants): Thread, OBEX

Propeller chiptune player: Thread

20,2160Vote UpVote DownHere is my main code:

`while(1) { random64 = next(); for (i = 0; i < 8; i++) { putchar(random64 >> (i * 8)); } }`