First twenty 17-bit (Xoroshiro34) sums for triplet [7,8,12] is:
00001 01181 1608d 0a75d 0bf68 1c71d 158e8 028f2 1c4a8 1c24f 14e6c 1f938 1f135 07a8e 068cd 15404 1d908 02648 1e45c 13161

First twenty 20-bit (Xoroshiro40) sums for triplet [7,9,12] is:
00001 01281 c6013 5a2f1 2f5ac 7590c fb594 b7ac1 2a644 33649 ebe23 66f73 18a54 fd4a2 4674a 590d3 37bc4 2cce9 3a9db 0ee4e

Evan, do you have a list of the best triplets in your opinion for different widths, say Xoroshiro24 to Xoroshiro40? If not, don't worry and if so, score tables not needed.

Yep, I've got a set of score tables for all those. Size 20 is the largest I've done though. Anything bigger took forever to brute force search for full period candidates.

Here's my selection. Some of them were a real toss up and only chosen by least number of lowest scores. A method that's somewhat dependent on the sampling variants I've scored with.

EDIT: The "Bit" column exhibits a perfect scaling of scores! That blows me away. I don't have a clue why it's so rigidly alone like that. Pity it's also the scoring variant that takes many times longer to run.

As MUL takes the same time as ADD on the P2 there would be no performance penalty. It's not clear to me whether the product could be truncated to 16 bits as well as 8 bits. If so, would it perform better in randomness tests than the 16-bit sum? If not, two iterations would be needed to get a high-quality 16-bit result, but that's just the same as with adding.

I can certainly replace the summing with multiplying. Don't even have to re-run the full period searches.

Tony,
With a quick look at that webpage, I'm not making much sense of how he's done the Xorshift algorithm. I'm suspicious he's just looking at the state data directly, because otherwise PractRand would be throwing a wobbly on our testing quick smart I'd think.

... It's not clear to me whether the product could be truncated to 16 bits as well as 8 bits. If so, would it perform better in randomness tests than the 16-bit sum? If not, two iterations would be needed to get a high-quality 16-bit result, but that's just the same as with adding.

Assuming XORO32 based, why would multiplying two 16-bit values only produce 8 bits? Or is that not what you were talking about?

Well, simply replacing the addition with a multiply produces rubbish PractRand scores. Amusingly, the "bit" variant is still the same scores but most everything else barely makes it passed the starting blocks.

I've added the author's name to my previous post. It's hard to be sure, but I think her "xoroshiro16star8" uses two 8-bit states that are multiplied by a 16-bit magic number ("MCG multiply") and only the high byte of the product kept.

All of them, but let's put this to one side for a while. Try doing a hexdump of the first few Xoroshiro32+ sums (eight is plenty) when seed is $32F3_1EDB :)

Dang, it just dawned on me, far too late, that DRAM prices have been going through the roof all this year! And even with those prices, supply appears to have gone to virtually nil.

One of my new 16GB DIMMs failed and there is no exact replacements in stock. I've got the full credit for them (Returned the pair) and can claim it as a refund, but the options for replacement are slim pickings.

I'm back using me old PC for the moment, and the remainder of the year looks bleak. :(

Well, things have progressed. I got the full refund for the faulty RAM and went elsewhere to get something faster. I figured if I was going to be paying inflated prices to get a decent set of DIMMs I may as well get something faster than I had. The alternative was something slower but still more expensive than original.

I've since discovered, and verified, that at least one program was having issues with the SMT flaw in the early Ryzens. I would have been happy to just disable SMT in the BIOS but the silly BIOS has a problem with that - It insists I had an unexpected power down every time it powers up - have to press F1 and ESC and ENTER every time I power it up with SMT disabled!

So now I'm sending the CPU back for a free exchange tomorrow - never done anything international like this before - AMD have directly emailed me a FedEx number to book it to. Conveniently there is a depot only ten minutes drive away ...

First twenty 17-bit (Xoroshiro34) sums for triplet [7,8,12] is:
00001 01181 1608d 0a75d 0bf68 1c71d 158e8 028f2 1c4a8 1c24f 14e6c 1f938 1f135 07a8e 068cd 15404 1d908 02648 1e45c 13161

First twenty 20-bit (Xoroshiro40) sums for triplet [7,9,12] is:
00001 01281 c6013 5a2f1 2f5ac 7590c fb594 b7ac1 2a644 33649 ebe23 66f73 18a54 fd4a2 4674a 590d3 37bc4 2cce9 3a9db 0ee4e

Evan, do you have a list of the best triplets in your opinion for different widths, say Xoroshiro24 to Xoroshiro40? If not, don't worry and if so, score tables not needed.

Yep, I've got a set of score tables for all those. Size 20 is the largest I've done though. Anything bigger took forever to brute force search for full period candidates.

I'll zip them up when home ...

I've done Xoroshiro40+ on the Z80 and the triplet numbers were quite friendly. A final one I'd like to try is Xoroshiro48+ as every bit in two sets of three 8-bit registers would be used. There is a trick for finding full-period triplets without brute-force as that would take forever with Xoroshiro128+, if only we knew what the trick is.

Yes, Xoroshiro128+ and 1024+ both have "jump" functions supplied in the sources from the authors.

Starting from bit 0 of both word halves of the PRNG state, it does an overlaying XOR of a working state with the current state based on what looks to be a specially crafted key, then calls the iterator, stepping the current state by one. This repeats for each bit of the word size (ie: half the state length).

And at the end it copies the working state over the current state.

Here's the actual code from original Xoroshiro128+ jump() function:

/* This is the jump function for the generator. It is equivalent
to 2^64 calls to next(); it can be used to generate 2^64
non-overlapping subsequences for parallel computations. */
void jump(void) {
static const uint64_t JUMP[] = { 0xbeac0467eba5facb, 0xd86b048b86aa9922 };
uint64_t s0 = 0;
uint64_t s1 = 0;
for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
for(int b = 0; b < 64; b++) {
if (JUMP[i] & 1ULL << b) {
s0 ^= s[0];
s1 ^= s[1];
}
next();
}
s[0] = s0;
s[1] = s1;
}

C is mostly gibberish to me. I don't see how a jump function proves a certain triplet is full-period. I've just looked through the prng Google group and it wasn't any help really.

Somewhere, some how, some mathematician can prove that some pseudo random number generator algorithm with some parameters will run though all, or most, of its combinations of states before returning to where it started.

I have no idea how.

Of course, running through a huge sequence does not imply a random looking sequence. A simple binary counter will run through such a sequence, step by step in order. We need something better to defeat the tests of "randomness".

I think understanding C is the least of the problems in understanding how these things work.

Certainly brute force running and checking them does not help. The periods can be longer to work through than the age of the universe!

What is certainly true is that tweaking the number of bits in the state or messing with the parameters can ruin a good PRNG.

I start to believe that the search for a better pseudo random number generator drives people insane. As this thread shows :) I felt it myself before I got help from "PRNG Anonymous". For example this from Stanford:

The top three lines of the source is a comment, so that's plain english at least. :) Basically, this particular jump() config is not useful for finding any full period triplets because even it still has to be called 2^64 times to hop all the way along.

So, presumably it's possible to built jump functions that steps further than a square-root into the period. But they're not inclined to explain, or maybe it's just really hard to work out. I have no idea.

What is certainly true is that tweaking the number of bits in the state or messing with the parameters can ruin a good PRNG.

Full period may be solved at a higher analytical level but the quality certainly isn't. Tony's link to Melissa's visualisation work is a good demo of how quality is accessed. And that's also why PractRand and BigCrush are heavily relied on.

## Comments

830Vote UpVote DownEvan, do you have a list of the best triplets in your opinion for different widths, say Xoroshiro24 to Xoroshiro40? If not, don't worry and if so, score tables not needed.

4,1800Vote UpVote DownI'll zip them up when home ...

4,1800Vote UpVote Downhttp://forums.parallax.com/discussion/comment/1410683/#Comment_1410683

4,1800Vote UpVote Down4,1800Vote UpVote DownEDIT: The "Bit" column exhibits a perfect scaling of scores! That blows me away. I don't have a clue why it's so rigidly alone like that. Pity it's also the scoring variant that takes many times longer to run.

830Vote UpVote DownI've just been reading this post in which Melissa O'Neill argues (right at the end) that it would be better to multiply the xoroshiro states rather than add them:

http://www.pcg-random.org/posts/visualizing-the-heart-of-some-prngs.html

As MUL takes the same time as ADD on the P2 there would be no performance penalty. It's not clear to me whether the product could be truncated to 16 bits as well as 8 bits. If so, would it perform better in randomness tests than the 16-bit sum? If not, two iterations would be needed to get a high-quality 16-bit result, but that's just the same as with adding.

4,1800Vote UpVote DownTony,

With a quick look at that webpage, I'm not making much sense of how he's done the Xorshift algorithm. I'm suspicious he's just looking at the state data directly, because otherwise PractRand would be throwing a wobbly on our testing quick smart I'd think.

4,1800Vote UpVote Down4,1800Vote UpVote Down4,1800Vote UpVote DownHere's the s12 multiplying scores for comparison:

830Vote UpVote Downmagic number("MCG multiply") and only the high byte of the product kept.4,1800Vote UpVote Down830Vote UpVote Down4,1800Vote UpVote Down4,1800Vote UpVote DownI've since discovered, and verified, that at least one program was having issues with the SMT flaw in the early Ryzens. I would have been happy to just disable SMT in the BIOS but the silly BIOS has a problem with that - It insists I had an unexpected power down every time it powers up - have to press F1 and ESC and ENTER every time I power it up with SMT disabled!

So now I'm sending the CPU back for a free exchange tomorrow - never done anything international like this before - AMD have directly emailed me a FedEx number to book it to. Conveniently there is a depot only ten minutes drive away ...

830Vote UpVote DownI've done Xoroshiro40+ on the Z80 and the triplet numbers were quite friendly. A final one I'd like to try is Xoroshiro48+ as every bit in two sets of three 8-bit registers would be used. There is a trick for finding full-period triplets without brute-force as that would take forever with Xoroshiro128+, if only we knew what the trick is.

4,1800Vote UpVote DownYes, Xoroshiro128+ and 1024+ both have "jump" functions supplied in the sources from the authors.

Starting from bit 0 of both word halves of the PRNG state, it does an overlaying XOR of a working state with the current state based on what looks to be a specially crafted key, then calls the iterator, stepping the current state by one. This repeats for each bit of the word size (ie: half the state length).

And at the end it copies the working state over the current state.

Here's the actual code from original Xoroshiro128+ jump() function:

4,1800Vote UpVote Down830Vote UpVote Down19,7250Vote UpVote DownI have no idea how.

Of course, running through a huge sequence does not imply a random looking sequence. A simple binary counter will run through such a sequence, step by step in order. We need something better to defeat the tests of "randomness".

I think understanding C is the least of the problems in understanding how these things work.

Certainly brute force running and checking them does not help. The periods can be longer to work through than the age of the universe!

What is certainly true is that tweaking the number of bits in the state or messing with the parameters can ruin a good PRNG.

I start to believe that the search for a better pseudo random number generator drives people insane. As this thread shows :) I felt it myself before I got help from "PRNG Anonymous". For example this from Stanford:

PCG: A Family of Better Random Number Generators:

4,1800Vote UpVote DownSo, presumably it's possible to built jump functions that steps further than a square-root into the period. But they're not inclined to explain, or maybe it's just really hard to work out. I have no idea.

4,1800Vote UpVote Down19,7250Vote UpVote DownCertainly jump functions can make huge strides into the state space. I have no idea about the square-root thing.

Me too. I have no idea.