Random/LFSR on P2

evanh · 2017-04-14 01:56

cgracey wrote: »

evanh wrote: »

I happened to do a quick run of that last night. Attached is the full set of tests. {8,2,3} is the only test that makes it to 64 MB.

There's only 16M iterations in 24 bits, so the random tester must have clued into the pattern after the 4th full cycle. What else could explain that?

Ha, I hadn't thought about it. Well, there is signs, three uncommonly suspicious cases, of an impending failure at 32 MB. It's common to see the DC6-9x1Bytes-1 case showing up early but not those other three.

evanh · 2017-04-14 01:58

That and remember I am concatenating the generated numbers so the MB amount is not the iteration count.

Phil Pilgrim (PhiPi) · 2017-04-14 02:22

heater wrote:

Now I really don't get it. With a Schnider 32 bit LFSR, or similar, you get a 2^32 -1 sequence and surely the "quality" plenty good enough.

I was going to say, "Actually, no. You get a 2**27 -1 sequence, since you have to iterate 32 times to get a random 32-bit value. Otherwise, patterns show up. As a consequence, given a starting seed, 31/32 of the potential values are never seen in the sequence."

But that's not true, due to the "-1" (zero not being in the sequence). So, yeah, the non-repeating sequence is 2**32 - 1 long.

-Phil

cgracey · 2017-04-14 03:07

evanh wrote: »

cgracey wrote: »

I would like to see this tested just using the MSB of the sum as input to the test suite. Evanh, would you mind doing that? Just testing the MSB of the sum? It would take 8 iterations for each byte of output. I wonder what the failure point would be, then. I suspect it might fully iterate through all of its 2^32-1 states before failing, in that case.

Oh, ah, The B in MSB is Byte, right?

So you want to truncate each 12-bit sum to the top 8 bits and spit those individually at PractRand rather than my existing concatenating, correct?

I was assuming you did that on the 64MB-pass case. Did you do something else? I assumed you did the xoroshiro24+ iteration and passed out the top 8 bits of the 12+12 bit sum.

In my question above, I was wondering how far the best xoroshiro32+ would pass by only outputting the top bit (not byte) of the sum. You'd have to do eight iterations to get each byte of output.

evanh · 2017-04-14 05:32

Here's the byte truncated version - throwing away the bottom 4 bits of each generated number. You can see all test scores are smaller with {8,2,3} failing at 8 MB now instead of 64 MB. That's smaller than I expected but is still one of the best scores of that run. Oops, just a moment ... I was throwing out the top 3 bits and bottom bit. Corrected now and {11,2,4} stands out as better than {8,2,3} with HQ up to 8 MB, fails at 16 MB.

I'm off to a meal so will be a bit longer with the bit truncated version ...

evanh · 2017-04-14 05:48

Had to attached the new file in a new message.

cgracey · 2017-04-14 06:17

Evanh, thanks for running all those tests. It's interesting that you are not seeing failure up until 16MB. That kind of suggests that among known max-length shift settings, there may be one to make it all the way to 16MB. Extracting 1 byte per iteration means that its entire iteration range is almost perfect at 24 bits. This probably means that among those max-length xoroshiro32+ settings I found, one of them can go all the way to 4GB when extracting the top byte of the sum (as opposed to the top 15 bits), which is the iteration range for 32 bits. It would be ideal to get one of those xoroshiro32+ settings nailed down. I had 86 candidates there. Uh, what do you say?

evanh · 2017-04-14 12:10

No problem. My pleasure.

I will say I'm not seeing any fixed pattern to the scores of truncations. In fact sometimes the quality of the bit sized truncation seems on par with byte sized in terms of megabytes processed. As far as I can see, this implies that repeats are not being detected by the PractRand test suit.

I guess for everything it's statistical. It's not like the whole data stream is kept and compared over its entire length all the time.

evanh · 2017-04-14 13:46

With the shifting of the lsb truncation out of the generator function I've decided to change my labelling of the filenames. Where I used to label them as size15 and size11, it's now s16 and s12 respectively.

I've rerun a set of each variant of each size and packaged them all together: (but had to change to 7zip to fit the whole lot within the forum's 2 MB limit. 7zip halved the archive size!)

evanh · 2017-04-14 14:13

And here's the full source used for each variant (Note the testnew variant has quite a bit of detritus commented out so look at the other two first.):

evanh · 2017-04-14 23:55

cgracey wrote: »

It would be ideal to get one of those xoroshiro32+ settings nailed down. I had 86 candidates there. Uh, what do you say?

Since there wasn't much of a match between quality and repeat length at the full length of generated numbers, it might be worth comparing your 86 candidates with the best of the bit truncated set. It might be possible to spot repeat length as the failure mode there. Admittedly I've not provided every possible combination though.

EDIT: Doh! I've found your posted list of 86. I hadn't realised what that was before ...

cgracey · 2017-04-15 00:11

Evanh, thanks for doing all this. I will be looking this over in a few hours from now. Hey, do you have Skype?

evanh · 2017-04-15 01:03

cgracey wrote: »

... do you have Skype?

Tainted!

evanh · 2017-04-15 01:45

This is what I get after filtering for A>C>B (I'm assuming that's a minimum requirement):
6,2,3
6,2,5
8,1,7
9,1,4
10,2,7
11,1,2
11,2,6
11,3,10
11,7,10
12,8,11
12,10,11
13,5,8
13,5,10
13,10,12
14,1,11
14,2,7
14,2,9
14,3,13
14,8,9
14,11,13
14,12,13
15,1,2
15,3,6
15,4,12
15,7,8

evanh · 2017-04-15 02:47

I'm running the full s16 combination sweep of 3375 tests for each variant now. Might take a while

Of note is the bit-truncated variant is munching through the combinations at half the speed of the other two. It has a lot of large scores so far. I'm guessing the odd combinations of a-b-c don't upset the results for this variant the same way.

Byte-truncated seems to have the most small scores. It is currently about 20-25% faster than the full word 15-bit variant.

PS: Also started same full combination sweep for the three s12 variants too. This is 1331 combinations each.

PPS: With one still finishing up an old size25 run there is now seven Bash scripts concurrently compiling and testing. I can feel the warm air from me Ryzen rig! I've never done anything like this before.

evanh · 2017-04-15 06:08

cgracey wrote: »

Evanh, thanks for doing all this. I will be looking this over in a few hours from now. Hey, do you have Skype?

I've installed Mumble now. It seems easy enough, there's lots of open servers automatically registered.

Last time I did any VoIP stuff was with Teamspeak in 2006 when I was playing Eve.

cgracey · 2017-04-15 07:09

evanh wrote: »

cgracey wrote: »

Evanh, thanks for doing all this. I will be looking this over in a few hours from now. Hey, do you have Skype?

I've installed Mumble now. It seems easy enough, there's lots of open servers automatically registered.

Last time I did any VoIP stuff was with Teamspeak in 2006 when I was playing Eve.

I installed some mumble thing, but it doesn't seem to find any servers.

evanh · 2017-04-15 07:54

Hmm, the server I'm currently sitting on is called Mumble.co.nz (has low ping time for me, and user creatable channels). Port 64738. Maybe you can connect to it manually. I've created a Prop2 channel.

PS: Mumble version is 1.2.18

evanh · 2017-04-15 09:43

evanh wrote: »

I'm running the full s16 combination sweep of 3375 tests for each variant now...

Of the 86, four had zeros, so really only 82.

The quality difference between the three variants isn't significant, imho.

Full sweep bit-truncated produced an impressive 236 (of 3375) scores that failed at 1 GByte. 63 of these 236 map onto the 82. Not perfect but way closer than the other two variants.

Byte-truncated variant produced 2 scores that failed at 2 GB, 11 scores at 1 GB, and 47 scores at 512 MB. 11 of these 60 map onto the 82.

Full 15-bit variant produced 19 scores that failed at 2 GB, and 65 scores at 1 GB. A mere 7 of these 84 map onto the 82.

Complete repeat list:

Bit	Byte	Full	Combinations
====	====	====	============
1.0GB			1,2,8
1.0GB			2,1,11
1.0GB			2,1,15
1.0GB			2,2,7
1.0GB			2,6,15
1.0GB			2,8,7
1.0GB			2,9,9
1.0GB			2,11,3
1.0GB			3,2,6
1.0GB			3,3,10
1.0GB			3,11,2
1.0GB			3,11,14
			4,1,9
			4,7,15
1.0GB			4,8,5
1.0GB			5,2,6
			5,8,4
1.0GB			5,14,12
			6,2,3
			6,2,5
1.0GB			6,2,11
1.0GB			6,3,15
1.0GB			6,14,11
1.0GB			7,1,8
			7,2,2
1.0GB		1.0GB	7,2,10
			7,8,2
1.0GB			7,10,10
1.0GB			7,15,8
			8,1,7
			8,2,1
1.0GB			8,5,13
1.0GB			8,7,15
1.0GB	512MB		8,9,13
1.0GB			8,15,7
			9,1,4
1.0GB		2.0GB	9,2,14
1.0GB			9,8,14
			9,9,2
1.0GB			10,2,7
			10,3,3
1.0GB		2.0GB	10,3,11
512MB	1.0GB	2.0GB	10,5,13
1.0GB	512MB		10,7,11
			10,10,7
1.0GB			11,1,2
1.0GB		1.0GB	11,1,14
			11,2,6
			11,3,10
1.0GB			11,7,10
1.0GB	512MB		11,8,12
1.0GB	1.0GB		11,10,12
1.0GB			11,14,6
1.0GB			12,4,15
1.0GB			12,8,11
1.0GB			12,10,11
1.0GB			12,10,13
1.0GB			12,14,5
1.0GB			13,3,14
1.0GB	512MB		13,5,8
1.0GB			13,5,10
1.0GB	512MB		13,9,8
			13,10,12
1.0GB			13,11,14
1.0GB			13,12,14
1.0GB			13,13,14
1.0GB			14,1,11
1.0GB	1.0GB	1.0GB	14,2,7
1.0GB			14,2,9
1.0GB			14,3,13
1.0GB	1.0GB		14,8,9
1.0GB			14,11,3
1.0GB			14,11,11
1.0GB			14,11,13
1.0GB			14,12,13
			14,13,13
1.0GB			15,1,2
1.0GB	512MB	1.0GB	15,3,6
1.0GB			15,4,12
1.0GB			15,6,2
1.0GB	512MB		15,7,4
			15,7,8

I've edited {10,5,13} to show the 512MB entry for the bit-truncated column. It doesn't conform to the original 128+ configuration but does look good on the scores. Another contender would be {14,2,7}, it conforms and is nicely balanced.

cgracey · 2017-04-15 21:35

Evanh, thanks so much for doing all these tests! I've only got my phone today, so I can't get into any zip files, but your last big list is very helpful.

The reason those max-run-length settings are critical to stick with is because they WILL cycle through all non-0 states, never getting hung up in some short loop, due to some unfortunate seed value. I did cut my test somewhat short, though, so it's possible that some of your higher-scoring settings are also max-run-length.

The way I discovered those max-run-length settings was to use a seed of $0000_0001, then iterate until $0000_0001 reappears, counting iterations as I go. If $0000_0001 first reappears after exactly $FFFF_FFFF iterations, you've got a max-run-length setting that cycles through every non-0 value before repeating. If you are not back to $0000_0001 after $FFFF_FFFF iterations, the PRNG got hung up in a loop and those shift settings are not ideal.

It's interesting that regardless of how you extract the random values, you get nearly the same quality, even though the numbers of iterations are up to 15x more for single-bit vs. 15-bit.

It looks like 14,2,7 is the best, overall, doesn't it?

cgracey · 2017-04-15 21:55

evanh wrote: »

This is what I get after filtering for A>C>B (I'm assuming that's a minimum requirement):
6,2,3
6,2,5
8,1,7
9,1,4
10,2,7
11,1,2
11,2,6
11,3,10
11,7,10
12,8,11
12,10,11
13,5,8
13,5,10
13,10,12
14,1,11
14,2,7
14,2,9
14,3,13
14,8,9
14,11,13
14,12,13
15,1,2
15,3,6
15,4,12
15,7,8

I don't know that that's a requirement, but maybe it is.

evanh · 2017-04-16 00:38

cgracey wrote: »

Evanh, thanks so much for doing all these tests! I've only got my phone today, so I can't get into any zip files, but your last big list is very helpful.

Compliments do feel good.

As for those zip files, the dump of scores probably don't have much value. I was still meandering at that stage.

... I did cut my test somewhat short, though, so it's possible that some of your higher-scoring settings are also max-run-length.

I'll take a shot at that too ...

It looks like 14,2,7 is the best, overall, doesn't it?

Yep, I like that one.

cgracey · 2017-04-16 00:47

evanh wrote: »

cgracey wrote: »

Evanh, thanks so much for doing all these tests! I've only got my phone today, so I can't get into any zip files, but your last big list is very helpful.

Compliments do feel good.

As for those zip files, the dump of scores probably don't have much value. I was still meandering at that stage.

... I did cut my test somewhat short, though, so it's possible that some of your higher-scoring settings are also max-run-length.

I'll take a shot at that too ...

It looks like 14,2,7 is the best, overall, doesn't it?

Yep, I like that one.

Isn't it funny how all this thought and effort go into coming up with a sequence of three 4-bit numbers?

If anyone ever looks at xoroshiro128+ and wonders about a 32-bit version, they will search for xoroshiro32 and wind up reading this thread, where they will find an optimal implementation.

evanh · 2017-04-16 01:10

We are somewhat reliant on PractRand giving meaningful answers without understanding how it does so.

I do have the source code but it's rather big and complicated and full of C++. I find C++ has this natural obfuscation, there is so much squishiness to the data types and their operators. The levels of indirection and hidden data copying just swamp me.

evanh · 2017-04-16 01:12

Not that I've ever really tried that hard. I'm comfortable with C.

EDIT: And I don't code for a job these days. PLCs and ladder logic is as far as I've ever needed go there. That said, learning C was an on the job self taught crash course when the opportunity arose ... so it often seems to be necessity driven.

EDIT2: That opportunity actually sprung from prior success with PLCs. We wanted a big colour touchscreen in the 1990's when they were all really expensive in the industrial market space. So, using a raw laptop LCD (Quite hard to source in small numbers back then.) plugged into a PC on an ISA card we cobbled together something close to what is now called a Panel-PC. Everything was folded and painted sheet steel. That's when I really learned C properly.

evanh · 2017-04-16 06:39

evanh wrote: »

... I did cut my test somewhat short, though, so it's possible that some of your higher-scoring settings are also max-run-length.

I'll take a shot at that too ...

Took me maybe three or four hours of tinkering and three or four minutes to run. I was running it as a single task but decided to make a script that launched 15 processes in rapid fire to make use of the hardware. Attached is a screenshot, just after it was finished, showing the ramp down of processor utilisation, bottom right of display, due to differing run times.

As for the result, I get 84 hits without any zeros. That's two extra: {7,2,14} and {11,11,14}. All others are perfect 1:1 match.

 1, 2, 8  128
 2, 1,11  21b
 2, 1,15  21f
 2, 2, 7  227
 2, 6,15  26f
 2, 8, 7  287
 2, 9, 9  299
 2,11, 3  2b3
 3, 2, 6  326
 3, 3,10  33a
 3,11, 2  3b2
 3,11,14  3be
 4, 1, 9  419
 4, 7,15  47f
 4, 8, 5  485
 5, 2, 6  526
 5, 8, 4  584
 5,14,12  5ec
 6, 2, 3  623
 6, 2, 5  625
 6, 2,11  62b
 6, 3,15  63f
 6,14,11  6eb
 7, 1, 8  718
 7, 2, 2  722
 7, 2,10  72a
 7, 2,14  72e
 7, 8, 2  782
 7,10,10  7aa
 7,15, 8  7f8
 8, 1, 7  817
 8, 2, 1  821
 8, 5,13  85d
 8, 7,15  87f
 8, 9,13  89d
 8,15, 7  8f7
 9, 1, 4  914
 9, 2,14  92e
 9, 8,14  98e
 9, 9, 2  992
10, 2, 7  a27
10, 3, 3  a33
10, 3,11  a3b
10, 5,13  a5d
10, 7,11  a7b
10,10, 7  aa7
11, 1, 2  b12
11, 1,14  b1e
11, 2, 6  b26
11, 3,10  b3a
11, 7,10  b7a
11, 8,12  b8c
11,10,12  bac
11,11,14  bbe
11,14, 6  be6
12, 4,15  c4f
12, 8,11  c8b
12,10,11  cab
12,10,13  cad
12,14, 5  ce5
13, 3,14  d3e
13, 5, 8  d58
13, 5,10  d5a
13, 9, 8  d98
13,10,12  dac
13,11,14  dbe
13,12,14  dce
13,13,14  dde
14, 1,11  e1b
14, 2, 7  e27
14, 2, 9  e29
14, 3,13  e3d
14, 8, 9  e89
14,11, 3  eb3
14,11,11  ebb
14,11,13  ebd
14,12,13  ecd
14,13,13  edd
15, 1, 2  f12
15, 3, 6  f36
15, 4,12  f4c
15, 6, 2  f62
15, 7, 4  f74
15, 7, 8  f78

cgracey · 2017-04-16 06:48

So, these are all the shift values that produce a max-length-run? And you found two more than I did?

cgracey · 2017-04-16 06:49

Did your computer try out 4096 sets in just 3 or 4 minutes? If so, that's really fast.

evanh · 2017-04-16 07:10

Because the iterator loop can terminate on two criteria - when (s0 == 1 AND s1 == 0), and also the, supposedly impossible, roll-over of 2^32. I though I'd see if there is any cases where the impossible does occur.

Modifying the print criteria, for what gets listed in the report, gives empty reports! And I double checked by tacking on a single end-of-line upon program completion which shows up as file size of one. Yay, no impossibles occurred.

evanh · 2017-04-16 07:14

cgracey wrote: »

So, these are all the shift values that produce a max-length-run? And you found two more than I did?

Yep.

wrote:

Did your computer try out 4096 sets in just 3 or 4 minutes? If so, that's really fast.

Range was 1-15, I didn't do the zeros. So 15^3 = 3375 sets total, but 15^2 = 225 per task.

Random/LFSR on P2

Comments