Random/LFSR on P2

Heater. · 2018-08-06 07:32

"...the geometric mean of 15 runs..." is clear enough.

"...normalized to make the classic biased-mod method have unit time."

Hmm... so the axis is time, but not measured in seconds or whatever but in units of the execution time of the biased-mod method.

The axis is relative not absolute.

So, yeah, lower is better.

evanh · 2018-08-06 07:37

Oh, so it's just execution speed. Is that all that's being compared in the whole article?

Xoroshiro++/Xoshiro++ win hands down there. No multiplier and minimal multilayered rotate-sum-rotate-sum.

Heater. · 2018-08-06 08:02

Yep, execution speed. Or rather, relative execution speeds.

The Xo.... generators are fast for sure. The point of the blog though was that the speed of your PRNG may not be so relevant if you are wrapping it around with some much slower processing. In this case selecting random numbers from a range. Which is quite a common thing to do. Or perhaps selecting random numbers with some desired distribution. Differences in speeds of different PRNGs can get lost in the noise compared to the execution time of everything else.

Ha! I can understand being allergic to XP.

TonyB_ · 2018-08-06 13:00

evanh wrote: »

TonyB_ wrote: »

Her last blog post is not as interesting as some others, but I mention it because I can't see the various Benchmark Results, only the titles. Can you?

Ha, those are all Structured Vector Graphics. Wow, I think the only place I've seen an SVG before is Wikipedia, but Wikipedia has automatic conversion to bitmap for browsers that can't handle an SVG.

I've emailed you the SVGs converted to PNGs.

Thanks Evan, I should have looked at the page source. My browser can view SVG files but not when placed inline as follows:

<img alt="Large-Shuffle Benchmark, Mersenne Twister" src="../figures/bounded-rand-mt1.svg">

If there were a link to the SVG's URL that I could click on, then I could see the image.

I have an XP PC that was given to me but it is probably the slowest XP PC in the world.

evanh · 2018-08-06 22:36

TonyB_ wrote: »

I have an XP PC that was given to me but it is probably the slowest XP PC in the world.

That's where the name Windoze comes from.

Nothing's changed, btw. People still get confused thinking that their slow Win10 is due to the internet. The marketers just love it. Upgrades and new faster internet every few years. Of course, being talked into the big upgrade - fibre connections - nothing has improved, they're now getting annoyed.

TonyB_ · 2018-08-07 00:12

Evan, do you have a copy of TestUI? I'm wondering how [13,5,10,9] and [13,5,8,10] double-iterated and Chris's generator do in the SmallCrush, Crush and BigCrush tests. I think TestUI requires 32-bit samples, which is exactly what we're using here.

evanh · 2018-08-07 00:31

I used Dieharder before Practrand. The Crush testing just looked difficult and uncomprehendable. So, no, never had it.

EDIT: Lol, that said, it is in the Ubuntu pre-built repository.

EDIT2: Or not. There is some other program with same name, here's the man page description: "Program testu01-tcode makes compilable code from a [La]TeX document." Yet the package manager says "TestU01 is a software library, implemented in the ANSI C language, and offering a collection of utilities for the empirical statistical testing of uniform random number generators." And the package contents includes /usr/bin/testu01-tcode

evanh · 2018-08-07 01:04

From what little I know of the Crush tests is they only seem to be a pass/fail. There is no length given for how far it had to go to discover the failure. It seems pretty worthless because all 32-bit state engines fail a BigCrush.

TonyB_ · 2018-08-07 11:46

BigCrush does more than 100 separate tests. The pass/fail makes things simpler. The first generator to test would be Chris's as his is the best (for 32-bit output). A few tests will fail but how many exactly? This would give us a baseline for xoroshiro32++ double-iterated. Do more tests fail or the same number? We know there are small differences with Crush and BigCrush for [14,2,7,5] and [14,2,7,6].

TestUI documentation says: "To install and use TestU01, one needs the GNU C compiler. This compiler is bundled with most Linux distributions."
http://simul.iro.umontreal.ca/testu01/tu01.html

BigCrush takes hours to run, so it's not meant for all candidates. If from our recent work we recommend switching a future version of XORO32 to [13,5,10,9] say, then this should be checked with the Crush tests. Seba has helped us a lot but I don't want to bother him again.

evanh · 2018-08-07 12:44

Yeah, I got that and built it. Lots of .o files resulted. There's no executable other than tcode, which is the same doc converter I mentioned earlier. There is a "bin" directory but it's empty. The command line method I've been using doesn't seem to be an option.

It's all a huge collection of sources with over 80 support headers that I somehow have to read up on to find all the functions to call to make use of it.

I'm not in any rush to learn it all to be honest. I'll have another poke around ...

evanh · 2018-08-07 13:00

TonyB_ wrote: »

BigCrush takes hours to run, so it's not meant for all candidates. If from our recent work we recommend switching a future version of XORO32 to [13,5,10,9] say, then this should be checked with the Crush tests.

Problem is how do we tell if period is the fail condition or quality? If we can't tell the difference then that test has no value.

I'll do some reading. Hopefully there is a way to make the distinction.

TonyB_ · 2018-08-07 22:31

evanh wrote: »

TonyB_ wrote: »

BigCrush takes hours to run, so it's not meant for all candidates. If from our recent work we recommend switching a future version of XORO32 to [13,5,10,9] say, then this should be checked with the Crush tests.

Problem is how do we tell if period is the fail condition or quality? If we can't tell the difference then that test has no value.

I'll do some reading. Hopefully there is a way to make the distinction.

It would be better to do some compiling!

Don't worry about how the tests work, just let 'em run. If all the (small number of) candidates fail the same tests, then we'll have learnt something: that TestUI has no value as a comparison tool.

evanh · 2018-08-08 02:00

I gather you've not discerned anything then. At this stage I wouldn't have a clue as to how to put together a Crush compile.

There is a new compile occurring every few minutes for the last week or so, if that counts. Double iterate gridding of the distribution list is progressing.

TonyB_ · 2018-08-08 11:39

What we need compiled are the battery tests, bbattery.c in the source files. Info in bbattery.tex or from p.147 onwards in the User's Guide. I don't know about the various *.h files.

evanh · 2018-08-08 11:57

I'm not very focused at the moment, sorry. The primary issue is how to put it all together. I think I've found something useful on page 19 (24) of the short guide.

evanh · 2018-08-09 14:45

I can safely say I'm caffeine dependent.

EDIT: Typo again!

evanh · 2018-08-10 01:37

Here's all four charts together.

Note: Single iterated is scored as a 16x16 grid. Double iterated is scored as a 32x32 grid but with every second line missing - making it a 16x32 grid. Because of the score difference in odd verses even aperture sizes, this creates maybe roughly -0.67 comparative bias on the double iteration averages.

Nothing stands out as best candidate. All the usual good ones are still there. #1 distribution candidate is okay, as is #4.

EDIT: Oops, I see the graph text is wrong ...

evanh · 2018-08-10 01:45

Corrected double iterated charts:

evanh · 2018-08-11 13:33

Well, after writing two full implementations of Crush testing with extensively configurable Xoroshiro generator built in - one using what I found in the short guide PDF, the other as per Melissa's example when I realised someone else had probably done this before - it dawned on me I could very easily do my own piped "stdin" version of just the Crush tests without any generator and have it fit the way I've been using Practrand in my scripts.

PS: Melissa's excellent blog on the matter - http://www.pcg-random.org/posts/how-to-test-with-testu01.html

Here's the very much simplified result:

#include <stdint.h>
#include <stdio.h>

#include "unif01.h"
#include "bbattery.h"



static inline uint8_t  revbits( uint8_t  v )
{
    // https://graphics.stanford.edu/~seander/bithacks.html
    // swap odd and even bits
    v = ((v >> 1) & 0x55) | ((v & 0x55) << 1);
    // swap consecutive pairs
    v = ((v >> 2) & 0x33) | ((v & 0x33) << 2);
    // swap nibbles ...
    v = ((v >> 4) & 0x0f) | ((v & 0x0f) << 4);

    return v;
}


uint32_t  chunk_bits( void )
{
	uint32_t  output;


#ifdef REVBITS
	output = revbits( getchar() ) << 24;
	output |= revbits( getchar() ) << 16;
	output |= revbits( getchar() ) << 8;
	output |= revbits( getchar() );
#else
	output = (uint8_t)getchar();
	output |= (uint8_t)getchar() << 8;
	output |= (uint8_t)getchar() << 16;
	output |= (uint8_t)getchar() << 24;
#endif	

	return output;
}



int  main( void )
{
	// Create TestU01 PRNG object for our generator
#ifdef REVBITS
	unif01_Gen  *gen = unif01_CreateExternGenBits( "stdin32, bit-order-reversed", chunk_bits );
#else
	unif01_Gen  *gen = unif01_CreateExternGenBits( "stdin32", chunk_bits );
#endif

	// Run the tests.
#if CRUSH == 1
	bbattery_SmallCrush( gen );
#elif CRUSH == 2
	bbattery_Crush( gen );
#elif CRUSH == 3
	bbattery_BigCrush( gen );
#else
	printf( "No test-battery selected!\n" );
#endif

	// Clean up.
	unif01_DeleteExternGenBits( gen );

	return 0;
}

TonyB_ · 2018-08-11 13:50

evanh wrote: »

Well, after writing two full implementations of Crush testing with extensively configurable Xoroshiro generator built in - one using what I found in the short guide PDF, the other as per Melissa's example when I realised someone else had probably done this before - it dawned on me I could very easily do my own piped "stdin" version of just the Crush tests without any generator and have it fit the way I've been using Practrand in my scripts.

PS: Melissa's excellent blog on the matter - http://www.pcg-random.org/posts/how-to-test-with-testu01.html

That's good news, Evan, so you have the Crush tests working? In a moment of madness I considered getting hold of a C compiler. Is the REVBITS code for extra testing as recommended with the bit order reversed?

evanh · 2018-08-11 14:13

TonyB_ wrote: »

That's good news, Evan, so you have the Crush tests working?

Things seem to be behaving. Not a lot of testing just yet though. Still have to fit the scripts.

In a moment of madness I considered getting hold of a C compiler.

Come to the dark side ... you'll like it here.

Is the REVBITS code for extra testing as recommended with the bit order reversed?

Yes. That's what's known as a preprocessor, or compile-time, condition. Only the piece of code for the true case gets compiled. The rest just stays as text in the source.

Here's the most recent compile command with those controls:

gcc -o crushstdin crushstdin.c -ltestu01 -lprobdist -lmylib -lm -O3 -DCRUSH=1 -DREVBITS

evanh · 2018-08-11 14:31

Oh, it's not extra testing, it's either testing with or without bit reversal. You probably worked that out from the rest of what I said though.

evanh · 2018-08-12 02:25

evanh wrote: »

Wow, I didn't know each Epyc socket fields eight DIMM channels. That's insane pin count! I had assumed the Threadripper parts were same packaging as Epyc but that's now clearly not the case.

Huh, I just discovered, as I'd first thought, they are the same socket. If I'd bothered to check I would've known earlier.

Threadrippers/TR4 must just have half the DIMM channels unrouted.

TonyB_ · 2018-08-13 00:53

TonyB_ wrote: »

Tests failed:

		Crush	BigCrush
[14,2,7,5]	  5	   10
[14,2,7,6]	  3	   11

It will be interesting to see how Chris's generator and [13,5,10,9] & [13,5,8,10] perform.

evanh · 2018-08-13 04:29

Uh-oh, those don't match what I'm getting. Thanks for linking back to those earlier Crush results. Gives me something to reference. I've tried out both the piped version and the all-in-one version.

Here's Seba's Crush result for Xoroshiro32++ [14 2 7 5]:

========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        xoroshiro[16]32++-14-2-7
 Number of statistics:  144
 Total CPU time:   00:29:09.44
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2              1 -  5.1e-5
  2  SerialOver, t = 4               1.3e-8
 42  MaxOft, t = 10                 1 -  1.1e-6
 43  MaxOft, t = 20                 1 -  2.7e-5
 44  MaxOft, t = 30                 1 -  5.7e-9
 ----------------------------------------------
 All other tests were passed

Here's mine for same [14 2 7 5] generator:

========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        Xorshiro32(16)++, candidate [14 2 7 5]
  - single iterated, bit-order straight, sampling aperture 16>>0
 Number of statistics:  144
 Total CPU time:   00:24:40.69
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  2  SerialOver, t = 4               3.5e-4
 41  MaxOft, t = 5                   0.9996 
 42  MaxOft, t = 10                 1 -  5.7e-5
 43  MaxOft, t = 20                  0.9997 
 44  MaxOft, t = 30                 1 -  9.8e-6
 ----------------------------------------------
 All other tests were passed

========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        stdin32
 Number of statistics:  144
 Total CPU time:   00:28:47.11
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  2  SerialOver, t = 4               3.5e-4
 41  MaxOft, t = 5                   0.9996 
 42  MaxOft, t = 10                 1 -  5.7e-5
 43  MaxOft, t = 20                  0.9997 
 44  MaxOft, t = 30                 1 -  9.8e-6
 ----------------------------------------------
 All other tests were passed

And I've got two variants of bit-reversal now too, neither of which match Seba's results:

========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        stdin32, bit-order-reversed
 Number of statistics:  144
 Total CPU time:   00:29:42.00
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2              1 - 1.3e-12
 41  MaxOft, t = 5                   0.9998 
 71  LinearComp, r = 0               0.9991 
 ----------------------------------------------
 All other tests were passed

========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        Xorshiro32(16)++, candidate [14 2 7 5]
  - single iterated, bit-order reversed, sampling aperture 16>>0
 Number of statistics:  144
 Total CPU time:   00:25:24.90
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2              1 -  2.8e-5
 41  MaxOft, t = 5                  1 -  1.1e-9
 ----------------------------------------------
 All other tests were passed

evanh · 2018-08-13 06:43

The first of those two - "Generator: stdin32, bit-order-reversed" - is the more appropriate one. The bit-reversal in the second one reorders the generator's 16-bit output rather than Crush's 32-bit input.

At this stage, I can't say why Seba's result is not matching anything I've done.

evanh · 2018-08-13 10:27

To improved execution speed of the piped test suite, I've moved the bit-order reversal code out of the Crush testing and tacked at the end of the sampling code in the generator compile. So the summary report now looks completely generic in terms of generator configuration. Eg: Here's a new run from Xoroshiro32++ [14 2 7 5], single iterated, sampling aperture 16>>0, bit-order reversed.

========= Summary results of Crush =========

 Version:          TestU01 1.2.3
 Generator:        stdin32
 Number of statistics:  144
 Total CPU time:   00:22:38.14
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  SerialOver, t = 2              1 - 1.3e-12
 41  MaxOft, t = 5                   0.9998 
 71  LinearComp, r = 0               0.9991 
 ----------------------------------------------
 All other tests were passed

evanh · 2018-08-13 10:30

Regarding a possible reason for not matching the report from Seba: Seba's implementation may have had the ability for the test suite to set the generator seed at will. My implementation doesn't provide that option, each new test in the battery can only continue the generator sequence from where the previous test left off.

TonyB_ · 2018-08-13 12:01

Thanks, Evan. Are the results repeated exactly if you do a second lot of battery tests, continuing from where the first one left off?

TonyB_ · 2018-08-13 13:29

Reversing the bit order produces fewer failures but in more categories (LinearComp added to SerialOver and MaxOft).

Random/LFSR on P2

Comments