Yep, execution speed. Or rather, relative execution speeds.
The Xo.... generators are fast for sure. The point of the blog though was that the speed of your PRNG may not be so relevant if you are wrapping it around with some much slower processing. In this case selecting random numbers from a range. Which is quite a common thing to do. Or perhaps selecting random numbers with some desired distribution. Differences in speeds of different PRNGs can get lost in the noise compared to the execution time of everything else.
Her last blog post is not as interesting as some others, but I mention it because I can't see the various Benchmark Results, only the titles. Can you?
Ha, those are all Structured Vector Graphics. Wow, I think the only place I've seen an SVG before is Wikipedia, but Wikipedia has automatic conversion to bitmap for browsers that can't handle an SVG.
I've emailed you the SVGs converted to PNGs.
Thanks Evan, I should have looked at the page source. My browser can view SVG files but not when placed inline as follows:
I have an XP PC that was given to me but it is probably the slowest XP PC in the world.
That's where the name Windoze comes from.
Nothing's changed, btw. People still get confused thinking that their slow Win10 is due to the internet. The marketers just love it. Upgrades and new faster internet every few years. Of course, being talked into the big upgrade - fibre connections - nothing has improved, they're now getting annoyed.
Evan, do you have a copy of TestUI? I'm wondering how [13,5,10,9] and [13,5,8,10] double-iterated and Chris's generator do in the SmallCrush, Crush and BigCrush tests. I think TestUI requires 32-bit samples, which is exactly what we're using here.
I used Dieharder before Practrand. The Crush testing just looked difficult and uncomprehendable. So, no, never had it.
EDIT: Lol, that said, it is in the Ubuntu pre-built repository.
EDIT2: Or not. There is some other program with same name, here's the man page description: "Program testu01-tcode makes compilable code from a [La]TeX document." Yet the package manager says "TestU01 is a software library, implemented in the ANSI C language, and offering a collection of utilities for the empirical statistical testing of uniform random number generators." And the package contents includes /usr/bin/testu01-tcode
From what little I know of the Crush tests is they only seem to be a pass/fail. There is no length given for how far it had to go to discover the failure. It seems pretty worthless because all 32-bit state engines fail a BigCrush.
BigCrush does more than 100 separate tests. The pass/fail makes things simpler. The first generator to test would be Chris's as his is the best (for 32-bit output). A few tests will fail but how many exactly? This would give us a baseline for xoroshiro32++ double-iterated. Do more tests fail or the same number? We know there are small differences with Crush and BigCrush for [14,2,7,5] and [14,2,7,6].
TestUI documentation says: "To install and use TestU01, one needs the GNU C compiler. This compiler is bundled with most Linux distributions." http://simul.iro.umontreal.ca/testu01/tu01.html
BigCrush takes hours to run, so it's not meant for all candidates. If from our recent work we recommend switching a future version of XORO32 to [13,5,10,9] say, then this should be checked with the Crush tests. Seba has helped us a lot but I don't want to bother him again.
Yeah, I got that and built it. Lots of .o files resulted. There's no executable other than tcode, which is the same doc converter I mentioned earlier. There is a "bin" directory but it's empty. The command line method I've been using doesn't seem to be an option.
It's all a huge collection of sources with over 80 support headers that I somehow have to read up on to find all the functions to call to make use of it.
I'm not in any rush to learn it all to be honest. I'll have another poke around ...
BigCrush takes hours to run, so it's not meant for all candidates. If from our recent work we recommend switching a future version of XORO32 to [13,5,10,9] say, then this should be checked with the Crush tests.
Problem is how do we tell if period is the fail condition or quality? If we can't tell the difference then that test has no value.
I'll do some reading. Hopefully there is a way to make the distinction.
BigCrush takes hours to run, so it's not meant for all candidates. If from our recent work we recommend switching a future version of XORO32 to [13,5,10,9] say, then this should be checked with the Crush tests.
Problem is how do we tell if period is the fail condition or quality? If we can't tell the difference then that test has no value.
I'll do some reading. Hopefully there is a way to make the distinction.
It would be better to do some compiling! Don't worry about how the tests work, just let 'em run. If all the (small number of) candidates fail the same tests, then we'll have learnt something: that TestUI has no value as a comparison tool.
I gather you've not discerned anything then. At this stage I wouldn't have a clue as to how to put together a Crush compile.
There is a new compile occurring every few minutes for the last week or so, if that counts. Double iterate gridding of the distribution list is progressing.
What we need compiled are the battery tests, bbattery.c in the source files. Info in bbattery.tex or from p.147 onwards in the User's Guide. I don't know about the various *.h files.
I'm not very focused at the moment, sorry. The primary issue is how to put it all together. I think I've found something useful on page 19 (24) of the short guide.
Note: Single iterated is scored as a 16x16 grid. Double iterated is scored as a 32x32 grid but with every second line missing - making it a 16x32 grid. Because of the score difference in odd verses even aperture sizes, this creates maybe roughly -0.67 comparative bias on the double iteration averages.
Nothing stands out as best candidate. All the usual good ones are still there. #1 distribution candidate is okay, as is #4.
Well, after writing two full implementations of Crush testing with extensively configurable Xoroshiro generator built in - one using what I found in the short guide PDF, the other as per Melissa's example when I realised someone else had probably done this before - it dawned on me I could very easily do my own piped "stdin" version of just the Crush tests without any generator and have it fit the way I've been using Practrand in my scripts.
Well, after writing two full implementations of Crush testing with extensively configurable Xoroshiro generator built in - one using what I found in the short guide PDF, the other as per Melissa's example when I realised someone else had probably done this before - it dawned on me I could very easily do my own piped "stdin" version of just the Crush tests without any generator and have it fit the way I've been using Practrand in my scripts.
That's good news, Evan, so you have the Crush tests working? In a moment of madness I considered getting hold of a C compiler. Is the REVBITS code for extra testing as recommended with the bit order reversed?
That's good news, Evan, so you have the Crush tests working?
Things seem to be behaving. Not a lot of testing just yet though. Still have to fit the scripts.
In a moment of madness I considered getting hold of a C compiler.
Come to the dark side ... you'll like it here.
Is the REVBITS code for extra testing as recommended with the bit order reversed?
Yes. That's what's known as a preprocessor, or compile-time, condition. Only the piece of code for the true case gets compiled. The rest just stays as text in the source.
Here's the most recent compile command with those controls:
Wow, I didn't know each Epyc socket fields eight DIMM channels. That's insane pin count! I had assumed the Threadripper parts were same packaging as Epyc but that's now clearly not the case.
Huh, I just discovered, as I'd first thought, they are the same socket. If I'd bothered to check I would've known earlier.
Threadrippers/TR4 must just have half the DIMM channels unrouted.
Uh-oh, those don't match what I'm getting. Thanks for linking back to those earlier Crush results. Gives me something to reference. I've tried out both the piped version and the all-in-one version.
Here's Seba's Crush result for Xoroshiro32++ [14 2 7 5]:
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: xoroshiro[16]32++-14-2-7
Number of statistics: 144
Total CPU time: 00:29:09.44
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 1 - 5.1e-5
2 SerialOver, t = 4 1.3e-8
42 MaxOft, t = 10 1 - 1.1e-6
43 MaxOft, t = 20 1 - 2.7e-5
44 MaxOft, t = 30 1 - 5.7e-9
----------------------------------------------
All other tests were passed
Here's mine for same [14 2 7 5] generator:
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: Xorshiro32(16)++, candidate [14 2 7 5]
- single iterated, bit-order straight, sampling aperture 16>>0
Number of statistics: 144
Total CPU time: 00:24:40.69
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
2 SerialOver, t = 4 3.5e-4
41 MaxOft, t = 5 0.9996
42 MaxOft, t = 10 1 - 5.7e-5
43 MaxOft, t = 20 0.9997
44 MaxOft, t = 30 1 - 9.8e-6
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: stdin32
Number of statistics: 144
Total CPU time: 00:28:47.11
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
2 SerialOver, t = 4 3.5e-4
41 MaxOft, t = 5 0.9996
42 MaxOft, t = 10 1 - 5.7e-5
43 MaxOft, t = 20 0.9997
44 MaxOft, t = 30 1 - 9.8e-6
----------------------------------------------
All other tests were passed
And I've got two variants of bit-reversal now too, neither of which match Seba's results:
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: stdin32, bit-order-reversed
Number of statistics: 144
Total CPU time: 00:29:42.00
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 1 - 1.3e-12
41 MaxOft, t = 5 0.9998
71 LinearComp, r = 0 0.9991
----------------------------------------------
All other tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: Xorshiro32(16)++, candidate [14 2 7 5]
- single iterated, bit-order reversed, sampling aperture 16>>0
Number of statistics: 144
Total CPU time: 00:25:24.90
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 1 - 2.8e-5
41 MaxOft, t = 5 1 - 1.1e-9
----------------------------------------------
All other tests were passed
The first of those two - "Generator: stdin32, bit-order-reversed" - is the more appropriate one. The bit-reversal in the second one reorders the generator's 16-bit output rather than Crush's 32-bit input.
At this stage, I can't say why Seba's result is not matching anything I've done.
To improved execution speed of the piped test suite, I've moved the bit-order reversal code out of the Crush testing and tacked at the end of the sampling code in the generator compile. So the summary report now looks completely generic in terms of generator configuration. Eg: Here's a new run from Xoroshiro32++ [14 2 7 5], single iterated, sampling aperture 16>>0, bit-order reversed.
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: stdin32
Number of statistics: 144
Total CPU time: 00:22:38.14
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
1 SerialOver, t = 2 1 - 1.3e-12
41 MaxOft, t = 5 0.9998
71 LinearComp, r = 0 0.9991
----------------------------------------------
All other tests were passed
Regarding a possible reason for not matching the report from Seba: Seba's implementation may have had the ability for the test suite to set the generator seed at will. My implementation doesn't provide that option, each new test in the battery can only continue the generator sequence from where the previous test left off.
Comments
"...normalized to make the classic biased-mod method have unit time."
Hmm... so the axis is time, but not measured in seconds or whatever but in units of the execution time of the biased-mod method.
The axis is relative not absolute.
So, yeah, lower is better.
Xoroshiro++/Xoshiro++ win hands down there. No multiplier and minimal multilayered rotate-sum-rotate-sum.
The Xo.... generators are fast for sure. The point of the blog though was that the speed of your PRNG may not be so relevant if you are wrapping it around with some much slower processing. In this case selecting random numbers from a range. Which is quite a common thing to do. Or perhaps selecting random numbers with some desired distribution. Differences in speeds of different PRNGs can get lost in the noise compared to the execution time of everything else.
Ha! I can understand being allergic to XP.
Thanks Evan, I should have looked at the page source. My browser can view SVG files but not when placed inline as follows:
If there were a link to the SVG's URL that I could click on, then I could see the image.
I have an XP PC that was given to me but it is probably the slowest XP PC in the world.
Nothing's changed, btw. People still get confused thinking that their slow Win10 is due to the internet. The marketers just love it. Upgrades and new faster internet every few years. Of course, being talked into the big upgrade - fibre connections - nothing has improved, they're now getting annoyed.
EDIT: Lol, that said, it is in the Ubuntu pre-built repository.
EDIT2: Or not. There is some other program with same name, here's the man page description: "Program testu01-tcode makes compilable code from a [La]TeX document." Yet the package manager says "TestU01 is a software library, implemented in the ANSI C language, and offering a collection of utilities for the empirical statistical testing of uniform random number generators." And the package contents includes /usr/bin/testu01-tcode
TestUI documentation says: "To install and use TestU01, one needs the GNU C compiler. This compiler is bundled with most Linux distributions."
http://simul.iro.umontreal.ca/testu01/tu01.html
BigCrush takes hours to run, so it's not meant for all candidates. If from our recent work we recommend switching a future version of XORO32 to [13,5,10,9] say, then this should be checked with the Crush tests. Seba has helped us a lot but I don't want to bother him again.
It's all a huge collection of sources with over 80 support headers that I somehow have to read up on to find all the functions to call to make use of it.
I'm not in any rush to learn it all to be honest. I'll have another poke around ...
I'll do some reading. Hopefully there is a way to make the distinction.
It would be better to do some compiling! Don't worry about how the tests work, just let 'em run. If all the (small number of) candidates fail the same tests, then we'll have learnt something: that TestUI has no value as a comparison tool.
There is a new compile occurring every few minutes for the last week or so, if that counts. Double iterate gridding of the distribution list is progressing.
EDIT: Typo again!
Note: Single iterated is scored as a 16x16 grid. Double iterated is scored as a 32x32 grid but with every second line missing - making it a 16x32 grid. Because of the score difference in odd verses even aperture sizes, this creates maybe roughly -0.67 comparative bias on the double iteration averages.
Nothing stands out as best candidate. All the usual good ones are still there. #1 distribution candidate is okay, as is #4.
EDIT: Oops, I see the graph text is wrong ...
PS: Melissa's excellent blog on the matter - http://www.pcg-random.org/posts/how-to-test-with-testu01.html
Here's the very much simplified result:
That's good news, Evan, so you have the Crush tests working? In a moment of madness I considered getting hold of a C compiler. Is the REVBITS code for extra testing as recommended with the bit order reversed?
Come to the dark side ... you'll like it here.
Yes. That's what's known as a preprocessor, or compile-time, condition. Only the piece of code for the true case gets compiled. The rest just stays as text in the source.
Here's the most recent compile command with those controls:
Threadrippers/TR4 must just have half the DIMM channels unrouted.
It will be interesting to see how Chris's generator and [13,5,10,9] & [13,5,8,10] perform.
Here's Seba's Crush result for Xoroshiro32++ [14 2 7 5]:
Here's mine for same [14 2 7 5] generator:
And I've got two variants of bit-reversal now too, neither of which match Seba's results:
At this stage, I can't say why Seba's result is not matching anything I've done.