The C language, for example, abstracts only the basic common ideas of number crunching, flow control, and memory into the language. All else is farmed out to functions. Which turns out to be great if you want an easily portable language.
If portability is not the goal, then anything goes. Just keep inventing keywords to do whatever.
Shouldn't that conversation be moved to the "New Spin" thread?
Spin1 has a lot of what would be consider library stuff built in, like BASIC usually does. So far, Chip is following that model for Spin2.
I like the idea of separating library stuff from the language, but we need a way that isn't cumbersome.
If library stuff is external to the language in the current model, then we need objects to contain them and calling them is object.whatever() instead of just whatever(). Then you need to reference the library object(s) in all your objects. It's kind of a slippery slope going down that route.
I'm not sure if it's a good idea to force the added complexity in this case?
I think we need this:
An implied library which is an object, but does not need the object.method() syntax, just the method() syntax. If any method uses an unknown keyword, the implied object's methods get checked for a name match. Any object that uses any of those methods will have that implied object included, which is just a 2-long cost. The top-level file can specify this implied object. That way, Spin can get extended without any tool changes.
Have you tested both for 2^128 iterations to check?
Of course, done it in two seconds. 2^128 in 0.1 seconds, 2^130 in 0.4 seconds, the remaining time was just me fluffing about.
I'm sorry, but I find this hard to believe.
If you could do an iteration per clock (upper bound for one cog), with a 128MHz clock, you can do about 2^27 iterations per second, so it should run for 2^101 seconds, not 0.1 seconds. For any reasonable multiplier of parallelism and available clock speeds, it should take way too long to prove the sequence length that way.
Using PractRand 0.93, the randomness for smaller word sizes shows a massive drop off the moment the data stream exceeds the specified s0/s1 word size, which is RESULT_SIZE + 1. I've tested RESULT_SIZEs from 21 to 31 so far.
Extrapolating on this, I predict the original xoroshiro128+ to start failing above 2^64 bytes. This leads to the question of is this the repeat point? If so, then where did the 2^128 figure come from?
Ah, just found an important statement from Heater's link - http://xoroshiro.di.unimi.it/ - It says "A long period does not imply high quality". So, the obvious conclusion to draw from this is the repeat period cannot be derived from testing failures.
Ah, just worked out how to make PractRand go below 26 bits usefully. There's a -tlmin option for starting it's analysis at a smaller size ... made a calibration tweak to the above source.
Bugger, size 32 failed above 2^30 instead of the expected 2^33. That's not good.
Size 33,34 and 35 aren't any better. Of note is the particular test PractRand is failing them on is called "DC6-9x1Bytes-1". Wonder what that is telling me ... other than I don't know what I'm doing that is ...
Hmm, everything over 32 bits is unhealthy, maybe a bug in my source code ...
Wow, this forum software is a bit screwy. I've only just now seen your post Heater even though it was long before I finished making my updates to the forum. That reference I made to your link was actually from way back on page 2 of this topic.
Part of the problem is there is no numbering of posts now. Another problem is the fuzzy datestamping of posts. One very quickly loses perspective on the spacing of posts.
Learnt a little bit of bash and made meself an auto looping script for dumping scores of every size:
#!/bin/bash
for i in `seq 8 63`; do
gcc -o xoroshiro128plus-test xoroshiro128plus-test.c -O3 -DRESULT_SIZE=${i}
./xoroshiro128plus-test | stdbuf -o L ./RNG_test stdin -a -tlmin 1KB >PRscore-size${i}.out
done
All of this is wrapped in a little test bench which outputs the first and second 32 bit batches of bits from the shuffled rnd output. As may be delivered to COG's 0 and 1 for example. Like so:
This is my first verilog so if anyone can say if those results are correct or not that would be great. The bytes maybe in reverse order but that is no concern.
Every now and then I get a little obsessed with random number generation and PRNGs. It's that little mystery over how do you know they are random enough? It all started a few decades ago when we needed to get some random seed for a secure, portable, military communications box. Turned out to be harder than I would have guessed.
The P2 will have awesome random numbers. Not up to cryptographic strength but very good.
And now Chip has got me tinkering with Verilog. I just want to run the actual verilog output through the statistical tests.
Then I'm going to start designing my own CPU....:)
I've been a little distracted ... For the above single threaded xoroshiro testing I'm getting double the speed - over my ageing Athlon64 - on half the total system power.
The Ryzen is very comfy, I'm liking it ... which is a just as well because I've dropped down somewhat more moola on a single chip than ever before. The basic overclocking multiplier is pretty cool now, it doesn't impact on the dynamic clocking feature so there is no noticeable increase in idle power. It'll make a cheap gaming option when the quad cores arrive.
Ha, I'm having a hard enough time making the xoroshiro work nicely in Verilog. A CPU is far away. My CPU would probably be a subleq machine.
Hmm...maybe I should actually try and get xoroshiro ticking on a real FPGA. The Icarus Verilog simulator is very slow. Now where is that DE-0 Nano?....
Ha, I'm having a hard enough time making the xoroshiro work nicely in Verilog. A CPU is far away. My CPU would probably be a subleq machine.
Hmm...maybe I should actually try and get xoroshiro ticking on a real FPGA. The Icarus Verilog simulator is very slow. Now where is that DE-0 Nano?....
Yeah, Verilog started out as a simulation language and was later used for describing hardware to be synthesized. My understanding, anyway. I don't know the simulation side of Verilog, only using it for hardware description. An FPGA will run 1000's of times faster, if not millions of times faster.
That is the way I understand of Verilog history as well. Same for VHDL I think.
The great thing about a simulator like Icarus is that it is less than a second between hitting save on my newly edited module and having the thing running. The edit/debug cycle is very quick. This makes great training wheels for a Verilog newbie.
I have not tried for some years but I recall that doing this with the Altair tools is thousands of times more complicated and takes forever to compile, synthesize and run.
Downside is of course speed. I have been pumping the output of my xoroshoiro into the dieharder tests. It's been running all night and only the first test step has passed!
Next up is using Verilator. That compiles Verilog into C++ which then runs hundreds of times faster.
Or perhaps go straight to FPGA. I need to see some LEDs flashing
I've discovered things aren't stable on the Ryzen setup. It seemed okay until I was hammering it ... after a while the display would just go black. Needing a hard reset to reboot.
I did a search and found out Linux has a known issue with the Zen architecture. Supposedly Kernel 4.10.1 onwards fixes it.
I discovered I also can't even reliably download a newer iso using the Ryzen - md5sum failed its check. Now I'm wondering how much file corruption has occurred on my main install ... back to the old box for a while maybe ...
Newest Kubuntu daily release worked. Been hammering away for the past hour or so with no problems. Now the mouse wheel reverse direction setting is broken! Damn it!!!! I'd just got used to the Mac way too, it solved all the backwards zooming in newer 3D stuff.
I should try the earlier January beta release I guess. Actually, good test to do there - downloading another iso and check it's md5 ...
Comments
Shouldn't that conversation be moved to the "New Spin" thread?
People here have been worrying about the correlation between COGs. My Poker game thing was just a way to put that into perspective. Who says?
Have you tested both for 2^128 iterations to check?
I think we need this:
An implied library which is an object, but does not need the object.method() syntax, just the method() syntax. If any method uses an unknown keyword, the implied object's methods get checked for a name match. Any object that uses any of those methods will have that implied object included, which is just a 2-long cost. The top-level file can specify this implied object. That way, Spin can get extended without any tool changes.
-Phil
I'm sorry, but I find this hard to believe.
If you could do an iteration per clock (upper bound for one cog), with a 128MHz clock, you can do about 2^27 iterations per second, so it should run for 2^101 seconds, not 0.1 seconds. For any reasonable multiplier of parallelism and available clock speeds, it should take way too long to prove the sequence length that way.
-Phil
So it was iterated 130 times in 0.1 seconds and 132 times in 0.4 seconds.
Perhaps running on an old CP/M machine.
Right, here's where I've got to:
Extrapolating on this, I predict the original xoroshiro128+ to start failing above 2^64 bytes. This leads to the question of is this the repeat point? If so, then where did the 2^128 figure come from?
Ah, just found an important statement from Heater's link - http://xoroshiro.di.unimi.it/ - It says "A long period does not imply high quality". So, the obvious conclusion to draw from this is the repeat period cannot be derived from testing failures.
http://xoroshiro.di.unimi.it/
Also it's implied in the source code, in the comments on the jump() function:
http://xoroshiro.di.unimi.it/xoroshiro128plus.c
And from wikipedia (Which is always correct, right?) :
https://en.wikipedia.org/wiki/Xoroshiro128+
Correction: The period seems to be 2^128 - 1
But whose counting?
Size 33,34 and 35 aren't any better. Of note is the particular test PractRand is failing them on is called "DC6-9x1Bytes-1". Wonder what that is telling me ... other than I don't know what I'm doing that is ...
Hmm, everything over 32 bits is unhealthy, maybe a bug in my source code ...
Part of the problem is there is no numbering of posts now. Another problem is the fuzzy datestamping of posts. One very quickly loses perspective on the spacing of posts.
I'm running Chip's verilog version of xoroshiro128plus under the icarus verilog simulator. From here: http://forums.parallax.com/discussion/comment/1402326/#Comment_1402326
I added Chip's 63 bit to 16 times 32 bit shuffling code. From here: http://forums.parallax.com/discussion/comment/1402448/#Comment_1402448 Wrapped in a little module of it's own.
All of this is wrapped in a little test bench which outputs the first and second 32 bit batches of bits from the shuffled rnd output. As may be delivered to COG's 0 and 1 for example. Like so: This is my first verilog so if anyone can say if those results are correct or not that would be great. The bytes maybe in reverse order but that is no concern.
This is my shuffle module: This is my test bench:
If this is crappy verilog please do say!
Personally, I never gave all this too much thought. There are a lot of great used for good random numbers. IMHO, your efforts are worth it.
The P2 will have awesome random numbers. Not up to cryptographic strength but very good.
And now Chip has got me tinkering with Verilog. I just want to run the actual verilog output through the statistical tests.
Then I'm going to start designing my own CPU....:)
The Ryzen is very comfy, I'm liking it ... which is a just as well because I've dropped down somewhat more moola on a single chip than ever before. The basic overclocking multiplier is pretty cool now, it doesn't impact on the dynamic clocking feature so there is no noticeable increase in idle power. It'll make a cheap gaming option when the quad cores arrive.
Don't bother with anything else but bytes for strings.
Hmm...maybe I should actually try and get xoroshiro ticking on a real FPGA. The Icarus Verilog simulator is very slow. Now where is that DE-0 Nano?....
Yeah, Verilog started out as a simulation language and was later used for describing hardware to be synthesized. My understanding, anyway. I don't know the simulation side of Verilog, only using it for hardware description. An FPGA will run 1000's of times faster, if not millions of times faster.
The great thing about a simulator like Icarus is that it is less than a second between hitting save on my newly edited module and having the thing running. The edit/debug cycle is very quick. This makes great training wheels for a Verilog newbie.
I have not tried for some years but I recall that doing this with the Altair tools is thousands of times more complicated and takes forever to compile, synthesize and run.
Downside is of course speed. I have been pumping the output of my xoroshoiro into the dieharder tests. It's been running all night and only the first test step has passed!
Next up is using Verilator. That compiles Verilog into C++ which then runs hundreds of times faster.
Or perhaps go straight to FPGA. I need to see some LEDs flashing
Can't get it though. Altera's web site is under maintenance...
Can't sign up or log in as their back end server is down.
Is it really so that Intel can't keep a web site on line?
Edit: Ha! I can't even complain on Altera's "How are we doing" link from their download page:
"We are sorry. We are unable to accept your feedback at this time."
I did a search and found out Linux has a known issue with the Zen architecture. Supposedly Kernel 4.10.1 onwards fixes it.
I discovered I also can't even reliably download a newer iso using the Ryzen - md5sum failed its check. Now I'm wondering how much file corruption has occurred on my main install ... back to the old box for a while maybe ...
I should try the earlier January beta release I guess. Actually, good test to do there - downloading another iso and check it's md5 ...
Spoke too soon.