Random/LFSR on P2

Ahle2 · 2017-03-01 07:09

As I understood it, the P2 Will have a built in LFSR random generator. Is that right?
Which bits are tapped? How long is the cycle?

I would like to run it trough the Dieharder and Testu01 randomness suites to see how well it performs.

IMHO, it must be good enough to be "indistinguishable" (according to these tests) from a TRNG for it to be usable for anything other than fun and games. (a pure LFSR never succeeds these tests, there needs to be a little bit more logic than that)

Even better would be if it the internal states of the PRNG never propagates outside the HW in combination with some way of seeding. Just wishful thinking because I know that a CSPRNG is a complex matter. ☺

/Johannes

Ahle2 · 2017-03-01 07:22

Btw... Have a look at the fast and simple "xoroshiro" family of PRNG. It's simpler, faster and gives better result than the famous Mersenne Twister algorithm. The cycle is of course not anything near 2^19937−1 and it's not cryptogically safe. Other than that it runs with ease trough all randomness tests thrown at it and it can be used for anything other than fun and games.

Heater. · 2017-03-01 08:07

I brought up this issue a couple of days ago. The consensus seems to be that even a bad PRNG is good enough for simple games and robots so no improvement is required.

Ahle2 · 2017-03-01 08:52

@Heater
Such a coincidence! I have not been active in the P2 community for a long time and just came to think of the P2 and PRNG when I had the needs for a better PRNG than whats available in the std C++ library. To be honest I don't think a simple LFSR is good enough even for robot application. My experience with pure LFSR is that they are much less good than what most people think. A simple random image generated this way will reveal obvious patterns. Even a PRNG that is considered "useless" by the "PRNG community" still produce images that the eye can't distinguish from a TRNG. Xoroshiro is dead simple and would make the randomness soooo endlessly much better.

Have a look at this link for a VHDL implementation.
http://jorisvr.nl/article/random-numbers-VHDL

If the LFSR in the P2 is available somewhere I would be happy to produce random images and run it trough some test suites.

Ahle2 · 2017-03-01 09:27

Just to higlight the simplicity of the xoroshiro algorithm. Here is the implementation in C!

uint64_t next(void) {
	const uint64_t s0 = s[0];
	uint64_t s1 = s[1];
	const uint64_t result = s0 + s1;

	s1 ^= s0;
	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
	s[1] = rotl(s1, 36); // c

	return result;
}

That's all it takes to fool even the best randomness tests!

Ahle2 · 2017-03-01 09:52

Another tought... would it be possible somehow to get entropy from heat/electrical noise etc generated internally in the P2.

Heater. · 2017-03-01 10:16

Ahle2,

Another tought... would it be possible somehow to get entropy from heat/electrical noise etc generated internally in the P2.

The P1 is already capable of generating real random numbers from the noise obtained from it's phase locked loops.
Have a search for the "real random" object. I'm sure it's in OBEX somewhere. http://forums.parallax.com/discussion/93061/real-random-number-generator-object

Downside is that it uses a whole COG to generate real random bits.

Late time I asked the P2 was also going to be capable of such real random generation.

evanh · 2017-03-01 10:29

From an analogue electronics angle a simple zenor diode can be used as a small random signal generator. This presumably is the typical hardware starting point of digital designs too. Has to be amplified before throwing the signal at the logic input but the humble diode does the job.

Heater. · 2017-03-01 10:55

Over Christmas 2011 I was getting a bit obsessive about random numbers and did some testing of the P1 PRNG. It did not perform well.

So I created a version of the JKISS32 PRNG in Spin.

http://forums.parallax.com/discussion/136975/jkiss32-high-quality-prng-with-realrandom-seed-save-a-cog

JKISS32 is a high quality PRNG with a period of 2^121, or 2.6*10^36, by David Jones. It's a version of George Marsaglia's KISS32 PRNG that needs no multiply, divide or mod. Like so.

/* Implementation of a 32-bit KISS generator which uses no multiply instructions */
static unsigned int x=123456789,y=234567891,z=345678912,w=456789123,c=0;
unsigned int JKISS32()
{
 int t;
 y ^= (y<<5); y ^= (y>>7); y ^= (y<<22);
 t = z+w+c; z = w; c = t < 0; w = t&2147483647;
 x += 1411392427;
 return x + y + w;
}

David Jones original paper here: http://www0.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf

Note that the Diehard test suite was developed by G. Marsaglia. His orginal description of KISS is here: https://groups.google.com/forum/#!msg/comp.lang.fortran/5Bi8cFoYwPE/pSFU7NaK224J

These simple PRNG are wonderful because they are very small simple and fast, compared to the famous Mersenne Twister. They have massively long periods and do well in tests of randomness. They are not cryptographically strong though and should not be used for crypto.

Heater. · 2017-03-01 11:14

evanh,

From an analogue electronics angle a simple zenor diode can be used as a small random signal generator.

Funny you should say that. I have just been building an analogue noise generator form a reversed biased NPN transistor producing avalanche noise.

Like so: http://www.cryogenius.com/hardware/rng/

Works a treat. My home made discrete schmit trigger following it does not though, seems to be too slow to keep up.

Ahle2 · 2017-03-01 11:24

@Heater
I was not aware of the TRNG object from Chip. Sooo cool!

If we could dedicate one of the 16 cogs in the P2 to be a TRNG, that solves everything! I still think a random instruction in hardware should be better than this!

It's better to leave it out entirely if it's just a LFSR!

/Johannes

Heater. · 2017-03-01 11:45

What is that? Is that actually the output of the Propeller PRNG or some other?

ozpropdev · 2017-03-01 12:12

@Heater
That KISS algorithm is neat!
Just ran it on my P2 and it works great.

Ahle2 · 2017-03-01 12:14

Some other... But it shows why a LFSR sucks!

You can clearly see the shift operation in action vertically.

/Johannes

Heater. · 2017-03-01 12:23

That really hit me a couple of weeks ago when watching a Youtube video where someone had built an LFSR with LEDs attached to each bit and clocked it at 1Hz. Watching the LEDs scrolling around you immediately see there is no randomness there!

lonesock · 2017-03-01 16:26

I posted a RealRandom / JKISS32 hybrid at the bottom of this post: forums.parallax.com/discussion/153840/random-number-generator-no-extra-cog

There is a Spin-only version of realRandom, and there is a PASM engine that uses the RealRandom technique to generate random seed bits slowly, but continuously runs JKISS32 in the background, periodically updating the JKISS32 seed bits with the real random ones generated. All of that updates a long in Hub RAM as fast as possible. It's fast enough that you could do back-to-back assignments in Spin and get different random numbers, no wait needed.

Jonathan

cgracey · 2017-03-01 18:13

Here is the current Prop2 PRNG:

// sys rnd

reg [31:0] rndx;

`regscan (rndx, 32'b1,
1'b1,
  {rndx[30:00], rndx[31] ^ rndx[10] ^ rndx[04] ^ rndx[01]})

wire[63:0] rndy	= {2{	rndx[23], rndx[07], rndx[25], rndx[08], rndx[04], rndx[09], rndx[11], rndx[03],
			rndx[15], rndx[31], rndx[10], rndx[06], rndx[05], rndx[02], rndx[26], rndx[18],
			rndx[20], rndx[28], rndx[16], rndx[17], rndx[00], rndx[27], rndx[22], rndx[13],
			rndx[14], rndx[30], rndx[01], rndx[29], rndx[12], rndx[19], rndx[21], rndx[24]	}};

assign rnd	= {	rndy[31:00]	^ 32'h428A2F98,
			rndy[32:01]	^ 32'h71374491,
			rndy[33:02]	^ 32'hB5C0FBCF,
			rndy[34:03]	^ 32'hE9B5DBA5,
			rndy[35:04]	^ 32'h3956C25B,
			rndy[36:05]	^ 32'h59F111F1,
			rndy[37:06]	^ 32'h923F82A4,
			rndy[38:07]	^ 32'hAB1C5ED5,
			rndy[39:08]	^ 32'hD807AA98,
			rndy[40:09]	^ 32'h12835B01,
			rndy[41:10]	^ 32'h243185BE,
			rndy[42:11]	^ 32'h550C7DC3,
			rndy[43:12]	^ 32'h72BE5D74,
			rndy[44:13]	^ 32'h80DEB1FE,
			rndy[45:14]	^ 32'h9BDC06A7,
			rndy[46:15]	^ 32'hC19BF174 };

assign pin_rnd	=	rnd;

The 'regscan' part (rndx) is the 32-bit LFSR which gets seeded with 32'b1 on reset.

'rndy' reorders those bits for eventual output. This gets rid of the obvious shift pattern.

'rnd' takes offsets of those 32 reordered bits and xor's them with different patterns for final output to the 16 cogs.

'pin_rnd' is the same data as 'rnd', but gets used as 64 bytes, one per smart pin.

This all means that each cog gets a different view of the LFSR, as well as each smart pin. We could change to the "xoroshiro" method, no problem.

cgracey · 2017-03-01 18:18

Ahle2 wrote: »

@Heater
I was not aware of the TRNG object from Chip. Sooo cool! If we could dedicate one of the 16 cogs in the P2 to be a TRNG, that solves everything! I still think a random instruction in hardware should be better than this!

It's better to leave it out entirely if it's just a LFSR!

/Johannes

In the Prop2, a smart pin's ADC could serve as a source for a TRNG. Just put the pin into ADC calibration mode and observe the LSB of output.

cgracey · 2017-03-01 18:26

Ahle2 wrote: »
Just to higlight the simplicity of the xoroshiro algorithm. Here is the implementation in C!
uint64_t next(void) {
	const uint64_t s0 =0];
	uint64_t s1 = s[1];
	const uint64_t result = s0 + s1;

	s1 ^= s0;
	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
	s[1] = rotl(s1, 36); // c

	return result;
}
That's all it takes to fool even the best randomness tests!

I don't get C. Is there some kind of initialization going on near the top?

David Betz · 2017-03-01 18:28

cgracey wrote: »
Ahle2 wrote: »
Just to higlight the simplicity of the xoroshiro algorithm. Here is the implementation in C!
uint64_t next(void) {
	const uint64_t s0 =0];
	uint64_t s1 = s[1];
	const uint64_t result = s0 + s1;

	s1 ^= s0;
	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
	s[1] = rotl(s1, 36); // c

	return result;
}
That's all it takes to fool even the best randomness tests!
I don't get C. Is there some kind of initialization going on near the top?

Looks like you're missing the declaration of the "s" array and there is also a typo on the line defining "s0". I think it should read "s0 = s[0]".

cgracey · 2017-03-01 18:39

David Betz wrote: »
cgracey wrote: »
Ahle2 wrote: »
Just to higlight the simplicity of the xoroshiro algorithm. Here is the implementation in C!
uint64_t next(void) {
	const uint64_t s0 =0];
	uint64_t s1 = s[1];
	const uint64_t result = s0 + s1;

	s1 ^= s0;
	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
	s[1] = rotl(s1, 36); // c

	return result;
}
That's all it takes to fool even the best randomness tests!
I don't get C. Is there some kind of initialization going on near the top?
Looks like you're missing the declaration of the "s" array and there is also a typo on the line defining "s0". I think it should read "s0 = s[0]".

That would make sense, but why is s0 assigned as 'const uint64_t' and s1 isn't?

Maybe it should look like this?

uint64_t next(void) {
	const uint64_t s0 = s[0];
	const uint64_t s1 = s[1];
	const uint64_t result = s0 + s1;

	s1 ^= s0;
	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
	s[1] = rotl(s1, 36); // c

	return result;
}

Heater. · 2017-03-01 18:54

That code does not look any xoroshiro I can find through google. The ones I did find were a lot more complex.

xoroshiro is designed for 64 bit machines. Where as JKISS is designed for 32 bitters. If that is of any consequence.

David Betz · 2017-03-01 19:45

cgracey wrote: »
David Betz wrote: »
cgracey wrote: »
Ahle2 wrote: »
Just to higlight the simplicity of the xoroshiro algorithm. Here is the implementation in C!
uint64_t next(void) {
	const uint64_t s0 =0];
	uint64_t s1 = s[1];
	const uint64_t result = s0 + s1;

	s1 ^= s0;
	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
	s[1] = rotl(s1, 36); // c

	return result;
}
That's all it takes to fool even the best randomness tests!
I don't get C. Is there some kind of initialization going on near the top?
Looks like you're missing the declaration of the "s" array and there is also a typo on the line defining "s0". I think it should read "s0 = s[0]".
That would make sense, but why is s0 assigned as 'const uint64_t' and s1 isn't?

Maybe it should look like this?
uint64_t next(void) {
	const uint64_t s0 = s[0];
	const uint64_t s1 = s[1];
	const uint64_t result = s0 + s1;

	s1 ^= s0;
	s[0] = rotl(s0, 55) ^ s1 ^ (s1 << 14); // a, b
	s[1] = rotl(s1, 36); // c

	return result;
}

I guess the const in the declaration of result is technically correct. The value doesn't change after it is first assigned. Maybe it results in better code generation?

evanh · 2017-03-01 21:46

I'm not sure how. Auto locals are easy for the optimiser I thought.

And the definition of s is missing. Presumably it's uint64_t s[2];

Roy Eltham · 2017-03-01 22:14

const mostly does nothing for modern C++ compilers. It's a hint at best.
It can be cast away anywhere in the codebase, so it's not a reliable thing for the optimizing compiler anyway.

Phil Pilgrim (PhiPi) · 2017-03-01 22:23

Ahle2 wrote:

[re:LFSR]You can clearly see the shift operation in action vertically.

That's really not a problem. You don't generate words one shift at a time. You have to repeat the process by at least the bit size of the word before extracting another one.

-Phil

Heater. · 2017-03-01 23:23

Is that what actually happens in the implementations "random" in Spin and in common language libraries?

They are notorious for being terrible PRNGs.

The graphic above may be exceptionally bad but I have seen the outputs of various common random functions plotted in various ways and patterns show up very clearly.

Sadly I can't find any examples just now.

cgracey · 2017-03-02 03:44

Here is that xoroshiro algorithm in PASM for the Prop123 board:

DAT	org

	bmask	dirb,#15	'make p32..p47 outputs

loop	call	#stepx		'step xoroshiro
	mov	outb,out+0	'output 16 LSBs to 16 LEDs
	waitx	##80_000_000/1	'wait 1/10 second
	jmp	#loop		'repeat


stepx	mov	out+0,s0+0	'out = s0 + s1
	mov	out+1,s0+1
	add	out+0,s1+0 wc
	addx	out+1,s1+1

	mov	s0_+0,s0+0	's0_ = s0
	mov	s0_+1,s0+1

	mov	s1_+0,s1+0	's1_ = s1
	mov	s1_+1,s1+1

	xor	s1_+0,s0_+0	's1_ ^= s0_
	xor	s1_+1,s0_+1

	rep	#3,#55		'rol s0,#55
	testb	s0+1,#31   wc
	rcl	s0+0,#1    wc
	rcl	s0+1,#1
	
	xor	s0+0,s1_+0	's0 ^= s1_
	xor	s0+1,s1_+1

	mov	t0_+0,s1_+0	't0_ = s1_
	mov	t0_+1,s1_+1

	rep	#2,#14		'shl t0_,#14
	shl	t0_+0,#1    wc
	rcl	t0_+1,#1
	
	xor	s0+0,t0_+0	's0 ^= t0_
	xor	s0+1,t0_+1

	rep	#3,#36		'rol s1_,#36
	testb	s1_+1,#31   wc
	rcl	s1_+0,#1    wc
	rcl	s1_+1,#1
	
	mov	s1+0,s1_+0	's1 = s1_
	mov	s1+1,s1_+1

	ret



s0	long	1,0	'seed s0 with 1
s1	long	0,0

s0_	res	2	'workspace
s1_	res	2
t0_	res	2

out	res	2

The rotates could be optimized, but this was simplest to think about.

It looks pretty random, to me.

If we can be sure this is correct, I'll update the PRNG in the hub to use this, instead.

I made a video of it:

Ahle2 · 2017-03-02 05:11

I will implement the old P2 LFSR, Jkiss , PCG (by Melissa O'Neill) and xoroshiro 128+. Then run the them through Testu01 and Dieharder suites. Stay tuned! (Melissa claims that her PRNG is the best available to this date. Her paper is under review as we speak)

Heater. · 2017-03-02 05:34

I think the JKISS32 PRNG would be a lot faster and smaller code than the xoroshiro algorithm.

JKISS32 is especially optimized for 32 bit machines.

Admittedly it's "quality" may be a bit less but it's pretty damn good anyway.

Original source code for xorooshiro128+ is here:
http://xoroshiro.di.unimi.it/
http://xoroshiro.di.unimi.it/xoroshiro128plus.c

NOTE: I notice in the source it says:

"The state must be seeded so that it is not everywhere zero."

xoroshiro can never generate a zero output.

Ahle2 · 2017-03-02 05:56

@Chip
Do you think the smart pin LSB will act as a good TRNG? Maybe smart pins in combination with a zener diode and a amplifier!?

@Evanh
Have you tried your zener TRNG with some RNG test suites? It would be cool to see how well it performs. I guess It would pass with Flying colors!

Random/LFSR on P2

Comments