Where? How? I have not seen it demonstrated anywhere yet. Gut says it should be so but it's only for sure true if you are permuting real random bits.

I know the result is 63 bits currently. Dropping the LSB. That's 63 "good bits" to distribute among 16 users of 32 bits at a time.

I'm impressed by your extended width PRNG. It passed tests well here.

I don't think it should be considered for use until you can demonstrate that its period is 2^128. Are you sure there is not some quirk in there that massively shortens the period ?

In short it's better to stick with known good solutions that others have analysed well.

I'm impressed by your extended width PRNG. It passed tests well here.

I don't think it should be considered for use until you can demonstrate that its period is 2^128. Are you sure there is not some quirk in there that massively shortens the period ?

It should be 2^130. The algorithm isn't any different, just has two more bits. To be honest, I'm not that convinced the original really has the full 2^128 period either.

@Heater, @evanh
Yes, it's true that my initial testings were just the 32 LSB's. Since then I have used all 64 bits if possible or different 32 bit taps from 63 bits (it's extremely slow due to all the bitwise operations). In all my testing so far I can't see any anomalies in PractRand or Dieharder.

I guess, that it's almost impossible to actually know whether it really is a cycle of 2^128 or not. I mean we are dealing with "randomness" here. So if the 128 bit state happens to be equal to something it already has been, it will loop. This might happen on cycle 2^128 or this might happen on cycle 2^127 or even 2^7. One could say that a PRNG with a 128 bit state have the potential for a 2^128 cycle.
For 32 bit LFSR's there are comprehensive lists for all taps that produce the full cycle of 2^32, but for a 2^128 we need to invent quantum computers to see if the cycle is anything near 2^128.

@all There seems to be a lot of confusion among people here, so I will try to straighten things up:

* The PRNG is a shared resource that all cogs can sample
* It generate one new 63 bit random number each clock cycle (or maybe 64 bits if evanh's code passes all tests)
* There is no mechanism to get truly independant random numbers BETWEEN cogs for each clocking of the PRNG
* This is NOT a problem for algorithms/code that depends on a single cog for its computations
* This is likely a big problem if you would do any type of parallel algorithms/code (Monte Carlo simulations for an example)
* Chips solution is to randomize different taps from the original 63 bits to each cog to increase "randomness" BETWEEN cogs
* If the PRNG is anything near good, all permutations of these taps would give good randomness
* One could get good indepedence if your parallel simulation code could be synched in a way that they don't sample the PRNG on the same clock

Since then I have used all 64 bits if possible or different 32 bit taps from 63 bits (it's extremely slow due to all the bitwise operations). In all my testing so far I can't see any anomalies in PractRand or Dieharder.

I too have done my PractRand and Diearder testing on 63 bits (Packed tightly into 64 bit chunks) and on 32 bit subsets. All pass. What I did not do yet is to select bits as Chip's shuffle does. If you have tried those and they pass the tests then we are good to go.

I guess, that it's almost impossible to actually know whether it really is a cycle of 2^128 or not.

I also wonder about that. But the guy who devised it claims it to be so. Further it is stated that it can generate 2^64 non-overlapping sequences of length 2^64.

I presume that there is a way of logically analyzing such a generator and deducing it's sequence length. It's not just hand waving and guessing. I'm banking on the fact that nobody has stepped up and pointed out why the sequence might not be as long as claimed.

I guess somewhere there is a paper that mathematically proves the sequence length.

This is why I'm loath to adopt any PRNG anyone makes up that has not been subject to analysis.

There seems to be a lot of confusion among people here, so I will try to straighten things up:

* The PRNG is a hub resource that all cogs share

Yes, there is.

As far as I can make out the PRNG is not a HUB resource. That implies to me that the GETRND instruction would be a HUBOP and have a longer execution time. A variable execution time as it contends with the other COGS.

I got the idea that the PRNG was shared like pins are shared with INA/OUTA, the instruction is not a HUBOP and it executes in the same number of cycles as other "normal" instructions.

Or did all that change in the P2 and I missed the memo?

The testing has done a bang on job of telling us when we messed up. To me, the 130-bit and 128-bit are the same algorithm. Just like we can increase and decrease the size of the summing to non-matching sizes without issue, we can also increase and decrease the accumulator size.

The only restriction seems to be there can't be overlap of bits used from the accumulator into the summing or result.

Doesn't the good test results count for anything? It would seem to me they point to a very non-repeating sequence. And they are the same algorithms after all.

What I did not do you yet is to select bits as Chip's shuffle does. If you have tried those and they pass the tests then we are good to go.

We have proven it's makes no difference the selection size nor bit order. The only requirement is no duplication of source accumulator bits in each result.

Doesn't the good test results count for anything? It would seem to me they point to a very non-repeating sequence. And they are the same algorithms after all.

The tests count for a lot. They may point to something however they do not say anything about the length of the sequence. We cannot possibly run the the thing for over 2^12 iterations to find out!

I look at it like searching for prime numbers. We can start testing integers from 2 upwards and find more and more primes. Popping up apparently at random as it happens. We might speculate that there is an infinite number of primes. We could not be sure though, not by just iterating our search forever. No, it took Euclid to provide a logical argument that proves there is an infinite number of primes.

The papers linked from the xorshiro page http://xoroshiro.di.unimi.it/ discuss the proofs of the sequence lengths of these kind of generators. I don't understand much of it but get the impression that any small change can seriously mess them up!

We have proven it's makes no difference the selection size nor bit order. The only requirement is no duplication of source accumulator bits in each result.

I see no proof. Someone has tested it and it worked OK. I'm happy to accept the notion though.

16 different subsets of 32 of the 63 bits get read by 16 COGS. Some bits inverted, some not.

The COGS know their COG ID and the wiring of the shuffler. They know who gets which bits.

So, 15 of them could probably conspire, swap their view of the PRNG among each other. And with all that information they would know what random number the 16th COG must have got.

BUSTED!

Except of course as the PRNG is free running 16 COGs are not likely to get a view of the same number. Or are they....

I guess this is of no concern unless someone is writing poker simulation where 15 COGs "cheat" and can hence determine the hand of the other poor COG

Downside of this fascinating thread is that now I find myself trying to get to grips with Verilog all day long. I'm sure the boss would rather I did something else...

I just realized that with good random numbers being part of the Prop2, we could have standard functions that do things like shuffle bytes, words, and longs.

Having good randoms available does open some new doors.

You were thinking of putting that in a library or OBEX object, right? There's nothing about those functions that seems to require that they be built into a language.

Having good randoms available does open some new doors.

You were thinking of putting that in a library or OBEX object, right? There's nothing about those functions that seems to require that they be built into a language.

With a good RNG, one pass through and everything is thoroughly shuffled. I like the idea of putting it into Spin. It's kind of like a string function.

Spin1 has a lot of what would be consider library stuff built in, like BASIC usually does. So far, Chip is following that model for Spin2.

I like the idea of separating library stuff from the language, but we need a way that isn't cumbersome.

If library stuff is external to the language in the current model, then we need objects to contain them and calling them is object.whatever() instead of just whatever(). Then you need to reference the library object(s) in all your objects. It's kind of a slippery slope going down that route.

I'm not sure if it's a good idea to force the added complexity in this case?

So, 15 of them could probably conspire, swap their view of the PRNG among each other. And with all that information they would know what random number the 16th COG must have got.

BUSTED!

Cogs have no need to be secured from each other. Propeller isn't targetting general computing.

We have proven it's makes no difference the selection size nor bit order. The only requirement is no duplication of source accumulator bits in each result.

I see no proof. Someone has tested it and it worked OK. I'm happy to accept the notion though.

Works as good as the original. That's proof enough for me.

The C language, for example, abstracts only the basic common ideas of number crunching, flow control, and memory into the language. All else is farmed out to functions. Which turns out to be great if you want an easily portable language.

If portability is not the goal, then anything goes. Just keep inventing keywords to do whatever.

## Comments

21,233I know the result is 63 bits currently. Dropping the LSB. That's 63 "good bits" to distribute among 16 users of 32 bits at a time.

I'm impressed by your extended width PRNG. It passed tests well here.

I don't think it should be considered for use until you can demonstrate that its period is 2^128. Are you sure there is not some quirk in there that massively shortens the period ?

In short it's better to stick with known good solutions that others have analysed well.

11,635Since then, I've regularly been running truncated results without issue.

11,6351,165Yes, it's true that my initial testings were just the 32 LSB's. Since then I have used all 64 bits if possible or different 32 bit taps from 63 bits (it's extremely slow due to all the bitwise operations). In all my testing so far I can't see any anomalies in PractRand or Dieharder.

I guess, that it's almost impossible to actually know whether it really is a cycle of 2^128 or not. I mean we are dealing with "randomness" here. So if the 128 bit state happens to be equal to something it already has been, it will loop. This might happen on cycle 2^128 or this might happen on cycle 2^127 or even 2^7. One could say that a PRNG with a 128 bit state have the potential for a 2^128 cycle.

For 32 bit LFSR's there are comprehensive lists for all taps that produce the full cycle of 2^32, but for a 2^128 we need to invent quantum computers to see if the cycle is anything near 2^128.

@all

There seems to be a lot of confusion among people here, so I will try to straighten things up:

* The PRNG is a shared resource that all cogs can sample

* It generate one new 63 bit random number each clock cycle (or maybe 64 bits if evanh's code passes all tests)

* There is no mechanism to get truly independant random numbers

BETWEENcogs for each clocking of the PRNG* This is

NOTa problem for algorithms/code that depends on a single cog for its computations* This is likely a big problem if you would do any type of parallel algorithms/code (Monte Carlo simulations for an example)

* Chips solution is to randomize different taps from the original 63 bits to each cog to increase "randomness"

BETWEENcogs* If the PRNG is anything near good, all permutations of these taps would give good randomness

* One could get good indepedence if your parallel simulation code could be synched in a way that they don't sample the PRNG on the same clock

/Johannes

11,63521,233I presume that there is a way of logically analyzing such a generator and deducing it's sequence length. It's not just hand waving and guessing. I'm banking on the fact that nobody has stepped up and pointed out why the sequence might not be as long as claimed.

I guess somewhere there is a paper that mathematically proves the sequence length.

This is why I'm loath to adopt any PRNG anyone makes up that has not been subject to analysis. Yes, there is.

As far as I can make out the PRNG is not a HUB resource. That implies to me that the GETRND instruction would be a HUBOP and have a longer execution time. A variable execution time as it contends with the other COGS.

I got the idea that the PRNG was shared like pins are shared with INA/OUTA, the instruction is not a HUBOP and it executes in the same number of cycles as other "normal" instructions.

Or did all that change in the P2 and I missed the memo?

1,165The right way is to call it a shared resource then, not a hub resource!?

21,233But then I have somewhat lost touch with the P2 architecture and the terminology people use for it.

Anyway, the main point is that there is only one PRNG. As you say.

11,635The only restriction seems to be there can't be overlap of bits used from the accumulator into the summing or result.

21,233It's up to you to demonstrate, logically, mathematically, that the sequence length is what you suspect it is.

Sadly I have no idea how people do that.

11,63511,63521,233I look at it like searching for prime numbers. We can start testing integers from 2 upwards and find more and more primes. Popping up apparently at random as it happens. We might speculate that there is an infinite number of primes. We could not be sure though, not by just iterating our search forever. No, it took Euclid to provide a logical argument that proves there is an infinite number of primes.

The papers linked from the xorshiro page http://xoroshiro.di.unimi.it/ discuss the proofs of the sequence lengths of these kind of generators. I don't understand much of it but get the impression that any small change can seriously mess them up!

21,23310,195Thanks for helping me understand this better.

21,233The PRNG generates a 63 bit number.

16 different subsets of 32 of the 63 bits get read by 16 COGS. Some bits inverted, some not.

The COGS know their COG ID and the wiring of the shuffler. They know who gets which bits.

So, 15 of them could probably conspire, swap their view of the PRNG among each other. And with all that information they would know what random number the 16th COG must have got.

BUSTED!

Except of course as the PRNG is free running 16 COGs are not likely to get a view of the same number. Or are they....

I guess this is of no concern unless someone is writing poker simulation where 15 COGs "cheat" and can hence determine the hand of the other poor COG

21,23310,19513,61721,233I hope you mean in software.

We want silicon. Now!

13,617I was thinking...

SHUFBYTES(Address, NumberOfBytes)

SHUFWORDS(Address, NumberOfWords)

SHUFLONGS(Address, NumberOfLongs)

...in software, of course.

Having good randoms available does open some new doors.

21,233Often times people don't want a random 32 bit integer. They want a random real between zero and one.

Reading around it seems that converting a random LONG to a uniform real is fraught with bias issues.

Not that I'm sure what the CORDIC thing actually does...sorry to say.

5,021You were thinking of putting that in a library or OBEX object, right? There's nothing about those functions that seems to require that they be built into a language.

13,617With a good RNG, one pass through and everything is thoroughly shuffled. I like the idea of putting it into Spin. It's kind of like a string function.

21,233I think a classic example of this is in Python. In the original Python the "print" statement was a keyword of the language. You could write: Like we did in good old BASIC.

In Python 3 print is now not part of the language. It's a function like any other: I'm with Eric on these things. They are not language features, they should be library features.

Kind of like string handling features is most languages. Handled by functions. Not the syntax and semantics of the language itself.

2,994I like the idea of separating library stuff from the language, but we need a way that isn't cumbersome.

If library stuff is external to the language in the current model, then we need objects to contain them and calling them is object.whatever() instead of just whatever(). Then you need to reference the library object(s) in all your objects. It's kind of a slippery slope going down that route.

I'm not sure if it's a good idea to force the added complexity in this case?

11,63511,63521,233The C language, for example, abstracts only the basic common ideas of number crunching, flow control, and memory into the language. All else is farmed out to functions. Which turns out to be great if you want an easily portable language.

If portability is not the goal, then anything goes. Just keep inventing keywords to do whatever.

11,635