Shop OBEX P1 Docs P2 Docs Learn Events
Random/LFSR on P2 - Page 11 — Parallax Forums

Random/LFSR on P2

18911131492

Comments

  • Heater.Heater. Posts: 21,230
    evanh,
    Thoroughly proven, yes.
    Where? How? I have not seen it demonstrated anywhere yet. Gut says it should be so but it's only for sure true if you are permuting real random bits.

    I know the result is 63 bits currently. Dropping the LSB. That's 63 "good bits" to distribute among 16 users of 32 bits at a time.

    I'm impressed by your extended width PRNG. It passed tests well here.

    I don't think it should be considered for use until you can demonstrate that its period is 2^128. Are you sure there is not some quirk in there that massively shortens the period ?

    In short it's better to stick with known good solutions that others have analysed well.
  • evanhevanh Posts: 15,915
    Heater. wrote: »
    Where? How? I have not seen it demonstrated anywhere yet.
    All of Johannes's early testing was truncated to 32 bits. That's why he was getting clean passes before we'd got our heads around testing the 63-bit results. - http://forums.parallax.com/discussion/comment/1402711/#Comment_1402711

    Since then, I've regularly been running truncated results without issue.
  • evanhevanh Posts: 15,915
    edited 2017-03-09 11:00
    Heater. wrote: »
    I'm impressed by your extended width PRNG. It passed tests well here.

    I don't think it should be considered for use until you can demonstrate that its period is 2^128. Are you sure there is not some quirk in there that massively shortens the period ?
    It should be 2^130. The algorithm isn't any different, just has two more bits. To be honest, I'm not that convinced the original really has the full 2^128 period either.
  • Ahle2Ahle2 Posts: 1,179
    edited 2017-03-09 12:57
    @Heater, @evanh
    Yes, it's true that my initial testings were just the 32 LSB's. Since then I have used all 64 bits if possible or different 32 bit taps from 63 bits (it's extremely slow due to all the bitwise operations). In all my testing so far I can't see any anomalies in PractRand or Dieharder.

    I guess, that it's almost impossible to actually know whether it really is a cycle of 2^128 or not. I mean we are dealing with "randomness" here. So if the 128 bit state happens to be equal to something it already has been, it will loop. This might happen on cycle 2^128 or this might happen on cycle 2^127 or even 2^7. One could say that a PRNG with a 128 bit state have the potential for a 2^128 cycle.
    For 32 bit LFSR's there are comprehensive lists for all taps that produce the full cycle of 2^32, but for a 2^128 we need to invent quantum computers to see if the cycle is anything near 2^128.

    @all
    There seems to be a lot of confusion among people here, so I will try to straighten things up:

    * The PRNG is a shared resource that all cogs can sample
    * It generate one new 63 bit random number each clock cycle (or maybe 64 bits if evanh's code passes all tests)
    * There is no mechanism to get truly independant random numbers BETWEEN cogs for each clocking of the PRNG
    * This is NOT a problem for algorithms/code that depends on a single cog for its computations
    * This is likely a big problem if you would do any type of parallel algorithms/code (Monte Carlo simulations for an example)
    * Chips solution is to randomize different taps from the original 63 bits to each cog to increase "randomness" BETWEEN cogs
    * If the PRNG is anything near good, all permutations of these taps would give good randomness
    * One could get good indepedence if your parallel simulation code could be synched in a way that they don't sample the PRNG on the same clock

    /Johannes

  • evanhevanh Posts: 15,915
    Ahle2 wrote: »
    * One could get good indepedence if your parallel simulation code could be synched in a way that they don't sample the PRNG on the same clock
    I suppose that is a point. Personally, I'd not want to sacrifice the fast instruction for it though.
  • Heater.Heater. Posts: 21,230
    edited 2017-03-09 14:29
    Ahle2,
    Since then I have used all 64 bits if possible or different 32 bit taps from 63 bits (it's extremely slow due to all the bitwise operations). In all my testing so far I can't see any anomalies in PractRand or Dieharder.
    I too have done my PractRand and Diearder testing on 63 bits (Packed tightly into 64 bit chunks) and on 32 bit subsets. All pass. What I did not do yet is to select bits as Chip's shuffle does. If you have tried those and they pass the tests then we are good to go.
    I guess, that it's almost impossible to actually know whether it really is a cycle of 2^128 or not.
    I also wonder about that. But the guy who devised it claims it to be so. Further it is stated that it can generate 2^64 non-overlapping sequences of length 2^64.

    I presume that there is a way of logically analyzing such a generator and deducing it's sequence length. It's not just hand waving and guessing. I'm banking on the fact that nobody has stepped up and pointed out why the sequence might not be as long as claimed.

    I guess somewhere there is a paper that mathematically proves the sequence length.

    This is why I'm loath to adopt any PRNG anyone makes up that has not been subject to analysis.
    There seems to be a lot of confusion among people here, so I will try to straighten things up:

    * The PRNG is a hub resource that all cogs share
    Yes, there is.

    As far as I can make out the PRNG is not a HUB resource. That implies to me that the GETRND instruction would be a HUBOP and have a longer execution time. A variable execution time as it contends with the other COGS.

    I got the idea that the PRNG was shared like pins are shared with INA/OUTA, the instruction is not a HUBOP and it executes in the same number of cycles as other "normal" instructions.

    Or did all that change in the P2 and I missed the memo?
  • Ahle2Ahle2 Posts: 1,179
    edited 2017-03-09 12:53
    So, it seems like I'm a little bit confused myself! ;)
    The right way is to call it a shared resource then, not a hub resource!?
  • Heater.Heater. Posts: 21,230
    edited 2017-03-09 13:04
    Ahle2,
    The right way is to call it a shared resource then, not a hub resource!?
    That would be my guess.

    But then I have somewhat lost touch with the P2 architecture and the terminology people use for it.

    Anyway, the main point is that there is only one PRNG. As you say.
  • evanhevanh Posts: 15,915
    The testing has done a bang on job of telling us when we messed up. To me, the 130-bit and 128-bit are the same algorithm. Just like we can increase and decrease the size of the summing to non-matching sizes without issue, we can also increase and decrease the accumulator size.

    The only restriction seems to be there can't be overlap of bits used from the accumulator into the summing or result.
  • Heater.Heater. Posts: 21,230
    evanh,
    To me, the 130-bit and 128-bit are the same algorithm.
    Clearly they are not equivalent algorithms. If they were they would produce the same output. Which I presume they do not.

    It's up to you to demonstrate, logically, mathematically, that the sequence length is what you suspect it is.

    Sadly I have no idea how people do that.

  • evanhevanh Posts: 15,915
    Doesn't the good test results count for anything? It would seem to me they point to a very non-repeating sequence. And they are the same algorithms after all.
  • evanhevanh Posts: 15,915
    Heater. wrote: »
    What I did not do you yet is to select bits as Chip's shuffle does. If you have tried those and they pass the tests then we are good to go.
    We have proven it's makes no difference the selection size nor bit order. The only requirement is no duplication of source accumulator bits in each result.
  • Heater.Heater. Posts: 21,230
    evanh,
    Doesn't the good test results count for anything? It would seem to me they point to a very non-repeating sequence. And they are the same algorithms after all.
    The tests count for a lot. They may point to something however they do not say anything about the length of the sequence. We cannot possibly run the the thing for over 2^12 iterations to find out!

    I look at it like searching for prime numbers. We can start testing integers from 2 upwards and find more and more primes. Popping up apparently at random as it happens. We might speculate that there is an infinite number of primes. We could not be sure though, not by just iterating our search forever. No, it took Euclid to provide a logical argument that proves there is an infinite number of primes.

    The papers linked from the xorshiro page http://xoroshiro.di.unimi.it/ discuss the proofs of the sequence lengths of these kind of generators. I don't understand much of it but get the impression that any small change can seriously mess them up!

  • Heater.Heater. Posts: 21,230
    evanh,
    We have proven it's makes no difference the selection size nor bit order. The only requirement is no duplication of source accumulator bits in each result.
    I see no proof. Someone has tested it and it worked OK. I'm happy to accept the notion though.
  • No worries. Saw it. I'm good. The tradeoffs and use cases are clear and make good sense.

    Thanks for helping me understand this better.
  • Heater.Heater. Posts: 21,230
    Here is an interesting thing....

    The PRNG generates a 63 bit number.

    16 different subsets of 32 of the 63 bits get read by 16 COGS. Some bits inverted, some not.

    The COGS know their COG ID and the wiring of the shuffler. They know who gets which bits.

    So, 15 of them could probably conspire, swap their view of the PRNG among each other. And with all that information they would know what random number the 16th COG must have got.

    BUSTED!

    Except of course as the PRNG is free running 16 COGs are not likely to get a view of the same number. Or are they....

    I guess this is of no concern unless someone is writing poker simulation where 15 COGs "cheat" and can hence determine the hand of the other poor COG :(
  • Heater.Heater. Posts: 21,230
    Downside of this fascinating thread is that now I find myself trying to get to grips with Verilog all day long. I'm sure the boss would rather I did something else...
  • Ongoing professional development. Works in a pinch.
  • cgraceycgracey Posts: 14,152
    I just realized that with good random numbers being part of the Prop2, we could have standard functions that do things like shuffle bytes, words, and longs.
  • Heater.Heater. Posts: 21,230
    We could.

    I hope you mean in software.

    We want silicon. Now!

  • cgraceycgracey Posts: 14,152
    Heater. wrote: »
    We could.

    I hope you mean in software.

    We want silicon. Now!

    I was thinking...

    SHUFBYTES(Address, NumberOfBytes)
    SHUFWORDS(Address, NumberOfWords)
    SHUFLONGS(Address, NumberOfLongs)

    ...in software, of course.

    Having good randoms available does open some new doors.
  • Heater.Heater. Posts: 21,230
    Actually I was wondering how this RND thing plays with CORDIC.

    Often times people don't want a random 32 bit integer. They want a random real between zero and one.

    Reading around it seems that converting a random LONG to a uniform real is fraught with bias issues.

    Not that I'm sure what the CORDIC thing actually does...sorry to say.
  • cgracey wrote: »
    I was thinking...

    SHUFBYTES(Address, NumberOfBytes)
    SHUFWORDS(Address, NumberOfWords)
    SHUFLONGS(Address, NumberOfLongs)

    ...in software, of course.

    Having good randoms available does open some new doors.

    You were thinking of putting that in a library or OBEX object, right? There's nothing about those functions that seems to require that they be built into a language.
  • cgraceycgracey Posts: 14,152
    ersmith wrote: »
    cgracey wrote: »
    I was thinking...

    SHUFBYTES(Address, NumberOfBytes)
    SHUFWORDS(Address, NumberOfWords)
    SHUFLONGS(Address, NumberOfLongs)

    ...in software, of course.

    Having good randoms available does open some new doors.

    You were thinking of putting that in a library or OBEX object, right? There's nothing about those functions that seems to require that they be built into a language.

    With a good RNG, one pass through and everything is thoroughly shuffled. I like the idea of putting it into Spin. It's kind of like a string function.
  • Heater.Heater. Posts: 21,230
    edited 2017-03-10 02:15
    That there is some, perhaps philosophical, difference of opinion.

    I think a classic example of this is in Python. In the original Python the "print" statement was a keyword of the language. You could write:
    print "Hello World"
    
    Like we did in good old BASIC.

    In Python 3 print is now not part of the language. It's a function like any other:
    print ("Hello World")
    
    I'm with Eric on these things. They are not language features, they should be library features.

    Kind of like string handling features is most languages. Handled by functions. Not the syntax and semantics of the language itself.
  • Spin1 has a lot of what would be consider library stuff built in, like BASIC usually does. So far, Chip is following that model for Spin2.

    I like the idea of separating library stuff from the language, but we need a way that isn't cumbersome.

    If library stuff is external to the language in the current model, then we need objects to contain them and calling them is object.whatever() instead of just whatever(). Then you need to reference the library object(s) in all your objects. It's kind of a slippery slope going down that route.

    I'm not sure if it's a good idea to force the added complexity in this case?
  • evanhevanh Posts: 15,915
    Heater. wrote: »
    So, 15 of them could probably conspire, swap their view of the PRNG among each other. And with all that information they would know what random number the 16th COG must have got.

    BUSTED!
    Cogs have no need to be secured from each other. Propeller isn't targetting general computing.

  • evanhevanh Posts: 15,915
    Heater. wrote: »
    evanh,
    We have proven it's makes no difference the selection size nor bit order. The only requirement is no duplication of source accumulator bits in each result.
    I see no proof. Someone has tested it and it worked OK. I'm happy to accept the notion though.
    Works as good as the original. That's proof enough for me.
  • Heater.Heater. Posts: 21,230
    Yeah, that's why I said "philosophical".

    The C language, for example, abstracts only the basic common ideas of number crunching, flow control, and memory into the language. All else is farmed out to functions. Which turns out to be great if you want an easily portable language.

    If portability is not the goal, then anything goes. Just keep inventing keywords to do whatever.


  • evanhevanh Posts: 15,915
    I'll do a little more testing this weekend ... see if I can prove anything on small accumulator size that won't take so long to repeat.
Sign In or Register to comment.