Shop OBEX P1 Docs P2 Docs Learn Events
Random/LFSR on P2 - Page 27 — Parallax Forums

Random/LFSR on P2

1242527293092

Comments

  • TonyB_TonyB_ Posts: 2,178
    edited 2017-10-27 16:07
    Thanks, Evan.
  • cgraceycgracey Posts: 14,152
    edited 2017-10-27 15:56
    What if you took the parity of the sum[16:1] and XOR'd it into the sum LSB, then used sum[15:0]?
  • evanhevanh Posts: 15,915
    edited 2017-10-27 16:09
    cgracey wrote: »
    What if you took the parity of the sum[16:1] and XOR'd it into the sum LSB, then used sum[15:0]?

    Wow! That's perfect. I'm watching the log file at the moment and there is already 512M scores for word0 ...

    EDIT: A detail, I've actually parity'd all bits, sum[16:0]
    EDIT2: Doh! I see that is the same as yours. :)
  • cgraceycgracey Posts: 14,152
    We just need to maximize the randomosity.
  • So using the sum carry is useful after all?
  • evanhevanh Posts: 15,915
    cgracey wrote: »
    We just need to maximize the randomosity.

    Totally, I was picking up on that when Melissa was describing using the high quality bits to dynamically do rotates.
  • cgraceycgracey Posts: 14,152
    edited 2017-10-27 16:13
    evanh wrote: »
    cgracey wrote: »
    What if you took the parity of the sum[16:1] and XOR'd it into the sum LSB, then used sum[15:0]?

    Wow! That's perfect. I'm watching the log file at the moment and there is already 512M scores for word0 ...

    EDIT: A detail, I've actually parity'd all bits, sum[16:0]
    EDIT2: Doh! I see that is the same as yours. :)

    If it winds up looking good, then do two iterations and feed the result words, concatenated together as a long, into your analyzer. Actually, that's effectively what is happening, right?
  • evanhevanh Posts: 15,915
    I'm salivating now ...
    	int  shift;
    	uint64_t  result = s0 + s1;
    	uint64_t  parity = result;
    
    
    	for( shift=1; shift <= ACCUM_SIZE; shift++ )  {
    		parity = parity ^ (result >> shift);
    	}
    
    	result = ((result & -2LL) | (parity & 1LL)) & ACCUM_MASK;
    

    And [14 2 7] is still out front :)
      Xoroshiro32p+ PractRand Score Table - Run 2017-10-28 05:11:16 +1300
    
    Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   msBit
    =============================================================================
     [ 1  2  8]      4M     32M     64M     16M     64M     64M     32M      1G
     [ 2  1 11]      8M      8M      8M    128K    128K    128K    256K      1G
     [ 2  1 15]     64K    128K    128K     32K     32K     64K     32K      1G
     [ 2  2  7]      4M      8M      8M     32M     32M     32M      4M      1G
     [ 2  6 15]      1M      2M      4M     16M    128K    128K    128K      1G
     [ 2  8  7]      4M      8M     16M    128M     32M     32M      4M      1G
     [ 2  9  9]    256K    256K    256K     16K    128K    512K     64M      1G
     [ 2 11  3]      2M      2M      4M    512K    256K    256K    256K      1G
     [ 3  2  6]     32M     64M     32M      2M      2M      2M      1M      1G
     [ 3  3 10]     16M     16M     16M    256K    256K    512K     64M      1G
     [ 3 11  2]    256K    256K    512K    128K     64K     64K     64K      1G
     [ 3 11 14]      8M      4M      4M    256K    256K    512K    512K      1G
     [ 4  1  9]      1M    512K      1M    512K      8M    256K    512K    512M
     [ 4  7 15]     16M     64M     64M    128M    512K    512K    512K      1G
     [ 4  8  5]     64M    512M    128M     32M      8M      4M      2M      1G
     [ 5  2  6]     64M    128M    128M      8M      4M      4M      4M      1G
     [ 5  8  4]      1M      4M     16M      4M      1M    256K    128K     64M
     [ 5 14 12]    256M    128M     64M      2M      4M     16M    128M      1G
     [ 6  2  3]     16M     32M     32M      1M      1M      1M      2M      2M
     [ 6  2  5]     16M     16M     16M    256K    512K    512K    512K      4M
     [ 6  2 11]    256M    512M    256M      8M      8M    256M    128M      1G
     [ 6  3 15]     32M     32M     32M      1M      1M      2M      4M      1G
     [ 6 14 11]    128M    128M    128M     32M     64M    128M     16M      1G
     [ 7  1  8]      1M     64M    128M     16M    128M     16M     16M      1G
     [ 7  2  2]      8M     16M     16M      2M      2M      4M      4M      4M
     [ 7  2 10]    512M      1G    512M    128M    256M     64M     32M      1G
     [ 7  2 14]    512M    512M    256M      4M      8M     32M     32M    256M
     [ 7  8  2]    128M    512M    512M     64M      8M      4M      4M    512M
     [ 7 10 10]    128M    128M    256M     32M     32M    256M    128M      1G
     [ 7 15  8]      1M      2M      2M    256K      1M    512K    256K      1G
     [ 8  1  7]      1M      1M      4M      1M      1M      1M      1M     64M
     [ 8  2  1]      4M     32M     64M      8M      8M     16M     16M     16M
     [ 8  5 13]     32M     32M     64M     16M      8M    128M    128M      1G
     [ 8  7 15]      1M      2M      2M    512K      2M      2M      1M      1G
     [ 8  9 13]    256M    512M    128M    512M      1G    512M    128M      1G
     [ 8 15  7]    512K    512K      1M    256K    512K    512K    256K      1G
     [ 9  1  4]     16M     32M     64M      8M     16M     16M     16M     64M
     [ 9  2 14]    512M      2G      1G    128M    256M     64M    128M      1G
     [ 9  8 14]      8M     16M     32M    256M     32M     32M     64M      1G
     [ 9  9  2]     32M     32M     64M     16M     16M     16M     32M     64M
     [10  2  7]    128M    256M    128M     16M     16M     64M     32M      1G
     [10  3  3]     32M     64M     64M     32M     32M     32M     64M    256M
     [10  3 11]    512M      2G      1G    256M    256M    512M    128M      1G
     [10  5 13]    512M      2G    512M      1G    512M    256M    128M    512M
     [10  7 11]    512M    512M    128M    512M    128M    512M    128M      1G
     [10 10  7]     16M     16M     16M     16M     16M    512M    128M    512M
     [11  1  2]    128M    256M    256M     32M     32M     32M     32M      1G
     [11  1 14]    512M      1G      1G     32M     16M     16M     16M      1G
     [11  2  6]    512M    512M    512M     32M     64M    512M    128M    512M
     [11  3 10]     16M     32M     32M     64M      1M    128K     64K    128M
     [11  7 10]      2M      4M      8M     32M    128K    128K     64K      1G
     [11  8 12]    128M    256M    256M    512M    512M    512M    128M      1G
     [11 10 12]     32M    128M    128M      1G    512M    512M    128M      1G
     [11 11 14]     32M     64M    128M     16M     16M     16M      8M      1G
     [11 14  6]    128M    256M    128M    256M     64M    512M    128M      1G
     [12  4 15]    256M    512M      1G    128M     64M     16M     16M      1G
     [12  8 11]      2M      4M      4M     64M    128K    128K     64K      1G
     [12 10 11]      2M      2M      2M    128K    128K    128K     64K      1G
     [12 10 13]     64M     32M     64M     16M     16M     32M      8M      1G
     [12 14  5]    128M     32M     32M    128M    128M     64M     64M      1G
     [13  3 14]    128M    512M    512M     64M     32M     16M      8M      1G
     [13  5  8]    128M    256M    256M    512M    128M    256M    128M      1G
     [13  5 10]    256M     64M     32M     64M     16M    256K    256K      1G
     [13  9  8]    128M    128M     64M    512M    512M    512M    128M      1G
     [13 10 12]      2M      2M      2M    128K    128K    128K     64K    512M
     [13 11 14]     16M     16M     16M      8M      8M      8M      8M      1G
     [13 12 14]     16M     16M     16M      8M      8M      4M      4M      1G
     [13 13 14]      8M     16M     16M      8M      8M      8M      8M      1G
     [14  1 11]     32M     64M     32M    256K    256K    256K    256K      1G
     [14  2  7]    512M      1G    512M      1G    512M    512M    128M      1G
     [14  2  9]    512M    512M    256M    128M    128M     64M    128M      1G
     [14  3 13]      4M     16M     16M      1M    256K     64K     64K      1G
     [14  8  9]    512M    128M     64M      1G     64M     64M     64M      1G
     [14 11  3]    512M    512M    512M    256M    128M    128M     64M      1G
     [14 11 11]     16M     16M      8M    256K    256K    128K    256K      1G
     [14 11 13]    512K    512K    512K    128K     64K     64K     64K      1G
     [14 12 13]    256K    512K    256K    128K     64K     64K     64K      1G
     [14 13 13]    128K    128K    256K     64K     64K    128K     64K    256M
     [15  1  2]      2M      2M      2M      1M    512K    512K    256K      1G
     [15  3  6]    512M      1G    512M    512M    512M    256M    128M      1G
     [15  4 12]     64M    128M     64M    128M      8M    256K    256K      1G
     [15  6  2]     64M    128M    256M    128M    128M     32M     16M      1G
     [15  7  4]    256M    512M    128M    512M    512M    256M    128M      1G
     [15  7  8]     16M     64M     64M     32M     32M      4M      2M    512M
    
  • evanhevanh Posts: 15,915
    cgracey wrote: »
    If it winds up looking good, then do two iterations and feed the result words, concatenated together as a long, into your analyzer. Actually, that's effectively what is happening, right?

    Yep, everything is one long unending string to Practrand.
  • cgraceycgracey Posts: 14,152
    edited 2017-10-27 16:20
    Cool! So, double iterations should give us two high-quality 16-bit results.
  • Here is a result summary as I see it:
    		Xoroshiro32+ [14,2,7] sum bits
    [15:0]  [15:1]  [15:2]  [15:8]  [11:4]   [9:2]   [8:1]    [15] 
    =========================================================================================
     512K      1G    512M      1G    512M    512M    128M      1G	sum[0] = s0[0] + s1[0]
     512K      1G    512M      1G    512M    512M    128M      1G	sum[0] = [s0,s1] parity
      16M      1G    512M      1G    512M    512M    128M      1G	sum[0] = sum[15:0] parity
      16K      1G    512M      1G    512M    512M    128M      1G	sum[0] = sum[15:1] parity
     512M      1G    512M      1G    512M    512M    128M      1G   sum[0] = sum[16:0] parity
    
  • I wonder whether sum[0] is more random than sum[1] now. Something for the weekend, perhaps?
  • evanhevanh Posts: 15,915
    TonyB_ wrote: »
    Here is a result summary as I see it:
    I've forgotten how the first two lines go there but, yep, the bottom three lines register as correct to me.
  • cgraceycgracey Posts: 14,152
    Okay. I will see about enhancing the XORO32 instruction.

    This would address TonyB_'s longstanding desire to get good long results.
  • evanhevanh Posts: 15,915
    TonyB_ wrote: »
    Something for the weekend, perhaps?
    I can see blue sky for Saturday morning already here.
  • evanhevanh Posts: 15,915
    Question: Any chance that this 17 bit parity equation is the same as the corrected sign (16-bit version obviously) from the Z/C flags?
  • TonyB_TonyB_ Posts: 2,178
    edited 2017-10-28 20:08
    We need an extra test now: sum[7:0].

    I think sum[0] = sum[15:1] parity is so bad because sum[0] is no longer a function of itself!
  • evanh wrote: »
    Question: Any chance that this 17 bit parity equation is the same as the corrected sign (16-bit version obviously) from the Z/C flags?

    That's either the best question ever or you really do need to get some sleep!

  • cgracey wrote: »
    If XORO32 can be used to produce a high-quality 16-bit result per iteration, then I could enhance the instruction to do TWO iterations, get 16-bit sums from each, and substitute the high-quality 32-bit PRNG result into the next instruction's S value. Meanwhile, the double iteration would be written back to the original D in the XORO32 instruction.

    So, do a XORO32 iteration and store the 32-bit state in S. Do another XORO32 iteration, then sum the two states as two parallel 16-bit adds, with the parity jiggery-pokery, to give a 32-bit PRN in D. Is that correct and is there time to do all this?

    I'm all for XORO32 doing the summing itself, if possible. We'd need a XORO32 write for the seed and a XORO32 read for the PRN.
  • evanhevanh Posts: 15,915
    TonyB_ wrote: »
    We need an extra test now, sum[7:0].

    I think sum[0] = sum[15:1] parity is so bad because sum[0] is no longer a function of itself!
    Okay, I'm struggling to see how those two sentences relate to one another but I've run that and not too surprisingly the byte0 scores are a pretty close match to the word0 scores.

    However, that along with many other score tables that have similar confusing column differences that one would expect to not occur due to common use of poor quality bits.

    So, either the idea of bit positions being of a given quality for a given calibration is wrong - I doubt it - or PractRand has some glaring deficiencies in detecting problems with certain bit positions.

    One thought was maybe it was byte oriented in some fashion given that's how I feed it the data stream. So I swapped Xoroshiro32+p result bit 9 with result bit 1. Both bits are moved by a whole byte. That shouldn't have any impact on PractRand, right? Guess what, it has quite a decent impact ...
  • evanhevanh Posts: 15,915
    Here's what you asked for sans the msBit scores:
      Xoroshiro32+p PractRand Score Table - Run 2017-10-28 06:10:51 +1300
    
    Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   Byte0
    =============================================================================
     [ 1  2  8]      4M     32M     64M     16M     64M     64M     32M     32M
     [ 2  1 11]      8M      8M      8M    128K    128K    128K    256K    512K
     [ 2  1 15]     64K    128K    128K     32K     32K     64K     32K     64K
     [ 2  2  7]      4M      8M      8M     32M     32M     32M      4M     32M
     [ 2  6 15]      1M      2M      4M     16M    128K    128K    128K    256K
     [ 2  8  7]      4M      8M     16M    128M     32M     32M      4M    128M
     [ 2  9  9]    256K    256K    256K     16K    128K    512K     64M    256M
     [ 2 11  3]      2M      2M      4M    512K    256K    256K    256K    256K
     [ 3  2  6]     32M     64M     32M      2M      2M      2M      1M      4M
     [ 3  3 10]     16M     16M     16M    256K    256K    512K     64M    256M
     [ 3 11  2]    256K    256K    512K    128K     64K     64K     64K     64K
     [ 3 11 14]      8M      4M      4M    256K    256K    512K    512K      1M
     [ 4  1  9]      1M    512K      1M    512K      8M    256K    512K    512K
     [ 4  7 15]     16M     64M     64M    128M    512K    512K    512K      1M
     [ 4  8  5]     64M    512M    128M     32M      8M      4M      2M      8M
     [ 5  2  6]     64M    128M    128M      8M      4M      4M      4M     16M
     [ 5  8  4]      1M      4M     16M      4M      1M    256K    128K    128K
     [ 5 14 12]    256M    128M     64M      2M      4M     16M    128M      1G
     [ 6  2  3]     16M     32M     32M      1M      1M      1M      2M      4M
     [ 6  2  5]     16M     16M     16M    256K    512K    512K    512K      1M
     [ 6  2 11]    256M    512M    256M      8M      8M    256M    128M    512M
     [ 6  3 15]     32M     32M     32M      1M      1M      2M      4M      8M
     [ 6 14 11]    128M    128M    128M     32M     64M    128M     16M    128M
     [ 7  1  8]      1M     64M    128M     16M    128M     16M     16M     32M
     [ 7  2  2]      8M     16M     16M      2M      2M      4M      4M      8M
     [ 7  2 10]    512M      1G    512M    128M    256M     64M     32M    128M
     [ 7  2 14]    512M    512M    256M      4M      8M     32M     32M      1G
     [ 7  8  2]    128M    512M    512M     64M      8M      4M      4M     16M
     [ 7 10 10]    128M    128M    256M     32M     32M    256M    128M    512M
     [ 7 15  8]      1M      2M      2M    256K      1M    512K    256K    256K
     [ 8  1  7]      1M      1M      4M      1M      1M      1M      1M     32M
     [ 8  2  1]      4M     32M     64M      8M      8M     16M     16M     16M
     [ 8  5 13]     32M     32M     64M     16M      8M    128M    128M    128M
     [ 8  7 15]      1M      2M      2M    512K      2M      2M      1M      1M
     [ 8  9 13]    256M    512M    128M    512M      1G    512M    128M      1G
     [ 8 15  7]    512K    512K      1M    256K    512K    512K    256K    256K
     [ 9  1  4]     16M     32M     64M      8M     16M     16M     16M     16M
     [ 9  2 14]    512M      2G      1G    128M    256M     64M    128M    512M
     [ 9  8 14]      8M     16M     32M    256M     32M     32M     64M    128M
     [ 9  9  2]     32M     32M     64M     16M     16M     16M     32M    128M
     [10  2  7]    128M    256M    128M     16M     16M     64M     32M     64M
     [10  3  3]     32M     64M     64M     32M     32M     32M     64M    512M
     [10  3 11]    512M      2G      1G    256M    256M    512M    128M    512M
     [10  5 13]    512M      2G    512M      1G    512M    256M    128M    512M
     [10  7 11]    512M    512M    128M    512M    128M    512M    128M      1G
     [10 10  7]     16M     16M     16M     16M     16M    512M    128M    512M
     [11  1  2]    128M    256M    256M     32M     32M     32M     32M    128M
     [11  1 14]    512M      1G      1G     32M     16M     16M     16M    512M
     [11  2  6]    512M    512M    512M     32M     64M    512M    128M      1G
     [11  3 10]     16M     32M     32M     64M      1M    128K     64K     64K
     [11  7 10]      2M      4M      8M     32M    128K    128K     64K     64K
     [11  8 12]    128M    256M    256M    512M    512M    512M    128M    512M
     [11 10 12]     32M    128M    128M      1G    512M    512M    128M    512M
     [11 11 14]     32M     64M    128M     16M     16M     16M      8M    512M
     [11 14  6]    128M    256M    128M    256M     64M    512M    128M    512M
     [12  4 15]    256M    512M      1G    128M     64M     16M     16M     32M
     [12  8 11]      2M      4M      4M     64M    128K    128K     64K    128K
     [12 10 11]      2M      2M      2M    128K    128K    128K     64K    128K
     [12 10 13]     64M     32M     64M     16M     16M     32M      8M    512M
     [12 14  5]    128M     32M     32M    128M    128M     64M     64M    128M
     [13  3 14]    128M    512M    512M     64M     32M     16M      8M     16M
     [13  5  8]    128M    256M    256M    512M    128M    256M    128M    128M
     [13  5 10]    256M     64M     32M     64M     16M    256K    256K      2M
     [13  9  8]    128M    128M     64M    512M    512M    512M    128M      1G
     [13 10 12]      2M      2M      2M    128K    128K    128K     64K    128K
     [13 11 14]     16M     16M     16M      8M      8M      8M      8M     16M
     [13 12 14]     16M     16M     16M      8M      8M      4M      4M     16M
     [13 13 14]      8M     16M     16M      8M      8M      8M      8M     16M
     [14  1 11]     32M     64M     32M    256K    256K    256K    256K    512K
     [14  2  7]    512M      1G    512M      1G    512M    512M    128M    512M
     [14  2  9]    512M    512M    256M    128M    128M     64M    128M    256M
     [14  3 13]      4M     16M     16M      1M    256K     64K     64K    128K
     [14  8  9]    512M    128M     64M      1G     64M     64M     64M      1G
     [14 11  3]    512M    512M    512M    256M    128M    128M     64M    256M
     [14 11 11]     16M     16M      8M    256K    256K    128K    256K    512K
     [14 11 13]    512K    512K    512K    128K     64K     64K     64K    128K
     [14 12 13]    256K    512K    256K    128K     64K     64K     64K    128K
     [14 13 13]    128K    128K    256K     64K     64K    128K     64K    128K
     [15  1  2]      2M      2M      2M      1M    512K    512K    256K      1M
     [15  3  6]    512M      1G    512M    512M    512M    256M    128M      1G
     [15  4 12]     64M    128M     64M    128M      8M    256K    256K    512K
     [15  6  2]     64M    128M    256M    128M    128M     32M     16M     16M
     [15  7  4]    256M    512M    128M    512M    512M    256M    128M     64M
     [15  7  8]     16M     64M     64M     32M     32M      4M      2M      2M
    

    And here's what we get with sum[9] <-> sum[1]:
      Xoroshiro32+p PractRand Score Table - Run 2017-10-28 06:47:33 +1300
    
    Combination    Word0   Word1   Word2  msByte  Byte04   Byte2   Byte1   Byte0   msBit
    =====================================================================================
     [ 1  2  8]     64M     16M     32M     64M     64M     32M     64M    128M      1G
     [ 2  1 11]      8M     16M      8M    128K    256K    128K    256K      1M      1G
     [ 2  1 15]    128K    128K    256K    128K    128K     32K     64K    128K      1G
     [ 2  2  7]      8M      8M      8M    256K      4M      4M    512K    512K      1G
     [ 2  6 15]      2M      4M      2M     32M    128K    128K    256K    256K      1G
     [ 2  8  7]      8M     16M     16M    256K      4M      4M    256K    512K      1G
     [ 2  9  9]    256K    512K    512K     32K    256K     64M    256K    256K      1G
     [ 2 11  3]      4M      8M      4M    512K    512K    256K    128K    128K      1G
     [ 3  2  6]     32M     64M     32M     16M    512K      1M    128K    512K      1G
     [ 3  3 10]     16M     32M     32M      1M    512K     64M      1M      4M      1G
     [ 3 11  2]    512K    512K    512K    128K    128K     64K     64K    128K      1G
     [ 3 11 14]      8M     16M      8M    512K    256K    256K    512K      2M      1G
     [ 4  1  9]      1M    512K      1M    512K     16M    512K    256K    256K    512M
     [ 4  7 15]     32M     32M     64M    128M    256K    512K      1M      2M      1G
     [ 4  8  5]    128M    512M     64M     32M      1M      2M      4M     16M      1G
     [ 5  2  6]    128M    256M     64M      8M      2M      4M    512K      2M      1G
     [ 5  8  4]      2M      8M      8M    128K    128K    128K    256K    256K     64M
     [ 5 14 12]    256M    512M    256M      8M     16M    512M     32M    128M      1G
     [ 6  2  3]     32M     32M     32M      1M      1M      1M    128K    512K      2M
     [ 6  2  5]      8M     16M     16M    128K    512K    512K    128K    256K      4M
     [ 6  2 11]    256M      1G    256M      8M     16M    256M    128M    512M      1G
     [ 6  3 15]     64M     64M     64M      2M      1M      2M      4M      8M      1G
     [ 6 14 11]    128M    256M    128M     32M    128M     16M    128M      1G      1G
     [ 7  1  8]    128M    256M    128M     32M    256M     16M     16M     64M      1G
     [ 7  2  2]     16M     32M     32M      4M      2M      4M    512K      4M      4M
     [ 7  2 10]    256M      1G      1G    128M     64M     32M     64M    512M      1G
     [ 7  2 14]    256M      1G    256M      8M     32M    512M     32M    256M    256M
     [ 7  8  2]     64M      1G    256M     64M      4M      4M      8M     16M    512M
     [ 7 10 10]    128M    256M    256M     64M     64M    256M    256M    512M      1G
     [ 7 15  8]      1M      2M      2M    512K      2M    256K    512K    512K      1G
     [ 8  1  7]      1M      2M      4M    128K    256K      1M     64K    256K     64M
     [ 8  2  1]     64M     32M    128M     16M     16M     16M     16M     32M     16M
     [ 8  5 13]     32M    128M     64M     32M     16M    128M    128M      1G      1G
     [ 8  7 15]      2M      4M      2M      2M      8M      1M      2M      2M      1G
     [ 8  9 13]    256M    512M    128M    512M    256M    128M    512M      1G      1G
     [ 8 15  7]      1M      1M      1M    128K    256K    256K     64K    256K      1G
     [ 9  1  4]     16M     32M     64M      8M     16M     16M      8M     32M     64M
     [ 9  2 14]    256M      2G    512M    128M    256M     64M    128M    256M      1G
     [ 9  8 14]     16M     16M     32M    256M     64M     32M     64M    256M      1G
     [ 9  9  2]     32M     64M     64M     32M     32M     32M     32M    128M     64M
     [10  2  7]     64M    256M    128M     16M      8M     32M     64M    256M      1G
     [10  3  3]     32M    128M    128M     16M     32M     64M     32M    128M    256M
     [10  3 11]    256M      2G    256M    128M    128M    512M    512M      2G      1G
     [10  5 13]    512M      2G    512M      1G    256M    256M    512M      1G    512M
     [10  7 11]     64M    512M    256M    128M     64M    512M    512M      2G      1G
     [10 10  7]     32M     32M     16M     16M      8M    256M     64M    512M    512M
     [11  1  2]    256M    512M    512M     16M     64M     16M     64M    128M      1G
     [11  1 14]    512M      2G      2G     64M    256M    512M     32M    128M      1G
     [11  2  6]    512M      2G    512M     64M     64M    512M    256M      2G    512M
     [11  3 10]     16M     16M     16M      2M    256K     64K    256K      1M    128M
     [11  7 10]      4M      8M      8M    256K    128K     64K    256K      1M      1G
     [11  8 12]     64M    512M    512M    256M    512M    512M    512M    512M      1G
     [11 10 12]     64M    128M    128M    512K     64M    512M    512M      1G      1G
     [11 11 14]     64M    128M    256M     16M    256M    128M     32M     64M      1G
     [11 14  6]    256M    512M    128M      1G     32M    128M    256M      1G      1G
     [12  4 15]    512M      1G      1G    256M    128M     16M     32M     64M      1G
     [12  8 11]      1M      4M      8M      8M    128K     64K    256K    512K      1G
     [12 10 11]      4M      4M      4M      1M    128K     64K    256K    512K      1G
     [12 10 13]     64M    128M    256M     16M     16M    128M     64M    128M      1G
     [12 14  5]    256M    128M     32M    128M     64M     64M     64M    256M      1G
     [13  3 14]    256M    512M    512M    256M     32M      8M     32M     64M      1G
     [13  5  8]    256M    512M    256M    512M    512M    512M    256M    512M      1G
     [13  5 10]     32M    128M     32M     16M      8M    128K    512K      1M      1G
     [13  9  8]    128M    256M    128M      1G    512M    512M    512M      1G      1G
     [13 10 12]      4M      4M      4M    512K    128K     64K    128K    256K    512M
     [13 11 14]     16M     32M     32M     16M     16M      8M      8M     32M      1G
     [13 12 14]     16M     32M     32M    512K     16M      4M      4M     64M      1G
     [13 13 14]     16M     32M     32M     16M     16M      8M      8M     32M      1G
     [14  1 11]     32M     64M     32M      1M    256K    256K    512K      2M      1G
     [14  2  7]    512M      2G    256M    256M     16M    512M    256M    512M      1G
     [14  2  9]    128M    512M    256M     64M    128M    256M    128M    128M      1G
     [14  3 13]     16M     32M     16M      1M    256K     64K    128K    256K      1G
     [14  8  9]    128M    256M    128M      1G     16M    256M     64M     64M      1G
     [14 11  3]    512M      1G      2G    256M    128M     64M    128M    256M      1G
     [14 11 11]     16M     32M     16M    512K    256K    128K    512K      1M      1G
     [14 11 13]      1M      1M      1M    256K     64K     64K    128K    128K      1G
     [14 12 13]    512K    512K    512K    256K     64K     64K    128K    128K      1G
     [14 13 13]    512K    256K    512K    128K     64K     64K    128K    128K    256M
     [15  1  2]      4M      4M      4M      2M      8M    256K    512K      4M      1G
     [15  3  6]    512M      2G    512M    512M    512M    512M    512M      1G      1G
     [15  4 12]     64M    128M     32M    256M      8M    256K    512K      1M      1G
     [15  6  2]    128M    256M    128M      1G     64M     16M     32M     32M      1G
     [15  7  4]    512M      2G    128M    512M    256M    128M    512M    512M      1G
     [15  7  8]     32M    128M     64M    128M     32M      2M      4M      4M    512M
    
  • evanhevanh Posts: 15,915
    Note triplet [14 2 7] sucks now. However, triplet [15 3 6] looks fabulous.
  • cgraceycgracey Posts: 14,152
    edited 2017-10-27 18:16
    TonyB_ wrote: »
    cgracey wrote: »
    If XORO32 can be used to produce a high-quality 16-bit result per iteration, then I could enhance the instruction to do TWO iterations, get 16-bit sums from each, and substitute the high-quality 32-bit PRNG result into the next instruction's S value. Meanwhile, the double iteration would be written back to the original D in the XORO32 instruction.

    So, do a XORO32 iteration and store the 32-bit state in S. Do another XORO32 iteration, then sum the two states as two parallel 16-bit adds, with the parity jiggery-pokery, to give a 32-bit PRN in D. Is that correct and is there time to do all this?

    I'm all for XORO32 doing the summing itself, if possible. We'd need a XORO32 write for the seed and a XORO32 read for the PRN.

    It's all in one!

    Because the iteration is just a bunch of XOR's, we could do a double iteration at once, grabbing results from each to perform the separate 16-bit adds and parity computations, in order to compute the compound PRNG output for the next instruction's S value. The 'XORO32 D' will write the double-iterated value (which is separate from the PRNG output) back into D, which is just a register. To seed the PRNG, just put a non-0 value into the register being used.

    XORO32 D 'iterate D and put 32-bit PRNG result into next instruction's S

    Example:
            XORO32  rnd             'iterate rnd
            MOV     outb,0          'write XORO32 PRNG result to outb
    
  • Amazing! The 0 could in MOV could be anything and interrupts disabled between XORO32 and MOV?
  • cgraceycgracey Posts: 14,152
    edited 2017-10-27 18:22
    TonyB_ wrote: »
    Amazing! The 0 could in MOV could be anything and interrupts disabled between XORO32 and MOV?

    Yes, XORO32 would stave off an interrupt just like SETQ (SETQ+RDLONG) does. The next instruction would wind up with the PRNG result for its S value. This is what you've been working towards, but didn't realize how simple it would turn out. Me neither. This just uses any cog register. Easy peasy.
  • cgraceycgracey Posts: 14,152
    edited 2017-10-27 20:32
    Here is the Verilog to do this:
    wire [15:0] xoro32z	= d[31:16] ^ d[15:0];			// first iteration
    wire [31:0] xoro32y	= {xoro32z[8:0], xoro32z[15:9],
    			  {d[1:0], d[15:2]} ^
    			  {xoro32z[13:0], 2'b0} ^ xoro32z};
    
    wire [15:0] xoro32x	= xoro32y[31:16] ^ xoro32y[15:0];	// second iteration
    wire [31:0] xoro32	= {xoro32x[8:0], xoro32x[15:9],		// xoro32 = d result
    			  {xoro32y[1:0], xoro32y[15:2]} ^
    			  {xoro32x[13:0], 2'b0} ^ xoro32x};
    
    wire [16:0] xoro32a	= xoro32y[31:16] + xoro32y[15:0];	// first iteration sum
    wire [16:0] xoro32b	= xoro32[31:16] + xoro32[15:0];		// second iteration sum
    
    wire [31:0] xoro32r	= {xoro32b[15:1], ^xoro32b,		// xoro32r = prng result
    			   xoro32a[15:1], ^xoro32a};
    

    EDIT: I changed the xoro32r output so that the 1st iteration result is in the lower word and the 2nd iteration result is in the upper word.
  • TonyB_TonyB_ Posts: 2,178
    edited 2017-10-27 20:06
    Thanks, Chip. Don't we want the first sum in the low word?
    wire [31:0] xoro32r	= {xoro32b[15:1], ^xoro32b,		// xoro32r = prng result
    			   xoro32a[15:1], ^xoro32a};
    

  • TonyB_TonyB_ Posts: 2,178
    edited 2017-10-28 20:09
    deleted
  • cgraceycgracey Posts: 14,152
    TonyB_ wrote: »
    Thanks, Chip. Don't we want the first sum in the low word?
    wire [31:0] xoro32r	= {xoro32b[15:1], ^xoro32b,		// xoro32r = prng result
    			   xoro32a[15:1], ^xoro32a};
    

    I think so. I'll change it.
  • Best to keep everything little-endian.
Sign In or Register to comment.