Shop OBEX P1 Docs P2 Docs Learn Events
Random/LFSR on P2 - Page 21 — Parallax Forums

Random/LFSR on P2

1181921232492

Comments

  • evanhevanh Posts: 15,854
    edited 2017-05-08 23:11
    I've done a little code tidy up and it now sorts the score table by combination order. Here's the s16 scores again, I've also run the "Byte1" scoring now too:
    
    
        Xorshift32+ PractRand Score Table
    
    Combination   Word  Byte3  Byte2  Byte1   Bit
    ===============================================
     [ 1  1  7]     2M   128K     4M     8M     4K
     [ 1  1 12]     1M    64K   256K   512K     4K
     [ 1  1 13]   512K    64K   256K   256K     4K
     [ 2  5  8]     8M   256K    32M   128M   256K
     [ 2  5 13]     8M   256K    16M     8M   256K
     [ 2 13 15]     1M   128K   256K   256K   256K
     [ 2 15 13]     1M   128K   256K   256K   256K
     [ 3  7  6]     8M   256K     4M    16M    32M
     [ 5  3  1]   512M     8M   128M    16M     1G
     [ 5  3  8]   128M     2M   512M   128M     1G
     [ 5  3 13]   128M     2M   128M    64M     1G
     [ 5  7  4]    64M     2M    64M    16M    32M
     [ 6  3  8]   256M     8M   512M   128M     1G
     [ 7  1  6]   128M    16M    32M     8M    64M
     [ 7  1 15]     2M     8M     1M    32K   128M
     [ 7  2  1]    64M    32M    32M     1M     1G
     [ 8  3  9]   128M     1G   128M   128M   256M
     [ 9 14  5]    16M    16M     8M     1M   256M
     [11  8  5]    32M     4M   256M   128M     1G
     [13 12  3]     4M   512K   512K     4M   512M
     [14  1 15]   256K    64K    32K    32K     4K
     [15 10  1]   256K    32K    32K     1M   512M
    
    
        Xoroshiro32+ PractRand Score Table
    
    Combination   Word  Byte3  Byte2  Byte1   Bit
    ===============================================
     [ 1  2  8]    32M    16M    64M    32M     1G
     [ 2  1 11]     8M   128K   128K   256K     1G
     [ 2  1 15]   128K    32K    32K    32K     1G
     [ 2  2  7]     8M    32M    32M     4M     1G
     [ 2  6 15]     2M    16M   128K   128K     1G
     [ 2  8  7]     8M   128M    32M     4M     1G
     [ 2  9  9]   256K    16K   128K    64M     1G
     [ 2 11  3]     2M   512K   256K   256K     1G
     [ 3  2  6]    64M     2M     2M     1M     1G
     [ 3  3 10]    16M   256K   256K    64M     1G
     [ 3 11  2]   256K   128K    64K    64K     1G
     [ 3 11 14]     4M   256K   256K   512K     1G
     [ 4  1  9]   512K   512K     8M   512K   512M
     [ 4  7 15]    64M   128M   512K   512K     1G
     [ 4  8  5]   512M    32M     8M     2M     1G
     [ 5  2  6]   128M     8M     4M     4M     1G
     [ 5  8  4]     4M     4M     1M   128K    64M
     [ 5 14 12]   128M     2M     4M   128M     1G
     [ 6  2  3]    32M     1M     1M     2M     2M
     [ 6  2  5]    16M   256K   512K   512K     4M
     [ 6  2 11]   512M     8M     8M   128M     1G
     [ 6  3 15]    32M     1M     1M     4M     1G
     [ 6 14 11]   128M    32M    64M    16M     1G
     [ 7  1  8]    64M    16M   128M    16M     1G
     [ 7  2  2]    16M     2M     2M     4M     4M
     [ 7  2 10]     1G   128M   256M    32M     1G
     [ 7  2 14]   512M     4M     8M    32M   256M
     [ 7  8  2]   512M    64M     8M     4M   512M
     [ 7 10 10]   128M    32M    32M   128M     1G
     [ 7 15  8]     2M   256K     1M   256K     1G
     [ 8  1  7]     1M     1M     1M     1M    64M
     [ 8  2  1]    32M     8M     8M    16M    16M
     [ 8  5 13]    32M    16M     8M   128M     1G
     [ 8  7 15]     2M   512K     2M     1M     1G
     [ 8  9 13]   512M   512M     1G   128M     1G
     [ 8 15  7]   512K   256K   512K   256K     1G
     [ 9  1  4]    32M     8M    16M    16M    64M
     [ 9  2 14]     2G   128M   256M   128M     1G
     [ 9  8 14]    16M   256M    32M    64M     1G
     [ 9  9  2]    32M    16M    16M    32M    64M
     [10  2  7]   256M    16M    16M    32M     1G
     [10  3  3]    64M    32M    32M    64M   256M
     [10  3 11]     2G   256M   256M   128M     1G
     [10  5 13]     2G     1G   512M   128M   512M
     [10  7 11]   512M   512M   128M   128M     1G
     [10 10  7]    16M    16M    16M   128M   512M
     [11  1  2]   256M    32M    32M    32M     1G
     [11  1 14]     1G    32M    16M    16M     1G
     [11  2  6]   512M    32M    64M   128M   512M
     [11  3 10]    32M    64M     1M    64K   128M
     [11  7 10]     4M    32M   128K    64K     1G
     [11  8 12]   256M   512M   512M   128M     1G
     [11 10 12]   128M     1G   512M   128M     1G
     [11 11 14]    64M    16M    16M     8M     1G
     [11 14  6]   256M   256M    64M   128M     1G
     [12  4 15]   512M   128M    64M    16M     1G
     [12  8 11]     4M    64M   128K    64K     1G
     [12 10 11]     2M   128K   128K    64K     1G
     [12 10 13]    32M    16M    16M     8M     1G
     [12 14  5]    32M   128M   128M    64M     1G
     [13  3 14]   512M    64M    32M     8M     1G
     [13  5  8]   256M   512M   128M   128M     1G
     [13  5 10]    64M    64M    16M   256K     1G
     [13  9  8]   128M   512M   512M   128M     1G
     [13 10 12]     2M   128K   128K    64K   512M
     [13 11 14]    16M     8M     8M     8M     1G
     [13 12 14]    16M     8M     8M     4M     1G
     [13 13 14]    16M     8M     8M     8M     1G
     [14  1 11]    64M   256K   256K   256K     1G
     [14  2  7]     1G     1G   512M   128M     1G
     [14  2  9]   512M   128M   128M   128M     1G
     [14  3 13]    16M     1M   256K    64K     1G
     [14  8  9]   128M     1G    64M    64M     1G
     [14 11  3]   512M   256M   128M    64M     1G
     [14 11 11]    16M   256K   256K   256K     1G
     [14 11 13]   512K   128K    64K    64K     1G
     [14 12 13]   512K   128K    64K    64K     1G
     [14 13 13]   128K    64K    64K    64K   256M
     [15  1  2]     2M     1M   512K   256K     1G
     [15  3  6]     1G   512M   512M   128M     1G
     [15  4 12]   128M   128M     8M   256K     1G
     [15  6  2]   128M   128M   128M    16M     1G
     [15  7  4]   512M   512M   512M   128M     1G
     [15  7  8]    64M    32M    32M     2M   512M
    
    EDIT: Oops, I had left some scoring variants disabled. I've removed the attached sources here.
  • evanhevanh Posts: 15,854
    edited 2017-05-08 23:10
    Ramon,
    Check out me "extract_score" Bash function. I'm a little proud of that one. Lots of googling involved to get it right.

    I still don't really know any of the meanings of the various bracketing mechanisms [], [[]], (), (()), {} ... about the only one I'm confident in using is "" and '', and even then I'm not sure if there is any distinction between single and double quotes.

    EDIT: Corrected spelling of Ramon's name.
  • cgraceycgracey Posts: 14,134
    edited 2017-05-08 19:04
    I just noticed something. The higher up the adder chain, the better the random quality gets. Bytes, even a bit, extracted from the top is better quality than data extracted from lower bits.
  • evanhevanh Posts: 15,854
    Yeah, that was the reason I was historically sometimes trimming the search range to improve completion times but I've decided that's too early to be blindly eliminating candidates.

    Here's the current sources again but with all scoring variants enabled now:
  • Wow! Evanh, you are The Shell Ninja.

    I had to search what the ^^ was doing there. Also the same for shopt.

    I prefer not to employ commands that depend on some version of bash. That uppercase (^^) needs at least bash version 4.0. They can also be done with 'tr' or 'awk'. It will be much slower but maybe is more portable.
    $ echo "4 kilobytes" | tr '[a-z]' '[A-Z]'
    4 Kilobytes
    
    $ echo "4 kilobytes" | tr '[:lower:]' '[:upper:]'
    4 KILOBYTES
    
    $ grep 'length=' test-s9/PRscore-xs-s9bit-a1b2c6.out | tail -n1 | awk '{ printf "%3s%1s", $2, toupper(substr($3,0,1)) }'
    

    Also the readline can be reordered to the first command and grep can be put inside a '<( )'to avoid the pipe (and shopt).
    $ read length value suffix line < <(grep 'length=' test-s9/PRscore-xs-s9bit-a1b2c6.out | tail -1)
    

    Or much better ... you can directly modify PractRand source code to output your desired printf format. (but maybe it's not as funny as shell scripting) :
    $ diff RNG_test.cpp.orig RNG_test.cpp
    455c455
    < 	const char *unitstr[6] = {"kilobyte", "megabyte", "gigabyte", "terabyte", "petabyte", "exabyte"};
    ---
    > 	const char *unitstr[6] = {"K", "M", "G", "T", "P", "E"}; // Kilobyte, Mega...
    458,460c458,459
    < 	if (length & (length-1))
    < 		std::printf("%.3f %ss", length * std::pow(0.5,units*10.0+10), unitstr[units] );
    < 	else std::printf("%.0f %s%s", length * std::pow(0.5,units*10.0+10), unitstr[units], length != (Uint64(1024)<<(units*10)) ? "s" : "" );
    ---
    > 	// std::printf("%.3f %s", length * std::pow(0.5,units*10.0+10), unitstr[units] );
    > 	std::printf("%3.0f%s", length * std::pow(0.5,units*10.0+10), unitstr[units] );
    

    Thanks for the update. The scripts worked great (for bit test) (I have just seen that you updated the scripts to enable all test). Previous version had some issue detecting the CPU load with my computer. Before I needed to comment out the section that detected CPU load, but this time they worked perfect the first time.
  • evanhevanh Posts: 15,854
    Thanks Ramon,
    I'll do both those.

    Yeah, there was a bug in the CPU load test early on. It would just keep launching a new task irrespective. I think that was another example of me not understanding the bracketing I was using.

    Chip,
    I've just finished s19 and s20 bit and byte1 scoring runs and I'm thinking it's showing weakness in the lower bits in general, not just bit 0. It's more obvious in the bigger word sizes.

    I'll do some more testing ...
  • evanhevanh Posts: 15,854
    edited 2017-05-11 13:27
    evanh wrote: »
    Chip,
    I've just finished s19 and s20 bit and byte1 scoring runs and I'm thinking it's showing weakness in the lower bits in general, not just bit 0. It's more obvious in the bigger word sizes.

    I'll do some more testing ...
    No, there's no real consistency. Other than "byte1" variant never showing any peaks the way the others do, any other signs aren't enough to be concerned over given how sensitive Practrand seems to be.
  • evanhevanh Posts: 15,854
    edited 2017-05-11 13:35
    another deleted file ...
  • evanhevanh Posts: 15,854
    Latest sources - using awk to help format score table.
  • evanhevanh Posts: 15,854
    edited 2017-05-12 06:57
    Here's the s16 scores again with a couple of extra variants added from the above experimental look into bit1 quality. Our chosen 14 2 7 combination holds up well again with its consistency.

    Variant Key:
    - "Word1" is the full summed word size minus the lsb, for s16 (Xorshift32+/Xoroshiro32+) that is the top 15 bits [15:1].
    - "Word2" is the full summed word size minus two lsb, for s16 that is the top 14 bits [15:2].
    - "Byte08" is the most significant 8 bits of the summed word, for s16 that is bits [15:8]. Label changes per position.
    - "Byte04" is half way down the summed word, for s16 that is bits [11:4]. Label changes per position.
    - "Byte2" is always bits [9:2] for all word sizes.
    - "Byte1" is always bits [8:1] for all word sizes.
    - "Bit" is msb.
    
    
        Xorshift32+ PractRand Score Table
    
    Combination    Word1   Word2  Byte08  Byte04   Byte2   Byte1    Bit
    ====================================================================
     [ 1  1  7]      2M      1M    128K      4M      8M      8M      4K
     [ 1  1 12]      1M      2M     64K    256K    512K    512K      4K
     [ 1  1 13]    512K      1M     64K    256K    256K    256K      4K
     [ 2  5  8]      8M     16M    256K     32M    512M    128M    256K
     [ 2  5 13]      8M     16M    256K     16M     32M      8M    256K
     [ 2 13 15]      1M      2M    128K    256K    256K    256K    256K
     [ 2 15 13]      1M      2M    128K    256K    256K    256K    256K
     [ 3  7  6]      8M      8M    256K      4M     32M     16M     32M
     [ 5  3  1]    512M    512M      8M    128M     32M     16M      1G
     [ 5  3  8]    128M    128M      2M    512M    512M    128M      1G
     [ 5  3 13]    128M    128M      2M    128M     64M     64M      1G
     [ 5  7  4]     64M     64M      2M     64M     64M     16M     32M
     [ 6  3  8]    256M    128M      8M    512M    512M    128M      1G
     [ 7  1  6]    128M    256M     16M     32M     32M      8M     64M
     [ 7  1 15]      2M      4M      8M      1M     32K     32K    128M
     [ 7  2  1]     64M    128M     32M     32M      8M      1M      1G
     [ 8  3  9]    128M     32M      1G    128M    512M    128M    256M
     [ 9 14  5]     16M     16M     16M      8M      4M      1M    256M
     [11  8  5]     32M     64M      4M    256M    256M    128M      1G
     [13 12  3]      4M      4M    512K    512K      2M      4M    512M
     [14  1 15]    256K    256K     64K     32K     32K     32K      4K
     [15 10  1]    256K    256K     32K     32K    128K      1M    512M
    
    
        Xoroshiro32+ PractRand Score Table
    
    Combination    Word1   Word2  Byte08  Byte04   Byte2   Byte1    Bit
    ====================================================================
     [ 1  2  8]     32M     64M     16M     64M     64M     32M      1G
     [ 2  1 11]      8M      8M    128K    128K    128K    256K      1G
     [ 2  1 15]    128K    128K     32K     32K     64K     32K      1G
     [ 2  2  7]      8M      8M     32M     32M     32M      4M      1G
     [ 2  6 15]      2M      4M     16M    128K    128K    128K      1G
     [ 2  8  7]      8M     16M    128M     32M     32M      4M      1G
     [ 2  9  9]    256K    256K     16K    128K    512K     64M      1G
     [ 2 11  3]      2M      4M    512K    256K    256K    256K      1G
     [ 3  2  6]     64M     32M      2M      2M      2M      1M      1G
     [ 3  3 10]     16M     16M    256K    256K    512K     64M      1G
     [ 3 11  2]    256K    512K    128K     64K     64K     64K      1G
     [ 3 11 14]      4M      4M    256K    256K    512K    512K      1G
     [ 4  1  9]    512K      1M    512K      8M    256K    512K    512M
     [ 4  7 15]     64M     64M    128M    512K    512K    512K      1G
     [ 4  8  5]    512M    128M     32M      8M      4M      2M      1G
     [ 5  2  6]    128M    128M      8M      4M      4M      4M      1G
     [ 5  8  4]      4M     16M      4M      1M    256K    128K     64M
     [ 5 14 12]    128M     64M      2M      4M     16M    128M      1G
     [ 6  2  3]     32M     32M      1M      1M      1M      2M      2M
     [ 6  2  5]     16M     16M    256K    512K    512K    512K      4M
     [ 6  2 11]    512M    256M      8M      8M    256M    128M      1G
     [ 6  3 15]     32M     32M      1M      1M      2M      4M      1G
     [ 6 14 11]    128M    128M     32M     64M    128M     16M      1G
     [ 7  1  8]     64M    128M     16M    128M     16M     16M      1G
     [ 7  2  2]     16M     16M      2M      2M      4M      4M      4M
     [ 7  2 10]      1G    512M    128M    256M     64M     32M      1G
     [ 7  2 14]    512M    256M      4M      8M     32M     32M    256M
     [ 7  8  2]    512M    512M     64M      8M      4M      4M    512M
     [ 7 10 10]    128M    256M     32M     32M    256M    128M      1G
     [ 7 15  8]      2M      2M    256K      1M    512K    256K      1G
     [ 8  1  7]      1M      4M      1M      1M      1M      1M     64M
     [ 8  2  1]     32M     64M      8M      8M     16M     16M     16M
     [ 8  5 13]     32M     64M     16M      8M    128M    128M      1G
     [ 8  7 15]      2M      2M    512K      2M      2M      1M      1G
     [ 8  9 13]    512M    128M    512M      1G    512M    128M      1G
     [ 8 15  7]    512K      1M    256K    512K    512K    256K      1G
     [ 9  1  4]     32M     64M      8M     16M     16M     16M     64M
     [ 9  2 14]      2G      1G    128M    256M     64M    128M      1G
     [ 9  8 14]     16M     32M    256M     32M     32M     64M      1G
     [ 9  9  2]     32M     64M     16M     16M     16M     32M     64M
     [10  2  7]    256M    128M     16M     16M     64M     32M      1G
     [10  3  3]     64M     64M     32M     32M     32M     64M    256M
     [10  3 11]      2G      1G    256M    256M    512M    128M      1G
     [10  5 13]      2G    512M      1G    512M    256M    128M    512M
     [10  7 11]    512M    128M    512M    128M    512M    128M      1G
     [10 10  7]     16M     16M     16M     16M    512M    128M    512M
     [11  1  2]    256M    256M     32M     32M     32M     32M      1G
     [11  1 14]      1G      1G     32M     16M     16M     16M      1G
     [11  2  6]    512M    512M     32M     64M    512M    128M    512M
     [11  3 10]     32M     32M     64M      1M    128K     64K    128M
     [11  7 10]      4M      8M     32M    128K    128K     64K      1G
     [11  8 12]    256M    256M    512M    512M    512M    128M      1G
     [11 10 12]    128M    128M      1G    512M    512M    128M      1G
     [11 11 14]     64M    128M     16M     16M     16M      8M      1G
     [11 14  6]    256M    128M    256M     64M    512M    128M      1G
     [12  4 15]    512M      1G    128M     64M     16M     16M      1G
     [12  8 11]      4M      4M     64M    128K    128K     64K      1G
     [12 10 11]      2M      2M    128K    128K    128K     64K      1G
     [12 10 13]     32M     64M     16M     16M     32M      8M      1G
     [12 14  5]     32M     32M    128M    128M     64M     64M      1G
     [13  3 14]    512M    512M     64M     32M     16M      8M      1G
     [13  5  8]    256M    256M    512M    128M    256M    128M      1G
     [13  5 10]     64M     32M     64M     16M    256K    256K      1G
     [13  9  8]    128M     64M    512M    512M    512M    128M      1G
     [13 10 12]      2M      2M    128K    128K    128K     64K    512M
     [13 11 14]     16M     16M      8M      8M      8M      8M      1G
     [13 12 14]     16M     16M      8M      8M      4M      4M      1G
     [13 13 14]     16M     16M      8M      8M      8M      8M      1G
     [14  1 11]     64M     32M    256K    256K    256K    256K      1G
     [14  2  7]      1G    512M      1G    512M    512M    128M      1G
     [14  2  9]    512M    256M    128M    128M     64M    128M      1G
     [14  3 13]     16M     16M      1M    256K     64K     64K      1G
     [14  8  9]    128M     64M      1G     64M     64M     64M      1G
     [14 11  3]    512M    512M    256M    128M    128M     64M      1G
     [14 11 11]     16M      8M    256K    256K    128K    256K      1G
     [14 11 13]    512K    512K    128K     64K     64K     64K      1G
     [14 12 13]    512K    256K    128K     64K     64K     64K      1G
     [14 13 13]    128K    256K     64K     64K    128K     64K    256M
     [15  1  2]      2M      2M      1M    512K    512K    256K      1G
     [15  3  6]      1G    512M    512M    512M    256M    128M      1G
     [15  4 12]    128M     64M    128M      8M    256K    256K      1G
     [15  6  2]    128M    256M    128M    128M     32M     16M      1G
     [15  7  4]    512M    128M    512M    512M    256M    128M      1G
     [15  7  8]     64M     64M     32M     32M      4M      2M    512M
    
  • cgraceycgracey Posts: 14,134
    Looks good, Evanh.

    I think ours will probably be used for bits 15:01 and for bits 15:08. So, (14,2,7) looks the best for that, and that's what's in the Verilog.
  • I've been thinking about XORO32, as illustrated below.
    cgracey wrote: »
    I've got it working:
    dat	org
    
    	bmask	dirb,#15	'drive LEDs
    
    loop	getword	outb,state,#1	'output sum to LEDs (LSB is low-quality)
    	add	outb,state
    	waitx	##80_000_000/10
    
    	xoro32	state		'iterate xoroshiro32+
    
    	jmp	#loop
    
    
    state	long	1		'seed {s1,s0} with 1
    

    The top 15 bits of the 16-bit sum 'are high-quality pseudo-random data' but bit 0 is poor quality. A 16-bit addition gives a 17-bit result and I was wondering whether bit 16 is more random than bit 0 and could replace it?
  • evanhevanh Posts: 15,854
    Nope, the carry bit is horribly poor quality. I had posed that very question myself back when I first got interested. In fact it was that question that got me doing my own testing.
  • Heater.Heater. Posts: 21,230
    Slippery things these PRNGs. Make one harmless looking change and the resulting "randomness" is ruined.

    How many hours did we all spend testing this and that here?
  • evanhevanh Posts: 15,854
    edited 2017-09-13 09:08
    Hehe, many many hours. So many mistakes with having to do all the testing over and over again. But I certainly learnt a few things along the way. :) Although the knowledge is fading already. :(
  • evanhevanh Posts: 15,854
    edited 2017-09-13 12:43
    Dang, it just dawned on me, far too late, that DRAM prices have been going through the roof all this year! And even with those prices, supply appears to have gone to virtually nil.

    One of my new 16GB DIMMs failed and there is no exact replacements in stock. I've got the full credit for them (Returned the pair) and can claim it as a refund, but the options for replacement are slim pickings.

    I'm back using me old PC for the moment, and the remainder of the year looks bleak. :(
  • TonyB_TonyB_ Posts: 2,178
    edited 2017-09-13 23:53
    TonyB_ wrote: »
    I've been thinking about XORO32 ...
    The top 15 bits of the 16-bit sum 'are high-quality pseudo-random data' but bit 0 is poor quality. A 16-bit addition gives a 17-bit result and I was wondering whether bit 16 is more random than bit 0 and could replace it?
    evanh wrote: »
    Nope, the carry bit is horribly poor quality. I had posed that very question myself back when I first got interested. In fact it was that question that got me doing my own testing.

    Thanks Evan. Before you forget even more, could you please explain how the most random triplet combination you found of 14,2,7 translates into Chip's logic below?
    cgracey wrote: »
    Here's the final fruit of all the xoroshiro32+ testing that Evanh did:
    wire [15:0] xoro32x  = d[31:16] ^ d[15:0];
    wire [31:0] xoro32   = {xoro32x[8:0], xoro32x[15:9], {d[1:0], d[15:2]} ^ xoro32x ^ {xoro32x[13:0], 2'b0}};
    

    Does this algorithm work for arbitrary data widths? Specifically, could there be a xoro34?
    wire [16:0] xoro34x  = {d[31:16],C} ^ {d[15:0],Z};
    wire [33:0] xoro34   = {f(xoro34x,d,Z)} ^ xoro34x ^ {f(xoro34x)};
    

  • Above post edited to be less confusing.

    Whole point of xoro34 is to produce 16 bits of high-quality pseudo-random data on a P2-like CPU some time in the future.

  • evanhevanh Posts: 15,854
    Chip wants to keep the state storage to 32 bits so it fits cleanly in a single Cog register. So, its really as good as it can be given the requirements.


    As for how the final triplet combination was chosen, here's where I came into sync with what Chip was after - http://forums.parallax.com/discussion/comment/1407995/#Comment_1407995 You can see there why Chip picked out 14,2,7 with its even quality in all three sampling variants. I'd sort of favoured 10,5,13 with its higher full word quality but wasn't pushing it due to its non-conforming triplet combination.

    Prior to that, I'd kind of been off on my own mission not realising that starting with known full repeat lengths was pretty important. Heater beat me up about that.
  • evanhevanh Posts: 15,854
    edited 2017-09-14 13:51
    You do have the option of retaining the bottom bit of the summing when using the XORO32 instruction. It's not that bad a quality. Someone suggested it lowered the results to equivalent of Xorshift, which is still considered a perfectly okay PRNG.
  • TonyB_TonyB_ Posts: 2,178
    edited 2017-09-15 13:52
    evanh wrote: »
    As for how the final triplet combination was chosen, here's where I came into sync with what Chip was after - http://forums.parallax.com/discussion/comment/1407995/#Comment_1407995 You can see there why Chip picked out 14,2,7 with its even quality in all three sampling variants. I'd sort of favoured 10,5,13 with its higher full word quality but wasn't pushing it due to its non-conforming triplet combination.

    Prior to that, I'd kind of been off on my own mission not realising that starting with known full repeat lengths was pretty important. Heater beat me up about that.

    Thanks again Evan. I've been reading all the earlier posts in this thread today once again, which I should have done before. Sorry to all for asking newbie-style questions.
    evanh wrote: »
    Chip wants to keep the state storage to 32 bits so it fits cleanly in a single Cog register. So, it's really as good as it can be given the requirements.

    Ah, but is it? Can we please put an end to these malicious rumours that the P2 is a 32-bit processor? It's 34-bit! :)

    XORO32 (existing)
    dat	org
    
    		bmask	dirb,#15	'drive LEDs
    
    loop		getword	outb,state,#1	'output sum to LEDs
    					'(LSB is low-quality)
    		add	outb,state
    		waitx	##80_000_000/10
    
    		xoro32	state		'iterate xoroshiro32+
    
    		jmp	#loop
    
    
    state		long	1		'seed {s1,s0} with 1
    

    XORO34 (possible future option?)
    dat	org
    
    		bmask	dirb,#15
    
    loop		getword	outb,state,#1
    		add	outb,state
    if_c_and_z	add	outb,#1		'extra instruction
    					'increment sum if LSBs both 1
    					'all 16 bits are high-quality
    		waitx	##80_000_000/10
    
    		xoro34	state	wcz	'modified instruction, wcz added
    					'state includes c and z where
    					'c is LSB of high 17-bit word
    					'z if LSB of low  17-bit word
    		jmp	#loop
    
    
    state		long	1		'or 0
    

    There's a bit more to seeding than shown as c & z are involved, but not very much.

    Presumably 14,2,7 for xoro32 translates to 15,3,8 for xoro34?
    I wonder how that performs in randomness tests? Or is there a better set?
  • evanhevanh Posts: 15,854
    TonyB_ wrote: »
    Presumably 14,2,7 for xoro32 translates to 15,3,8 for xoro34?
    I wonder how that performs in randomness tests? Or is there a better set? (Probably, I imagine, assuming all of this new xoro34 stuff isn't just like a chimp bashing away on a typewriter.)
    I'm at work so I'll get back to you in full later. Suffice to say I also learnt there is no patterns to the combinations. Even my "conforming" to the original ratios is probably worthless.

    I did some full run tests on many word sizes. But only the s16 size was done with the latest scoring variations. You can find some s17 (Xoroshiro34) results here - http://forums.parallax.com/discussion/comment/1410683/#Comment_1410683 They aren't very pretty, the odd sizes were that way. I'll do some more testing with the extra variants to see if anything sticks out ...
  • TonyB_TonyB_ Posts: 2,178
    edited 2017-09-15 11:07
    Chosen Xoro32 set:
        Xoroshiro32+ PractRand Score Table
    
    Combination   Word  Byte3  Byte2  Byte1   Bit
    ===============================================
     [14  2  7]     1G     1G   512M            1G
    

    Best Xoro34 sets appear to be:
        Xoroshiro34+ PractRand Score Table
    
    Combination   Word  Byte3  Byte2  Byte1   Bit
    ===============================================
     [10 15 12]   512M     4G     2G            4G
     [ 7  8 12]   512M     2G     2G            4G
     [16  1  6]   512M     1G     2G            4G
    

    Are any of these max length? Any other possible sets?

    For the comparison with Xoro32 to be fair, Xoro34_sum[15:1] should be tested as well as Xoro34_sum[15:0].
  • evanhevanh Posts: 15,854
    edited 2017-09-15 13:09
    TonyB_ wrote: »
    Are any of these max length? Any other possible sets?
    The score charts cover every full period combination for the specified word size.

    Finding all the full period triplet candidates is where the long search times come from. The PractRand quality testing is short in comparison. Beyond s20 (Xoroshiro40) it became weeks per attempt on my 8-core CPU. I had thought about trying to use CUDA or something to help out, but never got around to it.

    PS: My early testing was just doing quality testing of groups of arbitrary triplets. Some got very high scores, but they weren't full period triplets.

    Clarification: I keep reverting back to the term "full length repeat", I note you've used the term "max length". The researchers I read seem to use the term "full period". I'll try to stick to "full period" naming.
  • Are the three XORO34 candidates "full period"?
    If so, I'm really keen to know how Xoro34_sum[15:1] perform.
  • evanhevanh Posts: 15,854
    edited 2017-09-15 13:39
    evanh wrote: »
    Finding all the full period triplet candidates is where the long search times come from. The PractRand quality testing is short in comparison. Beyond s20 (Xoroshiro40) it became weeks per attempt on my 8-core CPU.
    It's a basic brute force of cycle the PRNG state until the starting seed repeats. Full period is identified by a repeat at exactly the maximum possible. For s16, it's 2^32-1. For s17, it's 2^34-1. My code also checks to see if any exceed the presumed full period. So far, none have.

    The more long period triplets there are in a particular word size the longer the whole search takes. This is amplified massively by the raised to power X nature of the word size.

    The researchers that do the basic development also devise methods for jumping ahead in the PRNG state sequence. That way they can find (probably in some statistically fashion) full period candidates even when the period is enormous. Not something I've even come close to comprehending.
  • Chip wrote a program to test whether seed value repeats after 2^32-1 iterations:
    http://forums.parallax.com/discussion/comment/1407299/#Comment_1407299
    Any triplets that failed were ruled out immediately.

    Change to 2^34-1 and we're good to go?
  • evanhevanh Posts: 15,854
    Yup, my version completed an s16 search in a few minutes. s17 was probably about an hour. s20 was a couple of days. s21 never completed when I terminated it after maybe a week.
  • evanhevanh Posts: 15,854
    edited 2017-09-15 13:50
    The old sizes take disproportionately longer time than the evens. That was consistent at all stages of my work.
Sign In or Register to comment.