Shop OBEX P1 Docs P2 Docs Learn Events
Random/LFSR on P2 - Page 58 — Parallax Forums

Random/LFSR on P2

1555658606192

Comments

  • TonyB_TonyB_ Posts: 2,195
    edited 2018-09-30 17:36
    evanh wrote: »
    TonyB_ wrote: »
    evanh wrote: »
    BTW: That makes the Crush reports for aperture 16>>0 of candidates [13 5 8 10] and [7 2 14 10] clean reports.

    Aperture 16>>0 for [14 2 7 5] seems to be good in most of my tests, but there is one arrangement producing e-12 that I'd class as VERY SUSPICIOUS (Practrand naming). It was my experimental candidate so, until I get back to retesting it, I'm not completely sure what the various arrangements were.

    Aperture 16>>0 for [14 2 7 6] produces an eps in all the arrangements so it's out.

    Long holiday away from PRNGs now over. :)

    Evan, which of [13,5,10,9] and [13,5,8,10] are better overall in PractRand and Crush tests?

    [13 5 8 10] seems the better, it was already visible in the good scoring Practrand scores before we used the distribution scores. I haven't done any exhaustive comparing with the Crushes. You've got the reports.

    Evan, I do have the Crush reports you emailed me thanks and I also have Crush and BigCrush reports for Chris's 32-bit generator. Did you ever run BigCrush tests for [13,5,10,9] and [13,5,8,10]? If not, I think they would be good to have.

    A respin might be the opportunity to improve XORO32 and make it the best it can be by replacing [14,2,7,5] with [13,5,10,9] or [13,5,8,10]. No other candidates really matter at this time. The XORO32 logic would be the same, just the signal routing would change due to different rotate and shift constants.
  • TonyB_TonyB_ Posts: 2,195
    edited 2018-10-03 02:15
    I think there is a third distribution we should look at. We have the xoroshiro32++ pair (XORO32 output) and corresponding zero run frequencies but we don't have the non-zero runs. There are longer non-zero runs than zero runs, which means there are more non-zero run frequencies > 0.

    Now, this might not affect the ranking of candidates at all, but we won't know until we have the non-zero results. The C code could be modified to record non-zero runs as well as zero runs quite easily. As I don't have 4GB of RAM, I tested only the first and last 64K blocks, i.e. high word of 32-bit output was 0000 or FFFF, in x86 assembly. I could deduce the equation for expected nzfreq(x) and testing Chris's generator first should confirm it.
  • TonyB_TonyB_ Posts: 2,195
    edited 2019-04-24 17:29
    Expected pfreq(x) = 1/x! * N * 1/e ... well-known binomial distribution
    Expected zfreq(x) = (1-1/e)^2 * N * (1/e)^x ... proven
    Expected nzfreq(x) = (1/e)^2 * N * (1-1/e)^x ... unproven EDIT proven

    N = 2^32. Very nice symmetry in the zfreq and nzfreq equations.

    zfreq(x) > nzfreq(x) when x = 1
    zfreq(x) = nzfreq(x) when x = 2
    zfreq(x) < nzfreq(x) when x > 2

    Rounding to the nearest integer, the highest frequencies > 0 are pfreq(12), zfreq(21) and nzfreq(45).
  • TonyB_TonyB_ Posts: 2,195
    edited 2018-10-04 01:15
    Expected distribution values to the nearest integer are listed below. Evan, could you please modify your C program to also record nzfreq for Chris's generator? If results agree with expected values then we'll need nzfreq distributions for some of the xoroshiro32++ candidates. I suggest testing the top 10 for zfreq to see if their ranking is the same for nzfreq and if so we could probably stop there.

    pfreq(x) proven
     x,   expected
     0, 1580030169
     1, 1580030169
     2,  790015084
     3,  263338361
     4,   65834590
     5,   13166918
     6,    2194486
     7,     313498
     8,      39187
     9,       4354
    10,        435
    11,         40
    12,          3
    13,          0
    

    zfreq(x) proven
     x,   expected
     1,  631342768
     2,  232258025
     3,   85442952
     4,   31432706
     5,   11563446
     6,    4253954
     7,    1564942
     8,     575710
     9,     211792
    10,      77914
    11,      28663
    12,      10544
    13,       3879
    14,       1427
    15,        525
    16,        193
    17,         71
    18,         26
    19,         10
    20,          4
    21,          1
    22,          0
    

    nzfreq(x) unproven EDIT proven
     x,   expected
     1,  367426785
     2,  232258025
     3,  146815072
     4,   92804826
     5,   58663838
     6,   37082618
     7,   23440685
     8,   14817339
     9,    9366344
    10,    5920659
    11,    3742570
    12,    2365756
    13,    1495443
    14,     945300
    15,     597544
    16,     377720
    17,     238764
    18,     150928
    19,      95405
    20,      60307
    21,      38121
    22,      24097
    23,      15232
    24,       9629
    25,       6087
    26,       3847
    27,       2432
    28,       1537
    29,        972
    30,        614
    31,        388
    32,        245
    33,        155
    34,         98
    35,         62
    36,         39
    37,         25
    38,         16
    39,         10
    40,          6
    41,          4
    42,          3
    43,          2
    44,          1
    45,          1
    46,          0
    
  • evanhevanh Posts: 16,032
    TonyB_ wrote: »
    A respin might be the opportunity to improve XORO32 and make it the best it can be by replacing [14,2,7,5] with [13,5,10,9] or [13,5,8,10]. No other candidates really matter at this time.
    They are all much the same to me. I don't have any complaint with the existing [14 2 7 5], [3 2 6 9] also looks particularly good in Crush testing and was a top candidate with Practrand.

    I can't see any particular judgement as better. I also have no clue as to the value of any specific test, including the distributions. I won't complain if you ask Chip to make a change.

  • evanhevanh Posts: 16,032
    edited 2018-10-03 08:34
    TonyB_ wrote: »
    Evan, could you please modify your C program to also record nzfreq for Chris's generator
    Code sample please.

    EDIT: Here's the existing code:
    	// read pair array sequentially forward (fast) and increment freq values (fits in L1 caches easy)
    	pairi = 0;
    	ztally = 0;
    	do {
    		// pair freq values
    		freqi = pairs[pairi++];
    		pfreq[freqi]++;         // 8-bit limited indexing
    
    		// zero-run freq values
    		if( freqi == 0 )  ztally++;
    		else if( ztally != 0 )
    		{
    			if( ztally > 255 )  ztally = 255;
    			zfreq[ztally]++;
    			zfreq[0]++;
    			ztally = 0;
    		}
    	} while( pairi < PAIRPOW-1 );
    
    	// pair freq values, tail record
    	freqi = pairs[pairi];
    	pfreq[freqi]++;
    
    	// zero-run freq values, tail record
    	if( freqi == 0 )  ztally++;
    
    	if( ztally != 0 )
    	{
    		if( ztally > 255 )  ztally = 255;
    		zfreq[ztally]++;
    		zfreq[0]++;
    	}
    
  • TonyB_TonyB_ Posts: 2,195
    edited 2018-10-03 12:19
    evanh wrote: »
    TonyB_ wrote: »
    Evan, could you please modify your C program to also record nzfreq for Chris's generator
    Code sample please.

    EDIT: Here's the existing code:
    	// read pair array sequentially forward (fast) and increment freq values (fits in L1 caches easy)
    	pairi = 0;
    	ztally = 0;
    	do {
    		// pair freq values
    		freqi = pairs[pairi++];
    		pfreq[freqi]++;         // 8-bit limited indexing
    
    		// zero-run freq values
    		if( freqi == 0 )  ztally++;
    		else if( ztally != 0 )
    		{
    			if( ztally > 255 )  ztally = 255;
    			zfreq[ztally]++;
    			zfreq[0]++;
    			ztally = 0;
    		}
    	} while( pairi < PAIRPOW-1 );
    
    	// pair freq values, tail record
    	freqi = pairs[pairi];
    	pfreq[freqi]++;
    
    	// zero-run freq values, tail record
    	if( freqi == 0 )  ztally++;
    
    	if( ztally != 0 )
    	{
    		if( ztally > 255 )  ztally = 255;
    		zfreq[ztally]++;
    		zfreq[0]++;
    	}
    

    It might be better to have separate code for non-zero runs as zero runs already done in their entirety. Changes:

    ztally -> nztally
    zfreq -> nzfreq
    freqi == 0 -> freqi > 0
    zero-run -> non-zero-run
    	// read pair array sequentially forward (fast) and increment freq values (fits in L1 caches easy)
    	pairi = 0;
    	nztally = 0;
    	do {
    		// pair freq values
    		freqi = pairs[pairi++];
    		pfreq[freqi]++;         // 8-bit limited indexing
    
    		// non-zero-run freq values
    		if( freqi > 0 )  nztally++;
    		else if( nztally != 0 )
    		{
    			if( nztally > 255 )  nztally = 255;
    			nzfreq[nztally]++;
    			nzfreq[0]++;
    			nztally = 0;
    		}
    	} while( pairi < PAIRPOW-1 );
    
    	// pair freq values, tail record
    	freqi = pairs[pairi];
    	pfreq[freqi]++;
    
    	// non-zero-run freq values, tail record
    	if( freqi > 0 )  nztally++;
    
    	if( nztally != 0 )
    	{
    		if( nztally > 255 )  nztally = 255;
    		nzfreq[nztally]++;
    		nzfreq[0]++;
    	}
    

    pfreq could be commented out as already done.
  • TonyB_TonyB_ Posts: 2,195
    edited 2018-10-03 12:15
    evanh wrote: »
    TonyB_ wrote: »
    A respin might be the opportunity to improve XORO32 and make it the best it can be by replacing [14,2,7,5] with [13,5,10,9] or [13,5,8,10]. No other candidates really matter at this time.
    They are all much the same to me. I don't have any complaint with the existing [14 2 7 5], [3 2 6 9] also looks particularly good in Crush testing and was a top candidate with Practrand.

    I can't see any particular judgement as better. I also have no clue as to the value of any specific test, including the distributions. I won't complain if you ask Chip to make a change.

    I agree that switching to [3,2,6,9] would be fine if the XORO32 output were 16-bit, but as it's 32-bit using two successive xoroshiro32++ outputs we need to also consider how it compares to an ideal 32-bit generator, which Chris's is close to being. This is where the distributions come in: pfreq, zfreq and the new nzfreq, which might be a better test than zfreq as there are more non-zero frequencies > 0.

    If nzfreq ranking is no different to zfreq, then it's matter of selecting either [13,5,10,9] or [13,5,8,10], based on Crush and BigCrush tests. However, latter not yet run, I think. We should use a candidate that is very good as 16-bit, based on PractRand and TestUI results, and very good as 32-bit, based on the distributions.
  • evanhevanh Posts: 16,032
    edited 2018-10-03 13:07
    I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.

    Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).

    They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.
  • evanhevanh Posts: 16,032
    edited 2018-10-03 13:42
    Just eyeballed the two Practrand grids for Xoroshiro32++ [6 2 3 9]. Nothing below a 512 MB score for 32, 16 and 8-bit grid lines. So looks better than [14 2 7 5] if ignoring all the other apertures.

    Double iterated [13 5 10 9] has one 256 MB at aperture 8>>14.
    Double iterated [13 5 8 10] has a couple of 128 MB and several 256 MB scores on the 8-bit line.

    Of these two, I'd now choose [13 5 10 9].

    I'll get to work on the nzfreq distribution ...
  • evanh wrote: »
    I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.

    Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).

    They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.

    The pfreq and zfreq results for Chris's generator tell me that (a) his generator is excellent and the output is very close to the ideal distributions, (b) as a consequence of (a) the distributions are a good test of randomness and should be applied to XORO32 and (c) the very close match to the expected zfreq proves that the equation I deduced but cannot prove mathematically is correct. A nzfreq test of Chris's generator will prove or disprove that equation.

    A quick reminder of something you wrote on May 6:
    evanh wrote: »
    Tony and I are still searching for better constants to use here even.

    The search is almost over. If I could do the tests that I'm asking you to do myself I would, but I can't.

    P.S. I'm not sure Melissa has given xoroshiro++ a thorough test.
  • evanhevanh Posts: 16,032
    Frequency distribution of non-zero runs for Scro-rarns, sampling aperture 32>>0
      0  998749247
      1  367409237
      2  232249822
      3  146806621
      4  92822979
      5  58653366
      6  37074317
      7  23448505
      8  14822833
      9  9366603
     10  5924351
     11  3742556
     12  2364557
     13  1495255
     14  945073
     15  597224
     16  376964
     17  238020
     18  151211
     19  95677
     20  60491
     21  37941
     22  24158
     23  15310
     24  9608
     25  6148
     26  3831
     27  2415
     28  1568
     29  998
     30  609
     31  350
     32  233
     33  154
     34  100
     35  64
     36  39
     37  19
     38  14
     39  13
     40  2
     41  6
     42  3
     43  0
     44  0
     45  0
     46  2
     47  0
     48  0
     49  0
    
  • Many thanks, Evan, the results prove my nzfreq equation. :)
    I'll create a Chi-square score from the binary later today.
  • evanhevanh Posts: 16,032
    TonyB_ wrote: »
    evanh wrote: »
    I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.

    Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).

    They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.

    The pfreq and zfreq results for Chris's generator tell me that (a) his generator is excellent and the output is very close to the ideal distributions, (b) as a consequence of (a) the distributions are a good test of randomness and should be applied to XORO32 and (c) the very close match to the expected zfreq proves that the equation I deduced but cannot prove mathematically is correct. A nzfreq test of Chris's generator will prove or disprove that equation.
    I see all that as only a correlation, not proof. But then I'm not trying to fully understand it.
    P.S. I'm not sure Melissa has given xoroshiro++ a thorough test.
    True, but the differences aren't huge. The biggest hole in the + scrambler is plugged but I'm sure she'll still have the same gripes about the engine itself. Luckily, it doesn't have the static flow of the ** scrambler. Melissa was most unimpressed with that move.
  • evanh wrote: »
    TonyB_ wrote: »
    evanh wrote: »
    I don't understand what can be achieved by looking at the scores of Chris's tuned algorithm other than to say they are better than any of the Xoroshiro algorithms. Which is no surprise.

    Oops, it was [6 2 3 9], not [3 2 6 9], that did well in Crush. [6 2 3 9] wasn't terrible in the Practrand gridding but did have a low minimum of 16 MB (32 MB double iterated).

    They've all got some weakness. Melissa has it right. Xoroshiro++ might be the best of the Xoroshiro group but it still isn't ideal for quality.

    The pfreq and zfreq results for Chris's generator tell me that (a) his generator is excellent and the output is very close to the ideal distributions, (b) as a consequence of (a) the distributions are a good test of randomness and should be applied to XORO32 and (c) the very close match to the expected zfreq proves that the equation I deduced but cannot prove mathematically is correct. A nzfreq test of Chris's generator will prove or disprove that equation.
    I see all that as only a correlation, not proof. But then I'm not trying to fully understand it.

    Had I been wrong with the equation, the nzfreq test would have disproved it. A better than 99.9% match for Chris's pfreq and zfreq is proof enough for me! :) And the pfreq equation was known and proved already.
  • evanhevanh Posts: 16,032
    edited 2018-10-03 15:31
    I wish I'd asked Chris what "biasing" means - in terms of, what would his example algorithm look like unbiased?

  • evanhevanh Posts: 16,032
    Tony,
    You're looking at the wrong end of the scores for proof. While distribution candidates #1 and #4 do have good Practrand gridding scores, candidates #2 and #5 have rather bad scores. And there is worse still in the top 100 candidates.

  • evanhevanh Posts: 16,032
    Bugger, I've run the wrong script. Forgetting what is what. :( It finished when I went to bed 7 hours ago, too!
  • TonyB_TonyB_ Posts: 2,195
    edited 2018-10-04 01:50
    evanh wrote: »
    I wish I'd asked Chris what "biasing" means - in terms of, what would his example algorithm look like unbiased?

    Does Chris's PractRand (or PracRand) forum still exist on sourceforge.net? I can't view it now on this PC, seeing as it's browser-challenged.
    evanh wrote: »
    Tony,
    You're looking at the wrong end of the scores for proof. While distribution candidates #1 and #4 do have good Practrand gridding scores, candidates #2 and #5 have rather bad scores. And there is worse still in the top 100 candidates.

    Evan, I've always said good scores in distribution tests are no guarantee of good scores in other tests. That's why we need to select the candidate that does well in several tests: distribution, PractRand and TestUI. It has to be said that [14,2,7,5] does badly on distribution: 487 out of 1260 on combined prank+zrank ranking.

    The nzfreq Chi-square score for Chris's generator is 41 for nzfreq(1-45), or 18 for nzfreq(1-16) which make up over 99.9% of the total non-zero runs. This doesn't mean much without other results but I'm sure no xoroshiro32++ will match it. How do you feel about running the full set of nzfreq tests?
  • The engine for Chris's generator is xorshift, which produces equidistributed output and, as well as improving the output quality, his scrambler destroys this equidistribution by ensuring that certain outputs occur more than once and others never. I think this is what he means by "biasing".

    Also a good pfreq score does not imply a good zfreq score and vice-versa.
  • Repeat of earlier post:
    TonyB_ wrote: »
    Here are some combined scores for frequency distributions. The table belows show the top 10 for prank only and zrank only when the two rankings are added and sorted from lowest to highest. The lowest prank is the tiebreaker for equal prank+zrank. Candidates that are not in either top 10 have been omitted.
    #a,  b,  c,  d, prank, zrank, prank+zrank
    13,  5, 10,  9,    6,    5,    1
    13, 13, 14, 13,    3,   10,    2
    14,  2,  9,  9,   12,    1,    3
    13,  5,  8, 10,    8,    7,    4
     7,  1,  8,  6,   14,    9,    5
     7,  2, 14, 10,   30,    6,    6
    14,  2,  9, 10,   35,    8,    8
     7,  2, 14, 13,   41,    2,    9
     8,  1,  7,  5,   47,    3,   12
     5,  2,  6,  5,    5,   51,   16
    11,  2,  6,  2,    4,   61,   18
     3, 11, 14,  7,    9,   62,   19
     9,  2, 14, 13,    1,   80,   24
     6,  2, 11,  3,   83,    4,   26
     8,  5, 13,  2,    2,  238,  112
    15,  4, 12, 14,    7,  272,  133
     2,  8,  7, 11,   10,  351,  160
    

    With nzfreq results we could have combined prank+zrank+nzrank.
  • evanhevanh Posts: 16,032
    You're trying to use Chris's single data point as representative for proof that distribution is a good predictor. Yet, the much wider selection of best distribution Xoroshiro candidates demonstrates it's just random.

  • evanhevanh Posts: 16,032
    edited 2018-10-04 01:47
    I've started the paired distribution run, with nzfreq, for a full sweep of the 1260 candidates again. Five minutes per candidate, seven candidates in parallel -> 15 hours.

    EDIT: It's reproducing all the distribution data. Even with all three data sets, pfreq, zfreq and nzfreq, being produced, that part of the program is still only 5% of execution time. Filling the initial 4 GB of pair tallies takes the other 95%.

    EDIT2: I can feel the sluggishness of the computer while it's doing this. Parctrand doesn't have this much impact even when hogging much more CPU. It'll be cache/RAM thrashing because of the large memory allocations, causing a lot of blocked waiting. Of note, when running a single case the run time is four minutes instead of five.
  • TonyB_TonyB_ Posts: 2,195
    edited 2018-10-05 01:30
    evanh wrote: »
    I've started the paired distribution run, with nzfreq, for a full sweep of the 1260 candidates again. Five minutes per candidate, seven candidates in parallel -> 15 hours.

    EDIT: It's reproducing all the distribution data. Even with all three data sets, pfreq, zfreq and nzfreq, being produced, that part of the program is still only 5% of execution time. Filling the initial 4 GB of pair tallies takes the other 95%.

    EDIT2: I can feel the sluggishness of the computer while it's doing this. Parctrand doesn't have this much impact even when hogging much more CPU. It'll be cache/RAM thrashing because of the large memory allocations, causing a lot of blocked waiting. Of note, when running a single case the run time is four minutes instead of five.

    Thanks for starting the tests, Evan. Best to leave the computer alone to get on with it!
  • evanhevanh Posts: 16,032
    Done. I cut it a little short at the end so there is some empty files. Wanted to go to bed and those candidates are known poor quality.
  • evanh wrote: »
    Done. I cut it a little short at the end so there is some empty files. Wanted to go to bed and those candidates are known poor quality.

    Thanks, Evan! I've had a quick look and [13,5,10,9] ranking is 1 for nzfreq.
  • Top 20 distributions based on combined pfreq + zfreq + nzfreq rankings, plus selected others:
    #a,  b,  c,  d, prank, zrank, nzrank, prank+zrank+nzrank
    13,  5, 10,  9,    6,    5,    1,    1
    13,  5,  8, 10,    8,    7,    6,    2
    14,  2,  9,  9,   12,    1,   17,    3
     7,  1,  8,  6,   14,    9,    8,    4
     7,  2, 14, 13,   41,    2,    4,    5
     7,  2, 14, 10,   30,    6,   16,    6
    14,  2,  7, 10,   34,   19,    5,    7
    14,  2,  9, 10,   35,    8,   20,    8
     8,  1,  7,  5,   47,    3,   14,    9
    11,  1,  2, 13,   27,   16,   25,   10
     3,  3, 10,  2,   43,   13,   21,   11
     8,  7, 15, 13,   21,   24,   41,   12
    15,  6,  2,  9,   57,   34,    3,   13
     6,  2, 11,  3,   83,    4,   12,   14
    14, 12, 13,  6,   17,   33,   62,   15
    13, 13, 14, 13,    3,   10,  103,   16
    14,  2,  7,  6,   78,   14,   33,   17
     7,  2, 14,  3,   52,   37,   51,   18
    14,  2,  7, 15,   16,   58,   70,   19
    11,  2,  6,  2,    4,   61,   80,   20
    
     6,  2,  3,  9,   79,   30,   90,   37
    
     3,  2,  6,  9,  249,  226,  114,  161
    
    14,  2,  7,  5,  477,  491,  353,  418
    
  • TonyB_ wrote: »
    A quick reminder of something you wrote on May 6:
    evanh wrote: »
    Tony and I are still searching for better constants to use here even.

    The search is almost over.

    I think the search is now over and [13,5,10,9] is the one.

    Something that we should prove is that the distribution results are identical when they are generated in the same way as XORO32. To save time, the C code saves the previous 16-bit output and uses it again. What was the high word in one 32-bit output becomes the low word in the next. XORO32 repeats each of the 2^32-1 iterations but not until 2^32-2 others have been done first. The ordering of the data is different but the distributions should be the same and checking a single candidate would suffice.
  • evanhevanh Posts: 16,032
    So you want double iterated 32-bit output from the Xoroshiro generator. And remove the pairing from the distribution test code so that it still only has to deal with 32-bit indexing. Right?

Sign In or Register to comment.