Shop OBEX P1 Docs P2 Docs Learn Events
XORO32 scrambler output — Parallax Forums

XORO32 scrambler output

Chip,
A couple of times now I've wanted to feed another operation directly with a random number. But in both cases it has been a D operand so XORO32 couldn't do it without an intermediate MOV.

A variation of XORO32 that feeds the D port instead of the S port would be nice to have.

Comments

  • cgraceycgracey Posts: 14,152
    evanh wrote: »
    Chip,
    A couple of times now I've wanted to feed another operation directly with a random number. But in both cases it has been a D operand so XORO32 couldn't do it without an intermediate MOV.

    A variation of XORO32 that feeds the D port instead of the S port would be nice to have.

    Okay. Let me look into it....

    It could be done. Could you give me an example of how you would use it?
  • evanhevanh Posts: 15,915
    WRLONG was the previous one I think. But just now I was wanting to use SETDACS, albeit just for diagnostics.
  • evanhevanh Posts: 15,915
    edited 2019-01-13 10:18
    With something like ADD it gets really interesting because it becomes a three operand arrangement with ALU result port retaining the specified D address of the ADD instruction.
  • evanhevanh Posts: 15,915
    Heh, oh, there's something to ponder for Prop3 architecture. Have a bit-field in all opcodes just for specifying if the ALU result goes to specified D or to next instruction's D input.
  • cgraceycgracey Posts: 14,152
    edited 2019-01-13 10:36
    evanh wrote: »
    Heh, oh, there's something to ponder for Prop3 architecture. Have a bit-field in all opcodes just for specifying if the ALU result goes to specified D or to next instruction's D input.

    That's a really neat idea.

    As far as having XORO32 report to D, can you give me a more compelling case? It takes more logic, so it needs to be worth it. XORO32 is an oddball instruction in how it works and it's really nice that it reports to one of D or S. I think S is way more useful if you could pick only D or S, and the S circuit exploits the same path that SCA/SCAS use. An alternate D path would take a whole new set of circuitry.
  • evanhevanh Posts: 15,915
    edited 2019-01-13 10:44
    You know what, SCA can probably work better outputting to next D input itself. Maybe eliminate the S versions of both.
  • cgraceycgracey Posts: 14,152
    evanh wrote: »
    You know what, SCA can probably work better outputting to next D input itself. Maybe eliminate the S versions of both.

    But, then you couldn't do a multiply-accumulate:

    SCA X,Y
    ADD A,B

    The multiply result would get added to B and then written to A.

    The way it works now, B is ignored and the multiply result gets added into A.

    Am I missing something?
  • TonyB_TonyB_ Posts: 2,178
    edited 2019-01-14 00:31
    XORO32 is a compound instruction that takes four cycles if both state and PRN are to be updated:
    	XORO32	state
    	ANYDS	PRN,0-0
    
    where ANYDS is any D,S instruction (not just a MOV) with S replaced by XORO32 PRN output.

    It might be possible to skip PRNs as follows:
    	XORO32	state
    	XORO32	state		' PRN in S field ignored?
    	ANYDS	PRN,0-0
    

    One of the C or Z opcode bits in XORO32 could be used to specify which of S or D is replaced by the PRN in the next instruction, but is it really worth it?
  • evanhevanh Posts: 15,915
    cgracey wrote: »
    SCA X,Y
    ADD A,B

    The multiply result would get added to B and then written to A.

    The way it works now, B is ignored and the multiply result gets added into A.
    "B is ignored" is the key to it working just as well for A too. ADD A,A is perfectly fine solution there. And could be considered tidier looking even.

  • evanhevanh Posts: 15,915
    edited 2019-01-13 20:32
    TonyB_ wrote: »
    	XORO32	state
    	XORO32	state		' PRN in S field ignored?
    	ANYDS	PRN,0-0
    
    Correct, the first output is discarded. I have used that to "jump" in testing.

  • evanh wrote: »
    cgracey wrote: »
    SCA X,Y
    ADD A,B

    The multiply result would get added to B and then written to A.

    The way it works now, B is ignored and the multiply result gets added into A.
    "B is ignored" is the key to it working just as well for A too. ADD A,A is perfectly fine solution there. And could be considered tidier looking even.

    Wouldn't it stay as ADD A,B?

    Replacing the S field has at least two drawbacks:

    1. D must be set to something beforehand for many D,S instructions, probably most, requiring an extra instruction.
    2. An immediate or register constant cannot be used, again adding an extra instruction for these cases.

    It's not clear cut that replacing S is better than D. In fact, replacing D might be the best option overall, assuming that not reading D but writing D is no more complicated than not reading S.

    GET/SET/ROLxxxx wouldn't work as they do now, however.
  • evanhevanh Posts: 15,915
    Yeah, that's the basis of thinking for overriding next instruction's D input instead of S input.

    As for why I said "ADD A,A", that is because SCA is intended for multiply-accumulate function, and the ADD is the accumulate part. To do that the register to be accumulated to has to be both the input and output. D would normally provide exactly that. But if SCA overrides D input, leaving D output intact, then S input can be pointed to the same register to achieve an accumulator.

    ALTxx + GET/SET/ROLxxxx are their own case. Same for all ALTx instructions.
  • cgraceycgracey Posts: 14,152
    I'm confused. What's the consensus here?
  • evanhevanh Posts: 15,915
    My interpretation is Tony was running over the pro's and con's of overriding D inputs vs S inputs and concluded that D always seemed best as a generalisation. The one exception is that example above where one wanted to rapidly iterate XORO32 on its own.

    SCA, since it'll create the three operand arrangement, might even gain a new ability by changing to overriding the next instruction's D input.
  • evanhevanh Posts: 15,915
    edited 2019-01-14 21:53
    I suppose that's the one advantage overriding S input does have, overriding the D input destroys the fastest case of self-recursion.
    EDIT: Correction, overriding the D input destroys the fastest case of self-recursion only when it's a single operand instruction like XORO32.

    EDIT2: Hmm, was just trying to picture recursing ADD with an override D but that doesn't have any advantage over a regular ADD, it's too simple an instruction. XORO32 is unique in this way I think.
  • evanhevanh Posts: 15,915
    In the Prop2:
    MOV  pa,raw
    ADD  pa,offset
    MUL  pa,scale
    CALL  #graph
    
    In the Prop3, it becomes:
    ADD  raw,offset   WQ
    MUL  pa,scale
    CALL  #graph
    
    :D That is cool! Covers the very reason why three-operand instructions ever existed while also eliminating any need for superscalar "move elimination".
  • TonyB_TonyB_ Posts: 2,178
    edited 2019-01-15 00:48
    cgracey wrote: »
    I'm confused. What's the consensus here?

    My change of mind has probably caused some of the confusion and apologies if so.
    evanh wrote: »
    My interpretation is Tony was running over the pro's and con's of overriding D inputs vs S inputs and concluded that D always seemed best as a generalisation. The one exception is that example above where one wanted to rapidly iterate XORO32 on its own.

    SCA, since it'll create the three operand arrangement, might even gain a new ability by changing to overriding the next instruction's D input.

    Skipping PRNs by following one XORO32 with another was not intended as a serious practical suggestion.

    I now think that P2 rev B would be improved if the SCA/SCAS/XORO32 instructions were changed so that the output is used as the next instruction's D value.

    SCA/SCAS would work just as well as now (better actually) and XORO32 would benefit even more. Code would be smaller and faster for most cases. No need for both S and D options, just the latter.

    If it's too late or D is too difficult, we could live with S.
  • May I suggest that it would be a really, really good idea to minimize changes in the instruction set. We're trying to get tools developed and used now, and instruction set changes are just going to screw up any libraries that users develop on the P2 eval board. Not to mention that requiring changes in the tools, and having the tools support both revs of silicon, are going to be more difficult the more changes that get made.

    At some point the P2 needs to be labelled as "done"!!
  • evanhevanh Posts: 15,915
    Everyone is sensitive to that concern Eric.

    I wouldn't have raised the topic if I thought it would have a significant impact on libraries. Or any impact for that matter. I'd be surprised if anyone has built a library around XORO32 or SCA yet.

  • For the SCA+ADD case, if you always write `ADD A, A` on code intented for the P2-ES. which does it the S way, your code would still work on the next silicon revision if it were changed to do it the D way.
  • For the SCA+ADD case, if you always write `ADD A, A` on code intented for the P2-ES. which does it the S way, your code would still work on the next silicon revision if it were changed to do it the D way.

    Yes and the D way would give you the extra functionality of ADD A,B. SCA/SCAS could have identical code for P2 revs A and B, but that would not apply to XORO32, e.g. writing the PRN to cog RAM:
    	XORO32	state		' PRN = S in next instruction
    	MOV	PRN,0
    ' or
    	XORO32	state		' PRN = D in next instruction
    	ADD	PRN,#0
    

    However, the D way would allow writing the PRN to LUT/HUB RAM or to DACs or bit testing with the minimum possible coding and often XORO32 would be only a net two cycle instruction.

    I'm a bit sceptical that a future P3 would cater for both S and D, so the decision made now is likely to be permanent.
  • ElectrodudeElectrodude Posts: 1,657
    edited 2019-02-11 16:48
    Can single-operand assembler aliases for `MOV x, x` (to receive the result of XORO32) and `ADD x, x` (to receive the result of SCA) be added to encourage people to write forward-compatible code in case this change is ever made?
  • TonyB_TonyB_ Posts: 2,178
    edited 2019-02-11 18:51
    Can single-operand assembler aliases for `MOV x, x` (to receive the result of XORO32) and `ADD x, x` (to receive the result of SCA) be added to encourage people to write forward-compatible code in case this change is ever made?

    But would MOV x,x actually work as intended after XORO32 if the PRN is in the D field?
  • evanhevanh Posts: 15,915
    Good, point Tony. MOV doesn't use D operand as an input.
    ADD dest, #0 would do the equivalent job. Of course this scrappy looking code line makes Electrodudes point more urgent.
  • You're right. On thinking about it more, all I could come up with that would work both ways for XORO32 are tricks equivalent to your `ADD dest, #0` that, while they are sufficiently random, aren't what XORO32 is supposed to output.

    If it's OK for it to compile differently for different chips, the alias could translate to `MOV x, #0` on the current chip and `ADD x, #0` if it's changed.
  • evanhevanh Posts: 15,915
    Yeah, it's not really an alias but, like the AUGx instruction, with a difference in the parameters to the mnemonic the assembler can reassemble to different combinations. Existing code can stay as is. eg:
    		xoro32	state, result
    
    'with S port insertion, could reassemble to
    		xoro32	state
    		mov	result, 0-0
    
  • evanhevanh Posts: 15,915
    And:
    		xoro32	state, result
    
    'with D port insertion, could reassemble to
    		xoro32	state
    		add	result, #0
    
  • evanhevanh Posts: 15,915
    edited 2019-02-11 23:54
    I suppose that is still an alias, just composed of multiple instructions is all. Make it user definable and it becomes a macro. :)
  • As pointed out by Seairth here, GETXACC would also be affected by changing from S to D, but it would act in a similar way as XORO32. Putting Goertzel X into D and Y into the next D has a nice symmetry and overall D is better anyway, I think.
Sign In or Register to comment.