Shop OBEX P1 Docs P2 Docs Learn Events
ALTDS replaced with ALTI/ALTR/ALTD/ALTS - Page 2 — Parallax Forums

ALTDS replaced with ALTI/ALTR/ALTD/ALTS

2»

Comments

  • cgraceycgracey Posts: 14,152
    David Betz wrote: »
    78rpm wrote: »
    cgracey wrote: »
    If you look at the opcodes, there's no more bits to use for something like 'indirect'.

    It either has to be via special registers, or with a prefix instruction like I've done here.

    I am being very non-serious, but my isr is used as a code example in this thread. Consider changing ram size from 32 to 33-1/3 bits per long, the extra bit and a bit will facilitate long play. :crazy: :clown: :crazy:

    If we did that wouldn't we have to also switch from digital to analog?

    Not if we go with direct servo drive.
  • cgracey wrote: »
    David Betz wrote: »
    78rpm wrote: »
    cgracey wrote: »
    If you look at the opcodes, there's no more bits to use for something like 'indirect'.

    It either has to be via special registers, or with a prefix instruction like I've done here.

    I am being very non-serious, but my isr is used as a code example in this thread. Consider changing ram size from 32 to 33-1/3 bits per long, the extra bit and a bit will facilitate long play. :crazy: :clown: :crazy:

    If we did that wouldn't we have to also switch from digital to analog?

    Not if we go with direct servo drive.
    Powered by a flux capacitor no doubt.

  • Only some food for thought...

    Since ALTR, ALTD and ALTS are "prefixing" instructions, Isn't it possible to "extend" at least some of the instructions that were "prefixed", by adding "extra" GOx stage(s) to them, thus gaining enough time for the addition(s) to happen?

    I know the way I've described above will not be exactly the one they'll perform. It'll be more likely to have two decode paths for the same instruction, discriminated by selectors resulting from the prefixing instructions that were executed before them, but I'm sure you could understand what i meant.

  • cgraceycgracey Posts: 14,152
    Yanomani wrote: »
    Only some food for thought...

    Since ALTR, ALTD and ALTS are "prefixing" instructions, Isn't it possible to "extend" at least some of the instructions that were "prefixed", by adding "extra" GOx stage(s) to them, thus gaining enough time for the addition(s) to happen?

    I know the way I've described above will not be exactly the one they'll perform. It'll be more likely to have two decode paths for the same instruction, discriminated by selectors resulting from the prefixing instructions that were executed before them, but I'm sure you could understand what i meant.

    That would be possible, and it would make the instructions one cycle longer. However, it would mean making a MULTICYCLE assignment, which opens a Pandora's box of other, necessitated MULTICYCLE assignments. I used to have parts of the chip settle over two clocks, but the ins and outs of making every requisite MULTICYCLE assignment was getting to be way too complicated and I was not 100% confident that I covered every case. The trouble is, logic flows all over the place, in ways you don't realize, at first. Unless you go from only one set of flops to another, which is actually not practical in a design where signals are borrowed all over, you may never get it straight.

    A few months ago, I got rid of all the MULTICYCLE paths in the Prop2 and inserted some flops in key places, so that the tool could resolved all timing, based on one constant, everywhere. I only suffered a small Fmax reduction, but now I KNOW that I haven't left anything out. So, I don't want to go back to the MULTICYCLE abyss.
  • Thanks Chip, by showing why it couldn't follow that path.
  • ElectrodudeElectrodude Posts: 1,657
    edited 2015-11-04 04:18
    What if you change the immediate version of ALT{R,D,S} to act like PTRx? In other words, when bit 8 of the immediate source is set, D will be affected in the same way the address of a "RDLONG x, #%1_SUPNNNNN" would be affected. Then, you could also use PTRx (which would now just be a normal register) for stuff like LUT and cogram and anything else involving things like array[x += y].
            ALTS    myptr, #%1SUPNNNNN
            RDLONG  x,     0-0
    
    myptr   res 1
    

    EDIT: This wouldn't work for RDLONG, but I still think it would be useful in other cases.
  • evanhevanh Posts: 15,915
    Electrodude,
    I don't think there is any gain there because the ALT prefixes is always a full accumulate from S to D. That's very flexible. The PTRA/B based operations are far more limited in inputs so needs the special limited indexing instead.
  • evanhevanh Posts: 15,915
    In other words, it think the new arrangement totally covers everything that PTRA/B can do, and more. The sacrifice is it's a little bigger and slower.
  • evanhevanh Posts: 15,915
    edited 2015-11-04 11:26
    Oops, I'm flat wrong. The new ALTR/ALTD/ALTS instructions don't accumulate at all. It's just an add.

    However, the ALTI instruction maintains the old ALTDS functionality which could increment similarly to the PTRx operations. So, we may still have an equivalent ... not sure, I remember it being difficult to handle the S fields which is where the HubOps address HubRAM from.
  • evanhevanh Posts: 15,915
    Electrodude,
    Why don't you think your suggestion will work for a RDLONG?
  • cgracey wrote: »
    In the next release...

    What was ALTDS is now ALTI.

    There are three new instructions that share the opcode space with ALTI (no more C/Z writing options, as they were meaningless for these instructions):

    ALTR D,S/# - use the sum of D and S/# for the result register in the next instruction
    ALTD D,S/# - use the sum of D and S/# for the D register in the next instruction
    ALTS D,S/# - use the sum of D and S/# for the S register in the next instruction

    The idea is that D is an offset and S/# is a base:

    ALTx offset,#base

    So, now we'll have simple-to-use instructions for R/D/S alterations.

    This cleans up 78rpm's task switcher.

    OLD CODE
    task_switcher   ' this is the task switcher isr
            ' save current task
                    addct1  task_time, ##TASKS_TIMER    
                    mov     modify, curr_task_index
                    add     modify, #task_ctrl_blk
                    altds   modify, #%000_100_000           ' replace D reg
                    mov     0-0, IRET1                      ' tcb[ task index]
                    add     curr_task_index, #1             ' next task
                    and     curr_task_index, #$3            ' in range 0 - 3
            ' set new task
                    mov     modify, curr_task_index
                    add     modify, #task_ctrl_blk
                    altds   modify, #%000_000_100           ' replace S reg
                    mov     IRET1, 0-0                      ' tcb[ task index]
    
                    reti1
    

    NEW CODE
    task_switcher   addct1  task_time,##TASKS_TIMER         'set next interrupt time
    
                    altd    curr_task_index,#task_ctrl_blk  'save current task
                    mov     0,IRET1
    
                    incmod  curr_task_index,#3              'inc/reset task pointer
    
                    alts    curr_task_index,#task_ctrl_blk  'set new task
                    mov     IRET1,0
    
                    reti1                                   'execute new task
    


    It may be too late in the game to do this, but I will throw this idea into the pot here.

    There are pros and cons with what I suggest here.

    Reduce the CCCC conditional execution field from 4 bits to 2 bits. This allows the IF_Z and IF_C combination fields to still be utilised directly.

    Introduce a FLAGS / COND / CONDREP/ STATUS / CHECK instruction which perfoms the condition check required and sets a counter, in a similar fashion to REP, to specify that the next sequence of instructions is an extended conditional, ie, IF_BE which is IF_C | IF_Z == 1.
    
    	if_z	jmp	#elsewhere		' IF_Z and varients still function for speed
    
    '		FLAGS	#sequence, #condition
    
    		flags	#.cond_seq, IF_BE
    		add	my_value, #1		' only executed IF_BE
    		shl	my_value, #3		' only executed IF_BE
    .cond_seq					' end of conditional sequence
    		add	my_value, #1		' this is always executed, outside of conditional
    	if_c	jmp	#somewhere
    		flags	#.old_code, IF_NEVER
    		shl	my_value, #4		' this is never executed.
    .old_code
    

    The FLAGS / COND / CONDREP / STATUS / CHECK instruction has the same attributes as the REP instruction. It inhibits interrupts until complete, and it is cancelled when a branch is executed.

    PNut and other assemblers can generate an error if an extended conditional, ie IF_BE, is specified without an enclosing FLAGS specifier.


    The 2 new bits from the CCCC field are for pointer selection, but first we rename:
    	adra to ptrc
    	adrb to ptrd
    

    This allows ptra and ptrb to continue to be associated with; loc, calla/b, reta/b, pusha/b, and popa/b. In HLL use, one may be used for the stack pointer and the other for the stack frame pointer.

    Ptrc and ptrd are used with all other instructions.


    The pointer select bits indicate if ptrc and/or ptrd are used in theinstruction in a similar manner to RDLONG etc, but use ptrc and d instead. This allows pointer operation on all instructions, with 1 bit for S reg and 1 bit for D reg. When specified, the pattern in S or D is similar to current RDLONG etc; 1SUPnnnnn. However we do not need the always 1 bit, therefore for our 9 bits we get SUPnnnnnn.
    	S selects ptrc or ptrd
    	U specifes pointer update
    	P specifes pointer post modification
    	nnnnnn = index -32 to 31
    

    As pointers can now be specified for all instructions, we can now do:
    		loc	ptrc, #my_long_array	' array or long
    		loc	ptrd, #my_structure	' a data array structure
    		rep	#.loop, #num_entries
    		mov	ptrc++, ptrd[8]
    		add	ptrd, #sizeof_structure ' next element of array structure
    .loop
    

    It may be possible to remove the RDLONG etc instructions externally, or keep them to show explicit Cog/Hub movement, if address resolution in the Cog can operate on a move:
      address bits b31:b9 == 0, then cog
      address bits b31:11 == 0 and not cog address, then lut
      not cog and not lut then hub.
    
    So the verilog internally can deduce that mov #$100, #$408 would be hub to cog move and execute accordingly.


    So what do we gain from this:
      Four flexible pointers, double the index range, instead of two. 
      No additional Special Registers.
      Possibly freeing or reallocating instruction opcodes for RDLONG etc externally
    

    What do we lose:
      Required to place conditional execution except for most basic Z and C in a block.
    

    Logic wise:
      An additional counter for the FLAGS / COND execution.
      An additional indexing logic / circuit per cog.
      Additional logic to prohibit interrupts whilst the conditional block is executing.
    

    Also, can this be done without slowing the clock speed?
  • evanhevanh Posts: 15,915
    Ugh! I know I can't talk but ... New topic please!
  • evanh wrote: »
    Electrodude,
    Why don't you think your suggestion will work for a RDLONG?

    Reposting code for reference:
            ALTS    myptr, #%1SUPNNNNN
            RDLONG  x,     0-0
    
    myptr   res 1
    

    That would not act like PTRx. It would instead make RDLONG get the address from a different register each time.

    Using "RDLONG x, #0-0" instead wouldn't work either (when address & $100 <> 0), because then you would be doing indirect PTRx configuration or something silly that wouldn't make any sense (quadraticly increasing addresses?).
  • evanhevanh Posts: 15,915
    Oh gee, it's a mind bender. Bugger, all those ideas are out the door. :(
Sign In or Register to comment.