Shop OBEX P1 Docs P2 Docs Learn Events
INDA/IINDB alternative? — Parallax Forums

INDA/IINDB alternative?

SeairthSeairth Posts: 2,474
edited 2013-11-22 11:59 in Propeller 2
What if INDA/INDB were changed as follows:
  • Make them 11-bit registers, where the two MSBs have the same encodings as found in the COND bits, and the nine LSBs are the same as before.
  • Add an instruciton (or two) to set the two MSBs.

The thought here is that this would allow COND bits to be used with INDA/INDB instructions. This is all predicated on the idea that code using INDA/INDB will typically be mutating those registers in a consistent manner (e.g. iterating over an array), so mode changes (e.g. switching from pre-increment to post-increment) would be relatively infrequent. This would also simplify the pipeline processing a bit, which can only be a good thing. Note that this would also require the SETINDx opcode to be moved to the extend instruction set and broken into two separate instructions. It might be possible to use the Z/C flags to also set the two-bit mode within the same instruction, since that's when you're likely to do a mode change anyhow.
«1

Comments

  • SeairthSeairth Posts: 2,474
    edited 2013-11-19 09:38
    Hmm. Of course, I've already thought of a counter-example (and a rather obvious one, I'm sad to say): JMPRET INDA,++INDA WZ,WC (i.e. TASKSW). I suppose you could make the registers 13-bit, allowing separate d-field and s-field modes.
  • cgraceycgracey Posts: 14,206
    edited 2013-11-19 17:01
    Seairth wrote: »
    Hmm. Of course, I've already thought of a counter-example (and a rather obvious one, I'm sad to say): JMPRET INDA,++INDA WZ,WC (i.e. TASKSW). I suppose you could make the registers 13-bit, allowing separate d-field and s-field modes.

    The big problem is that all these INDA/INDB pointer usages and modifications occur very early in stage 2 of the pipeline, whereas conditional execution takes place in stage 4. You couldn't know, two instructions in advance, which INDA/INDB modifications shouldn't occur in stage 2, in order to maintain the correct INDA/INDB values. Also, register addresses must be resolved in stage 2.

    Having said that, I got initially excited about your proposal. I had to try it out in my head before I recognized the old, familiar problem that I've faced a dozen times, already. Working on Prop2 is becoming, in some ways, like being stuck in a "fun house" with warped mirrors covering every wall. I can say I'm almost out, though.
  • SeairthSeairth Posts: 2,474
    edited 2013-11-19 18:35
    Ah, that's a good point. Even if you could delay the update of INDA/INDB to stage 4, you would have to put two instructions between sequential uses of INDA/INDB (when in incrementing/decrementing modes). In that case, I suppose you could propagate the updated INDA/INDB back through the pipeline like you do for the d-field, but that might still mean that you have a one-instruction gap requirement. A two-instruction gap might still be worth considering, though. We already expect a similar rule for self-modifing code. This would at least ensure that all side-effects are strictly limited to stage 4.


    NOTE: your comments made me realize that, with the current design, one must be very careful to avoid using INDA/INDB in the next two (or maybe just one?) instructions that immediately follow a non-delayed branch instruction. Otherwise, even though the instruction itself is *never* meant to be executed prior to the branch (like would happen with a delayed branch instruction), the INDA/INDB registers would still be updated due to the pipeline behavior.
  • rjo__rjo__ Posts: 2,114
    edited 2013-11-19 18:53
    Chip,

    The operative word is "fun." The entire Prop2 experience is fun and is about to become "funnest." I only wish you had been born 20 years earlier… or that I had been born twenty years later.


    Rich
  • cgraceycgracey Posts: 14,206
    edited 2013-11-19 20:16
    Seairth wrote: »
    NOTE: your comments made me realize that, with the current design, one must be very careful to avoid using INDA/INDB in the next two (or maybe just one?) instructions that immediately follow a non-delayed branch instruction. Otherwise, even though the instruction itself is *never* meant to be executed prior to the branch (like would happen with a delayed branch instruction), the INDA/INDB registers would still be updated due to the pipeline behavior.

    You're right, that WAS a problem, but I fixed it early on. I added an extra bit to each pipeline stage that signals whether or not that stage is still valid, regardless of the 4-bit condition field. So, those instructions trailing branches that get cancelled don't wreak the havoc that you might suppose.
  • cgraceycgracey Posts: 14,206
    edited 2013-11-19 20:24
    rjo__ wrote: »
    Chip,

    The operative word is "fun." The entire Prop2 experience is fun and is about to become "funnest." I only wish you had been born 20 years earlier… or that I had been born twenty years later.


    Rich

    I hope a lot of people wind up feeling that way. In this day, there seems to be an attitude that 'implementation' is a bad word. Everything should be written in a high-level language and nobody should ever have to get to know anything about any hardware, unless maybe VHDL or Verilog defines it.

    There will be specifics to learn and exploit to use the Prop2 well, and I get a big charge out of that kind of thing. You and I think it's fun. I hope the world isn't too spoiled, already, to enjoy it, also. The implication of the other mode of thought is that a processor might as well be designed only to run compiled code efficiently, and an assembly-language programmer's perspective is utterly irrelevant.

    This chip is made to be enjoyed in assembly language, first.
  • ozpropdevozpropdev Posts: 2,793
    edited 2013-11-19 20:44
    cgracey wrote: »
    This chip is made to be enjoyed in assembly language, first.

    I think your quote Chip will become an HISTORIC one! :)

    It's the PRIMARY reason I use propeller chips!

    Fun mixed with power, a rare combination.
  • potatoheadpotatohead Posts: 10,261
    edited 2013-11-19 22:05
    Seconded. PASM brought back the fun in this stuff for me. Rock on Chip!

    Can't wait. I've been holding off until this update is done. Seems like it's a final kind of thing, so it will be time to dig in, unlearn some stuff, then go forward on what is now to be the P2!
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-11-19 22:25
    ozpropdev wrote: »
    I think your quote Chip will become an HISTORIC one! :)

    It's the PRIMARY reason I use propeller chips!

    Fun mixed with power, a rare combination.
    I could not have said it any better than this.

    I love the regular P1 instruction set. P2 is of course more complex, but with the new mods, the P2 set is still quite regular. And when I don't need the speed - there is almost always a section of main code that doesn't require speed - I use spin.

    Just today, a project that is almost shipping requires an extra mod - I have just added another cog (I have a few spare in my 3 prop solution :) ) to do that part in assembler with almost no changes to the main code. However, if it was on a single micro, it would have been a big mess to do!
  • SeairthSeairth Posts: 2,474
    edited 2013-11-20 03:46
    cgracey wrote: »
    You're right, that WAS a problem, but I fixed it early on. I added an extra bit to each pipeline stage that signals whether or not that stage is still valid, regardless of the 4-bit condition field. So, those instructions trailing branches that get cancelled don't wreak the havoc that you might suppose.

    Okay, so if the indirect registers are updated only when an instruction is shifting from stage 2 to 3 (and it's still valid), then you would have to make sure that there was at least one instruction between a non-delayed branch and the indirect-addressing instruction. Is that correct?
  • cgraceycgracey Posts: 14,206
    edited 2013-11-20 07:37
    Seairth wrote: »
    Okay, so if the indirect registers are updated only when an instruction is shifting from stage 2 to 3 (and it's still valid), then you would have to make sure that there was at least one instruction between a non-delayed branch and the indirect-addressing instruction. Is that correct?

    No. For stage 1, there is no flop yet. It's a logic signal. Thereafter, for stages 2 and 3, there are flops combined with logic after the Q output. So, the 'valid' flops propagate forward, but can be cancelled at any stage. Once an instruction is cancelled as it travels through the pipeline, it stays cancelled. Cancellation can happen in stages 1 through 3. If an instruction is still valid at stage 4, it was never cancelled, or never a trailing instruction behind a cancelling branch.
  • SeairthSeairth Posts: 2,474
    edited 2013-11-20 09:17
    cgracey wrote: »
    No. For stage 1, there is no flop yet. It's a logic signal. Thereafter, for stages 2 and 3, there are flops combined with logic after the Q output. So, the 'valid' flops propagate forward, but can be cancelled at any stage. Once an instruction is cancelled as it travels through the pipeline, it stays cancelled. Cancellation can happen in stages 1 through 3. If an instruction is still valid at stage 4, it was never cancelled, or never a trailing instruction behind a cancelling branch.

    I think I am asking the question wrong. Suppose the following code:
    DJNZ counter, #loop
    MOV field, INDA++
    

    In the pipeline, you end up with the MOV in stage 3 when the DJNZ is executed in stage 4. Supposing that the branch occurs, the MOV would be cancelled, except the update of INDA itself had already occurred one clock cycle earlier. I don't see how the "valid" flops would help here. What am I missing?
  • cgraceycgracey Posts: 14,206
    edited 2013-11-20 16:00
    Seairth wrote: »
    I think I am asking the question wrong. Suppose the following code:
    DJNZ counter, #loop
    MOV field, INDA++
    

    In the pipeline, you end up with the MOV in stage 3 when the DJNZ is executed in stage 4. Supposing that the branch occurs, the MOV would be cancelled, except the update of INDA itself had already occurred one clock cycle earlier. I don't see how the "valid" flops would help here. What am I missing?

    The MOV gets cancelled at stage 3 and INDA gets put back to the appropriate prior value. There is circuitry which handles late cancellations for INDA/INDB, and it took me a couple of days to figure out the rules for it. The insurmountable problem is that a read address has to be produced at stage 2, and if another instruction higher in the pipeline later gets cancelled, that stage 2 address that was already issued is now known to be bad. There's no way to back up.
  • SeairthSeairth Posts: 2,474
    edited 2013-11-20 17:10
    I understand now! Thanks!
  • SeairthSeairth Posts: 2,474
    edited 2013-11-20 21:27
    First, a caveat: my Verilog knowledge is basically non-existent (I learned VHDL, and only the basics, even then!). So it is totally possible that what I'm proposing isn't even possible without extra clock cycles, complex state machines, etc. If nothing else, it might spur an idea/thought in someone who *is* more knowledgeable, so it still seems worth putting the idea out there. Now, with that said...

    Let's start with the current design (indirect-addressing instructions are encoded in the COND bits and execute unconditionally). Now, let's suppose that a local copy of INDA is propagated through Stages 2, 3, and 4. Now, suppose the following pseudo-code for Stage 2 of the pipeline:
    IF stage3.IsValid
        stage2.INDA = stage3.INDA
    ELSE IF stage4.IsValid
        stage2.INDA = stage4.INDA
    ELSE
        stage2.INDA = registers.INDA
    

    Note that this code is ALWAYS reading a value into stage2.INDA, even if Stage 2 doesn't contain an indirect-addressing instruction). The pseudo-code that I left out was the fetching of the value for the d-field/s-field and the increment/decrement update of stage2.INDA. I am assuming that this will work roughly the same as it does now, except for which set of INDA registers are being read/written.

    Further, suppose that:
    • registers.INDA is updated in Stage 4 (not Stage 2 or Stage 3)
    • registers.INDA is always updated if Stage4.IsValid evaluates to TRUE
    • registers.INDA is never updated if Stage4.IsValid evaluates to FALSE
    • registers.INDA is not updated if Stage 4 has an instruction that self-jumps (e.g. WAITxxx in multitasking mode)

    If there are no cancelled instructions in the pipeline, then the above code should result in Stage 2 getting the same INDA value as in the current approach (where the value is stored in Stage 2 of the prior cycle). If the instruction in Stage 4 (e.g. DJNZ) causes the instructions Stage 3 and Stage 2 to be cancelled, then the next instruction would read "registers.INDA", which was updated in Stage 4 of the prior clock cycle. The modified value in this clock cycle's Stage 3 and Stage 4 would be ignored because those instructions were cancelled (in the prior clock cycle). Maybe a better example of this would be "JMPRET INDA, ++INDA". If that instruction reaches Stage 4 without being cancelled, it will ensure that any values stage2.INDA and stage3.INDA will be ignored, while the value in stage4.INDA 4 will be written to registers.INDA.

    Unfortunately, multitasking would have to change slightly. When an instruction cancels the other instructions in the pipeline, it must cancel ALL instructions, even those in other tasks. This is due to the fact that the other tasks would be propagating potentially invalid INDA values, even if they weren't indirect-addressing instructions. This would also mean that multiple PCs would potentially have to be updated. Yes, I know that this would mean that one task would cause another task to stall, but it's not like we're writing deterministic code in the mode anyhow. Note, by the way, that a self-jumping instruction (e.g. WAITxxx), assuming it cancels the other instructions just like a branching instruction, should not encounter the issue discussed in thread "[thread=151114]Using WAITVID with INDA++ and multi-tasking + other observations[/thread]".


    I hope that was understandable. But I'm not done yet. Supposing that I'm not complete off base with the above approach, let's consider my original proposal: allow indirect-addressing instructions to be conditionally executed (see earlier posts for details). With the above approach, INDA would not be useful in conditional instructions, because the conditionally-skipped instruction would still cause registers.INDA to be updated (after all, the instruction was skipped, not cancelled). To get around this, we could make use of another technique that's already in the processor: self-jumping. In this case, however, the instruction would jump to PC+1. This would cause the remaining instructions in the pipeline to be cancelled, then reloaded. Of course, you would get a three-cycle penalty for this, but INDA would stay consistent. Not a bad trade-off, I think, for getting back full use of the COND bits.



    Okay, that's it for now. I realize it's a lot to chew over. If something doesn't make sense, make sure to ask for clarification. I'm fully aware that I may not have explained some of this very well. And I'm also fully aware that some of this just might not be possible (no matter how much a clarify my thoughts). Like I said, in the end, it might do nothing more than spur a better idea in the mind of someone who understands this stuff better than I do. And if that's all that I manage to do with this, that would be fantastic!
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-11-20 21:39
    Seairth wrote: »
    Okay, that's it for now. I realize it's a lot to chew over. If something doesn't make sense, make sure to ask for clarification. I'm fully aware that I may not have explained some of this very well. And I'm also fully aware that some of this just might not be possible (no matter how much a clarify my thoughts). Like I said, in the end, it might do nothing more than spur a better idea in the mind of someone who understands this stuff better than I do. And if that's all that I manage to do with this, that would be fantastic!
    ... and for those of us who only have a minimal understanding, we have learned something too. So thanks ;)
  • BaggersBaggers Posts: 3,019
    edited 2013-11-21 02:26
    I too agree, the prop brought back so much fun to programming, especially being able to do PASM and not only that, it has such a great instruction set too, to the point that we're spoilt rotten, as before Prop1, I was programming some pics, I wouldn't even think of going back to PICs they were such a pain to set up, that even that ruined the fun factor, and that was before you even started coding it.
    I know some people knock SPIN also, but that too, is such a great language, and it really compliments PASM well, I think Chip did a top notch job having the complete package of both languages being able to run on the Prop, giving you the awesomely quick prototyping speed of SPIN with the punch of PASM, writing any prop app was a pure pleasure! I'm really looking forward to when P2 comes and all that fun to be had!

    Anyway, sorry I was off the main topic a little, but thought it just had to be said! ;) Hats off to you Chip! you sir are a genius and a gent :) let the good times roll!
  • Bill HenningBill Henning Posts: 6,445
    edited 2013-11-21 05:43
    Seairth wrote: »
    Unfortunately, multitasking would have to change slightly. When an instruction cancels the other instructions in the pipeline, it must cancel ALL instructions, even those in other tasks. This is due to the fact that the other tasks would be propagating potentially invalid INDA values, even if they weren't indirect-addressing instructions. This would also mean that multiple PCs would potentially have to be updated. Yes, I know that this would mean that one task would cause another task to stall, but it's not like we're writing deterministic code in the mode anyhow. Note, by the way, that a self-jumping instruction (e.g. WAITxxx), assuming it cancels the other instructions just like a branching instruction, should not encounter the issue discussed in thread "[thread=151114]Using WAITVID with INDA++ and multi-tasking + other observations[/thread]".

    I hope that was understandable. But I'm not done yet. Supposing that I'm not complete off base with the above approach, let's consider my original proposal: allow indirect-addressing instructions to be conditionally executed (see earlier posts for details). With the above approach, INDA would not be useful in conditional instructions, because the conditionally-skipped instruction would still cause registers.INDA to be updated (after all, the instruction was skipped, not cancelled). To get around this, we could make use of another technique that's already in the processor: self-jumping. In this case, however, the instruction would jump to PC+1. This would cause the remaining instructions in the pipeline to be cancelled, then reloaded. Of course, you would get a three-cycle penalty for this, but INDA would stay consistent. Not a bad trade-off, I think, for getting back full use of the COND bits.

    I am coming in late to this discussion... but I woke up early today :-)

    If I correctly understand, you were proposing to make INDA/B instructions be conditionally executable. Thanks for catching that, I did not realize they were not already conditionally executable!

    Again, if I get it, your latest (above) way of allowing that would

    - not make them execute conditionally
    - make multi-tasking non-deterministic if INDA/INDB are used

    I agree that it would be useful to have INDA/B using instructions be able to be conditional, however only if it does not add delays to P2 production, and only if it does not cause the pipeline issues you outlined above.

    I also have strong reservations about robbing two bits in the INDA/B registers, as that would introduce compatibility headaches in future Props (P3? P4?) that will likely need more than 9 bits of indirect addressing.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-11-21 12:45
    I have been wondering if we could not get minimal conditions with the use of INDA/INDB. So here is an idea...

    If only 1 operand has INDA/INDB specified...
    2 bits for the PRE/POST & INC/DEC (as currently)
    2 bits for Z & C conditional subset only
    - 00 = always
    - 01 = if_C
    - 10 = if_Z
    - 11 = if_Z_and_C

    If 2 operands have INDA/INDB specified...
    2 bits for the PRE/POST & INC/DEC for Dest (as currently)
    2 bits for the PRE/POST & INC/DEC for Srce (as currently)

    Just a thought. I know it does not allow for NC and NZ testing, but a subset is better than nothing and hopefully simple to implement.

    BTW Don't we have the same issue with PTRA/PTRB ?
  • SeairthSeairth Posts: 2,474
    edited 2013-11-21 12:57
    I like your idea. Its simple. You'd still have to deal with all of the side effects of conditionally executing indirect-addressing instructions.

    As for PTRA/PTRB, I believe those are only used in dedicated instructions and are therefore encoded differently. I think the same goes for SPA/SPB.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-11-21 12:59
    Cluso99 wrote: »
    I have been wondering if we could not get minimal conditions with the use of INDA/INDB. So here is an idea...

    If only 1 operand has INDA/INDB specified...
    2 bits for the PRE/POST & INC/DEC (as currently)
    2 bits for Z & C conditional subset only
    - 00 = always
    - 01 = if_C
    - 10 = if_Z
    - 11 = if_Z_and_C

    If 2 operands have INDA/INDB specified...
    2 bits for the PRE/POST & INC/DEC for Dest (as currently)
    2 bits for the PRE/POST & INC/DEC for Srce (as currently)

    Just a thought. I know it does not allow for NC and NZ testing, but a subset is better than nothing and hopefully simple to implement.

    BTW Don't we have the same issue with PTRA/PTRB ?

    I just re-read the older specs and found this (could be a killer to the idea of using conditionals anyway)...
    Because indirect addressing mustoccur in the 2nd stage of the pipeline, long before C and Z are valid for conditional execution in the 4th stage, all instructions which use indirect addressing are forced to always execute. This frees the conditional bit field (CCCC) for specifying indirect operations. The top two bits of CCCC are used for indirect D and the bottom two bits are used for indirect S. If only D or S is indirect, the other two bits in CCCC are ignored.
  • ozpropdevozpropdev Posts: 2,793
    edited 2013-11-21 16:42
    It always seems that Chip is 10 steps ahead of us!
    That's a good thing :)
  • YanomaniYanomani Posts: 1,524
    edited 2013-11-21 16:48
    IMHO (and begging your pardon in advance, for a little joke) it seems to be a someway relativistic problem of time and space!

    I'm sure that everyone, everyday, is learning a little bit more, about the wonderful job done by Chip on the Propeller 2.

    Cases as such described in these posts drove my mind back to early studies about speculative instruction execution, its caveats, consequences and advantages.

    To detour from problems of INDA++ and similar ones, one of the possibilities is to have two sets of registers, for each involved pointer, with agregated enable bits, so only one of them is realy to be called as INDA, and the other, INDAshdw.

    Everytime a INDA++ reference enters the first stage of the pipeline, it gets copied to its shadow register, i.e. the one that has its enable bit negated.

    When INDA++ is executed in the second stage of the pipeline, the shadow retains the early value.

    When any instruction at the fourth stage, determines the abandon of the other ones, still progressing thru the pipeline, the enable bits should be reversed, then the effects of pre-incrementing at the second stage, get logically canceled.

    No one will know, for sure, if the INDA or its shadow will get emerged from pipeline's operation as the final INDA, but its name don't matter at all, only its contents.

    Something like the OGT Z80 EXX or EX AF, AF', but exclusively under pipeline's behavior control.

    Perhaps it's all a nonsense of mine, perhaps not.

    Yanomani
  • cgraceycgracey Posts: 14,206
    edited 2013-11-21 19:02
    Cluso99 wrote: »
    I have been wondering if we could not get minimal conditions with the use of INDA/INDB. So here is an idea...

    If only 1 operand has INDA/INDB specified...
    2 bits for the PRE/POST & INC/DEC (as currently)
    2 bits for Z & C conditional subset only
    - 00 = always
    - 01 = if_C
    - 10 = if_Z
    - 11 = if_Z_and_C

    If 2 operands have INDA/INDB specified...
    2 bits for the PRE/POST & INC/DEC for Dest (as currently)
    2 bits for the PRE/POST & INC/DEC for Srce (as currently)

    Just a thought. I know it does not allow for NC and NZ testing, but a subset is better than nothing and hopefully simple to implement.

    BTW Don't we have the same issue with PTRA/PTRB ?

    Well, this got me really excited, but then I remembered the problem of having to back up the INDA/INDB states. Adding another layer of backup (stage 4) would certainly blow the timing. That circuit is quite complicated, already, and I flattened it as much as I could (it's rather big in area for what it does). I don't want to say never, but it's hard to think of overcoming the INDA/INDB limitations. That thing is already doing way more than I initially figured was reasonable. I love the idea of sneaking some use out an unused bit pair in CCCC, but having to back up states across 3 pipeline stages would be murder.

    By the way, there are no such problems with PTRA/PTRB, as they are confined to stages 3 and 4, and 4 is where the conditionality happens. Same story with SPA and SPB.
  • cgraceycgracey Posts: 14,206
    edited 2013-11-21 19:06
    Yanomani wrote: »
    IMHO (and begging your pardon in advance, for a little joke) it seems to be a someway relativistic problem of time and space!

    I'm sure that everyone, everyday, is learning a little bit more, about the wonderful job done by Chip on the Propeller 2.

    Cases as such described in these posts drove my mind back to early studies about speculative instruction execution, its caveats, consequences and advantages.

    To detour from problems of INDA++ and similar ones, one of the possibilities is to have two sets of registers, for each involved pointer, with agregated enable bits, so only one of them is realy to be called as INDA, and the other, INDAshdw.

    Everytime a INDA++ reference enters the first stage of the pipeline, it gets copied to its shadow register, i.e. the one that has its enable bit negated.

    When INDA++ is executed in the second stage of the pipeline, the shadow retains the early value.

    When any instruction at the fourth stage, determines the abandon of the other ones, still progressing thru the pipeline, the enable bits should be reversed, then the effects of pre-incrementing at the second stage, get logically canceled.

    No one will know, for sure, if the INDA or its shadow will get emerged from pipeline's operation as the final INDA, but its name don't matter at all, only its contents.

    Something like the OGT Z80 EXX or EX AF, AF', but exclusively under pipeline's behavior control.

    Perhaps it's all a nonsense of mine, perhaps not.

    Yanomani

    That's pretty much how it works, already!
  • SeairthSeairth Posts: 2,474
    edited 2013-11-21 19:39
    cgracey wrote: »
    Well, this got me really excited, but then I remembered the problem of having to back up the INDA/INDB states.

    Would the approach I outlined in post #16 not avoid this?
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-11-21 20:47
    cgracey wrote: »
    Well, this got me really excited, but then I remembered the problem of having to back up the INDA/INDB states. Adding another layer of backup (stage 4) would certainly blow the timing. That circuit is quite complicated, already, and I flattened it as much as I could (it's rather big in area for what it does). I don't want to say never, but it's hard to think of overcoming the INDA/INDB limitations. That thing is already doing way more than I initially figured was reasonable. I love the idea of sneaking some use out an unused bit pair in CCCC, but having to back up states across 3 pipeline stages would be murder.

    By the way, there are no such problems with PTRA/PTRB, as they are confined to stages 3 and 4, and 4 is where the conditionality happens. Same story with SPA and SPB.
    Obviously this only works for instructions that only reference INDa/b in one operand...

    If the INDa/b is non-incrementing (ie respective CC bits =00), then the other pair of CC bits represent IF_Z, IF_C, IF_Z_AND_C and 00=ALWAYS.

    This should solve the backing up problem and we could do this conditional sequence...
    IF_Z MOV/etc INDn,someval
    IF_Z INCINDn 'new instructions to ADD/SUB INDn,#1 using the wrap FIXINDx settings.

    If this is doable, what would be the best 3 Z & C options?
    00 = always
    01 = if_C
    10 = if_Z
    11 = if_Z_OR_C / if_Z_AND_C or some other condition such as IF_NC ???

    or maybe this is more in keeping
    00 = IF_NC_AND_NZ
    01 = IF_C
    10 = IF_Z
    11 = ALWAYS
  • ozpropdevozpropdev Posts: 2,793
    edited 2013-11-21 21:29
    Cluso99 wrote: »
    INCINDn 'new instructions to ADD/SUB INDn,#1 using the wrap FIXINDx settings.

    I like the idea of this instruction.
    This instruction would also help in multi-tasking operations that use INDx in self-jumping instructions.
    Nice. :)
  • SeairthSeairth Posts: 2,474
    edited 2013-11-22 04:02
    I just realized how to make my INDA suggestion multitask-friendly: have a set of INDA/INDB registers for each TASK! This means that $1F6/$1F7 would map to the appropriate set for the associated TASK of the instruction in Stage 2. Change the above pseudo-code to:
    IF stage3.IsValid AND stage3.TaskIndex = stage2.TaskIndex
        stage2.INDA = stage3.INDA
    ELSE IF stage4.IsValid AND stage4.TaskIndex = stage2.TaskIndex
        stage2.INDA = stage4.INDA
    ELSE
        stage2.INDA = TASK[stage2.TaskIndex].INDA
    

    Then, when an instruction in stage 4 cancels instructions, it does so only for the instructions of the current TASK (which I believe is already the behavior). This would mean that every task could safely use INDA/INDB at the same time (okay, technically, they are using their own copy). And the hope is that this approach would simplify INDA/INDB handling enough overall to make up for the addition of the extra sets of registers.
  • ozpropdevozpropdev Posts: 2,793
    edited 2013-11-22 04:49
    Multiple INDx registers would be awesome.

    At the moment in multi-tasking you have to decide which task gains the
    most benefit from the INDx registers and use MOVS/MOVD (soon to be SETS/SETD) in the other tasks to achieve the same indexing.

    Sadly I think I hear the sound of bullets bouncing of the walls again! :)
Sign In or Register to comment.