Shop OBEX P1 Docs P2 Docs Learn Events
Propeller II update - BLOG - Page 202 — Parallax Forums

Propeller II update - BLOG

1199200202204205223

Comments

  • jmgjmg Posts: 15,173
    edited 2014-03-08 14:01
    potatohead wrote: »
    Bill, this last round of discussions, needed to whittle the whole thing down to that core bit of silicon really turned out rather nice. Thanks for the clear examples.

    Keep in mind that what Bill and I are discussing, is not quite what present FPGA does and his code will not run as written on present P2.
  • jmgjmg Posts: 15,173
    edited 2014-03-08 14:04
    David Betz wrote: »
    Single stepping could be done by adding one bit to the state saved by TPAUSE such that another TPAUSE is automatically triggered after a single instruction is executed.

    I think the current idea for single step, is to feed the slave task one clock slot, and read either side of that.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 14:08
    It is not used as a boolean

    out of 2**32 possible combinations, ONE (0) is reserved.

    One of your flags uses up 2**31 possibilities.

    If you need more flags, make a 'taskXflags.

    Compact mixing is irrelevant, it takes several instructions to pack/unpack.

    My way takes less code, and is flexible.

    Heck, it can even be used to implement flags.

    BUT

    The key is that TPAUSE, and TJNZ can use 0 as a very quick mechanism.

    One positive thing has come out of my trying to explain it to you - if there is logic for it, it would be nice if TPAUSE could exit on its own if the task1req register becomes 0, as this would remove the need for a TRESUME instruction. Which could not be done with flags done the way you propose.
    jmg wrote: »
    I still do not like the inefficiency of using 32 bits as a boolean.., and the asymmetry of message passing.

    mov task3result,result ' optionally pass back result
    mov task3req,#0 ' release task if PC not incremented past TPAUSE

    - but I also do not see an opcode that neatly allows compact mixing of flags and params
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 14:12
    TPAUSE can

    - be used as breakpoints
    - wait for events
    - be used as system calls

    It is not an interrupt, as the scheduler has to poll for the change in the request register.

    TLB's etc will be a fun discussion for P3!

    Classic interrupts would be a much larger change, and require all four tasks states be capable of being saved/restored, interrupt vectors, which would lead to interrupt priorities etc - we avoid that huge headache with TPAUSE, which is a lot more flexible than classical interrupts.
    David Betz wrote: »
    It seems like this TPAUSE/TRESUME feature is very close to what is needed to support traps which will be needed for handling TLB misses if we ever get to trying to execute code from external memory through pages cached in hub memory. A TLB miss could automatically pause the task that causes it and jump to some predefined location. It would also need to store a trap reason in another predefined location. The code at that location would then service the trap and possibly modify the state saved by the hardware triggered TPAUSE and then execute a TRESUME on itself to return to the code that was running prior to the trap. I think this could all be done in the context of a single task rather than requiring a scheduler task running in parallel with the task being scheduled. In fact, if you add a timer as a possible trap source then you can do a scheduler within a single task. And Bill's YIELD instruction could essentially be a software trap that is processed as a breakpoint. Single stepping could be done by adding one bit to the state saved by TPAUSE such that another TPAUSE is automatically triggered after a single instruction is executed. I suppose this is essentially introducing interrupts to the P2 but it seems a lot simpler than two tasks running in tandem to effect essentially the same thing.
  • David BetzDavid Betz Posts: 14,516
    edited 2014-03-08 14:22
    TPAUSE can

    - be used as breakpoints
    - wait for events
    - be used as system calls

    It is not an interrupt, as the scheduler has to poll for the change in the request register.

    TLB's etc will be a fun discussion for P3!

    Classic interrupts would be a much larger change, and require all four tasks states be capable of being saved/restored, interrupt vectors, which would lead to interrupt priorities etc - we avoid that huge headache with TPAUSE, which is a lot more flexible than classical interrupts.
    I don't see why it would be any more complex than what I already described. The external stimulus (TLB fault, timer, etc) does the following:

    1) Store a "reason" to a known location. This would be "TLB miss", "timer", etc.
    2) Save the current state of the task using the TPAUSE mechanism.
    3) Transfer control to another known location.

    Then that code can do whatever it wants to handle this "trap" and when it's done it can just execute a TRESUME to resume execution stopped by the TPAUSE. If these "known locations" were in the area of registers that get remapped for each task then every task could operate independently with its own trap reason and handler registers. And YIELD can serve as a breakpoint by being a software triggered stimulus following the same sequence as above. This all requires only a single task and so all four HW tasks can do this at the same time without interfering with each other. There is no need to waste a task to run the scheduler.

    Edit: Or is there only one WIDE for holding the state saved by TPAUSE shared by all tasks?

    Edit2: Also, I'm not suggesting adding TLB to P2, just saying P2 already has about half of what is needed to do it.
  • ctwardellctwardell Posts: 1,716
    edited 2014-03-08 14:27
    David,

    TPAUSE doesn't do any of the work, it just writes a value to a register and loops to itself. You need the scheduler to be watching the register set by TPAUSE and then the scheduler does the actions to save state. etc.

    Currently only TASK 3 can have it's full state persisted.

    C.W.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 14:27
    Hi David,

    I'll explain. Except I am skipping anything TLB/MMU related as being out of scope for P2.

    To have interrupts:

    - you need multiple interrupt sources.
    - you need to be able to enable/disable them
    - you would need an interrupt vector for each interrupt
    - if you have more than one source, you usually need a priority mechanism
    - you would need these for all four tasks
    - you would need all four tasks internal states to be saved on an interrupt
    - or you would need to bind a specific interrupt to a specific task, needs more state and instructions
    - you would need a specific 'return from interrupt' instruction that knows how to restore the state and for which task

    All of the above needs far more silicon, instructions etc than the current tasking scheme being discussed.

    An interrupt would be the hardware equivalent of TPAUSE, which is not an interrupt.

    And you are correct, there is only one WIDE for saving the state of task3 only (to save gates & complexity)

    David Betz wrote: »
    I don't see why it would be any more complex than what I already described. The external stimulus (TLB fault, timer, etc) does the following:

    1) Store a "reason" to a known location. This would be "TLB miss", "timer", etc.
    2) Save the current state of the task using the TPAUSE mechanism.
    3) Transfer control to another known location.

    Then that code can do whatever it wants to handle this "trap" and when it's done it can just execute a TRESUME to resume execution stopped by the TPAUSE. If these "known locations" were in the area of registers that get remapped for each task then every task could operate independently with its own trap reason and handler registers. And YIELD can serve as a breakpoint by being a software triggered stimulus following the same sequence as above. This all requires only a single task and so all four HW tasks can do this at the same time without interfering with each other. There is no need to waste a task to run the scheduler.

    Edit: Or is there only one WIDE for holding the state saved by TPAUSE shared by all tasks?
  • potatoheadpotatohead Posts: 10,261
    edited 2014-03-08 14:31
    @JMG: Yes, but I have seen Chip's final output enough times to know about where he will be on it.

    To be frank, you tend to ask for every possible option, and push for more in silicon than most here do. Chip tends to filer out and design away the need for as many of those as he can, and bill has a good grasp on where sweet spot cases may or do lie.

    At the end of all that, less is often more, and generally speaking, that is "the propeller way" where we can use software for a lot of things where more emphasis on hardware would typically be seen.

    That dynamic is why I am here. It is a great philosophy, because software improves over time. Where we have carved out the silicon sweet spots, we maximize that potential.

    An easy example is the P1 video system. It is just enough to take the really ugly bits out of that task, without being overly limiting otherwise. Had a bit more hardware been applied, we would have seen some tasks easier and faster, however we may well have not seen the advanced uses as well as non-video related uses we have. Over time, we ended up doing things on the P1 that were not even a consideration at the time it was designed. The only sweet spot case missed was PAL, nicely corrected in P2, which retains most of the "it will do far more than we think" qualities P1 had, while at the same time leveraging all we learned on P1. One of those seriously improved cases is mixed mode and dynamically drawn displays. Having a graphics capable window on a text display, is one example possible on P1, but difficult. Few of us did it, due to the overall difficulty. Another is dynamically drawn displays intended to maximize the RAM efficiency, again possible on P1, but difficult. Both of these are going to be considerably easier and more effective on P2.

    It is going to be the same way with these advanced tasking features. We need enough silicon to open the door to as much of the possible as we can, while not closing off options possible in software by defining too much now in the hardware.

    We will put it all to use, just as we did P1, and out of that will fall the really sweet spot cases for P3, based on actual application and innovation. Put simply, we think we know the optimal use cases, etc... but we may well not know about what is really effective and or possible, until after software gets written and applied so as to reveal them.

    In the balance is something people can use easily, great performance, no OS needed to multi-task and multi-process if desired, etc...

    Where the "include it all just in case" approach is taken, complexity is too high and adoption is more difficult and that is easily seen out there on other perfectly capable, but maddening and painful to use devices.

    You mention control as very important. Agreed, but moving as much of that to sofftware as is practical means not having to wade through tons of options and initialization just to run some basic concept code, or to get started. Most PASM programmers here picked up on it quickly due to that dynamic being well realized.

    Yes, that sometimes means a peak performance case or two isn't as well realized as some would like, but it also meand just doing the vast majority of things is lean, fast, easy, consistent.

    Less is very often more. This is why we don't see more people attempting the kinds of things we see people attempting on a Propeller on other devices that have so many options and controls one doesn't even know where to start!

    So far, we have preserved this for the vast majority of what I see the P2 capable of, and I'm very excited and pleased because it means we can and will have experiences that are bigger than P1 can bring us, but the overall feel we all got so much out of was not lost amidst a sea of well meaning, but sadly obtuse options.

    These differences in ideology have played out well, in that our end results are inclusive without being a burden. Again, that is primary for a whole lot of us.
  • David BetzDavid Betz Posts: 14,516
    edited 2014-03-08 14:34
    Hi David,

    I'll explain. Except I am skipping anything TLB/MMU related as being out of scope for P2.

    To have interrupts:

    - you need multiple interrupt sources.
    - you need to be able to enable/disable them
    - you would need an interrupt vector for each interrupt
    - if you have more than one source, you usually need a priority mechanism
    - you would need these for all four tasks
    - you would need all four tasks internal states to be saved on an interrupt
    - or you would need to bind a specific interrupt to a specific task, needs more state and instructions
    - you would need a specific 'return from interrupt' instruction that knows how to restore the state and for which task
    Of course you are correct that to fully implement interrupts you'd have to do most or all of those things. However, to replace the two-task scheduler scheme with one where each task could run its own scheduler wouldn't require any of that.
    All of the above needs far more silicon, instructions etc than the current tasking scheme being discussed.
    Yes, it would probably require a little more. Has anyone suggested just storing the state on TPAUSE into COG registers rather than into a special unmapped WIDE? If you run four tasks using register remapping with each task having 32 registers, 8 of them could be used to store the task's state when interrupted and two more could be used for the reason/handler registers. That leaves 22 registers for general use which seems like a fair number and also leaves 128 COG registers to be share among the tasks or used for fast COG code functions.
    An interrupt would be the hardware equivalent of TPAUSE, which is not an interrupt.
    Not sure I understand this statement. I'm just saying that most of what is needed to do this is already implemented in the TPAUSE instruction.
    And you are correct, there is only one WIDE for saving the state of task3 only (to save gates & complexity)
    Again, is there any reason that the state couldn't be store in a WIDE in COG memory within the remapped register region? Doing this actually saves gates because you don't need the special purpose WIDE used to store the state of task3.
  • jmgjmg Posts: 15,173
    edited 2014-03-08 14:44
    Compact mixing is irrelevant, it takes several instructions to pack/unpack.
    My way takes less code, and is flexible.
    Heck, it can even be used to implement flags.
    BUT

    The key is that TPAUSE, and TJNZ can use 0 as a very quick mechanism.

    One positive thing has come out of my trying to explain it to you - if there is logic for it, it would be nice if TPAUSE could exit on its own if the task1req register becomes 0, as this would remove the need for a TRESUME instruction. Which could not be done with flags done the way you propose.

    Correct, avoiding TRESUME is a good idea.

    Your other claims are not true in the general sense.
    There is nothing fundamental about flags that excludes self resume, or has to dictate larger code.
    An example on a virtual P2, designed for packed atomic semaphpre and message
    ' compact master scheduler loop, uses packed atomic semaphpre and message 
    ' this is a bare-bones service provider that can serve as a skeleton for a debugger or scheduler
    
    scheduler
           jb31  task1req, #task1handler      'B31 signals Slave is done, and waiting
           jb31  task2req, #task2handler  ' Slave sets B31 and waits looping until B31=0 
           jb31  task3req, #task3handler
           jmp  #scheduler
    
    task1handler
          ' decode the request, and handle it
          mov   task1req,#MessToSlave1     ' also does ClrB31 => releases Slave, and pass (optional) message
          jmp   #scheduler
    
    task2handler
          ' decode the request, and handle it
          mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
          jmp   #scheduler
    
    task3handler
          ' decode the request, and handle it
          mov   task3req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
          jmp   #scheduler
    
    task1req     long  0    ' two way message and semaphore register 
    task2req     long  0    ' two way message and semaphore register 
    task3req     long  0    ' two way message and semaphore register 
    
    'Slave task1 
             TPAUSEb     task1req,#MessFromSlave     'Sets B31, ORs 9b #MessFromSlave onto task1req, allows 2^31 messages
    ' TPAUSEb  Loops here until B31 is Zero, then can read other 31 bits in task1req as messages 
    ' Test message from master, or just continue 
    

    Notice this is both smaller, (in code and registers) and has higher message ceiling than your code.
    ( jb31 can use the saved TRESUME opcode, so adds no more opcodes).
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 14:53
    Hi David,

    Chip wants to use WIDE's as apparently that is the easiest and requires the least logic from what I recall.
    ' TPAUSE as originally proposed was equivalent to:
    
          MOV reg,#code
    lp:  JMP  #lp
    
    ' Revised TPAUSE is equivalent to:
    
          MOV reg,#code
    lp:  TJNZ reg,#lp
    

    The reason it needs an instruction is to save memory, as it will be used very frequently, including as breakpoints.

    What I meant is that it is a simple instruction, not an interrupt mechanism.

    T3SAVE and T3LOAD implement the WIDE state saving.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 14:56
    jmg,

    1) You are introducing a new two-op instruction... which I don't need

    2) Your way only allows passing back a 31 bit result, not a 32 bit

    3) TPAUSE has no way of affecting b31

    Sorry my friend, my way is simpler, requires one less instruction, a bit less logic, and allows a fuller return value.
    jmg wrote: »
    Correct, avoiding TRESUME is a good idea.

    Your other claims are not true in the general sense.
    There is nothing fundamental about flags that excludes self resume, or has to dictate larger code.
    An example on a virtual P2, designed for packed atomic semaphpre and message
    ' compact master scheduler loop, uses packed atomic semaphpre and message 
    ' this is a bare-bones service provider that can serve as a skeleton for a debugger or scheduler
    
    scheduler
           jb31  task1req, #task1handler      'B31 signals Slave is done, and waiting
           jb31  task2req, #task2handler  ' Slave sets B31 and waits looping until B31=0 
           jb31  task3req, #task3handler
           jmp  #scheduler
    
    task1handler
          ' decode the request, and handle it
          mov   task1req,#MessToSlave1     ' also does ClrB31 => releases Slave, and pass (optional) message
          jmp   #scheduler
    
    task2handler
          ' decode the request, and handle it
          mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
          jmp   #scheduler
    
    task3handler
          ' decode the request, and handle it
          mov   task3req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
          jmp   #scheduler
    
    task1req     long  0    ' two way message and semaphore register 
    task2req     long  0    ' two way message and semaphore register 
    task3req     long  0    ' two way message and semaphore register 
    
    'Slave task1 
             TPAUSEb     task1req,#MessFromSlave     'Sets B31, ORs 9b #MessFromSlave onto task1req, allows 2^31 messages
    ' TPAUSEb  Loops here until B31 is Zero, then can read other 31 bits in task1req as messages 
    ' Test message from master, or just continue 
    

    Notice this is both smaller, (in code and registers) and has higher message ceiling than your code.
    ( jb31 can use the saved TRESUME opcode, so adds no more opcodes).
  • David BetzDavid Betz Posts: 14,516
    edited 2014-03-08 14:57
    Hi David,

    Chip wants to use WIDE's as apparently that is the easiest and requires the least logic from what I recall.
    ' TPAUSE as originally proposed was equivalent to:
    
          MOV reg,#code
    lp:  JMP  #lp
    
    ' Revised TPAUSE is equivalent to:
    
          MOV reg,#code
    lp:  TJNZ reg,#lp
    

    The reason it needs an instruction is to save memory, as it will be used very frequently, including as breakpoints.

    What I meant is that it is a simple instruction, not an interrupt mechanism.

    T3SAVE and T3LOAD implement the WIDE state saving.
    What are T3SAVE and T3LOAD? Are they new names for TPAUSE and TRESUME? I guess it is probably impossible to write a WIDE into COG registers in one tick since the COG memory is only 32 bits wide so I guess my idea to make all four tasks able to do threading won't work without separate thread state storage for each task.
  • cgraceycgracey Posts: 14,152
    edited 2014-03-08 15:03
    David Betz wrote: »
    What are T3SAVE and T3LOAD?


    These are the instructions that save and load task 3's context to and from the WIDE registers. These are single-cycle instructions that save/load upwards of 256 bits of context data.
  • ctwardellctwardell Posts: 1,716
    edited 2014-03-08 15:06
  • David BetzDavid Betz Posts: 14,516
    edited 2014-03-08 15:07
    cgracey wrote: »
    These are the instructions that save and load task 3's context to and from the WIDE registers. These are single-cycle instructions that save/load upwards of 256 bits of context data.
    Then what do TPAUSE and TRESUME do?
  • jmgjmg Posts: 15,173
    edited 2014-03-08 15:10
    Sorry my friend, my way is simpler, requires one less instruction, a bit less logic, and allows a fuller return value.

    Given my solution uses less code and data memory, I'm not sure where your simpler (?) claim comes from ??

    Of course, I already said it is a virtual P2, so it replaces TRESUME with something useful, and it proves you can have packed atomic semaphpre and messages, to well above the 512 values you claimed were so important earlier, and does it smaller in the most vital register memory resource.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 15:18
    Your proposal requires:

    1) 'jb32 reg,#addr' new jump instruction
    2) more complicated TPAUSE
    ' your suggested TPAUSEb equivalent in instructions
    
         mov   reg,#code  ' could be S
         or       reg,bit31const
    lp: and    reg,bit31const wz
       if_nz jmp #:lp
    
    ' my revised TPAUSE in instructions
    
        mov   reg,#code   ' could be S
    lp: tjnz   reg,#lp:
    
    ' mine is simpler to implement, less logic required
    

    3) limits return value to 31 bits

    4) my solution uses 3 more registers in the whole cog, and can return 32 bit values

    Sorry, I believe my solution is far superior
    jmg wrote: »
    Given my solution uses less code and data memory, I'm not sure where your simpler (?) claim comes from ??

    Of course, I already said it is a virtual P2, so it replaces TRESUME with something useful, and it proves you can have packed atomic semaphpre and messages, to well above the 512 values you claimed were so important earlier, and does it smaller in the most vital register memory resource.
  • cgraceycgracey Posts: 14,152
    edited 2014-03-08 15:19
    David Betz wrote: »
    Then what do TPAUSE and TRESUME do?


    TPAUSE D,S/# 'write S/# to D and loop in place

    TRESUME D/# 'increment PC of dormant task D/#
  • mindrobotsmindrobots Posts: 6,506
    edited 2014-03-08 15:23
    cgracey wrote: »
    TPAUSE D,S/# 'write S/# to D and loop in place

    TRESUME D/# 'increment PC of dormant task D/#

    TPAUSE is used by a switchable thread, TRESUME is used by the supervisor to put a swichable thread back on the air.
  • jmgjmg Posts: 15,173
    edited 2014-03-08 15:32
    Your proposal requires:

    1) 'jb32 reg,#addr' new jump instruction
    2) more complicated TPAUSE

    Correct, it is a virtual P2 - one that is designed to use less code and less memory.
    3) limits return value to 31 bits
    4) my solution uses 3 more registers in the whole cog, and can return 32 bit values

    Hehe, and yet earlier you claimed 512 states was important ?

    That 512 is limited by the immediate operand, and that is exactly the same in both implementations.
    Besides, If anyone really wanted 32b fields, they can always use more wasteful extra registers :)

    Less code and less register memory, is a very clear win, as those are what matter to designers. Gate-count is essentially invisible to a user.
    JB31 is only a slight variant (subset actually) on existing JZ opcode. Both read and test a register, JB31 only tests upper bit.
    (and it is useful not just in this code case )
    The TPause is a variant opcode in both our cases.
  • David BetzDavid Betz Posts: 14,516
    edited 2014-03-08 15:34
    cgracey wrote: »
    TPAUSE D,S/# 'write S/# to D and loop in place

    TRESUME D/# 'increment PC of dormant task D/#
    Okay, I guess I'm too late to join this party. I'm too out of touch with the current plans. :-)

    Anyway, looking at Bill's summary I guess T3SAVE and T3LOAD are what I should have mentioned in my message about having a task handle its own scheduling.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 16:47
    Yep, 512 was important.

    Worst case, my method uses 3 more longs in the cog.

    But it makes the verilog simpler and needs less logic.

    Btw, in most cases, only one request and result long will be needed for task 3.

    I stand my assertion that my way is strongly preferrable due to the KISS principle.
    jmg wrote: »
    Correct, it is a virtual P2 - one that is designed to use less code and less memory.

    Hehe, and yet earlier you claimed 512 states was important ?

    That 512 is limited by the immediate operand, and that is exactly the same in both implementations.
    Besides, If anyone really wanted 32b fields, they can always use more wasteful extra registers :)

    Less code and less register memory, is a very clear win, as those are what matter to designers. Gate-count is essentially invisible to a user.
    JB31 is only a slight variant (subset actually) on existing JZ opcode. Both read and test a register, JB31 only tests upper bit.
    (and it is useful not just in this code case )
    The TPause is a variant opcode in both our cases.
  • jmgjmg Posts: 15,173
    edited 2014-03-08 17:14
    Worst case, my method uses 3 more longs in the cog.

    .. and don't forget to add the extra lines of your code, to load the return values.
    ( You did compare my code, with yours ? )

    Code that just works the same going in both directions, wins any code-level KISS contest.

    As far as Verilog code goes, neither approach is particularly challenging, and Verilog is written only once. KISS decisions there are more interested in ease of use of the final device.

    Thousands of users will write millions of lines of P2 code and the P2 has a hard register ceiling.
    Anything that lets users pack more into those registers, is worth a serious look.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 17:23
    What extra code?

    task3result can be referenced directly, exactly the same as referencing task3req ... and more readable.

    It takes the same amount of code to say

    mov somereg, task3req

    as

    mov somereg, task3result

    The only difference in memory is the extra long per task in the scheduler/debugger for taskXresult.
    jmg wrote: »
    .. and don't forget to add the extra lines of your code, to load the return values.
    ( You did compare my code, with yours ? )

    Code that just works the same going in both directions, wins any code-level KISS contest.

    As far as Verilog code goes, neither approach is particularly challenging, and Verilog is written only once. KISS decisions there are more interested in ease of use of the final device.

    Thousands of users will write millions of lines of P2 code and the P2 has a hard register ceiling.
    Anything that lets users pack more into those registers, is worth a serious look.
  • jmgjmg Posts: 15,173
    edited 2014-03-08 17:36
    What extra code?

    Try this - I've highlighted the extra line of code, in your case, vs mine - for 3 instances, that is 3 more lines of code.
    task2handler
          ' decode the request, and handle it
    [B]      mov   task2result,result   ' optionally pass back result[/B]
          mov   task2req,#0           ' release task if PC not incremented past TPAUSE
          jmp   #scheduler
    

    vs
    task2handler
          ' decode the request, and handle it
          mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
          jmp   #scheduler
    
    

    Did you maybe miss that by merging the message and the semaphore, my code updates both in a single line ? In your code, it is one line per item.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-08 17:45
    Ah, ok. I see what you mean. I was thinking of the client task.

    Ok, I agree - I need 1 extra long to hold the result per task, and one extra instruction to clear taskXrequest, but only in the debugger/scheduler.

    No difference in client tasks.
    jmg wrote: »
    Try this - I've highlighted the extra line of code, in your case, vs mine - for 3 instances, that is 3 more lines of code.
    task2handler
          ' decode the request, and handle it
    [B]      mov   task2result,result   ' optionally pass back result[/B]
          mov   task2req,#0           ' release task if PC not incremented past TPAUSE
          jmp   #scheduler
    

    vs
    task2handler
          ' decode the request, and handle it
          mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
          jmp   #scheduler
    
    

    Did you maybe miss that by merging the message and the semaphore, my code updates both in a single line ? In your code, it is one line per item.
  • cgraceycgracey Posts: 14,152
    edited 2014-03-08 19:16
    TPAUSE and TRESUME are gone.

    In TPAUSE's old place is:

    TCHECK D,S/# 'Write S/# into D and jump to self. On subsequent iterations, don't write D, but jump to self if D <> 0.

    This gets rid of the need for TRESUME. It takes one bit of state storage to track TCHECK now, so that we know if it's on its first or a subsequent iteration. On the first iteration, it writes S/# into D and jumps to itself. On subsequent iterations, it doesn't write D, but jumps to itself if D <> 0.

    So, task A does a TCHECK to write a non-zero value into some register. Task B notices the non-0 value and can do whatever it wants about it, but can write 0 to the register to release Task A.
  • mindrobotsmindrobots Posts: 6,506
    edited 2014-03-08 19:29
    cgracey wrote: »
    TPAUSE and TRESUME are gone.

    In TPAUSE's old place is:

    TCHECK D,S/# 'Write S/# into D and jump to self. On subsequent iterations, don't write D, but jump to self if D <> 0.

    This gets rid of the need for TRESUME. It takes one bit of state storage to track TCHECK now, so that we know if it's on its first or subsequent iteration. On the first iteration, it writes S/# into D and jumps to itself. On subsequent iterations, it doesn't write D, but jumps to itself if D <> 0.

    So, task A does a TCHECK to write a non-zero value into some register. Task B notices the non-0 value and can do whatever it wants about it, but can write 0 to the register to release Task A.

    Simple, elegant, Propeller-like!

    Enough hardware to let the software play!
  • cgraceycgracey Posts: 14,152
    edited 2014-03-08 19:32
    mindrobots wrote: »
    Simple, elegant, Propeller-like!

    Enough hardware to let the software play!


    It feels like the right solution. These days, tons of good ideas are developing because of the synergy on this forum.

    I really appreciate all you guys!
Sign In or Register to comment.