Propeller II update - BLOG

jmg · 2014-03-08 14:01

potatohead wrote: »

Bill, this last round of discussions, needed to whittle the whole thing down to that core bit of silicon really turned out rather nice. Thanks for the clear examples.

Keep in mind that what Bill and I are discussing, is not quite what present FPGA does and his code will not run as written on present P2.

jmg · 2014-03-08 14:04

David Betz wrote: »

Single stepping could be done by adding one bit to the state saved by TPAUSE such that another TPAUSE is automatically triggered after a single instruction is executed.

I think the current idea for single step, is to feed the slave task one clock slot, and read either side of that.

Bill Henning · 2014-03-08 14:08

It is not used as a boolean

out of 2**32 possible combinations, ONE (0) is reserved.

One of your flags uses up 2**31 possibilities.

If you need more flags, make a 'taskXflags.

Compact mixing is irrelevant, it takes several instructions to pack/unpack.

My way takes less code, and is flexible.

Heck, it can even be used to implement flags.

BUT

The key is that TPAUSE, and TJNZ can use 0 as a very quick mechanism.

One positive thing has come out of my trying to explain it to you - if there is logic for it, it would be nice if TPAUSE could exit on its own if the task1req register becomes 0, as this would remove the need for a TRESUME instruction. Which could not be done with flags done the way you propose.

jmg wrote: »

I still do not like the inefficiency of using 32 bits as a boolean.., and the asymmetry of message passing.

mov task3result,result ' optionally pass back result
mov task3req,#0 ' release task if PC not incremented past TPAUSE

- but I also do not see an opcode that neatly allows compact mixing of flags and params

Bill Henning · 2014-03-08 14:12

TPAUSE can

- be used as breakpoints
- wait for events
- be used as system calls

It is not an interrupt, as the scheduler has to poll for the change in the request register.

TLB's etc will be a fun discussion for P3!

Classic interrupts would be a much larger change, and require all four tasks states be capable of being saved/restored, interrupt vectors, which would lead to interrupt priorities etc - we avoid that huge headache with TPAUSE, which is a lot more flexible than classical interrupts.

David Betz wrote: »

It seems like this TPAUSE/TRESUME feature is very close to what is needed to support traps which will be needed for handling TLB misses if we ever get to trying to execute code from external memory through pages cached in hub memory. A TLB miss could automatically pause the task that causes it and jump to some predefined location. It would also need to store a trap reason in another predefined location. The code at that location would then service the trap and possibly modify the state saved by the hardware triggered TPAUSE and then execute a TRESUME on itself to return to the code that was running prior to the trap. I think this could all be done in the context of a single task rather than requiring a scheduler task running in parallel with the task being scheduled. In fact, if you add a timer as a possible trap source then you can do a scheduler within a single task. And Bill's YIELD instruction could essentially be a software trap that is processed as a breakpoint. Single stepping could be done by adding one bit to the state saved by TPAUSE such that another TPAUSE is automatically triggered after a single instruction is executed. I suppose this is essentially introducing interrupts to the P2 but it seems a lot simpler than two tasks running in tandem to effect essentially the same thing.

David Betz · 2014-03-08 14:22

Bill Henning wrote: »

TPAUSE can

- be used as breakpoints
- wait for events
- be used as system calls

It is not an interrupt, as the scheduler has to poll for the change in the request register.

TLB's etc will be a fun discussion for P3!

Classic interrupts would be a much larger change, and require all four tasks states be capable of being saved/restored, interrupt vectors, which would lead to interrupt priorities etc - we avoid that huge headache with TPAUSE, which is a lot more flexible than classical interrupts.

I don't see why it would be any more complex than what I already described. The external stimulus (TLB fault, timer, etc) does the following:

1) Store a "reason" to a known location. This would be "TLB miss", "timer", etc.
2) Save the current state of the task using the TPAUSE mechanism.
3) Transfer control to another known location.

Then that code can do whatever it wants to handle this "trap" and when it's done it can just execute a TRESUME to resume execution stopped by the TPAUSE. If these "known locations" were in the area of registers that get remapped for each task then every task could operate independently with its own trap reason and handler registers. And YIELD can serve as a breakpoint by being a software triggered stimulus following the same sequence as above. This all requires only a single task and so all four HW tasks can do this at the same time without interfering with each other. There is no need to waste a task to run the scheduler.

Edit: Or is there only one WIDE for holding the state saved by TPAUSE shared by all tasks?

Edit2: Also, I'm not suggesting adding TLB to P2, just saying P2 already has about half of what is needed to do it.

ctwardell · 2014-03-08 14:27

David,

TPAUSE doesn't do any of the work, it just writes a value to a register and loops to itself. You need the scheduler to be watching the register set by TPAUSE and then the scheduler does the actions to save state. etc.

Currently only TASK 3 can have it's full state persisted.

C.W.

Bill Henning · 2014-03-08 14:27

Hi David,

I'll explain. Except I am skipping anything TLB/MMU related as being out of scope for P2.

To have interrupts:

- you need multiple interrupt sources.
- you need to be able to enable/disable them
- you would need an interrupt vector for each interrupt
- if you have more than one source, you usually need a priority mechanism
- you would need these for all four tasks
- you would need all four tasks internal states to be saved on an interrupt
- or you would need to bind a specific interrupt to a specific task, needs more state and instructions
- you would need a specific 'return from interrupt' instruction that knows how to restore the state and for which task

All of the above needs far more silicon, instructions etc than the current tasking scheme being discussed.

An interrupt would be the hardware equivalent of TPAUSE, which is not an interrupt.

And you are correct, there is only one WIDE for saving the state of task3 only (to save gates & complexity)

David Betz wrote: »

I don't see why it would be any more complex than what I already described. The external stimulus (TLB fault, timer, etc) does the following:

1) Store a "reason" to a known location. This would be "TLB miss", "timer", etc.
2) Save the current state of the task using the TPAUSE mechanism.
3) Transfer control to another known location.

Then that code can do whatever it wants to handle this "trap" and when it's done it can just execute a TRESUME to resume execution stopped by the TPAUSE. If these "known locations" were in the area of registers that get remapped for each task then every task could operate independently with its own trap reason and handler registers. And YIELD can serve as a breakpoint by being a software triggered stimulus following the same sequence as above. This all requires only a single task and so all four HW tasks can do this at the same time without interfering with each other. There is no need to waste a task to run the scheduler.

Edit: Or is there only one WIDE for holding the state saved by TPAUSE shared by all tasks?

potatohead · 2014-03-08 14:31

@JMG: Yes, but I have seen Chip's final output enough times to know about where he will be on it.

To be frank, you tend to ask for every possible option, and push for more in silicon than most here do. Chip tends to filer out and design away the need for as many of those as he can, and bill has a good grasp on where sweet spot cases may or do lie.

At the end of all that, less is often more, and generally speaking, that is "the propeller way" where we can use software for a lot of things where more emphasis on hardware would typically be seen.

That dynamic is why I am here. It is a great philosophy, because software improves over time. Where we have carved out the silicon sweet spots, we maximize that potential.

An easy example is the P1 video system. It is just enough to take the really ugly bits out of that task, without being overly limiting otherwise. Had a bit more hardware been applied, we would have seen some tasks easier and faster, however we may well have not seen the advanced uses as well as non-video related uses we have. Over time, we ended up doing things on the P1 that were not even a consideration at the time it was designed. The only sweet spot case missed was PAL, nicely corrected in P2, which retains most of the "it will do far more than we think" qualities P1 had, while at the same time leveraging all we learned on P1. One of those seriously improved cases is mixed mode and dynamically drawn displays. Having a graphics capable window on a text display, is one example possible on P1, but difficult. Few of us did it, due to the overall difficulty. Another is dynamically drawn displays intended to maximize the RAM efficiency, again possible on P1, but difficult. Both of these are going to be considerably easier and more effective on P2.

It is going to be the same way with these advanced tasking features. We need enough silicon to open the door to as much of the possible as we can, while not closing off options possible in software by defining too much now in the hardware.

We will put it all to use, just as we did P1, and out of that will fall the really sweet spot cases for P3, based on actual application and innovation. Put simply, we think we know the optimal use cases, etc... but we may well not know about what is really effective and or possible, until after software gets written and applied so as to reveal them.

In the balance is something people can use easily, great performance, no OS needed to multi-task and multi-process if desired, etc...

Where the "include it all just in case" approach is taken, complexity is too high and adoption is more difficult and that is easily seen out there on other perfectly capable, but maddening and painful to use devices.

You mention control as very important. Agreed, but moving as much of that to sofftware as is practical means not having to wade through tons of options and initialization just to run some basic concept code, or to get started. Most PASM programmers here picked up on it quickly due to that dynamic being well realized.

Yes, that sometimes means a peak performance case or two isn't as well realized as some would like, but it also meand just doing the vast majority of things is lean, fast, easy, consistent.

Less is very often more. This is why we don't see more people attempting the kinds of things we see people attempting on a Propeller on other devices that have so many options and controls one doesn't even know where to start!

So far, we have preserved this for the vast majority of what I see the P2 capable of, and I'm very excited and pleased because it means we can and will have experiences that are bigger than P1 can bring us, but the overall feel we all got so much out of was not lost amidst a sea of well meaning, but sadly obtuse options.

These differences in ideology have played out well, in that our end results are inclusive without being a burden. Again, that is primary for a whole lot of us.

David Betz · 2014-03-08 14:34

Bill Henning wrote: »

Hi David,

I'll explain. Except I am skipping anything TLB/MMU related as being out of scope for P2.

To have interrupts:

- you need multiple interrupt sources.
- you need to be able to enable/disable them
- you would need an interrupt vector for each interrupt
- if you have more than one source, you usually need a priority mechanism
- you would need these for all four tasks
- you would need all four tasks internal states to be saved on an interrupt
- or you would need to bind a specific interrupt to a specific task, needs more state and instructions
- you would need a specific 'return from interrupt' instruction that knows how to restore the state and for which task

Of course you are correct that to fully implement interrupts you'd have to do most or all of those things. However, to replace the two-task scheduler scheme with one where each task could run its own scheduler wouldn't require any of that.

All of the above needs far more silicon, instructions etc than the current tasking scheme being discussed.

Yes, it would probably require a little more. Has anyone suggested just storing the state on TPAUSE into COG registers rather than into a special unmapped WIDE? If you run four tasks using register remapping with each task having 32 registers, 8 of them could be used to store the task's state when interrupted and two more could be used for the reason/handler registers. That leaves 22 registers for general use which seems like a fair number and also leaves 128 COG registers to be share among the tasks or used for fast COG code functions.

An interrupt would be the hardware equivalent of TPAUSE, which is not an interrupt.

Not sure I understand this statement. I'm just saying that most of what is needed to do this is already implemented in the TPAUSE instruction.

And you are correct, there is only one WIDE for saving the state of task3 only (to save gates & complexity)

Again, is there any reason that the state couldn't be store in a WIDE in COG memory within the remapped register region? Doing this actually saves gates because you don't need the special purpose WIDE used to store the state of task3.

jmg · 2014-03-08 14:44

Bill Henning wrote: »

Compact mixing is irrelevant, it takes several instructions to pack/unpack.
My way takes less code, and is flexible.
Heck, it can even be used to implement flags.
BUT

The key is that TPAUSE, and TJNZ can use 0 as a very quick mechanism.

One positive thing has come out of my trying to explain it to you - if there is logic for it, it would be nice if TPAUSE could exit on its own if the task1req register becomes 0, as this would remove the need for a TRESUME instruction. Which could not be done with flags done the way you propose.

Correct, avoiding TRESUME is a good idea.

Your other claims are not true in the general sense.
There is nothing fundamental about flags that excludes self resume, or has to dictate larger code.
An example on a virtual P2, designed for packed atomic semaphpre and message

' compact master scheduler loop, uses packed atomic semaphpre and message 
' this is a bare-bones service provider that can serve as a skeleton for a debugger or scheduler

scheduler
       jb31  task1req, #task1handler      'B31 signals Slave is done, and waiting
       jb31  task2req, #task2handler  ' Slave sets B31 and waits looping until B31=0 
       jb31  task3req, #task3handler
       jmp  #scheduler

task1handler
      ' decode the request, and handle it
      mov   task1req,#MessToSlave1     ' also does ClrB31 => releases Slave, and pass (optional) message
      jmp   #scheduler

task2handler
      ' decode the request, and handle it
      mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
      jmp   #scheduler

task3handler
      ' decode the request, and handle it
      mov   task3req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
      jmp   #scheduler

task1req     long  0    ' two way message and semaphore register 
task2req     long  0    ' two way message and semaphore register 
task3req     long  0    ' two way message and semaphore register 

'Slave task1 
         TPAUSEb     task1req,#MessFromSlave     'Sets B31, ORs 9b #MessFromSlave onto task1req, allows 2^31 messages
' TPAUSEb  Loops here until B31 is Zero, then can read other 31 bits in task1req as messages 
' Test message from master, or just continue

Notice this is both smaller, (in code and registers) and has higher message ceiling than your code.
( jb31 can use the saved TRESUME opcode, so adds no more opcodes).

Bill Henning · 2014-03-08 14:53

Hi David,

Chip wants to use WIDE's as apparently that is the easiest and requires the least logic from what I recall.

' TPAUSE as originally proposed was equivalent to:

      MOV reg,#code
lp:  JMP  #lp

' Revised TPAUSE is equivalent to:

      MOV reg,#code
lp:  TJNZ reg,#lp

The reason it needs an instruction is to save memory, as it will be used very frequently, including as breakpoints.

What I meant is that it is a simple instruction, not an interrupt mechanism.

T3SAVE and T3LOAD implement the WIDE state saving.

Bill Henning · 2014-03-08 14:56

jmg,

1) You are introducing a new two-op instruction... which I don't need

2) Your way only allows passing back a 31 bit result, not a 32 bit

3) TPAUSE has no way of affecting b31

Sorry my friend, my way is simpler, requires one less instruction, a bit less logic, and allows a fuller return value.

jmg wrote: »

Correct, avoiding TRESUME is a good idea.

Your other claims are not true in the general sense.
There is nothing fundamental about flags that excludes self resume, or has to dictate larger code.
An example on a virtual P2, designed for packed atomic semaphpre and message

' compact master scheduler loop, uses packed atomic semaphpre and message 
' this is a bare-bones service provider that can serve as a skeleton for a debugger or scheduler

scheduler
       jb31  task1req, #task1handler      'B31 signals Slave is done, and waiting
       jb31  task2req, #task2handler  ' Slave sets B31 and waits looping until B31=0 
       jb31  task3req, #task3handler
       jmp  #scheduler

task1handler
      ' decode the request, and handle it
      mov   task1req,#MessToSlave1     ' also does ClrB31 => releases Slave, and pass (optional) message
      jmp   #scheduler

task2handler
      ' decode the request, and handle it
      mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
      jmp   #scheduler

task3handler
      ' decode the request, and handle it
      mov   task3req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
      jmp   #scheduler

task1req     long  0    ' two way message and semaphore register 
task2req     long  0    ' two way message and semaphore register 
task3req     long  0    ' two way message and semaphore register 

'Slave task1 
         TPAUSEb     task1req,#MessFromSlave     'Sets B31, ORs 9b #MessFromSlave onto task1req, allows 2^31 messages
' TPAUSEb  Loops here until B31 is Zero, then can read other 31 bits in task1req as messages 
' Test message from master, or just continue

Notice this is both smaller, (in code and registers) and has higher message ceiling than your code.
( jb31 can use the saved TRESUME opcode, so adds no more opcodes).

David Betz · 2014-03-08 14:57

Bill Henning wrote: »
Hi David,

Chip wants to use WIDE's as apparently that is the easiest and requires the least logic from what I recall.
' TPAUSE as originally proposed was equivalent to:

      MOV reg,#code
lp:  JMP  #lp

' Revised TPAUSE is equivalent to:

      MOV reg,#code
lp:  TJNZ reg,#lp
The reason it needs an instruction is to save memory, as it will be used very frequently, including as breakpoints.

What I meant is that it is a simple instruction, not an interrupt mechanism.

T3SAVE and T3LOAD implement the WIDE state saving.

What are T3SAVE and T3LOAD? Are they new names for TPAUSE and TRESUME? I guess it is probably impossible to write a WIDE into COG registers in one tick since the COG memory is only 32 bits wide so I guess my idea to make all four tasks able to do threading won't work without separate thread state storage for each task.

cgracey · 2014-03-08 15:03

David Betz wrote: »

What are T3SAVE and T3LOAD?

These are the instructions that save and load task 3's context to and from the WIDE registers. These are single-cycle instructions that save/load upwards of 256 bits of context data.

ctwardell · 2014-03-08 15:06

Here is where Bill summarized some of these.

http://forums.parallax.com/showthread.php/125543-Propeller-II-update-BLOG?p=1248810&viewfull=1#post1248810

C.W.

David Betz · 2014-03-08 15:07

cgracey wrote: »

These are the instructions that save and load task 3's context to and from the WIDE registers. These are single-cycle instructions that save/load upwards of 256 bits of context data.

Then what do TPAUSE and TRESUME do?

jmg · 2014-03-08 15:10

Bill Henning wrote: »

Sorry my friend, my way is simpler, requires one less instruction, a bit less logic, and allows a fuller return value.

Given my solution uses less code and data memory, I'm not sure where your simpler (?) claim comes from ??

Of course, I already said it is a virtual P2, so it replaces TRESUME with something useful, and it proves you can have packed atomic semaphpre and messages, to well above the 512 values you claimed were so important earlier, and does it smaller in the most vital register memory resource.

Bill Henning · 2014-03-08 15:18

Your proposal requires:

1) 'jb32 reg,#addr' new jump instruction
2) more complicated TPAUSE

' your suggested TPAUSEb equivalent in instructions

     mov   reg,#code  ' could be S
     or       reg,bit31const
lp: and    reg,bit31const wz
   if_nz jmp #:lp

' my revised TPAUSE in instructions

    mov   reg,#code   ' could be S
lp: tjnz   reg,#lp:

' mine is simpler to implement, less logic required

3) limits return value to 31 bits

4) my solution uses 3 more registers in the whole cog, and can return 32 bit values

Sorry, I believe my solution is far superior

jmg wrote: »

Given my solution uses less code and data memory, I'm not sure where your simpler (?) claim comes from ??

Of course, I already said it is a virtual P2, so it replaces TRESUME with something useful, and it proves you can have packed atomic semaphpre and messages, to well above the 512 values you claimed were so important earlier, and does it smaller in the most vital register memory resource.

cgracey · 2014-03-08 15:19

David Betz wrote: »

Then what do TPAUSE and TRESUME do?

TPAUSE D,S/# 'write S/# to D and loop in place

TRESUME D/# 'increment PC of dormant task D/#

mindrobots · 2014-03-08 15:23

cgracey wrote: »

TPAUSE D,S/# 'write S/# to D and loop in place

TRESUME D/# 'increment PC of dormant task D/#

TPAUSE is used by a switchable thread, TRESUME is used by the supervisor to put a swichable thread back on the air.

jmg · 2014-03-08 15:32

Bill Henning wrote: »

Your proposal requires:

1) 'jb32 reg,#addr' new jump instruction
2) more complicated TPAUSE

Correct, it is a virtual P2 - one that is designed to use less code and less memory.

Bill Henning wrote: »

3) limits return value to 31 bits
4) my solution uses 3 more registers in the whole cog, and can return 32 bit values

Hehe, and yet earlier you claimed 512 states was important ?

That 512 is limited by the immediate operand, and that is exactly the same in both implementations.
Besides, If anyone really wanted 32b fields, they can always use more wasteful extra registers

Less code and less register memory, is a very clear win, as those are what matter to designers. Gate-count is essentially invisible to a user.
JB31 is only a slight variant (subset actually) on existing JZ opcode. Both read and test a register, JB31 only tests upper bit.
(and it is useful not just in this code case )
The TPause is a variant opcode in both our cases.

David Betz · 2014-03-08 15:34

cgracey wrote: »

TPAUSE D,S/# 'write S/# to D and loop in place

TRESUME D/# 'increment PC of dormant task D/#

Okay, I guess I'm too late to join this party. I'm too out of touch with the current plans. :-)

Anyway, looking at Bill's summary I guess T3SAVE and T3LOAD are what I should have mentioned in my message about having a task handle its own scheduling.

Bill Henning · 2014-03-08 16:47

Yep, 512 was important.

Worst case, my method uses 3 more longs in the cog.

But it makes the verilog simpler and needs less logic.

Btw, in most cases, only one request and result long will be needed for task 3.

I stand my assertion that my way is strongly preferrable due to the KISS principle.

jmg wrote: »

Correct, it is a virtual P2 - one that is designed to use less code and less memory.

Hehe, and yet earlier you claimed 512 states was important ?

That 512 is limited by the immediate operand, and that is exactly the same in both implementations.
Besides, If anyone really wanted 32b fields, they can always use more wasteful extra registers

Less code and less register memory, is a very clear win, as those are what matter to designers. Gate-count is essentially invisible to a user.
JB31 is only a slight variant (subset actually) on existing JZ opcode. Both read and test a register, JB31 only tests upper bit.
(and it is useful not just in this code case )
The TPause is a variant opcode in both our cases.

jmg · 2014-03-08 17:14

Bill Henning wrote: »

Worst case, my method uses 3 more longs in the cog.

.. and don't forget to add the extra lines of your code, to load the return values.
( You did compare my code, with yours ? )

Code that just works the same going in both directions, wins any code-level KISS contest.

As far as Verilog code goes, neither approach is particularly challenging, and Verilog is written only once. KISS decisions there are more interested in ease of use of the final device.

Thousands of users will write millions of lines of P2 code and the P2 has a hard register ceiling.
Anything that lets users pack more into those registers, is worth a serious look.

Bill Henning · 2014-03-08 17:23

What extra code?

task3result can be referenced directly, exactly the same as referencing task3req ... and more readable.

It takes the same amount of code to say

mov somereg, task3req

as

mov somereg, task3result

The only difference in memory is the extra long per task in the scheduler/debugger for taskXresult.

jmg wrote: »

.. and don't forget to add the extra lines of your code, to load the return values.
( You did compare my code, with yours ? )

Code that just works the same going in both directions, wins any code-level KISS contest.

As far as Verilog code goes, neither approach is particularly challenging, and Verilog is written only once. KISS decisions there are more interested in ease of use of the final device.

Thousands of users will write millions of lines of P2 code and the P2 has a hard register ceiling.
Anything that lets users pack more into those registers, is worth a serious look.

jmg · 2014-03-08 17:36

Bill Henning wrote: »

What extra code?

Try this - I've highlighted the extra line of code, in your case, vs mine - for 3 instances, that is 3 more lines of code.

task2handler
      ' decode the request, and handle it
[B]      mov   task2result,result   ' optionally pass back result[/B]
      mov   task2req,#0           ' release task if PC not incremented past TPAUSE
      jmp   #scheduler

vs

task2handler
      ' decode the request, and handle it
      mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
      jmp   #scheduler

Did you maybe miss that by merging the message and the semaphore, my code updates both in a single line ? In your code, it is one line per item.

Bill Henning · 2014-03-08 17:45

Ah, ok. I see what you mean. I was thinking of the client task.

Ok, I agree - I need 1 extra long to hold the result per task, and one extra instruction to clear taskXrequest, but only in the debugger/scheduler.

No difference in client tasks.

jmg wrote: »
Try this - I've highlighted the extra line of code, in your case, vs mine - for 3 instances, that is 3 more lines of code.
task2handler
      ' decode the request, and handle it
[B]      mov   task2result,result   ' optionally pass back result[/B]
      mov   task2req,#0           ' release task if PC not incremented past TPAUSE
      jmp   #scheduler
vs
task2handler
      ' decode the request, and handle it
      mov   task2req,#MessToSlave2     ' ClrB31 = releases Slave, and pass  (optional) message
      jmp   #scheduler
Did you maybe miss that by merging the message and the semaphore, my code updates both in a single line ? In your code, it is one line per item.

cgracey · 2014-03-08 19:16

TPAUSE and TRESUME are gone.

In TPAUSE's old place is:

TCHECK D,S/# 'Write S/# into D and jump to self. On subsequent iterations, don't write D, but jump to self if D <> 0.

This gets rid of the need for TRESUME. It takes one bit of state storage to track TCHECK now, so that we know if it's on its first or a subsequent iteration. On the first iteration, it writes S/# into D and jumps to itself. On subsequent iterations, it doesn't write D, but jumps to itself if D <> 0.

So, task A does a TCHECK to write a non-zero value into some register. Task B notices the non-0 value and can do whatever it wants about it, but can write 0 to the register to release Task A.

mindrobots · 2014-03-08 19:29

cgracey wrote: »

TPAUSE and TRESUME are gone.

In TPAUSE's old place is:

TCHECK D,S/# 'Write S/# into D and jump to self. On subsequent iterations, don't write D, but jump to self if D <> 0.

This gets rid of the need for TRESUME. It takes one bit of state storage to track TCHECK now, so that we know if it's on its first or subsequent iteration. On the first iteration, it writes S/# into D and jumps to itself. On subsequent iterations, it doesn't write D, but jumps to itself if D <> 0.

So, task A does a TCHECK to write a non-zero value into some register. Task B notices the non-0 value and can do whatever it wants about it, but can write 0 to the register to release Task A.

Simple, elegant, Propeller-like!

Enough hardware to let the software play!

cgracey · 2014-03-08 19:32

mindrobots wrote: »

Simple, elegant, Propeller-like!

Enough hardware to let the software play!

It feels like the right solution. These days, tons of good ideas are developing because of the synergy on this forum.

I really appreciate all you guys!

Propeller II update - BLOG

Comments