What's needed for preemptive multitasking?

Heater. · 2014-02-23 04:25

evanh,

It could be said that preemptive multitasking is the super-set.

It could be said. But then there are the weird systems where even reading from a socket, file, or other I/O does not halt your code. Rather the call returns immediately, your code can continue running, and the data arrives into your program later via a call back function. Such systems can be preemptive (they have to be to get the IO working) but never voluntarily yielding.

Heater. · 2014-02-23 04:36

evanh,

jmg has given a possible solution to creating preemption. That would be doable assuming his question is answered with a, yes, we can read and write the other PC/flags in a Cog.

I'm not sure I follow you.

My question assumes:
a) We don't have preemptive scheduling. As we don't now.
b) We want to create cooperative threads, say in C using pthreads.
c) We are running from HUB. In HUBEXEC mode.

Then, how does a thread voluntarily give up execution and allow a scheduler to run the next thread that is ready?

The first part is easy, just make a call into some scheduler primitive function. Call it "suspend" or "yield".

But then what? How does that "suspend" swap out the state of the running thread for another one. (PC, stack pointer etc)?

Perhaps I can answer it myself:

The threads PC is on the stack at this point. Or in a link register. So that is easy to fetch and save somewhere.
Whatever we are using as the stack pointer can be saved and pointed to some other threads stack.
Likewise important registers.

Anything missing here?

evanh · 2014-02-23 04:47

Heater. wrote: »

... Such systems can be preemptive (they have to be to get the IO working) but never voluntarily yielding.

But they still do voluntarily yield somewhere else. They would become very sluggish otherwise; hence my initial comment about smooth operation.

evanh · 2014-02-23 04:58

Heater. wrote: »

My question assumes:
a) We don't have preemptive scheduling. As we don't now.

I believe jmg is thinking we already have a means of forcing a thread redirection without any yielding. I wouldn't have a clue if pthreads can be cooperative.

The suspending jmg is referring to is just a means of controlling a task via a second task by forcing the first to stop execution prior to setting a new task context for that thread.

Heater. · 2014-02-23 05:06

evanh,

Here things get a bit woolly.

The whole concept of a thread is that the functionality you want to implement is written as an endless loop. Important part here is that loop potentially never exits. The programmer can write his functionality that way because he knows other threads will get a chance to run either preemptively or when he inevitably hits an I/O operation. (Ueah, he might have to insert a "suspend" in some cases.

But what of event based systems? Here there is no endless loop. The programmer writes his functionality as a collection of event handling functions. The concept of a thread goes away. No need for "yield", you handle the event and you are done until the next one.

Heater. · 2014-02-23 05:11

evanh,

The suspending jmg is referring to is just a means of controlling a task via a second task by forcing the first to stop execution prior to setting a new task context for that thread.

That is how I understood it.

Basically it's an interrupt.

In this case instead of a simple hardware interrupt controller mechanism stomping on a running piece of code and redirecting execution elsewhere, it is another thread doing the stomping.

All in all I have a bad feeling about the idea. As I would if introducing interrupts were suggested.

evanh · 2014-02-23 05:11

Heater. wrote: »

But what of event based systems? Here there is no endless loop. The programmer writes his functionality as a collection of event handling functions. The concept of a thread goes away. No need for "yield", you handle the event and you are done until the next one.

Each handler is effectively another tasklet. If no events are pending then no tasklets are running. Yelded idle state is the norm.

evanh · 2014-02-23 05:16

Heater. wrote: »

In this case instead of a simple hardware interrupt controller mechanism stomping on a running piece of code and redirecting execution elsewhere, it is another thread doing the stomping.

Obviously, the system integrator has to want to use such a mechanism. It comes with caveats and 1/16 processing penalty. It's not like Chip has added any hardware or instructions to make this happen.

Heater. · 2014-02-23 05:20

Yes, I did say it was "woolly".

The whole "thread" thing is an abstraction for the programmer. It arises because of those endless loops. The idea is to free the programmer to be able to write what the hell he likes and know that the system won't get hung up as a result. On top of that we add memory protection and create "big threads" or processes so that the thread programmer can no longer trample other threads memory.

Event based programming is a different abstraction. There are no threads in that abstraction.

Of course, all this gets support from hardware in the CPU, interrupt controller, memory manager etc, So concepts at the hardware level get messed up with concepts at the software abstraction level.

An event based programming abstraction may well be running on a machine with all the normal thread based hardware support. That's just because that's how the machines we have are made. It need not be so.

Heater. · 2014-02-23 05:23

evanh,

It's not like Chip has added any hardware or instructions to make this happen.

Quite so. And isn't that the topic of this thread? Should the P2 have hardware added or changed to support preemption?

My vote is no because it is the same as asking "Should we add interrupt support?". It sounds like a whole huge level of complexity that does not fit the P" design. A big can of worms.

Dave Hein · 2014-02-23 06:53

So with the current state of thread support on P2 you can have up to 4 threads per cog. One of the threads could be doing a waitcnt passcnt or waitpeq jp. Once that thread exits the waitcnt passcnt or waitpeq jp it can then prevent further task swapping and take over the cog for a short period of time, correct? So isn't that essentially the same as interrupt support?

Bill Henning · 2014-02-23 07:28

Two ways (without interrupts/pre-emption):

1) any I/O lib call (including sockets) re-schedules

2) compiler inserts an occasional YIELD

But both of the above fail when someone makes a mistake and compiles

while(1){}

into a pthread.

Heater. wrote: »

Yes, "yeilding" can happen. When reading from a socket or file for example. It is assumed that the thread/process has nothing else to do until the data arrives so it's time to see if anyone else can run. That's not the kind of rescheduling that distinguishes "preemptive" from "non-preemptive". They generally both do that.

Which is related to my question. When multiple pthreads are running in HUBEXEC mode how do we get that yield or thread reschedule done? Still a question in non-preemptive scheduling.

Heater. · 2014-02-23 07:48

Bill,

You did not answer the question. A yeild can happen by calling an I/O function or a dedicated yield/suspend. The question was having done that how is the scheduler going to suspend one thread and start another. My guess as I outlined above is that it happens as on any other CPU performing cooperative scheduling.

Unless I have missed some special P2 points here.

while(1) {} is always a problem. That is why there are languages designed for safety critical real-time systems that are event based and do not support loops!
That is to say they don't have threads (at the language abstraction level), preemptive or otherwise.
One could imagine building a processor that supports that abstraction in hardware. Sequences of instructions start at the start. completely finish at the end and backward jumps are not allowed!

Bill Henning · 2014-02-23 08:07

Sorry, I just got up, and no coffee yet - I mis-understood the question.

Without interrupts, or a thread cycle counter that jumps to a scheduler (an interrupt basically), the scheduler cannot suspend a thread.

An alternative is having one task be able to suspend another (like jmg's 1/16 thread as a controlling thread) but that would need something like:

STOPTASK pc_z_c_save_reg, #task0..3 ' task refers to hardware tasks

STARTTASK pc_z_c_load_reg, #task0..3 ' task refers to hardware tasks

That way a task running at 1/16 could implement pre-emption by whenever the current time slice expires entering a TSINGLE state, saving PTRA/B/X/Y, optionally saving the contents of AUX and cog, loading the next threads PC,Z,C,PTR*,{AUX} going back to TMULTI, starting the next thread.

Hmm.. this is more suited to the prop architecture, and avoids interrupts, at the cost of 1/16 cycles.

BUT, those 1/16 cycles could be used to support all the normal wakeup and scheduling functionality (including checking for events being waited on - ie sockets, serial data available etc), and STOPTASK/STARTTASK should be easy to implement.

What is cool is that the temporary full speed task control Chip added ties in with this perfectly. Sorry I did not remember the latest names for those ops.

In order to preserve sanity, such a "threaded cog" should only run a scheduler (say task 0) and the user threads (task 1)

(I am trying VERY hard not to ask for a STEPTASK pc_z_c_savereg, which would give us hardware single stepping for cog and hub-exec code.)

Heater. wrote: »

Bill,

You did not answer the question. A yeild can happen by calling an I/O function or a dedicated yield/suspend. The question was having done that how is the scheduler going to suspend one thread and start another. My guess as I outlined above is that it happens as on any other CPU performing cooperative scheduling.

Unless I have missed some special P2 points here.

while(1) {} is always a problem. That is why there are languages designed for safety critical real-time systems that are event based and do not support loops!
That is to say they don't have threads (at the language abstraction level), preemptive or otherwise.
One could imagine building a processor that supports that abstraction in hardware. Sequences of instructions start at the start. completely finish at the end and backward jumps are not allowed!

pjv · 2014-02-23 08:08

For a moment here I thought it was April fool's day ! But no... very funny Chip !

But on a more serious plane, in my opinion (and as yet I don't understand much about the P2) pre-emptive scheduling would best be served by an interrupt on pin change, and an interrupt on counter match. And the interrupt would save the CPU state, as well as restore that on resumption.

What preemptive would give you is the ability to write threads in a more natural manner, with endless loops permitted. But that, and perhaps a sight improvement in speed, would be about all. So it really comes down to how one thinks about threads. Co-operative threads require release of the processor at appropriately timed intervals, and that is not difficult.

So without an interrupt mechanism, thoughts about a pre-emptive approach are probably pointless.

Cheers,

Peter (pjv)

mindrobots · 2014-02-23 08:24

In my old but heavily multi-processing, multi-tasking mainframe experience, it always started with an interrupt. Either an external event (usually an I/O completion of some sort), or a timer expiration for time sliced tasks that had a time limit set by the task dispatcher/scheduler. In this case, the architecture supported a user register file and an executive register file (where the interrupt processing ran). The job of the interrupt handlers were to but things into a switchable task as quickly a possible and go back to the dispatcher to let switched work continue.

On the prop, you can have timer expirations in a kernel cog act as interrupts or have a WAITPxx instruction sitting on a pin act as an interrupt. So now, you can generate simulated interrupts to a COG (or part of a COG). Now you need to be able to cause whatever thread you want to pre-empt to stop and then have some way to save execution state (everything up to COGRAM in this case?), potentially load something or start something to "handle" the interrupt and then go back to your task list (with all its underlying data structures and saved states) to give an execution resource to the next ready, highest priority task. Then your "executive' can go back to waiting for an interrupt.

Of course, the more of this you push into hardware, the easier the software is to write...usually.

Is this really where the P2 design needs to invest additional hardware resources. I know we can't the future but with all that the P2 now has, is this REALLY, HONESTLY going to be something that is going to be used a lot? As a research and experimentation effort, I can see it being a lot of fun to play with. The interrupt and task dispatching part of the OS was always my favorite part to play in but it was also the most complicated to get your head around.

Again, is this really a P2 (or microcontroller) feature that people think will get real world use?

Preventing while(1) {} is always a problem at the executive level. At the user level, all tasks have a maximum time slot so if they don't yield or aren't interrupted, they run until their timeslice is used up and go back on the queue for future dispatching. Each time they exhaust their timeslice, they get dropped in priority. Again, fun stuff but can be messy code.

Please, just say "NO"

Heater. · 2014-02-23 08:32

pjv,

So without an interrupt mechanism, thoughts about a pre-emptive approach are probably pointless.

Exactly. I imagine there is a ton of complexity there:

1) What is going to generate the interrupts, pin changes, timers, video, software interrupt, other threads. Other Cogs?
2) What is going to get interrupted? Which COG, which thread?
3) What about priorities?
4) what about enable disable interrupts.

Sorting all that out that out is another ton of YAFI and/or register bits. Gack.

Sapieha · 2014-02-23 08:41

Hi ALL.

I have only one question?.

Shall P2 be any Micro-controller ELSE will all have it to be MAINFRAME CPU ?.

mindrobots · 2014-02-23 08:42

Heater. wrote: »

pjv,

Exactly. I imagine there is a ton of complexity there:

1) What is going to generate the interrupts, pin changes, timers, video, software interrupt, other threads. Other Cogs?
2) What is going to get interrupted? Which COG, which thread?
3) What about priorities?
4) what about enable disable interrupts.

Sorting all that out that out is another ton of YAFI and/or register bits. Gack.

Yes, it gets nasty and you do need the hardware to support unless you simulate all of this. I haven;t looked at pjv's code but I've wanted to see how he does it.

In my case, the processors were all identical, so whoever got the interrupt handled it and then found something else to work on in the dispatch queue. The Prop probably wouldn't work this way - especially with some of the ideas presented so far. It got even uglier when we considered processor affinity. If something was ready but had affinity for a different processor the processor in the dispatching code at the time found some work and then before it left, it interrupted the other processor (at least I think we did, it was long ago). Another cool feature but another layer of complexity.

Heater. · 2014-02-23 08:49

Sapieha,

Shall P2 be any Micro-controller ELSE will all have it to be MAINFRAME CPU ?.

+1

mindrobots · 2014-02-23 08:53

Sapieha wrote: »

Hi ALL.

I have only one question?.

Shall P2 be any Micro-controller ELSE will all have it to be MAINFRAME CPU ?.

My $$'s have always voted Micro-controller! My interests have always voted micro-controller!

pjv · 2014-02-23 09:19

Mindrobots,

If you look at my code please keep in mind that it is very old, and not SPIN friendly. I have continued to develop it, incorporating major enhancements, and making it very SPIN friendly. Though the concepts of the early versions are still valid, and you could build your own based on them.

Regrettably I have not been able to publish it because of MIT license requirements. I would love to get it out there for every hobbyist to use for free. It's the free commercial applications that are unfair. If only there was some way to exclude free commercial applications from the MIT license ........

Cheers,

Peter (pjv)

potatohead · 2014-02-23 09:31

Well then, it is settled. A P3 chip is needed.

Stay strong Bill. I appreciate it.

mindrobots · 2014-02-23 09:34

potatohead wrote: »

Well then, it is settled. A P3 chip is needed.

I didn't think that issue needed to be settled, I always thought P3 was a given.

What needs to be settled is the P2 and actually getting a P2 CHIP!! That's where the problem lies.......

JRetSapDoog · 2014-02-23 10:13

Yikes! Duck for cover!

jmg · 2014-02-23 12:21

Heater. wrote: »

....

In that way our critical thread has, in a way, preempted all the others to grab all the CPU capacity for a while.

Is that what you were meaning or am I way off?

If not, can what I describe above be done currently?

Yes, the new ONE/MULTI opcode, allows any one thread to remap the time slice to ME=100%, for any duration.
That could be done before, in multiple lines, but now it is more Atomic.

The detail others have asked for, and I can see is also vital for good Debug, is the ability to read/modify another Tasks PC/Flags.

If that can be done, then the rest can pretty much be added as needed in SW, and good Debug can be achieved.

The new ONE/MULTI opcode, has added a way to 'freeze' other tasks, so that does buy the time to do as much replace as you want.

With Hub-exec the need to swap-out and then replace a whole code-chunk has become less important, but Debug needs to be able to view/edit all tasks PCs and Flags, and once you have a means to do that, you can (I think) completely capture/restore an operating state, should you want to. I guess that could be swapped right out to SDRAM, if someone wanted to.

I think the present lower limit for Debug overhead is 1 task minimum, or 1/16 Time resource, and 1 of 4 tasks..
(so you can easily debug 3 other tasks)
A fractional task may possible, if the Debug stub can co-operate with user code, for some share of that 1/16 task.-in that case, you could run (effectively) 4 user tasks, with debug, and lose some % of time in the simplest, slowest task.
That gets closer to totally invisible, but needs some SW co-operation.

Another benefit of Read/Modify of other tasks PC/Flags, is in the area of Watchdog / Self checking code.
A user can launch 3 control tasks, and the 4th is there purely as a watchdog/blackbox type operation.

jmg · 2014-02-23 12:29

Bill Henning wrote: »

<snip>

An alternative is having one task be able to suspend another (like jmg's 1/16 thread as a controlling thread) but that would need something like:

STOPTASK pc_z_c_save_reg, #task0..3 ' task refers to hardware tasks

STARTTASK pc_z_c_load_reg, #task0..3 ' task refers to hardware tasks

<snip>

(I am trying VERY hard not to ask for a STEPTASK pc_z_c_savereg, which would give us hardware single stepping for cog and hub-exec code.)

Provided you can build STEPTASK pc_z_c_savereg, with the ones above, then you can avoid asking for it

As long as one task can somehow Read/Write another tasks PC & Flags, then functions for STOPTASK pc_z_c_save_reg can be built.
Smaller is always good, but the important step is the full access & replace ability.

Bill Henning · 2014-02-23 12:40

Not sure you could build STEP from START & STOP, as I am not sure even back to back START/STOP would execute only one instruction on the desired task.

Also this will need

GETTASKPTRA reg,#0-3
GETTASKPTRB reg,#0-3

as there are task local pointers

and the behaviour of rep loops?

I really like this idea for P3... but I am afraid it is too big a change at this point for P2

jmg wrote: »

Provided you can build STEPTASK pc_z_c_savereg, with the ones above, then you can avoid asking for it

As long as one task can somehow Read/Write another tasks PC & Flags, then functions for STOPTASK pc_z_c_save_reg can be built.
Smaller is always good, but the important step is the full access & replace ability.

jmg · 2014-02-23 13:31

Bill Henning wrote: »

Not sure you could build STEP from START & STOP, as I am not sure even back to back START/STOP would execute only one instruction on the desired task.

Good clarify point, yes, it would need that START/STOP executes one opcode, as a minimum.
(The present Task mapping register flexibility may help here)
Other controllers have this type of rule, to allow simpler debug/monitors.

Bill Henning wrote: »

Also this will need

GETTASKPTRA reg,#0-3
GETTASKPTRB reg,#0-3

as there are task local pointers

While that is nice to have, if you can R/W memory and change PC, you can patch in code to read any other task-private items.
The debug just grows a little larger in size.

Ahle2 · 2014-02-23 15:08

PLEASE MAKE IT STOP!!!

This thread is the exact reason why I am afraid of saying much in the main thread. A few words of mine all of a sudden triggers a massive thread with so many replays that I don't have the time to read them all. First of all I want to be clear that I never asked Chip to implement any new instructions or hw support regarding PM. My question was wheter there was a way of getting registers from another task. The reason I did not say why I wanted to know, was because I was afraid of something like this thread would happen.

Chip asked what I wanted it for and I had to reply. As I have said in the main thread... The P2 does not need a microkernel, interrupts , hw support for scheduling or memory management in hw... It is just me wanting to thinker with these things for my own pleasure. And I want to do it all in SW. As far I know the only thing extra that is needed is a way of retreiving registers from another thread and it can all be done in software.

Btw, I feel so young when all you gurus goes on about things that you did 10-20 years before I was born. But I think respect goes both ways. I have a lot to learn from you, and maybe my energy and creativity can give something to the community as well. BUT PLEASE MAKE THIS THREAD DIE!!!

/Johannes

What's needed for preemptive multitasking?

Comments