Head Spinning Around JMPRET
JonnyMac
Posts: 9,104
Those who know me know I'm a pretty regular guy who writes straightforward -- easy for my fellow actors to understand -- code. I have happily used FDS for years without worrying about the mechanics. Until now. I want (hence need) a simple FDS UART that can also run a 1ms timer.
I do tend to swallow my own medicine so I read what I could about JMPRET and even created a project with three LEDs that happily blink at independent rates. Next I converted one of those LED processes to a timer process that updates a hub variable and it works great.
So what's the problem? Me. Every time I look at the way I've set things up my mind gets into a twist thinking it's going to break somewhere. Here's my skeletal code for three processes:
I do tend to swallow my own medicine so I read what I could about JMPRET and even created a project with three LEDs that happily blink at independent rates. Next I converted one of those LED processes to a timer process that updates a hub variable and it works great.
So what's the problem? Me. Every time I look at the way I've set things up my mind gets into a twist thinking it's going to break somewhere. Here's my skeletal code for three processes:
dat entry mov p1code, #proc1 mov p2code, #proc2 mov p3code, #proc3 ' other setup proc1 jmpret p1code, p2code ' run process 1 jmp #proc1 proc2 jmpret p2code, p3code ' run process 2 jmp #proc2 proc3 jmpret p3code, p1code ' run process 3 jmp #proc3My mind thinks there's a lurking landmine. Is there, or am I seeing things being tired and stressed about another big project?
Comments
But as the Cog don't have any stack, you have to include the destination of the jmp that ends this subroutine as this instruction will self-modify that 9bit address of current PC+1
https://www.parallax.com/sites/default/files/downloads/AN014-Coroutines-v1.0.pdf
p.s. jmp is just a jmpret with no write flag set.
I suspect you could lose some bits on RX if you created a 1ms pause at higher baud rates and were transmitting at the same time. What's the baud rate?
The plot usually thickens with additional segments in each process. That is where jmpret shows its stuff, otherwise you might as well execute the tasks in sequence. For each process it will be something like:
Basically JMPRET is a GOTO. It does a GOTO some address, call it "targetAddr". However it also saves the address of the following instruction some place, call it "returnAddr".
In this way you can set up two loops. Call them "A" and "B". Loop A runs around and can do a JMPRET into loop B at any time, even into some place in the middle of the loop. Similarly loop B can do JMPRET to returnAddr which gets you back to continuing loop A from where ever it jumped out.
If I'm making sense there.
In FDS there are two such loops, RX and TX, that do just that.
This used to be known as coroutines back in the day. A neat way to make threads. Kind of.
It's not clear to me if/how you can extend that to three cooperating loops with out introducing some kind of scheduler loop in the middle. Gets complex, messy and slow.
Perhaps it's best not to mess with the JMPRETs in FDS but just add your 1ms handling to the RX or TX threads.
Whatever you do it will take time, cause jitters in the timing and limit the usable baud rates.
jmpret is all about (the appearance of) cooperative scheduling. A coroutine can read and act on CNT or it can manipulate its cog counters.
-Phil
-Phil
In the above "suspend()" gets you out of the infinite while loop and into another. "ping pong" as mentioned above.
The whole point here is that you can write two totally independent infinite loops (threads). The logical flow of one, as you read it, does not depend on the other.
Except for that pesky suspend() thing that you have to sprinkle around.
JMPRET is that suspend().
Arguably FullDuplexSerial can be written without such coroutines. In fact I seem to recall there is a serial driver that does that and performs better.
That's the more traditional approach, but it only works with two tasks using jmpret (i.e. ping-pong fashion). It relies upon the src field of the jump vector being written after it's read. If you have more than two, I think you're stuck with something like my example above, which can be extended to as many tasks as you want.
-Phil
I was thinking that if you want many tasks then you need a supervisor in the middle. All the tasks JMPRET to their coroutine, which is the supervisor. But the supervisor can juggle return addresses and JMPRET to some other task. I have not thought it through much.
Actually at first glance I don't see how your example can be extended to more than two tasks. It looks like the PASM version of my pseudo code. No doubt I'm missing a point.
Got it. Cool. You are chaining the tasks, round robin style.
Not quite like my pseudo code. It requires every task to know about another peer to JMPRET to. In the right order. But so what.
When they are all together at the top of the cog, the vectors addresses are simply 0, 1, 2, 3 ... out to the number of separate processes. Suppose there are 4 processes each with several sections. The jmprets look like this... Of course the vectors will in practice be symbolic names like p0code...p3code, but that is how they evaluate to numbers when they are positioned at the top of the cog memory.
It is not so hard to understand. Execution of the first jmpret in process A stores the address of its own next instruction to be executed at address 0, then the execution jumps to the address stored at 1. That is in process B. So on around the circle, until in process D the jmpret 3,0 stores the address of its own next instruction at address 3 and the execution jumps back to the address that was just recently stored at address 0 for process A. It is important to note that the contents of the jmpret instructions do not self-modify. Only the vectors change as the sections of each coroutine execute. If process A has three sections, then at different times address 0 will point to and vector to one of three code addresses in process A as those sections come into play.
Consider the 4-port serial object. It has 8 coroutines, each with 3 sections, a total of 24 jmpret instructions. There are 8 vectors. A complication is that its initialization has to selectively enable or disable any of the 8 coroutines. The application might require only 2 1/2 serial ports, not all 4. To do that, the initialization has to patch the jmpret instructions. In the above example with 4 processes, suppose it is necessary to disable process B, The initialization has to patch all the jmprets in process A to read, The 4-port object contains 8 separate processes, threaded like this... The order of threading does not matter so long as it completes the circle.
I think this would do the job:
-Phil
If you take concepts you have described a couple of steps farther, and make the JMPRETs from the individual tasks jump to a common entry point in a scheduler which then dispatches the next-to-be-executed thread, then you can get a very effective albeit basic "RTOS".
I have written such a scheduler, and it can run 8 (or theoretically more) individual threads in a single cog very nicely, all without regard for each others' sequence or timing. The basic kernel is only 34 instructions long and has an overhead of about 2 usec plus 1/3 usec per active thread. The technology dates back almost 10 years, and has evolved considerably since that time.
Some enhanced features, (with more code) permit you SUSPEND and RESUME operation of any thread, from any other thread. It also polls a mailbox in hub and can execute commands such as upload, download and others from SPIN routines that are running in other cogs, all while the threads' sequence and timing are largely unaffected.
There is a caveat however, and that is some jitter is caused by individual threads competing for the cog's CPU at the same instant. This typically is not an issue except at high baud rates (>115200) and having many threads running simultaneously.
The Propeller is such a wonderful machine to let us do all these neat functions.
I would consider publishing this code for use in non commercial applications.
Cheers,
pjv (Peter)
Jon? You asked about straightforward code, the mechanics of jmpret in order to add a 1ms timer to fds. The 1ms timer should indeed be an easy add-on to fds, provided there is either a little tolerance in how close the duration has to be to exactly one ms, or, the baud rate is low enough that the whole cycle can be synchronized to cnt in a way that brings it to 1ms right on the dot. The latter would be something like an RTOS. Something makes me suspect that you wants to be your own Actor to roll out the code.
No it is not. A requirement for posting in OBEX is the open MIT license, and I am not in agreement with that in respect of commercial applications.
In discussing this with Ken, his alternate suggestion was to publish it as an app note, where the commercial restriction conditions could be specified.
And this I have not bothered to do.
I think if there is enough interest for personal applications, I could do that.
Cheers,
Peter (pjv)
Our scheduler's suspend worked by juggling with the return address and such on the stack. The scheduler maintained simple queues of ready and waiting tasks. It had a simple round robin scheduling scheme. We had other primitives like "wait" and "delay". Because we used the stack those suspends could be buried deep in a call stack and still work, unlike the JMPRET scheduling.
One sneaky thing it had was the ability to read/write disk without blocking. It would schedule tasks to run while the disk access was going on behind the scenes. Took some juggling with BIOS interrupts to do that.
I'm not sure most of the world would call such a cooperative scheduler an RTOS. "Real Time Operating System" implies strict timing deadlines that cooperative scheduler cannot really provide. Usually involving interrupts and priorities. Of course in the Propeller we achieve real-time guarantees by using dedicated cores.
@pjv,
Why is the MIT license a sticking point? Seems unlikely any commercial enterprise is going get rich just by using your code. Even less likely they would want to pay you for use of it. If they ever needed such a scheduler they could make their own easily enough. Especially given Phil's inspiration above. I would have though the benefit to the Propeller community, mostly non-commercial use, would out way such concerns. And think of the fame, respect and standing in the community you would earn!
As you can see, it is derived from the bit-timing code used in FDS (using ticks per ms passed via cognew).
Again, I was concerned about my overall structure, even though my initial test code was working just fine. I'm not worried anymore, and my serial code will actually be simpler than what's in FDS because I only need True mode, and there are no circular buffers required -- just a one-byte command [tx] an a one-byte status [rx]. I will probably add a running flag when a start command is issue that will allow the timer to increment, and will clear and stop the timer when an EOF comes back from the player.
There is a serial PAUSE command -- unfortunately, it's a toggle (would prefer to have separate RESUME). I'm going to XOR a bit
Here's the PASM -- please remember that this is a WIP; your suggestions are appreciated, and please go easy on me.
Updated 20 FEB 2017
About the serial receive coroutine, once it receives one of the command characters it takes quite a long time to process the command before it gets back to test the other coroutines. It might mess up (depending on the baud rate) If it happened to be in the middle of transmitting a command at the same time that it received one, full duplex. However, I suspect that may never happen in this controller. It works, yeh!
When I first started playing with it I connected it to a Propeller with FDS and just watched the status returns. The player will report the current file at about a 12 Hertz rate -- "about" being the key word. The gentleman who sells the product in the US has a lot of experience with the BASIC Stamp so he made it (working with off-shore engineers) Stamp friendly. I have some customers who can benefit from a device like this using EFX-TEK controllers. The player loops its "zero" file, can play any other file, then goes back to file zero. I wanted to add a timer to facilitate chaining files based on time. Admittedly, this is a bit of an exercise for me, but I think it will be useful; with this object I don't have to worry about creating a timer elsewhere.