Help Chip with design of Prop II

John Abshier · 2011-08-30 14:50

Chip asked (if I remember correctly) if users would rather have a 100 MHz Prop II with 3 cycle branches or a 160 MHz Prop II with 5 cycle branches. Could anyone who has experience with Prop simulators modify them to provide statistics for PASM and possibly Spin programs to let Chip make an informed decision?

John Abshier

Sapieha · 2011-08-30 14:56

Hi John.

It is not that simple.
160/5 preferred ---> BUT what impact that have on HUB access IS more important!

Ps. 1 - what You describe give not entire picture of all problems that need be considered.
Ps. 2 - 160/5 give 32 ratio that is next same as ---> 100/3 give 33.3333 ratio

John Abshier wrote: »

Chip asked (if I remember correctly) if users would rather have a 100 MHz Prop II with 3 cycle branches or a 160 MHz Prop II with 5 cycle branches. Could anyone who has experience with Prop simulators modify them to provide statistics for PASM and possibly Spin programs to let Chip make an informed decision?

John Abshier

jmg · 2011-08-30 16:01

Sapieha wrote: »

160/5 preferred ---> BUT what impact that have on HUB access IS more important!

- and also the impact on Timers and ports.
If the timers and port access can also work to 160MHz, then that is the clear choice...

What is the silicon impact, of giving users the choice ? (hehe..)

Cluso99 · 2011-08-30 16:07

From the basic question, the 160MHz/5 was a no-brainer. But Chip then elaborated that there were other parts of the picture such as requiring flipflops on the I/O pins that would mean the I/O pins would be delayed a clock cycle. I am not exactly sure of the ramifications in all this, but it is not the simple question that it seemed.

Mark_T · 2011-08-30 17:25

Starting to sound like speculative execution (where you actually do work on the predicted arm of the branch that has to be undone later if branch prediction was wrong). The I/O pins can't be undone so need a delay to protect them from speculative side-effects. IIRC most instructions are 1 cycle with the Prop2 pipelining? I'm guessing only mis-predicted branches cost the multiple cycles and that the branch-prediction scheme is as on the Prop1, branches are predicted to be taken.

Kevin Wood · 2011-08-31 00:41

I'm guessing that Chip already has the best simulator available at this time.

Dave Hein · 2011-08-31 06:40

Mark_T wrote: »

Starting to sound like speculative execution (where you actually do work on the predicted arm of the branch that has to be undone later if branch prediction was wrong). The I/O pins can't be undone so need a delay to protect them from speculative side-effects. IIRC most instructions are 1 cycle with the Prop2 pipelining? I'm guessing only mis-predicted branches cost the multiple cycles and that the branch-prediction scheme is as on the Prop1, branches are predicted to be taken.

I don't think Prop 2 will implement speculative execution. It will be capable of delayed jumps and normal jumps like in Prop 1. The delayed jumps will execute the instructions already loaded in the pipeline before it starts executing the instruction at the jump target address. So this means that a delayed jump will execute the 2 or 4 instructions following the delayed jump instruction. The normal non-delayed jump will flush the pipeline, which means that it will essentially execute 2 or 4 NOP instructions after the jump.

More levels of pipelining will make it a bigger challenge to write efficient code that contains lots of jumps. It will require re-ordering instructions to execute after a delayed jump or using the conditional execution flags efficiently. A good compiler should be able to handle delayed jumps efficiently.

Seairth · 2011-09-12 05:46

Based on this conversation, I'm a bit concerned about the direction that Prop II seems to be taking. The beauty of Prop I (in my mind, at least) was it's simplicity despite being multi-processor. It was not necessary to think about things like interrupts, execution pipelines, etc. in order to write efficient and effective code. But it seems to me that Prop II is eshewing a great deal of that simplicity by adding a complex execution pipeline. I understand that it's a performance feature, but that only works if the developer can fully utilize the feature. Further, I think this would make cross-cog coordination even more difficult (without extensive use of hub operations).

Instead, I would rather see the focus be on significant improvements to the coordination and inter-communication of the cogs (and other propellers). I want the Prop II to be easier to use, not harder. If I have to start thinking about things like pipeline stalls or predictive branching (which means I cannot easily estimate MIPS) and have to add more code to keep cogs synchronized, I may as well go back to traditional processor development. Put another way, if Prop || is going to end up being as complex to code for as Microchip, Atmel, and other processors, I don't see enough incentive to stick with the Prop.

If there is a feature (other than avoiding the use of interrupts) that still makes the Prop || stand out from the crowd, I would like to know what it is and why that feature makes this new processor that much betterr than the competition.

Heater. · 2011-09-12 06:21

Searith,

I agree that the beauty of the Prop I is it's simplicity. That extends all the way from the overall architecture through the assembler instruction set, through the Spin language and interpreter all the way up to the Prop Tool (and it's clones now). It even includes the simplicity of dropping objects from OBEX or elsewhere into your projects with little worry that they may break some of your existing project.

A great deal of that drop dead ease of use simplicity is made possible by the freedom from having to worry about hooking into interrupts, setting priorities, worrying that some functionality is going to "steal" time from another. In that way multi-cores and the lack of interrupts are a god send.

Another big part of it is the fully deterministic timing of the instruction execution.

I do not believe Chip is about to sacrifice any of that in the Prop II. What is it about this this conversation that has made you think that might happen?

I'm confident that the Prop II will be as easy to program as the Prop I and that most existing code for Prop I will run on it with minor changes. Although it might well have some extra goodies that take some serious study to utilize fully. But then how many Prop I users are fully conversant with it's video generation capabilities?

Dave Hein · 2011-09-12 09:41

I believe Parallax is committed to supporting Spin on the Prop II, so most of the complexities of the pipeline architecture will be hidden from the Spin language programmer. Prop I Spin and PASM code should run on the Prop II after being re-compiled by a new and improved Prop tool. Therefore, current users of the Prop I should be able to program the Prop II without having to learn all of the new features of the Prop II.

Prop II's delayed jumps are fairly easy to handle. It just require putting some of the instructions in the loop after the delayed jump instruction. Even if we end up with 5 stages of pipelinng it would still allow for a tight 5-cycle loop that contained a delayed jump followed by 4 instructions in the loop. Hopefully, the Prop tool will support a #ifdef directive that would allow switching between Prop I and Prop II code in the same file.

BTW, I believe Prop II will support a direct cog-to-cog port so that hub accesses aren't required to for one cog to talk to another one.

Seairth · 2011-09-12 13:26

Heater. wrote: »

What is it about this this conversation that has made you think that might happen?

As I do not have a complete picture of the inner workings of the Prop II, I admit this is more of a gut feeling than anything else. With the standing up of Parallax Semiconductor and the obvious increase in complexity of the internal design of the Prop, it seems to me that the company is attempting to compete with "the big dogs". And I think that's a great thing, as long as they keep the Prop true to it's original design intent. I went away from the other micro company precisely because of the strengths of the Prop. If those strengths are still going to be there in the new version, then I don't have anything to worry about.

pjv · 2011-09-12 14:53

@ Heater;

It's clear that you have a very strong distaste for interrupts because they can be somewhat difficult to tame, and in the hands of inexperienced or incompetent programmers, that could be problematic. Well understood. But calling the lack of interrupts a "god send" is, I think, a bit over the top.

Frankly, I wish the Prop HAD an interrupt capability, for me to choose to use or not to use. There are things I do in assembler that would benefit from a simple "int-on-count", or "int-on-pin" interrupt to get away from those evil (IMHO) waitcnts and waitp's. The "wait" functions, while very convenient, are an enormous resource waster, at least in maximally deployed Props. For simple programs or LED Flasher, of course it is not an issue.

So, for me, lack of an interrupt (single priority) is tolerable, but far from "a god send". If an interrupt was available,you COULD choose not to use it.

Without intending any offence, I hope you can appreciate some others' point of view.

Cheers,

Peter (pjv)

User Name · 2011-09-12 16:16

pjv wrote: »

The "wait" functions, while very convenient, are an enormous resource waster, at least in maximally deployed Props.

Peter, you COULD choose not to use them.

pjv · 2011-09-12 17:02

Hi User;

Unfortunately I am not offered an alternative.

Cheers,

Peter (pjv)

Seairth · 2011-09-12 18:32

pjv wrote: »

@ Heater;

It's clear that you have a very strong distaste for interrupts because they can be somewhat difficult to tame, and in the hands of inexperienced or incompetent programmers, that could be problematic. Well understood. But calling the lack of interrupts a "god send" is, I think, a bit over the top.

This thread isn't about Heater.

pjv wrote: »

Frankly, I wish the Prop HAD an interrupt capability, for me to choose to use or not to use. There are things I do in assembler that would benefit from a simple "int-on-count", or "int-on-pin" interrupt to get away from those evil (IMHO) waitcnts and waitp's. The "wait" functions, while very convenient, are an enormous resource waster, at least in maximally deployed Props. For simple programs or LED Flasher, of course it is not an issue.

I understand the need for interrupts on platforms that were designed with them in mind (e.g. Microchip PICs). However, the Propeller was specifically designed without an interrupt mechanism. It is one of the things that makes a Propeller what it is. My original concern (though not specifically about interrupts) was that adding or changing one of the core design features would make the Propeller something else entirely. In my case, I was concerned about the determinism of the code (especially across cogs), but that does not seem to be a concern of others (well, at least one other) whose been more involved in the Prop II discussions.

As for the use of WAITPNE (etc.), I somewhat share your feelings on this. The notion of using up a cog to just sit and wait for an input change seems like a waste of a cog. However, I feel that this will be less of a concern on Prop II. With the single-cycle execution time and the (potential) 160MHz operating speed, the seven other cogs should be able to handle about 4-8 times the work load. Also, with the inter-cog capabilities being added, it should be significantly more efficient for a dedicated WAITPNE cog to perform interrupt-like processing without having to go through the hub. Sure, it's still not the same as an actual interrupt, but that doesn't bother me in the least on the Propeller platform.

Cluso99 · 2011-09-12 18:36

pjv: I know heater is a master of interrupts from knowing a bit about his past and day job. I too have mastered interrrupts. Perhaps a good example of this is in the 80's with a modem controller to the Rockwell chipset (and others) that performed the control of the Rockwell DSPs and Analog chip (3 chip set), implementing the AT command set, and implementing a soft UART (no internal UART in the MC68705U3S!), all on a 4MHz xtal (resulting in a 1MHz clock, 4 cycle instruction = 4us per instruction) processor. In addition, the UART had to be able to run at 2% tollerance for the Apple //c which was out of spec.

However, IMHO the lack of interrupts is fantastic! It forces the user to think in different ways. We do not have objects that use interrupts. Yes, it may be true that interrupts would be useful occasionally. But, that is far outweighed by forcing the user to use different methods.

The new Prop II is a much more complex chip. However, that only comes into play when those functions are required to extract somewhat special applications out of the chip. For example, a great majority of these will be specific to objects for various intelligent peripherals. Therefore, these complexities will be hidden from many users. How many fully understand the VGA or composite video??? Even I don't fully understand the TV, yet I wrote a 1pin version (using code by ericball and potatohead).

Prop II also contains a register interface between cogs, supported by special instructions. Full specs of this is not available, but there were preview discussions.

And for the final reality check... The Prop 1 is absolutely NOT being replaced by Prop II. They are different animals and all the work on Prop I is not wasted because the Prop I will still have its advantages too. Prop II is another leap step ahead (if that makes sense) and covers some things desirable in certain Prop I applications.

jazzed · 2011-09-12 18:45

Prop1 is designed for everyone. Prop2 is designed for everyone else.

Waitpeq/waitpne will have an abort mechanism in Prop2.
Complaining about interrupts or lack thereof is futile.

Seairth · 2011-09-12 19:13

jazzed wrote: »

Waitpeq/waitpne will have an abort mechanism in Prop2.

Has there been any discussion about adding a WAITCHG (or WAITANY) type of instruction? To be clear, this would allow the WAIT to wake for a change to *any* of multiple masked pins. I realize that this can be done right now with a WAITPNE, but it requires several additional instructions for setup. As a result, the WAITPNE approach is not atomic (i.e. a pin could pulse during the setup code and therefore get missed once the WAITPNE is entered). This hasn't really been an issue for me up to this point, but I wouldn't mind tighter WAIT loops and fewer instructions. I think it's also a more natural way to use WAIT instead of polling (especially if Prop II will have a timeout/abort capability).

Phil Pilgrim (PhiPi) · 2011-09-12 19:33

Interrupts exist as a mere kludge for single-processor micros that can only pretend to be a Propeller. Time spent idling in a cog in a waitxxx instruction is no more wasteful than the same time spent by the interrupt processor in a more traditional micro. Craving interrupts in a Propeller is like the pining of a cured paraplegic to have his crutches back.

I'm astonished that this fact isn't so patently obvious by now that it hasn't long since disappeared over last year's horizon of discussion.

-Phil

pjv · 2011-09-12 20:13

Sorry Phil, I'm not in agreement with you here.

While I truly love the Prop, and many wonders can be worked with it, my belief still is that for a very busy ("fully deployed") Prop, the capability of an interrupt would still be very useful. This situation exists mainly when all cogs are busy executing code, and one assembler cog is stuck in one of the "wait" instructions looking for an event. That cog would be better used executing more code threads, and having an interrupt briefly perform some activity.

Without interrupts, there are probably always work-arounds of some sort, but having an interrupt available would to me be "a god send" as I would then be able to squeeze more yet out of a cog.

I also realize that interrupts are not part of the design philosophy, and in many respects that does make things easier. And usually easier is better, unless you need more from the Prop......

Perhaps I was misunderstood when I reacted to Heater's exclamation on the miracle of no interrupts.

Cheers,

Peter (pjv)

jazzed · 2011-09-12 21:34

Seairth wrote: »

Has there been any discussion about adding a WAITCHG (or WAITANY) type of instruction? To be clear, this would allow the WAIT to wake for a change to *any* of multiple masked pins. ...

There is no WAITCHG-like instruction that I know of. I believe there is an analog comparator per pin though. The comparator would allow any of multiple pins state detection. The actual details of how states are reported is very vague at the moment. This information comes from a post by Chip that shows a white on blue diagram.

Today you can use the COG's counter module to capture pin change events which can be registered in the PHSx accumulator. You will end up with a change count which is not blocked by code or other hardware. The downside is that only 2 pins can be monitored per COG. The upside is you can check the state of PHSx any time to tell if there was an event. To me this is just as good as an interrupt except for lack of an associated ISR.

4x5n · 2011-09-13 09:33

I've only been working with the propeller for a couple of weeks now and have only scratched the surface on it's capabilities and don't know about half the instructions available in spin yet. Back in another life I did a lot of interrupt driven machine control programming on the 6809 and 68hc11 and have to admit to missing interrupts! I think the waitp* instructions are nice but the entire cog stops and waits until the input arrives. In my mind that makes it less efficient then having an interrupt service a handler and return when complete. An example would be in the case of an injection molder where I would energize a solenoid to move a hydrolic<SP?> ram and that ram is to continue to move until it trips a limit switch. No need to have the cpu sit and wait for the switch. The switch triggers an interrupt and when the switch is tripped a handler deals with it quickly and moves on.

It does take a while to learn how to write "clean" and maintainable interrupt driven code but it can be done with experience. Would it be to much to ask for a "watchp*" type command? :-) That way when a pin goes high or low a routine gets run. It doesn't have to replace anything but simply be an additional option. Again I spent ~10years writing interrupt driven code and have spent less then two weeks with the propeller. :-)

Mike Green · 2011-09-13 11:24

The whole idea is that with 8 cogs, one of them can indeed be assigned to just wait ... idle ... for something to occur. The biggest issue ... power consumption ... is moot because the cog is effectively powered off during the wait. Would you worry about using an I/O pin because you have only 16 available? ... Only if that were a limited resource for your specific project.

Heater. · 2011-09-13 11:43

4x5n,
You might find this suggestion a bit upsetting but there is another way, polling.
Take your example of detetecting a hydraulic ram limit. Let's say you have a cog waiting on a timer such that it wakes up every 1ms. So every 1ms it has time to do 20000 instructions. Easily enough to check a pile of such rams and other conditions and perform some action.
Ah, you say, polling is bad. It wastes processor time and it can have a high latency. So what? We have a bunch of other cores to play with so the fact that polling eats time on one of them does not matter so much. The reaction time is probably fine for things like hydraulic rams. The little sleeps reduce power consumption but can be made shorter or removed if a quicker reaction time is required.
I suspect that for a lot of applications interrupts are not required when you have multiple cores. Having all that interrupt hardware on each core will waste silicon that could be used for better things. Interrupts would soil and complicate a very clean architecture for not much gain.

User Name · 2011-09-13 12:00

pjv wrote: »

Unfortunately I am not offered an alternative.

The alternative is not to use them. I could show you a listing 1000 lines long which incorporates real-time asynch I/O, and yet doesn't have a single wait-type command in either SPIN or PASM. I left them out simply because I couldn't afford to tie up a cog that long. Yes, it does polling. But the polling was very easily interspersed.

4x5n · 2011-09-13 12:08

Heater,

The problem with polling is that you then need to then need to weave the polling throughout the code. This makes things far more complicated then handling interrupts. It's also makes the software very difficult to maintain and ends up causing the biggest evil (if you can call it that) of meaning that mainline code needs to stop running to handle the input. In reality it's a form of software interrupt.

I agree that with 8 cogs running at 80MHZ it wouldn't cause any real performance hits to dedicate a cog for handling inputs. I've been working on developing the framework for doing exactly that. Complete with the ability to inject the request for that input to be "handled" by other objects. While I've done little "real" programming in languages that directly support objects (like spin, python, c++, etc) I started to write my code in objects many years ago and expect to take full advantage of the object support provided by spin and pasm. That means having an object to monitor inputs another to control temp, another to run stepper motors, another for servos, well you get the idea.

Who knows maybe in six months I'll "come around" and agree that interrupts are more evil then good. :-) Until then I'd like even primitive support such as a command like "watchp*" that would watch a pin and execute a handler of some type and another "uwatch*" command to stop watching. That way we could set that watch in an object and instead of having a cog sitting and waiting for an input actually doing "real work". It would be of course in addition to the current command set.

Phil Pilgrim (PhiPi) · 2011-09-13 12:30

4x5n wrote:

The problem with polling is that you then need to then need to weave the polling throughout the code.

In PASM this is easier to do than you might think, by using coroutines: http://www.parallaxsemiconductor.com/an014

-Phil

mindrobots · 2011-09-13 12:41

4x5n,

I'm curious what types of interrupts your are looking at processing as far as urgency and frequency. If it's an urgent (data will be lost if not processed NOW!), frequent (micro-second), then dedicating a COG seems like a reasonable approach to meet the demand. If it's just a peripheral device that has no other way to notify the processor except for interrupts (because that's what all the cool kids understand), then polling seems reasonable. If the peripheral(s) just hold up hands every so often and then the processor still needs to read the data from them (Yo, the temperature changed out here), poll and process or poll and pass the notice off to some other interested COG.

4x5n · 2011-09-13 13:35

I should mention that in the applications I'm currently turning to interrupts aren't needed and would needlessly add complexity since I will have more cogs then I could use efficiently even if I wanted to use them all. However back in the day when I worked as a PE designing hardware and writing software to handle industrial machine control that wasn't always the case. Remember that in a factory environments it's rare to only need to aspect of a line. In the case of an injection molder for example there is a lot going on around the molder that needs to be controlled and it would be easy to have situations in which all cogs are "busy" controlling aspects of a machine or line. Back in the dark ages when I worked in the field we in frequently needed to use multiple cpu's to handle the multiple subsystems of say a large printing press, injection molding line, heat treater, etc. In the case of an industrial situation when a "panic stop" button is pressed the line needs to be brought to a "safe state" NOW!! That is a clear case for an interrupt. For the most part hardware interrupts were used to allow a processor to turn on a selonoid, motor, pump, etc and keep it on until an external hardware condition requires it to be turned off. By using interrupts the processor could go on doing other things and not have to poll or wait.

They're not always needed but nice to have when the most important thing for a processor to do is respond to an input without delay!!

Mike Green · 2011-09-13 14:03

In case of emergencies, one cog can stop other cogs dealing with the crucial hardware and take control of the signals needed for an orderly stop.

4x5n · 2011-09-13 16:28

I'm not trying to be argumentative about this and I see that people on both sides of the interrupts/ no interrupts debate have very strong feelings about it both ways. :-)

I also understand that with spin, pasm and multi cogs that interrupts aren't strictly needed. I do however feel that interrupts aren't the evil that I get the impression some people seem to think they are and a lot of times can make the resulting code easier to maintain, more extensible and "cleaner" then having to write code to poll inputs throughout the code.

Since this thread is about the design of the prop II and hopefully ideas for ways to improve the current prop. That said I'd like to suggest the addition of a spin and pasm commands along the lines of "watchp*" and "unwatchp*". The commands would be set to monitor input pins and execute a routine when a pin goes high or low. Again it would be an addition to the current list of commands. They would be a variation of the waitp* commands and of course their use would be optional and not frequently needed. :-)

Help Chip with design of Prop II

Comments