The New 16-Cog, 512KB, 64 analog I/O Propeller Chip

Heater. · 2014-04-09 23:22

Chip,

There are no more multi-personality wait/loop instructions, though there probably ought to be for WAITVID, so that other tasks can continue when one is doing a WAITVID.

Please clarify. Does this mean that that a task cannot do a WAITXXX instruction on a pin or CNT without blocking all tasks? i.e. tasks have to poll for pin and time events?
I guess we are OK with that, it's what FullDuplexSerial does in the P1 now.

Heater. · 2014-04-09 23:26

Eh, where did all this SONOS thing come from? As far as I know it's not on the table as OnSemi don't offer it. Not to mention it's yet another huge change in plans that we don't need at this point.

cgracey · 2014-04-09 23:34

Heater. wrote: »

Chip,

Please clarify. Does this mean that that a task cannot do a WAITXXX instruction on a pin or CNT without blocking all tasks? i.e. tasks have to poll for pin and time events?
I guess we are OK with that, it's what FullDuplexSerial does in the P1 now.

WAITxxx instructions will stall the cog, so they are not useful for multitasking programs. They wouldn't make sense in multitasking, anyway, because what they are looking for may come and go before the task gets another chance to check for the WAITxxx target condition. Tasks will need to code those checks in other ways.

ErNa · 2014-04-10 00:38

Just an idea: if there is more cog ram than can be addressed by 9 bits, it can be reached by pointers: "mov reg, ptr" and ptr can address 32 bits. Or: whenever ptr exceeds core space, it is redirected to hub memory. ErNa

Heater. · 2014-04-10 00:59

... if there is more cog ram than can be addressed by 9 bits,...

There isn't. The COG does not have RAM.The COG only has registers. At least on a P1. Hopefully on a P2 as well.
496 registers is huge by comparison to any other processor you can buy.
And as a bonus you can use those registers to cache instructions you want to run fast. Can't do that on any other processor I know of.

See, how naming and describing things differently changes the entire expectation.

...it can be reached by pointers: "mov reg, ptr"...

That instruction looks awfully like a RDLONG or WRLONG except working into some RAM space that is not shared. Would it actually be much more performant than using HUB RAM with RDLONG/WRLONG or using hubexec?

.

Brian Fairchild · 2014-04-10 01:39

Heater. wrote: »

And as a bonus you can use those registers to cache instructions you want to run fast. Can't do that on any other processor I know of.

And, run fast deterministically at that.

Seairth · 2014-04-10 07:15

ErNa wrote: »

Just an idea: if there is more cog ram than can be addressed by 9 bits, it can be reached by pointers: "mov reg, ptr" and ptr can address 32 bits. Or: whenever ptr exceeds core space, it is redirected to hub memory. ErNa

This could easily be done using INDx. At which point, you could actually do the following:

Increase the COG registers to 1024.
Increase INDA, INDB, and PC to 10 bits.
All addresses over 511 are only accessible via INDA/INDB (this might make an argument for a few more INDx registers).

An obvious usage pattern would be to load the program into the upper 512 bytes, then use the lower 512 just for data. And, this also has the advantage that even more cog registers could be added without any changes to compiled code!

tonyp12 · 2014-04-10 07:15

>You can bet SONOS is patented and requires $ per chip.
You license it and they will provide the expertise, of course they need to get paid.
But only 2 extra mask vs 7 for flash, the manufacturing cost is maybe something Parallax could afford? (and if OnSemi can license and fab it)

Having 16 Insta-Core with Smart Insta-IO, will sound good for the marketing department.
Having some type of Non-Volatile storage no matter how little without relaying on external ICs is what the Prop badly needs.

On the Prop the Sonos are not addressable,
It's just copied to the Cog at power up, there should be a hubop that copies a cog back to sonos (5ms to erase and 2ms to write)
If the 2 extra masks can just be put on top of sram for fast parallel loading, it should simply design.

potatohead · 2014-04-10 07:36

The COG only has registers. At least on a P1.

That's an important differentiator Heater. Heck, if we are going to tweak on the lingo, that's worth some discussion, and would help people better understand the addressing differences they find on a Prop.

Heater. · 2014-04-10 07:57

Potatohead,

"Tweaking" on the lingo. Yep, very important.

You see, if you tell someone this new fangled micro-controller has 512KB of RAM, usable for code or data, and 16 thirty two bit processors running at however many MIPS they might be quite interested. Especially if they want lots of pins and analogy stuff.

You might throw in the fact that the processors have 496 registers. At that point they might fall of their chairs given that every other CPU they have met only has a handful of registers. People, especially compiler writers, are always complaining there are not enough registers. Well now they have them.

Oh, and by the way, you can run code from those registers directly for extra boost and fine grained real-time control. At that point you might be getting open mouthed stares of amazement and disbelief.

See?

That is such a different story than the old P1. It runs code from COG RAM, but that's really small, you have to have some interpreter, Spin, in order to make bigger programs, but that's horrible slow. Oh you can use C but the LMM interpreter makes very big code.

potatohead · 2014-04-10 08:14

Totally agreed. I've got the better part of an essay done about this. Spent some time with peers as I said I would. After talking things through, we came to the conclusion there is no net gain for Parallax, US, and Propeller in general, to be had by "being like the other guy"

There is a basic conflict between education needs, marketing, and our overall discussion here. Unifying that around what the other guys do isn't good for us at all.

What we need to do is compartmentalize these things, maximize them, and present as needed. I'm in full agreement with you on how to "spin" our distinctive features. Embrace this, don't dilute it!

Bill Henning · 2014-04-10 08:25

Possible solution:

SETEVENT pin,#event ' rising edge, falling edge, statechange etc etc etc, could be 9 pin mask so any of 9 types of events

POLLPIN event, pin wc ' ask pin for last event, could optionally loop in place (task, cog) until it gets an event, clears event from smart pin for that cog, D is set to event sent from the pin

Basically, have a smart pin send a message back, releasing cog/task to continue. While cog is waiting, it only uses quiescent current.

Task can't miss event it was looking for.

cgracey wrote: »

WAITxxx instructions will stall the cog, so they are not useful for multitasking programs. They wouldn't make sense in multitasking, anyway, because what they are looking for may come and go before the task gets another chance to check for the WAITxxx target condition. Tasks will need to code those checks in other ways.

jazzed · 2014-04-10 08:32

Heater. wrote: »

That is such a different story than the old P1. It runs code from COG RAM, but that's really small, you have to have some interpreter, Spin, in order to make bigger programs, but that's horrible slow. Oh you can use C but the LMM interpreter makes very big code.

And the C CMM interpreter makes smaller code. LMM code will not be smaller on the new chip, but the new chip will have more memory. LMM will be faster than the old LMM.

jazzed · 2014-04-10 08:38

potatohead wrote: »

Totally agreed. I've got the better part of an essay done about this. Spent some time with peers as I said I would. After talking things through, we came to the conclusion there is no net gain for Parallax, US, and Propeller in general, to be had by "being like the other guy"

I agree mainly because of the cost (cash and opportunity) for Parallax to be like the other guy. Industry standards would always need to be met for example, and Parallax usually foo-foos such things as not being important. It would take a lot of effort for Parallax to understand and implement current industry standards, so it's better to just do what they have always done: I.E. Whatever suits Parallax is good enough for the customers they have or may expect to get.

Heater. · 2014-04-10 08:39

potatohead,

I've got the better part of an essay done about this.

Oh my God. It takes me the better part of a day and ten cups of tea to read some of your posts here. A whole essay might be too much

I agree, "being like the other guy" or claiming to be is a waste of effort. The other guys do that "other guy" thing much better and cheaper. Luckily a Propeller is not like them.

"spin" our distinctive features. Embrace this, don't dilute it!

Exactly. My only points really are:

1) The Prop II with it's speed, memory size and hubexec, is not confined like the P1 it is a radically different beast despite being very similar. The "spin" should be very different from the P1.

2) Use terms people can identify with. Things they want. And tell them we have lots of it!

Kerry S · 2014-04-10 08:45

potatohead wrote: »

What we need to do is compartmentalize these things, maximize them, and present as needed. I'm in full agreement with you on how to "spin" our distinctive features. Embrace this, don't dilute it!

Exactly.

The terms need to be 'standard' so that people get it at first mention THEN those terms can be used to describe how unique the Propeller is. That way your discussion is, right from the start, about the cool features of the Propeller rather than a long vocabulary lesson so they have any clue as to what you are saying.

Your description of:

Multi-core processor with 512K ram and 16 - 32bit cores each with equal access to all 64 smart I/O and 512 32bit registers each that can be used for data, pointers or local program cache for fast execution of code as needed. Shared access to RAM allows core to core communication and data sharing.

Really is something that people would GET the first time it is heard. Now the conversation turns to how they can leverage these amazing capabilities to their advantage.

For the new user, the first experience programming a prop needs to be as familiar to what they are used to doing as possible.

The term "HUBEXEC" should be dropped as that should be the 'default' mode as it would be what many people would use to start. Then you can get into using the 512 registers to run cached code for driver optimization as an advanced programming tool. If the tool chain is setup from the beginning to do what we know as HUBEXEC as default then programming the Prop would feel very familiar to any programmer.

Again it is all about TERMS and how they create impressions when first heard. Someone new to the Prop hearing "and you do this with HUBEXEC mode..." would instantly think "oh, great, another learning curve...".

They should be able to open SPIN or GCC for the first time and write a standard template example program and have it just work. They don't need to know it is technically running in what is called HUBEXEC mode. The performance option would be to select 'compile to cache code'. The guys that really want to maximize the thing will dive in and read the back end docks that get into the details of how the cogs run code and hand code PASM.

Brian Fairchild · 2014-04-10 09:15

Kerry S wrote: »

The term "HUBEXEC" should be dropped as that should be the 'default' mode as it would be what many people would use to start.

That's a very good point and follows on from HUB becoming just "memory".

jazzed · 2014-04-10 09:22

Brian Fairchild wrote: »

That's a very good point and follows on from HUB becoming just "memory".

+1

Yes, HUBEXEC is a feature most other controllers have too (sort of), but they don't have to call it that. Reiterating what Kerry S. said: it is the default mode of operation.

Propeller Chip may get less eccentric in spite of itself.

Brian Fairchild · 2014-04-10 09:30

jazzed wrote: »

Propeller Chip may get less eccentric in spite of itself.

Reminds me of an episode of "Yes Prime Minister"...

James Hacker: Eccentricity can be a virtue.
Sir Humphrey: If you call it individualism.
Bernard Woolley: That's one of those irregular verbs, isn't it. I have an independent mind, you are an eccentric, he is round the twist.

Alex.Stanfield · 2014-04-10 11:48

Kerry S wrote: »

Exactly.

The terms need to be 'standard' so that people get it at first mention THEN those terms can be used to describe how unique the Propeller is. That way your discussion is, right from the start, about the cool features of the Propeller rather than a long vocabulary lesson so they have any clue as to what you are saying.

Your description of:

Multi-core processor with 512K ram and 16 - 32bit cores each with equal access to all 64 smart I/O and 512 32bit registers each that can be used for data, pointers or local program cache for fast execution of code as needed. Shared access to RAM allows core to core communication and data sharing.

Really is something that people would GET the first time it is heard. Now the conversation turns to how they can leverage these amazing capabilities to their advantage.

For the new user, the first experience programming a prop needs to be as familiar to what they are used to doing as possible.

The term "HUBEXEC" should be dropped as that should be the 'default' mode as it would be what many people would use to start. Then you can get into using the 512 registers to run cached code for driver optimization as an advanced programming tool. If the tool chain is setup from the beginning to do what we know as HUBEXEC as default then programming the Prop would feel very familiar to any programmer.

Again it is all about TERMS and how they create impressions when first heard. Someone new to the Prop hearing "and you do this with HUBEXEC mode..." would instantly think "oh, great, another learning curve...".

They should be able to open SPIN or GCC for the first time and write a standard template example program and have it just work. They don't need to know it is technically running in what is called HUBEXEC mode. The performance option would be to select 'compile to cache code'. The guys that really want to maximize the thing will dive in and read the back end docks that get into the details of how the cogs run code and hand code PASM.

+1

Alex

jmg · 2014-04-10 12:40

Bill Henning wrote: »

Possible solution:

SETEVENT pin,#event ' rising edge, falling edge, statechange etc etc etc, could be 9 pin mask so any of 9 types of events

POLLPIN event, pin wc ' ask pin for last event, could optionally loop in place (task, cog) until it gets an event, clears event from smart pin for that cog, D is set to event sent from the pin

Basically, have a smart pin send a message back, releasing cog/task to continue. While cog is waiting, it only uses quiescent current.

Task can't miss event it was looking for.

This probably should be in the Pin thread, as it relates to Pin-smarts ? (and this thread is diverted on semantics)

I think you are asking for the pin to manage the WAIT and have a sticky flag the tasking COG can check ?
eg setting a Counter to -1 and waiting on Overflow, would wait on any INC condition on the counter (H,L,_/=,=\_ etc_)

The latency would be in setting up the Pin Cell to do what you want.

That would avoid missing events, but it also loses the phase-locked-release nature of WAIT.
I guess a single task keeps that and if you really need that phase-locked-release, you dedicated a 1 task COG.
In most cases, the 4 Cy sampling window in the task would be ok ?

Could be another case for the more parallel bus access to the pin Cells ?

Cluso99 · 2014-04-10 16:10

Heater. wrote: »

Potatohead,

"Tweaking" on the lingo. Yep, very important.

You see, if you tell someone this new fangled micro-controller has 512KB of RAM, usable for code or data, and 16 thirty two bit processors running at however many MIPS they might be quite interested. Especially if they want lots of pins and analogy stuff.

You might throw in the fact that the processors have 496 registers. At that point they might fall of their chairs given that every other CPU they have met only has a handful of registers. People, especially compiler writers, are always complaining there are not enough registers. Well now they have them.

Oh, and by the way, you can run code from those registers directly for extra boost and fine grained real-time control. At that point you might be getting open mouthed stares of amazement and disbelief.

See?

That is such a different story than the old P1. It runs code from COG RAM, but that's really small, you have to have some interpreter, Spin, in order to make bigger programs, but that's horrible slow. Oh you can use C but the LMM interpreter makes very big code.

This is a really smart analogy. Now I am drooling

Cluso99 · 2014-04-10 16:19

potatohead:
Don't forget to list all the intelligent peripherals that the prop can have. List them all, and that they can be on (almost) any pins. Then say they are soft and emulated by the cores, so there is no limitation to what mix of peripherals you can have. (this is a problem with P1 - prospective customers don't see the big list of peripherals that those other chips have)

mindrobots · 2014-04-10 17:11

And don't forget to have Parallax "Gold Standard" code available in one easy to find spot with a App Notes. Code that is known, understood and supported by Parallax FEs. There can be optional, experimental and derived versions in OBEX but not the standard Parallax soft peripherals.

Alex.Stanfield · 2014-04-10 17:21

Cluso99 wrote: »

potatohead:
Don't forget to list all the intelligent peripherals that the prop can have. List them all, and that they can be on (almost) any pins. Then say they are soft and emulated by the cores, so there is no limitation to what mix of peripherals you can have. (this is a problem with P1 - prospective customers don't see the big list of peripherals that those other chips have)

Totally agree. In addition to building the list, Parallax (IMHO) must decide if it officially supports that list or not as it becomes kind of an SDK for the chip. With other chips what gets between the real world and the programmer's code is the chip (with all it's included HW peripherals), on the Propeller you have to include an additional layer (SW peripherals) which you didn't develop and which have variations in style and also bring a certain degree of uncertainty. A warranted support from the chip manufacturer (rather than the unpaid community) erases that uncertainty.

Now lets put ourselves in the shoes of the newcomer. It's kind of similar to the open-source community, which is free but implies hidden costs. When you buy a Propeller did you pay already for that support/warranty or is it up to you the developer to fill/clear that uncertainty?

Alex

David Betz · 2014-04-10 17:42

jazzed wrote: »

And the C CMM interpreter makes smaller code. LMM code will not be smaller on the new chip, but the new chip will have more memory. LMM will be faster than the old LMM.

Will there even be a need for LMM at all now that we can execute code directly from hub memory? What I'd like to see is a hybrid hub execution and CMM mode where some code can be compiled to CMM opcodes if it isn't time critical and some can be in native PASM. We couldn't mix LMM and CMM before because both kernels wouldn't fit in COG memory at the same time but we should be able to have a CMM kernel in COG memory (or even hub memory) and use it along with native code running from either hub memory or COG memory or both. (Eric will probably kill me for saying that though since it might be complex to support in the compiler!)

Cluso99 · 2014-04-10 17:56

Here are the instructions Chip posted 2 days ago, in my usual format

Propeller P16X512x Instructions as of 2014/04/10
-------------------------------------------------------------------------------------------------------------------
ZCxS Opcode  ZC I Cond  Dest       Source     Instr00 01      10      11        Operand(s)              Flags
-------------------------------------------------------------------------------------------------------------------
ZCWS 00000ff ZC 1 CCCC  DDDDDDDDD  SSSSSSSSS  RDBYTE  RDBYTEC RDWORD  RDWORDC   D,PTRA/PTRB             ZC ZC ZC ZC 
ZCWS 00001ff ZC 1 CCCC  DDDDDDDDD  SSSSSSSSS  RDLONG  RDLONGC RDQUAD  SYSOP     D,PTRA/PTRB ||| D,#     ZC ZC ZC ZC 
ZCMS 00010ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ISOB    NOTB    CLRB    SETB      D,S/#                   ZC ZC ZC ZC 
ZCMS 00011ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  SETBC   SETBNC  SETBZ   SETBNZ    D,S/#                   ZC ZC ZC ZC 
ZCMS 00100ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ROR     ROL     SHR     SHL       D,S/#                   ZC ZC ZC ZC 
ZCMS 00101ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  RCR     RCL     SAR     REV       D,S/#                   ZC ZC ZC ZC 
ZCMS 00110ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ANDN    AND     OR      XOR       D,S/#                   ZC ZC ZC ZC 
ZCMS 00111ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MUXC    MUXNC   MUXZ    MUXNZ     D,S/#                   ZC ZC ZC ZC 
ZCMS 01000ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ADD     SUB     ADDX    SUBX      D,S/#                   ZC ZC ZC ZC 
ZCMS 01001ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  SUMC    SUMNC   SUMZ    SUMNZ     D,S/#                   ZC ZC ZC ZC 
ZCWS 01010ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MOV     NOT     ABS     NEG       D,S/#                   ZC ZC ZC ZC 
ZCWS 01011ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  NEGC    NEGNC   NEGZ    NEGNZ     D,S/#                   ZC ZC ZC ZC 
ZCMS 01100ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  SETS    SETD    SETCOND SETINST   D,S/#                   ZC ZC ZC ZC 
ZCMS 01101ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MIN     MAX     MINS    MAXS      D,S/#                   ZC ZC ZC ZC 
ZCMS 01110ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MUL     MULS    JMPSW   TOPBIT    D,S/# || D,S/@ | D,S/#  ZC ZC ZC ZC 
ZCWS 01111ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  DECOD5  ESWAP8  SPLITW  MERGEW    D,S/#                   ZC ZC ZC ZC 
ZCRS 10000ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  TESTB   TESTN   TEST    CMP       D,S/#                   ZC ZC ZC ZC 
ZCRS 10001ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  CMPX    CMPS    CMPSX   CMPR      D,S/#                   ZC ZC ZC ZC 
ZCMS 10010ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  INCMOD  DECMOD  CMPSUB  SUBR      D,S/#                   ZC ZC ZC ZC 
ZCWS 10011ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MSGINA  MSGINB  ROLNIB  ROLBYTE   D,S/#                   ZC ZC ZC ZC 
ZCMS 10010ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  DJZ     DJNZ    TJZ     TJNZ      D,S/@                   ZC ZC ZC ZC 
ZCMS 10011ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  WAITCNT                           D,S/#                   ZC          
-------------------------------------------------------------------------------------------------------------------
--WS 1010xnn nf I CCCC  DDDDDDDDD  SSSSSSSSS  GETNIB  SETNIB                    D,S/#,#0..7             -- --       
--WS 1011xxn nf I CCCC  DDDDDDDDD  SSSSSSSSS  GETBYTE SETBYTE                   D,S/#,#0..3             -- --       
--LS 1100000 fL 1 CCCC  DDDDDDDDD  SSSSSSSSS  WRBYTE          WRWORD            D/#,PTRA/PTRB           --    --    
--LS 1100001 fL 1 CCCC  DDDDDDDDD  SSSSSSSSS  WRLONG          WRQUAD            D/#,PTRA/PTRB [s]| D/#,S/#[/s] --    --    
--LS 1100010 fL I CCCC  DDDDDDDDD  SSSSSSSSS  WAITPAE         WAITPAN           D/#,S/#                 --    --    
--LS 1100011 fL I CCCC  DDDDDDDDD  SSSSSSSSS  WAITPBE         WAITPBN           D/#,S/#                 --    --    
--LS 1100100 fL I CCCC  DDDDDDDDD  SSSSSSSSS  MSGDIRA         MSGDIRB           D/#,S/#                 --    --    
--LS 1100101 fL I CCCC  DDDDDDDDD  SSSSSSSSS  MSGOUTA         MSGOUTB           D/#,S/# [s]| D/#,S/@[/s]       --    --    
--LS 1100110 fL I CCCC  DDDDDDDDD  SSSSSSSSS  JP              JPD               D/#,S/@                 --    --    
--LS 1100111 fL I CCCC  DDDDDDDDD  SSSSSSSSS  JNP             JNPD              D/#,S/@ [s]| D/#,S/#[/s]       --    --    
--LS 1101000 fL I CCCC  DDDDDDDDD  SSSSSSSSS  SETINDS         ADJINDS           D/#,S/#                 --    --    
--LS 1101001 fL I CCCC  DDDDDDDDD  SSSSSSSSS  WAITVID         REP               D/#,S/# [s]| #23bits[/s]       --    --    
-------------------------------------------------------------------------------------------------------------------
---- 11100xx x0 n nnnn  nnnnnnnnn  nnnnnnnnn  AUGS    AUGD    <AUGS>  <AUGD>    #23bits                 -- -- -- -- 
-------------------------------------------------------------------------------------------------------------------
---- 11101xx 00 0 CCCC  fnnnnnnnn  nnnnnnnnn  JMP     JMP                       #abs | @rel             -- --       
---- 11101xx 00 1 CCCC  fnnnnnnnn  nnnnnnnnn  CALL    CALL                      #abs | @rel             -- --       
---- 11101xx 01 0 CCCC  fnnnnnnnn  nnnnnnnnn  CALLA   CALLA                     #abs | @rel             -- --       
---- 11101xx 01 1 CCCC  fnnnnnnnn  nnnnnnnnn  CALLB   CALLB                     #abs | @rel             -- --       
---- 11101xx 10 0 CCCC  fnnnnnnnn  nnnnnnnnn  LINK    LINK                      #abs | @rel             -- --       
---- 11101xx 10 1 CCCC  fnnnnnnnn  nnnnnnnnn  LOCINST LOCINST                   #abs | @rel             -- --       
---- 11101xx 11 0 CCCC  fnnnnnnnn  nnnnnnnnn  LOCPTRA LOCPTRA                   #abs | @rel             -- --       
---- 11101xx 11 1 CCCC  fnnnnnnnn  nnnnnnnnn  LOCPTRB LOCPTRB                   #abs | @rel             -- --       
-------------------------------------------------------------------------------------------------------------------
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0000ff  OFFP    NOTP    CLRP    SETP      D/#                     -- -- -- -- 
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0001ff  SETPC   SETPNC  SETPZ   SETPNZ    D/#                     -- -- -- -- 
ZCL- 1111xxx ZC L CCCC  DDDDDDDDD  xxx0010ff  GETP    GETNP   WAITPR  WAITPF    D/#                     ZC ZC -- -- 
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0011ff  WAITPX  SETTASK SETPTRA SETPTRB   D/#                     -- -- -- -- 
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0100ff  SETINDA SETINDB ADJINDA ADJINDB   D/#                     -- -- -- -- 
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0101ff  JMPT0   JMPT1   JMPT2   JMPT3     D/#                     -- -- -- -- 
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0110ff  WAIT    SETVID  PUSH              D/#                     -- -- --    
Z-W- 1111xxx Z0 0 CCCC  DDDDDDDDD  xxx1000ff  GETPTRA GETPTRB POP               D                       Z- Z- Z-    
--R- 1111xxx 00 0 CCCC  DDDDDDDDD  xxx1100ff  LOCPTRA LOCPTRB JMP     CALL      D                       -- -- ZC ZC 
--R- 1111xxx ZC 0 CCCC  DDDDDDDDD  xxx1101ff  CALLA   CALLB   LINK              D                       ZC ZC ZC    
ZC-- 1111xxx ZC 0 CCCC  xxxxxxxxx  xxx1110ff  RET     <RET>   RETA    RETB                              ZC ZC ZC ZC 
---- 1111xxx 00 0 CCCC  xxxxxxxxx  xxx1111ff  DCACHEX ICACHEX <empty> <empty>                           -- -- -- -- 
-------------------------------------------------------------------------------------------------------------------

P16X512x_20140410.spin
Note the fix to REP (not fixed in the attached file)

Cluso99 · 2014-04-10 18:13

I am going to suggest the instructions be simplified (from a users point of view). Here is one example...

ZCxS Opcode  ZC I Cond  Dest       Source     Instr00 01      10      11        Operand(s)              Flags
-------------------------------------------------------------------------------------------------------------------
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0000ff  OFFP    NOTP    CLRP    SETP      D/#                     -- -- -- -- 
--L- 1111xxx 00 L CCCC  DDDDDDDDD  xxx0001ff  SETPC   SETPNC  SETPZ   SETPNZ    D/#                     -- -- -- --

become...

                                                ' was...    sssssssss
              SETPIN    #0      ' or #LO        ' CLRP      xxx000000
              SETPIN    #1      ' or #HI        ' SETP      xxx000001
              SETPIN    #XOR                    ' NOTP      xxx000010
              SETPIN    #OFF                    ' OFFP      xxx000011
              SETPIN    #Z                      ' SETPZ     xxx000100
              SETPIN    #C                      ' SETPC     xxx000101
              SETPIN    #NZ                     ' SETPNZ    xxx000110
              SETPIN    #NC                     ' SETPNC    xxx000111

This only changes the compiler, which now must recognize #LO, #HI, #XOR, #OFF, #Z, #NZ, #C, #NC as reserved names, but simplifies it down to one instruction.

This reduces the instruction count (theoretically) and simplifies the manual.

The only change to the existing instruction coding is I have rearranged the lower 3 bits of "S" (mainly to permit #0 and #1 for CLR and SET).

Thoughts anyone???

jazzed · 2014-04-10 18:54

David Betz wrote: »

Will there even be a need for LMM at all now that we can execute code directly from hub memory?

Technically LMM is not the right word. The point is that LMM used 32 bit instructions, and there will be little difference in size of code in for the same programs between LMM and hubexec/native/default whatever it's called. CMM will probably still be a valid thing to have. Given the amount of head-room we may end up with more interesting things too. Mixed code models though? Isn't that a bit bizarre? Seems rather complicated - maybe worth a white paper.

RossH · 2014-04-10 19:00

jazzed wrote: »

Technically LMM is not the right word. The point is that LMM used 32 bit instructions, and there will be little difference in size of code in for the same programs between LMM and hubexec/native/default whatever it's called. CMM will probably still be a valid thing to have. Given the amount of head-room we may end up with more interesting things too. Mixed code models though? Isn't that a bit bizarre? Seems rather complicated - maybe worth a white paper.

As long as you have the ability to hook external RAM onto the Propeller, there will be an LMM mode.

The Hubexec mode is an intermediate mode between normal cog execution and LMM. Maybe it should be called HMM.

Ross.

The New 16-Cog, 512KB, 64 analog I/O Propeller Chip

Comments