An idea they got from this very forum many years ago when we were suggesting and ARM + COGs
The issue with that is the ARM licensing. It's a no go. RISCV would be a viable choice if one really wanted to do that. Free to use, core designs available, simpler, smaller, same speed.
This is all cool and fascinating speculation but I'm trying to figure out how you would actually come up with a RISC-V Propeller (and what that actually means to be a RISC-V propeller)
The beauty of where we are now:
Propeller Instructions => Propeller Architecture => PASM => SPIN => Simple & Elegant (everything maps nicely from step to step and flows from start to finish)
Now if we build a RISC-VP
[code]
RISC-V Instructions ==> Multicore Extensions =?> PASM-V =?> SPIN-V - can this still end up simple and elegant? The extensions are the secret sauce a Propeller not just another MCU
==> Counter Extensions
==> HUB Extensions
==> Smart I/O Pin Extensions
==> Video Extensions?
RISC-V application processor with Propeller like minions? Cool but what do you program it in? C? Probably. Certainly not SPIN since you'd have to back Spin into the RISC-V, probably not a pretty mapping.
What would a RISC-V/Propeller combination buy you over a RISC-C with embedded RISC-V minions?
(I'm glad they called them minions, that's cool!)
Maybe a RISC-V core (or multi-core - yes, multi-core is mandatory!) with Parallax "Smart-I/O" pins?
Spin can run on any machine. I've run Spin programs on various machines using spinsim. However, for fastest execution a Spin VM should be used that is written in assembly for the target machine. Or spin2cpp could be used to run Spin code on other machines. The tricky part is running PASM code. It would be better to port the PASM code to the native assembly of the processor rather than using a VM that emulates a cog.
Spin can run on any machine. I've run Spin programs on various machines using spinsim. However, for fastest execution a Spin VM should be used that is written in assembly for the target machine. Or spin2cpp could be used to run Spin code on other machines. The tricky part is running PASM code. It would be better to port the PASM code to the native assembly of the processor rather than using a VM that emulates a cog.
You should be able to run Spin natively by using spin2cpp and a native C++ compiler. You wouldn't, of course, be able to include any PASM code though.
They got the idea from Freescale with their TPU's used on the MC68332 and PPC automotive controllers - which also predate Chip's notion of cogs.
Personally I find the notion of a real cpu core coupled with beefier cogs(that can run C code without a nasty performance hit like on the P-1) that do all the I/O grunt work quite attractive, over the cog and only cog design of the P-2.
The problem Parallax faces even if they adopted a RISCV, ARM or MIPS central core, is that they are going to run right into deep pocket, established players like TI and others in the SOC arena who can release new products and variants in a short period of time.
They got the idea from Freescale with their TPU's used on the MC68332 and PPC automotive controllers - which also predate Chip's notion of cogs.
Personally I find the notion of a real cpu core coupled with beefier cogs(that can run C code without a nasty performance hit like on the P-1) that do all the I/O grunt work quite attractive, over the cog and only cog design of the P-2.
The problem Parallax faces even if they adopted a RISCV, ARM or MIPS central core, is that they are going to run right into deep pocket, established players like TI and others in the SOC arena who can release new products and variants in a short period of time.
They're going to have to find a niche to survive.
Parallax would have to adopt RISC-V quickly. If they came to it late, they'd just be another small player doing more or less what everyone else is. Anything with license fees (ARM) would, I think, be out of the question for them.
The problem Parallax faces even if they adopted a RISCV, ARM or MIPS central core, is that they are going to run right into deep pocket, established players...
I will put it to you that Parallax already has that problem. Not only that they are up against a world full of very small players.
There are a lot of small players in the world designing custom chips that include processors, or SoCs for use as central processors, with their own custom widgets on board. A great example is the recent very cheap ESP??? WIFI adapter that is not just an adapter for your MCU it contains it's own 32 processor. All for a couple of dollars.
Those small players would love not to have to licence ARM for such cheap products. They would love not to create their own CPU and have to support the software for it.
Hence they are rallying behind open specification of the RISCV.
Already in England the LowRISC SoC, complete with it's COG like "minion" processors, is under development by such a small chip design company.
I suspect this RISCV thing could get a lot of traction very quickly. KC_Rob is right.
They got the idea from Freescale with their TPU's used on the MC68332 and PPC automotive controllers - which also predate Chip's notion of cogs.
Who's to say. There is nothing new under the sun. Even mainframes and minis of days gone by had peripheral I/O processors. Pretty much everything we see in micro-processors and MCU's today is just a very small implementation of ideas that existed 3 or 4 or 5 decades ago. Interrupts, virtual memory, virtualization etc etc. I'm hard put to think of anything new that way that we have now that was not thought of half a century ago.
So, I maintain the Sitara and PRU's was inspired by the Propeller. As was the XMOS range of multi-core chips
The big deal about the Propeller, and those XMOS chips is the ease of actually using the capability they have.
As noted by mindrobots the "synergy" (oh I hate to use that word) of the Propeller architecture, it's instruction set, Spin, PASM, the Prop Tool, all blend together to make a truly easy to use system. Makes you wonder why everything else is so unnecessarily complex.
Change the Propeller instruction set to something else and boom that simplicity is blown.
As noted by mindrobots the "synergy" (oh I hate to use that word) of the Propeller architecture, it's instruction set, Spin, PASM, the Prop Tool, all blend together to make a truly easy to use system. Makes you wonder why everything else is so unnecessarily complex.
Change the Propeller instruction set to something else and boom that simplicity is blown.
I don't buy that. The main reason we have to tightly integrate Spin with PASM is that Spin is so slow. If you had a better compiler for Spin or an architecture that could run compiled languages better than there would be far less need for assembly code.
I could imagine that instead of Spin we have C++ like the Arduino. The language 99% of the world is happy with, professional or hobbyist. Compiled, directly executed and fast.
That still leaves the issue of getting down to assembler, either when you really need to or just for educational purposes. This is universally far more complicated than just throwing some PASM into a Spin object. Partly because the tools are not very good at integrating HLL and PASM. Partly because most instruction sets are far less easy to deal with.
The only other multi-core, multi-threaded device designed for embedded systems and real world interfacing that is easy to program that I know of is the XMOS (Not in assembler though). But even the XMOS IDE is a huge complex beast to have to deal with for the casual programmer.
I seem to recall that there once was a C compiler where you could basically open an assembler block in the middle of your C source code and write simple assembler instructions. Something like:
int myFunc(int x, int y)
{
int z = x + y;
asm
{
add z, 32
}
return z;
}
Was that Borland Turbo C++ or something ?
Why does GCC and Clang and friends make this so hard?
Edit: Or am I thinking of Intel's old PL/M 86 compiler ?
Looks like nice code, but they have deftly sidestepped a true recursive call, and instead used a loop with an exit.
So rather than testing call / SW stack overhead, it shows how RISC-V would much rather avoid that. (as already seem)
One problem with such 'benchmarks', is the compiler-writers know them, and can drop in an equivalent - and they then cease to be benchmarks at all... I've never shipped a product that used fibo.
Simple test cases are not the real world, and ultimately, you do have to make real calls, using a sw stack.
I could imagine that instead of Spin we have C++ like the Arduino. The language 99% of the world is happy with, professional or hobbyist. Compiled, directly executed and fast.
That still leaves the issue of getting down to assembler, either when you really need to or just for educational purposes. This is universally far more complicated than just throwing some PASM into a Spin object. Partly because the tools are not very good at integrating HLL and PASM. Partly because most instruction sets are far less easy to deal with.
The only other multi-core, multi-threaded device designed for embedded systems and real world interfacing that is easy to program that I know of is the XMOS (Not in assembler though). But even the XMOS IDE is a huge complex beast to have to deal with for the casual programmer.
I seem to recall that there once was a C compiler where you could basically open an assembler block in the middle of your C source code and write simple assembler instructions. Something like:
int myFunc(int x, int y)
{
int z = x + y;
asm
{
add z, 32
}
return z;
}
Was that Borland Turbo C++ or something ?
Why does GCC and Clang and friends make this so hard?
I think GCC is difficult partly because it tries to make it possible to give information to the optimizer to help it integrate the inline assembly with the rest of the C/C++ code. This is probably thought to be important since, if you find you have to resort to writing assembly code, you want to make sure it is as fast as possible. You can, of course, write assembly more simply by just putting it in a separate .S file where you have use of a macro assembler that is much more powerful than the PASM assembler.
Even though it's unlikely and maybe not even desirable to replace the P2 COG with a RISC V core, it might be interesting to consider adding a RISC V core to a P2 chip and then remove all of the tricky hub execution logic from the P2 COG. The RISC V could run the "application code" and the COG could do what it's best at doing. Using a RISC V core would make more sense than adding an ARM core since there wouldn't be any licensing issues or fees associated with RISC V as I understand it. Still probably won't happen though.
I'd agree, the RISC-V is not looking great for deeply embedded minion work, but it is a compact 32 bit RISC that can run Linux, and it would couple well to a new Spansion HyperRAM
Would be interesting to see some same-fpga build numbers for P1V , P2 and RISC-V
Take RISC V, add some micro controller specifics, video, counters, SERDES, etc... and simply redo SPIN, and it's byte code. Put that into a COG, with the HUB concept, and it's going to look an awful lot like what we have now.
Seems to me, one could inline assembly and add macro capability to make it a bit more friendly, and offer compiled SPIN or bytecode and get something pretty nice. Not as nice as PASM is, but nice enough.
Re: "synergy" Yeah, me too.
The thing about this is one design vision. And it's why some of us, and myself in particular, are wanting Chip to do that again on the P2. (and we will get inline this time --sweet!)
I'm sure there are other words for that. One I would be inclined to float out there is, "resonant" in that the components and their attributes of the design all combine to make a greater thing out of the whole than dissonant ones would.
Why does GCC and Clang and friends make this so hard?
Indeed.
Looking back, the Apple ][ was this way, as was the BBC Micro. Both offered not just BASIC, but lower level tools too. The Apple ][ had a monitor, mini-line assembler, and the tools needed to build programs, save them, load them back, combine and save the whole. One can make a large, fast program in common sense parts with just the basic tools in the box. There were better tools made, but just about everything somebody needed was in there, and writing a few utilities patched that up nicely enough. The BBC was similar, but offering inline assembly made it all a bit more "resonant" IMHO. End result was the same though.
Props are like this. Better / different tools exist now. We've got a ton of them and they run all over the place too. But the "in the box" SPIN + PASM is resonant, and it's fairly lean.
I sure which I had time and more skill in these things. A RISC V "themed" prop could be very interesting!
As for the need for assembly code:
I see two camps.
One is to make assembly code easy to do and make it well integrated with higher level tools.
The other is to near completely eliminate it by making the higher level tools capable enough.
In both scenarios there remains reason to write assembly code. Not to mention reverse engineering, boot strapping, hackery of all kinds. This is why I favor the former, though it doesn't take away from the latter. Plenty of room for both to be maximized, and they should be. Which leaves people to their preferences and goals.
If you want to, it's possible to write fairly friendly, high level looking code with macros. I've done a couple little projects with this, and it's a lot of fun. Support for most 8 bitters is in the can too.
There are now a couple of compiled BASIC languages targeting the 6502 and old game consoles. Their inline assembly is stupid easy. All variable names and labels are visible. You literally type "asm:" and then write a few instructions, and then "end asm" and carry on. Nearly completely transparent. Showed it to Chip, and I'm hoping our SPIN gets this for the "make it easy" case.
This RISC V instruction set would bend toward the "nearly eliminate" case for sure. Sure would be interesting to see it play out. Maybe somebody will make a little, cheap CPU, or something we can play with someday.
Finally, SPIN was tightly integrated with PASM for a few reasons:
1. SPIN is slow, though I don't think it's that slow for most things. Heck, it's on par with assembly language I wrote for years. But, yes. Slow by comparison to a lot of things these days.
2. The isolated nature of the COG. To me, this is the real defining attribute of the Propeller, and it's awesome. It made great sense to compartmentalize what happens in a COG in ways that makes them easy to combine and use together. I think this gets regularly under appreciated in light of the strong desire to run bigger, faster, serial programs. And that's a valid need, don't get me wrong. But, a lot of things can be done differently to get around this need. Chip himself pointed out where the speed matters and where it doesn't and how that all works on a Propeller.
3. Being designed together means maximizing the commonalities inherent in both. Most of PASM is directly exposed in SPIN. Eric Ball wrote a video driver in SPIN! It's not all that high resolution, but it was very interesting to see that effort.
Again, seems to me a RISC V type ISA, with common sense Prop style instructions and extensions would lead right to where SPIN and PASM are, and it would exhibit many of the same traits too. A raw assembler would present RISC V as more obtuse than PASM (and what isn't?), but a few macros, conventions, helpers, and it could be a whole lot nicer without losing hardly anything in terms of raw code capability. One downside might be optimization. That's pretty easy in PASM.
Looking back, the one omission was compiled SPIN and we didn't get that because we didn't really grok how the COG works, until Bill got LMM sorted.
This time, that's known, planned for, etc... I suspect compiled SPIN will improve things considerably.
Who's to say. There is nothing new under the sun. Even mainframes and minis of days gone by had peripheral I/O processors. Pretty much everything we see in micro-processors and MCU's today is just a very small implementation of ideas that existed 3 or 4 or 5 decades ago. Interrupts, virtual memory, virtualization etc etc. I'm hard put to think of anything new that way that we have now that was not thought of half a century ago.
There really is nothing new under the sun (or seldom there is anyway). I remember using the TPU on the 332 years ago. I assume the eTPU is an upgrade from that. Another relevant example, of slightly more recent vintage, from Freescale would be the XGATE: http://www.freescale.com/webapp/sps/site/overview.jsp?code=LPXGATEARTICLE.
Who came first? Who knows. But there does seem to be a trend toward more little assistant CPUs ("minions") here of late.
I'd agree, the RISC-V is not looking great for deeply embedded minion work,
That is your recurring proposition. But so far without any basis. The issue of code space has been demonstrated to be a non-issue. See all my experiments above.
What else is missing to satisfy your demands of "deeply embedded minion" work?
Yes I know we can put assembler inline with GCC and other compilers. God forbid I have had to do it from time to time. It looks like this:
asm (" or %0 , %1 , %2\ n "
" or %0 , %0 , %3\ n"
: " =& r "( result )
: "r"( a) , "r"( b) , "r" (c)) ;
WTF? No body wants to write that.
That was my whole point. Why is it so damn complex and impenetrable?
1) Because the instruction set and assembly language are are not simple. Unlike the Propeller instruction set.
2) Because of all that Smile you need to add to write it as assembler in your HLL source. Unlike the Spin system.
This is totally not the same as pointing COGNEW at PASM in a DAT block.
That is my primary objection to inline. When it's that ugly, why bother?
I know those things are there for portability, but why?
Much better to compartmentalize. Here are these inputs, or a state, perform assembly to actualize outputs and or a state, exit, carry on.
If the program gets moved to a different CPU, then you do the assembly again, done, next. Well compartmentalized uses of this kind make a lot of sense.
I've mentioned it before too, but the PASM / SPIN ways to specify data are excellent. You get binary, delimiters, octal, two bits (whatever that is called %%), hex, decimal, strings, alignments, and all sorts of quick and dirty things. It's not only easy to write PASM. It's almost as easy to stuff data into the environment, and do so in an easy to read, common sense way.
Yes I know we can put assembler inline with GCC and other compilers. God forbid I have had to do it from time to time. It looks like this:
asm (" or %0 , %1 , %2\ n "
" or %0 , %0 , %3\ n"
: " =& r "( result )
: "r"( a) , "r"( b) , "r" (c)) ;
WTF? No body wants to write that.
That was my whole point. Why is it so damn complex and impenetrable?
1) Because the instruction set and assembly language are are not simple. Unlike the Propeller instruction set.
2) Because of all that Smile you need to add to write it as assembler in your HLL source. Unlike the Spin system.
This is totally not the same as pointing COGNEW at PASM in a DAT block.
You're comparing apples to oranges. Spin does not allow you to put PASM code inline with your Spin code. You have to put it in a DAT block and manually manage the linkage between the Spin code and the PASM code. You don't have to do that with GCC. Your code is automatically combined with your assembly code and you get to use the normal C calling conventions to invoke it. In that sense, it is easier. I agree though that the syntax is ugly. You might be able to leave a lot of that out if you want to embed knowledge of the function calling conventions. For example, use R0, R1, R2 as the function parameters and return the result in R0 or whatever. I think the funky %x syntax is supposed to insulate you from those details.
Spin does not allow you to put PASM code inline with your Spin code. You have to put it in a DAT block and manually manage the linkage between the Spin code and the PASM code.
True. But we will on P2.
And it's going to be pretty easy. PASM keyword, possibly a mode modifier, then the PASM code block, then continue on.
The only reason we didn't get this early on is due to how the COG works. No need for it, until LMM.
Frankly, we could add this to SPIN right now, and it would get done in LMM mode, though it would come with some oddities one would need to keep in mind.
Chip explored two models:
One was the snippet. It gets loaded into the COG, run, then when it exists, back to SPIN. The other was real, in-line, based on the idea of HUBEXEC. It's my read, the latter is most likely.
And it's going to be pretty easy. PASM keyword, possibly a mode modifier, then the PASM code block, then continue on.
I'm sure this is true but one of the reasons this is possible is that Spin makes no attempt to do any code optimization. It's perfectly predictable what values will be where during code execution. Some may consider this an advantage but it does result in much slower code.
Thanks for the interesting treatise, a couple of point's:
...seems to me a RISC V type ISA, with common sense Prop style instructions...
No can do. The RISC V ISA is the RISC V ISA. No room to add anything "common sense". Unlike PASM and the instructions set of 8080 (hence 8086), Z80 and other early micros the instruction set is not designed for ease of use by the assembler programmer. It's designed to efficiently do what HLL compilers can produce.
...the one omission was compiled SPIN and we didn't get that because we didn't really grok how the COG works, until Bill got LMM sorted.
The issue is that compiled to native code or LMM (Had Chip realized that was possible) the code size is huge. One can fit a lot more functionality into a Prop with Spin byte codes. Spin byte codes are slow but given that we can very easily put the high speed stuff into COG in PASM it's a good trade off.
That's part of the "synergy" thing I was talking about. (Blech, I said that word again, spit, spit).
The GCC inline assembly is ugly because it needs to know how to insert operands into the assembly, and the optimizer is able to do something with it. Or maybe the developers of GCC just wanted to discourage inline assembly. The syntax looks similar to the syntax used to create intrinsic functions, so it seems like the inline feature is just an extension of that.
I think the best way to insert assembly into a C program is to create assembly functions in a separate file. This way you have full control over the assembly code, and there's no confusion about mixing assembly and C. The files are all compiled/assembled into object files, which are then linked together. It's sort of like how Spin and PASM work together, except that with C/assembly they both run in the same cog. That's assuming you write the assembly as LMM code. The assembly can also run in a different cog, and in that case is very similar to how Spin and PASM work together, except they are in separate files.
That is your recurring proposition. But so far without any basis. The issue of code space has been demonstrated to be a non-issue. See all my experiments above.
What else is missing to satisfy your demands of "deeply embedded minion" work?
Nope, the results above serve to prove that it is very poor at handling calls, and still a work-in-progress on interrupts, and needs still more wrappers for Port io and Boolean work....
There is so much 'extended' / still coming stuff needed before RISC-V can come close to a 60c Cortex M0, that it is best left it the sand-pit it targets, which is Free-core Linux.
Better interrupts was one significant area of work on the Cortex
- ARM have traveled this path already, taking a basic RISC core (original ARM), and then turning it into something actually usable for embedded control is not trivial.
I see Microchip have added shadow registers to their PIC 32, and so the list goes on.. and on...
Here are some comments from the web
["The Cortex-M parts have a lower interrupt latency than the PIC32 parts -- fewer registers to save/restore and the hardware overlaps some of the work. Cortex interrupt handlers can be written in C with no special prolog/epilog code required, which is nice.
The Cortex parts also have hardware support for separate main and process stacks, if desired -- something that needs to be implemented in software on the PIC32.
Overall, the Cortex-M architecture is much newer than the MIPS architecture the PIC32 is based on, and it shows."]
No can do. The RISC V ISA is the RISC V ISA. No room to add anything "common sense".
? Err, the RISC-V docs actually say this (as well as all the other 'extensions' they mention ) ["Opcodes marked as reserved should be avoided for custom instruction set extensions as they might be used by future standard extensions. Major opcodes marked as custom-0 and custom-1 will be avoided by future standard extensions and are recommended for use by custom instruction-set extensions within the base 32-bit instruction format."]
Plenty of room to add-stuff...
I can quite understand that cognew is not the same as in line assembler.
You have obviously been toying with such things as GCC in line assembler for long enough to no longer see how impossible it is.
No normal human being who is up to doing some casual programming in C++ on an Arduino or Spin on a Prop is going to want to deal with the hideous mess that is GCC (and other) in line assembler. Life is too short.
Heck, even professional software teams will call in a specialist to deal with such things if they need it.
We plan to define more optional instruction set extensions for RISC-V beyond the ones we already have, including Packed-SIMD Instructions (P), Bit Manipulation (B), Decimal Floating-Point (L), and Transactional Memory (T). One goal for the RISC-V foundation is to manage development of these future standard instruction-set extensions.
The currently defined extensions to the base Integer (I) ISA are Multiply-Divide (M), Atomic (A), Floating-point in multiple precisions (F, D, and Q), and Compressed Instructions (C).
From the RISC V FAQ
For an embedded, micro controller specific type application, extensions may well make sense. They are going to add them and manage them anyway. The core ISA is the core ISA, of course.
And I didn't mean, "easy for the programmer" types of things anyway. More like, "if a Prop like device were to center on the core ISV" types of things. COGNEW, and friends.
I can quite understand that cognew is not the same as in line assembler.
You have obviously been toying with such things as GCC in line assembler for long enough to no longer see how impossible it is.
No normal human being who is up to doing some casual programming in C++ on an Arduino or Spin on a Prop is going to want to deal with the hideous mess that is GCC (and other) in line assembler. Life is too short.
Heck, even professional software teams will call in a specialist to deal with such things if they need it.
Have you read the documentation on GCC inline assembler? Take a look at the "Basic Inline" section of this page:
Comments
An idea they got from this very forum many years ago when we were suggesting and ARM + COGs
The issue with that is the ARM licensing. It's a no go. RISCV would be a viable choice if one really wanted to do that. Free to use, core designs available, simpler, smaller, same speed.
The beauty of where we are now:
Propeller Instructions => Propeller Architecture => PASM => SPIN => Simple & Elegant (everything maps nicely from step to step and flows from start to finish)
Now if we build a RISC-VP
[code]
RISC-V Instructions ==> Multicore Extensions =?> PASM-V =?> SPIN-V - can this still end up simple and elegant? The extensions are the secret sauce a Propeller not just another MCU
==> Counter Extensions
==> HUB Extensions
==> Smart I/O Pin Extensions
==> Video Extensions?
RISC-V application processor with Propeller like minions? Cool but what do you program it in? C? Probably. Certainly not SPIN since you'd have to back Spin into the RISC-V, probably not a pretty mapping.
What would a RISC-V/Propeller combination buy you over a RISC-C with embedded RISC-V minions?
(I'm glad they called them minions, that's cool!)
Maybe a RISC-V core (or multi-core - yes, multi-core is mandatory!) with Parallax "Smart-I/O" pins?
They got the idea from Freescale with their TPU's used on the MC68332 and PPC automotive controllers - which also predate Chip's notion of cogs.
Personally I find the notion of a real cpu core coupled with beefier cogs(that can run C code without a nasty performance hit like on the P-1) that do all the I/O grunt work quite attractive, over the cog and only cog design of the P-2.
The problem Parallax faces even if they adopted a RISCV, ARM or MIPS central core, is that they are going to run right into deep pocket, established players like TI and others in the SOC arena who can release new products and variants in a short period of time.
They're going to have to find a niche to survive.
There are a lot of small players in the world designing custom chips that include processors, or SoCs for use as central processors, with their own custom widgets on board. A great example is the recent very cheap ESP??? WIFI adapter that is not just an adapter for your MCU it contains it's own 32 processor. All for a couple of dollars.
Those small players would love not to have to licence ARM for such cheap products. They would love not to create their own CPU and have to support the software for it.
Hence they are rallying behind open specification of the RISCV.
Already in England the LowRISC SoC, complete with it's COG like "minion" processors, is under development by such a small chip design company.
I suspect this RISCV thing could get a lot of traction very quickly. KC_Rob is right.
So, I maintain the Sitara and PRU's was inspired by the Propeller. As was the XMOS range of multi-core chips
The big deal about the Propeller, and those XMOS chips is the ease of actually using the capability they have.
As noted by mindrobots the "synergy" (oh I hate to use that word) of the Propeller architecture, it's instruction set, Spin, PASM, the Prop Tool, all blend together to make a truly easy to use system. Makes you wonder why everything else is so unnecessarily complex.
Change the Propeller instruction set to something else and boom that simplicity is blown.
Perhaps you have a point.
I could imagine that instead of Spin we have C++ like the Arduino. The language 99% of the world is happy with, professional or hobbyist. Compiled, directly executed and fast.
That still leaves the issue of getting down to assembler, either when you really need to or just for educational purposes. This is universally far more complicated than just throwing some PASM into a Spin object. Partly because the tools are not very good at integrating HLL and PASM. Partly because most instruction sets are far less easy to deal with.
The only other multi-core, multi-threaded device designed for embedded systems and real world interfacing that is easy to program that I know of is the XMOS (Not in assembler though). But even the XMOS IDE is a huge complex beast to have to deal with for the casual programmer.
I seem to recall that there once was a C compiler where you could basically open an assembler block in the middle of your C source code and write simple assembler instructions. Something like:
Was that Borland Turbo C++ or something ?
Why does GCC and Clang and friends make this so hard?
Edit: Or am I thinking of Intel's old PL/M 86 compiler ?
So rather than testing call / SW stack overhead, it shows how RISC-V would much rather avoid that. (as already seem)
One problem with such 'benchmarks', is the compiler-writers know them, and can drop in an equivalent - and they then cease to be benchmarks at all... I've never shipped a product that used fibo.
Simple test cases are not the real world, and ultimately, you do have to make real calls, using a sw stack.
I'd agree, the RISC-V is not looking great for deeply embedded minion work, but it is a compact 32 bit RISC that can run Linux, and it would couple well to a new Spansion HyperRAM
Would be interesting to see some same-fpga build numbers for P1V , P2 and RISC-V
https://www.xmos.com/download/public/Inline-Assembly%28X1969F%29.pdf
Take RISC V, add some micro controller specifics, video, counters, SERDES, etc... and simply redo SPIN, and it's byte code. Put that into a COG, with the HUB concept, and it's going to look an awful lot like what we have now.
Seems to me, one could inline assembly and add macro capability to make it a bit more friendly, and offer compiled SPIN or bytecode and get something pretty nice. Not as nice as PASM is, but nice enough.
Re: "synergy" Yeah, me too.
The thing about this is one design vision. And it's why some of us, and myself in particular, are wanting Chip to do that again on the P2. (and we will get inline this time --sweet!)
I'm sure there are other words for that. One I would be inclined to float out there is, "resonant" in that the components and their attributes of the design all combine to make a greater thing out of the whole than dissonant ones would.
Indeed.
Looking back, the Apple ][ was this way, as was the BBC Micro. Both offered not just BASIC, but lower level tools too. The Apple ][ had a monitor, mini-line assembler, and the tools needed to build programs, save them, load them back, combine and save the whole. One can make a large, fast program in common sense parts with just the basic tools in the box. There were better tools made, but just about everything somebody needed was in there, and writing a few utilities patched that up nicely enough. The BBC was similar, but offering inline assembly made it all a bit more "resonant" IMHO. End result was the same though.
Props are like this. Better / different tools exist now. We've got a ton of them and they run all over the place too. But the "in the box" SPIN + PASM is resonant, and it's fairly lean.
I sure which I had time and more skill in these things. A RISC V "themed" prop could be very interesting!
As for the need for assembly code:
I see two camps.
One is to make assembly code easy to do and make it well integrated with higher level tools.
The other is to near completely eliminate it by making the higher level tools capable enough.
In both scenarios there remains reason to write assembly code. Not to mention reverse engineering, boot strapping, hackery of all kinds. This is why I favor the former, though it doesn't take away from the latter. Plenty of room for both to be maximized, and they should be. Which leaves people to their preferences and goals.
I've been quite intrigued by advances in the 8 bit scene. Check this thing out: http://www.wudsn.com/index.php/ide
If you want to, it's possible to write fairly friendly, high level looking code with macros. I've done a couple little projects with this, and it's a lot of fun. Support for most 8 bitters is in the can too.
There are now a couple of compiled BASIC languages targeting the 6502 and old game consoles. Their inline assembly is stupid easy. All variable names and labels are visible. You literally type "asm:" and then write a few instructions, and then "end asm" and carry on. Nearly completely transparent. Showed it to Chip, and I'm hoping our SPIN gets this for the "make it easy" case.
This RISC V instruction set would bend toward the "nearly eliminate" case for sure. Sure would be interesting to see it play out. Maybe somebody will make a little, cheap CPU, or something we can play with someday.
Finally, SPIN was tightly integrated with PASM for a few reasons:
1. SPIN is slow, though I don't think it's that slow for most things. Heck, it's on par with assembly language I wrote for years. But, yes. Slow by comparison to a lot of things these days.
2. The isolated nature of the COG. To me, this is the real defining attribute of the Propeller, and it's awesome. It made great sense to compartmentalize what happens in a COG in ways that makes them easy to combine and use together. I think this gets regularly under appreciated in light of the strong desire to run bigger, faster, serial programs. And that's a valid need, don't get me wrong. But, a lot of things can be done differently to get around this need. Chip himself pointed out where the speed matters and where it doesn't and how that all works on a Propeller.
3. Being designed together means maximizing the commonalities inherent in both. Most of PASM is directly exposed in SPIN. Eric Ball wrote a video driver in SPIN! It's not all that high resolution, but it was very interesting to see that effort.
Again, seems to me a RISC V type ISA, with common sense Prop style instructions and extensions would lead right to where SPIN and PASM are, and it would exhibit many of the same traits too. A raw assembler would present RISC V as more obtuse than PASM (and what isn't?), but a few macros, conventions, helpers, and it could be a whole lot nicer without losing hardly anything in terms of raw code capability. One downside might be optimization. That's pretty easy in PASM.
Looking back, the one omission was compiled SPIN and we didn't get that because we didn't really grok how the COG works, until Bill got LMM sorted.
This time, that's known, planned for, etc... I suspect compiled SPIN will improve things considerably.
Who came first? Who knows. But there does seem to be a trend toward more little assistant CPUs ("minions") here of late.
What else is missing to satisfy your demands of "deeply embedded minion" work?
Yes I know we can put assembler inline with GCC and other compilers. God forbid I have had to do it from time to time. It looks like this:
WTF? No body wants to write that.
That was my whole point. Why is it so damn complex and impenetrable?
1) Because the instruction set and assembly language are are not simple. Unlike the Propeller instruction set.
2) Because of all that Smile you need to add to write it as assembler in your HLL source. Unlike the Spin system.
This is totally not the same as pointing COGNEW at PASM in a DAT block.
I know those things are there for portability, but why?
Much better to compartmentalize. Here are these inputs, or a state, perform assembly to actualize outputs and or a state, exit, carry on.
If the program gets moved to a different CPU, then you do the assembly again, done, next. Well compartmentalized uses of this kind make a lot of sense.
I've mentioned it before too, but the PASM / SPIN ways to specify data are excellent. You get binary, delimiters, octal, two bits (whatever that is called %%), hex, decimal, strings, alignments, and all sorts of quick and dirty things. It's not only easy to write PASM. It's almost as easy to stuff data into the environment, and do so in an easy to read, common sense way.
True. But we will on P2.
And it's going to be pretty easy. PASM keyword, possibly a mode modifier, then the PASM code block, then continue on.
The only reason we didn't get this early on is due to how the COG works. No need for it, until LMM.
Frankly, we could add this to SPIN right now, and it would get done in LMM mode, though it would come with some oddities one would need to keep in mind.
Chip explored two models:
One was the snippet. It gets loaded into the COG, run, then when it exists, back to SPIN. The other was real, in-line, based on the idea of HUBEXEC. It's my read, the latter is most likely.
Thanks for the interesting treatise, a couple of point's: No can do. The RISC V ISA is the RISC V ISA. No room to add anything "common sense". Unlike PASM and the instructions set of 8080 (hence 8086), Z80 and other early micros the instruction set is not designed for ease of use by the assembler programmer. It's designed to efficiently do what HLL compilers can produce. The issue is that compiled to native code or LMM (Had Chip realized that was possible) the code size is huge. One can fit a lot more functionality into a Prop with Spin byte codes. Spin byte codes are slow but given that we can very easily put the high speed stuff into COG in PASM it's a good trade off.
That's part of the "synergy" thing I was talking about. (Blech, I said that word again, spit, spit).
I think the best way to insert assembly into a C program is to create assembly functions in a separate file. This way you have full control over the assembly code, and there's no confusion about mixing assembly and C. The files are all compiled/assembled into object files, which are then linked together. It's sort of like how Spin and PASM work together, except that with C/assembly they both run in the same cog. That's assuming you write the assembly as LMM code. The assembly can also run in a different cog, and in that case is very similar to how Spin and PASM work together, except they are in separate files.
Seems synergistic to me.
There is so much 'extended' / still coming stuff needed before RISC-V can come close to a 60c Cortex M0, that it is best left it the sand-pit it targets, which is Free-core Linux.
Better interrupts was one significant area of work on the Cortex
- ARM have traveled this path already, taking a basic RISC core (original ARM), and then turning it into something actually usable for embedded control is not trivial.
I see Microchip have added shadow registers to their PIC 32, and so the list goes on.. and on...
Here are some comments from the web
["The Cortex-M parts have a lower interrupt latency than the PIC32 parts -- fewer registers to save/restore and the hardware overlaps some of the work. Cortex interrupt handlers can be written in C with no special prolog/epilog code required, which is nice.
The Cortex parts also have hardware support for separate main and process stacks, if desired -- something that needs to be implemented in software on the PIC32.
Overall, the Cortex-M architecture is much newer than the MIPS architecture the PIC32 is based on, and it shows."]
? Err, the RISC-V docs actually say this (as well as all the other 'extensions' they mention )
["Opcodes marked as reserved should be avoided for custom instruction set extensions as they might be used by future standard extensions. Major opcodes marked as custom-0 and custom-1 will be avoided by future standard extensions and are recommended for use by custom instruction-set extensions within the base 32-bit instruction format."]
Plenty of room to add-stuff...
I can quite understand that cognew is not the same as in line assembler.
You have obviously been toying with such things as GCC in line assembler for long enough to no longer see how impossible it is.
No normal human being who is up to doing some casual programming in C++ on an Arduino or Spin on a Prop is going to want to deal with the hideous mess that is GCC (and other) in line assembler. Life is too short.
Heck, even professional software teams will call in a specialist to deal with such things if they need it.
From the RISC V FAQ
For an embedded, micro controller specific type application, extensions may well make sense. They are going to add them and manage them anyway. The core ISA is the core ISA, of course.
And I didn't mean, "easy for the programmer" types of things anyway. More like, "if a Prop like device were to center on the core ISV" types of things. COGNEW, and friends.
http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s4