Parallax generally isn't competing on price as much as it is on value.
There are lots of ways to show value and things are worth what people will pay for them.
This is recognized value and it helps when the discussion is price driven.
Essentially, the dialog gets expanded to a more generic one associated with needs, goals, and the impact of those on business.
If we compare one bare cpu or board to another in a vacuum, the Parallax products do not come across as optimal from a feature price perspective.
However, if the ecosystem, tools, expertise, performance features, methodology, etc... are compared, now there is robust competition, and value for the dollar becomes primary instead of it just being dollars.
This is not always possible to do. Some prospects really do only fixate on price and value narrowly. They typically will not be Parallax prospects, just as many laptop buyers aren't Apple Mac prospects.
One artifact of that economic reality is absolute market share isn't a meaningful measure of success. A value based niche player need only secure enough people who recognize their value added to fund the business and deliver the owners enough financially to meet their goals.
By contrast a public corporation must also meet market expectations. Those do generally focus more on price than value, but there are exceptions.
Apple computer is a pure value player and they get nice margins on their products because they add a lot of value and they ask for it and get it. Despite their smallish share numbers, they are one of the most, if not the most well capitalized companies to ever exist.
When we discuss the market viability of the P2, we would do well to keep these things in mind.
It is really easy to find dime chips that perform well and believe all is lost When the truth is much different and far more positive than the popular economic ideas and expectations would otherwise indicate.
It is also worth looking at the Pi, which is almost entirely a price play. Look at the goals as a foundation to enable more people to do real computing, not just be users, and compare those with the likes of Intel and others who seek to make a clear and substantial profit.
The impact on price is significant and very little out there competes with a Pi on that basis.
The Pi depends on the body of Open Source software to add considerable value to the already low price.
Despite high volumes, profits back to everyone involved are lean and we'll distributed as well.
Just be sure and keep all the relevant dynamics in mind and the discussion will be more positive and productive.
A simpler P2 could not only be used to verify OnSemi,
? If that 'simpler' chip includes all the full custom, it is no longer a 'simpler' chip.
ie you cannot do your claimed verify step, without making something very close to P2.
There could be merit in a Shuttle run to prove critical custom blocks, but shuttle runs do not give commercial volume parts, and that call is for Ken et al, based on the Contract Custom designers track record, and sign-off deliverables.
Might be time to look at a Pi 2 and see what can be done at a low level with 4x 900MHz cores, GPU and 1GB DDR2 (no *nix/*droid/w10, just ARMv7 Assembler).
.. and GPU assembler
Certainly will be worth tracking Pi 2.
Even if P2 was available today, with the specs of the failed "hot" chip, how is one to produce a P2 board (with just some of those Pi2 specs) for $35 ???
I half agree, however that price point is met by things like BeMicro MAX 10 ($30), using a part that is NOT cheap, and the market does expect low cost development system these days.
So I expect Parallax to manage it.
However, I'm not sure about "Produce a P2 board (with just some of those Pi2 specs) " ?? - P2 is not trying to displace Pi2, but it would make strong sense to include a Pi2 header on any P2 board, so it can be used with Pi2.
One more basic realization comes from this whole dynamic, and that is the value of sales, marketing and community these days.
Take two teams both producing identical products. Engineering is well matched and everyone is at par. Additionally, component prices are the same.
Those should price and sell the same, right?
But do they?
Almost never, unless the market is very highly regulated.
Say one of those teams has superior skill in showing value.
That team will likely get the majority of dollar margins, while potentially not getting superior share, though they could do that too, depending.
What about the team who develops and actualized a great community to compensate?
Suddenly we see robust competition and meaningful choices despite there being no other material difference in the product otherwise.
My point here isn't to devalue engineering or over inflate sales and marketing.
It is to highlight important dynamics for niche players and their potential paths to success and ideally a sustainable business.
Whatever we end up doing with P2, it absolutely has to have high value and that value must be made clear and it must be asked for in order for the niche player to maximize their chances and deliver that value to those who see it for what it is.
IMHO, Parallax is very well positioned to do that. The product is going to be highly differentiated, the community is stellar, and the move to open tools captures the very high use value perception so important for niche devices to make it.
The final price is important, but not in that "how do we compete with a Pi" way.
I'm not sure how you relate the Pi to Parallax or Apple or whoever but let's look at that....
...which is almost entirely a price play
Well, yes they wanted a minimal price. At the cost of performance. So as to be able to reach their target audience, school kids.
But "entirely a price play"? No way. The Raspberry Pi Foundation is not a corporation trying to ship a ton of product and maximize profit. They have an educational mission, it's specified in their charitable status. They add a huge amount of value in documentation, educational materials, community support etc, etc, etc.
It's not clear how we can compare the Raspberry Pi Foundation and it's mission and the work of countless unpaid volunteers to that of a for profit corporation like Apple, Intel, MS or even Parallax.
The Pi depends on the body of Open Source software to add considerable value to the already low price.
Yes indeed. I'm not sure I know where you are going with that though.
Apple also depends on a huge body of Open Source and is still as overpriced as it was back in the early eighties.
Parallax is using a good amount of the same Open Source, propp-gcc, SimpleIDE, PropellerIDE, OpenSpin anyone.
Seems to me everyone depends on Open Source now a days. Have a look at what MS is up to.
Heck, I'm not even sure what we are debating here any more.
I think extremely highly of the Pi Foundation. They are just nailing their goals.
They are adding a ton of value at a low price and performance is secondary.
All true, and very high value.
Naysayers early on cited the performance ad a why bother, completely missing the high value.
My point stated another way is the P2 will have similar trade offs. Maximizing the other value potentials is important.
It is not so important to match up with products that offer different trade offs.
One thing recognized in the last iteration was the lack of an OS. That is as debatable as it is differentiated.
We may see that still true on this iteration.
To my economic and business eye, that kind of thing and the implications of it require more analysis so the potential is seen and communicated well enough for people able to see themselves actualize their goals in ways that make the no OS feature seen as as high value.
...for just one example.
In the end, I think the gloom and doom type discussions aren't a bad thing. They will drive us to think differently.
I just want to make it clear price isn't a primary consideration, if we do the work to maximize what P2 can or will do. It will operate differently, and that is where a lot of the value will be.
That we understand it and can communicate it well and support those who see it and want to realize that for themselves is where a the rest of the value will come from.
When that all shakes out, the price will be the best price Parallax can deliver. If it goes well, that price will fall too.
But early adopters will very likely but in on a value for dollar basis, more than any other. IMHO.
I wrote this in response to an earlier post a page or two earlier in the thread and I don't want to struggle with edits on mobile... please forgive.
There is this too. I didn't mean it as debate as much as I did food for thought and perspective as we think about and ideally get back to real work on the current iteration.
Naysayers early on cited the performance ad a why bother, completely missing the high value.
Yep, when the Pi came out we were using tiny ARM boards that cost 200 dollars and more. I had an ongoing task for ages to keep an eye on that and find cheaper stuff. (Despite the fact that that was already 800 dollars cheaper than the old industrial PC's in use before I arrived)
The Pi was a shock to the system. Nearly a factor of ten cheaper!
And then there is all those jerks complaining that it has half the performance of the latest whatever. Totally missing the point.
Aside: We are still using expensive ARM boards as there is some insistence on an industrial temperature spec. I'm sure the Pi could do the job but there we go.
So, where does this put the P2?
On the one hand we have the little 8 bit Arduino dominating the landscape for hobbyists. Industrially that translates into AVR and PIC chips all over the place.
On the other hand we have the Rasperry Pi dominating the high end "must have an OS, and graphics and servers, and security and ..." landscape. Industrially that translates to all kind of ARM boards.
So, again, where does that put the P2?
Like the P1 there is that application area of things that need precise real-time control, but don't justify the expense of custom logic or an FPGA. But then are actually not so fast.
Both questions could use some analysis at some point.
Until then, I very strongly favor maximizing the differentiators. Compatability / me too features make some sense, but I don't see them making or breaking the project.
I'm very interested in the smart pin concept too.
It will be nice to get an image and get into the details.
I would not rule the "no os" ideas seen on the last iteration out.
That could be disruptive and compelling. I personally am waiting on details before thinking too much. Maybe being able to toss a bunch of stuff together and get real time behavior sans the OS and other complexities will expand that niche considerably.
I'm coming to this thread a little late since I have largely given up on ever seeing a P2. Still, I check the forums occasionally in the hope that something has actually been accomplished. I absolutely love the propeller. It is by far my favorite microcontroller. I use it professionally whenever I can.
Having said that, a product that is available sells a lot more than one that isn't. Parallax has released the code for the P1. Make a chip based off of that code (unchanged) that has a small feature size so that it runs several times faster than the P1. Development is already largely done (per my limited understanding of chip design). A Propeller that can run 100 million instructions or more per cog per second would be great. It would still have other limitations but it would be on the market and would open up other possibilities that didn't exist before. It's an easy incremental change. Make it a drop-in replacement for the P1.
Next step: Give it more pins. Release that version too. Incremental change. Better. Sells more chips than something that isn't released.
Next step: Add some instructions or RAM or something incremental. Release it. Sell chips. Bring in revenue.
Every company that I have ever worked for that couldn't set a fixed target for their products failed. A moving target means endless development. That's where the propeller is. Parallax has other revenue streams than the propeller so I don't think that it would go under (at least I hope not) but endless Propeller development has to be a big financial drag on the company.
Release something, anything and it would be better than nothing.
By producing a shuttle run of a simpler, but upmarket P1 (or a cut-down P2) would serve to prove a number of P2 features. Now, if this works first up, Parallax gets a chip that it can sell. Provided it has some special features that many require (not want, but require) then it would have a market even at a higher price. For this, IMHO the required section is more hub ram.
Add in the Monitor ROM and the Fuses into this mix as it proves the Fuses and Security. If it (security and fuses) fails then this just gets sold without, and its back to the drawing board on this for the P2, else it is passed as proven. Same with SPI FLASH.
IMHO this way makes the most sense to me as it gets a chip out there for Parallax to get some ROI and gets some credibility back, as well as proving OnSemi and the Contractors can work together properly.
Meanwhile, P2 can continue with Smart Pins and Instruction Set and Hub Interface, etc.
How much does it cost Parallax to do this??? I don't know. But it's my best suggestion without knowing this.
R Pi 2
There are a few cheap ARM quad core processors out there now! Without the overhead of *nix, I am pretty sure that I could make some cores run "determinately". Besides 800-900MHz (or even 400MHz) is way better than 160-200MHz on the P2 (or 80-104MHz that we have in the P1).
I hate to say it, but the Pi 2 has just changed the way I see much of the microcontroller market. It's no longer an ARM *nix vs P2 (or P1) because the multiple core ARM's are here and cheap. They are indeed an alternative to the P1 & P2. Note that I have always previously argued they are not the same, but cheap multi-core ARMs without the OS can also be regarded the same - ie progress has just changed my opinion. This is the point I am trying to get across - things have moved on!
Maybe I am wrong. I hope I am for Parallax's sake.
The RPI-2 isn't a microcontroller and shouldn't be compared to the P-2. The P-2 is what you should be bolting on to a RPi-2. Sure it's way faster than a P-2, but clock speed isn't everything. Freescale produces a boat load of embedded micros that aren't anywhere near the speed of a P-2 and they sell millions. A NASA probe near Pluto is using a rad hard 12 MHZ MIPS(Mongoose V chip) processor and is working just fine. Yeah 12 Mhz.
Most of the embedded ARM's and PIC32's seem to do just fine at under 200 Mhz.
If the P-2 becomes working silicon, it's going to be compared to what else out there in the embedded world and popular at the time. Let Parallax worry about the advertising and product placement of it. White papers, benchmarks/Coremarks, etc.
Meanwhile, P2 can continue with Smart Pins and Instruction Set and Hub Interface, etc.
Err, then you have actually 'proven' very little of the P2, which is delayed even further by the sign-offs needed...
I get the impression the Custom design is more on the critical path than Verilog, and certainly more directly costly
(as in cheques being signed).
A SECOND Parallel Custom design, needed for the trial device, would be very costly.
Verilog can be proven to a large extent on FPGAs, which is how everyone does it these days.
If you wanted to get a P2 asap, there could be a case for a Shuttle run on the exact P2-Custom design, but to test that really needs a good chunk of P2 Verilog, and once there, why not just do it properly ?
It's no longer an ARM *nix vs P2 (or P1) because the multiple core ARM's are here and cheap. They are indeed an alternative to the P1 & P2.
NXP have had Dual Core Microcontrollers for a while now, and many PCBs are designed with more than one chip, to achieve the same Multi-core benefits. Pi2 does not really change any of that microcontroller reality.
Maybe it makes Multi-core ARM just a little bit more visible, but I'm not sure how many you have to order to actually buy a BCM2836, but I cannot find anyone selling them.
I still wait to see how one can actually use the BCM2836 Multi-cores in a deterministic design.
It would need parts of the cache to be locked to selected cores, and even there Pin access is likely to be not quite independent.
@rod1963,
[quoute]
The RPI-2 isn't a microcontroller and shouldn't be compared to the P-2.
[/quoute]
Quite so.
The Raspi is a board, a finished computer, the Propeller is a chip.
The Raspi processor is not available as a chip by itself. The Propeller 2 is (Well, we have to imagine it is for the sake of argument)
The Raspi processor is built and optimized to run an OS like Linux, it has all the complications of virtual memory and caches and so on. The Propeller is designed to be real-time real world interfacing, to the metal as it were.
The Raspi processor requires external RAM in order to operate, it's a micro-processor. The Propeller does not, it's a micro-controller.
But, Clusso has a point or two.
What if you can buy quad core ARMs off the shelf for a dollar or two? Seems there are already such things.
What if you can program them "to the metal" with no OS. Which you can. What if you can have the OS and nail a core or two to your real-time code? Which you can under Linux.
All of a sudden we may not care about that fuzzy distinction between micro-controller and micro-processor. As long as it does the job who cares what it is called?
We have a spectrum of devices and capabilities available, from the tiny 8 bitters (AVR, PIC), to the 32 bit ARM MCUs, (STM32 etc) to the big ARM SOC for Linux and other OS based devices.
Distinctions are getting blurred. Those "big" devices are actually now very small and cheap.
@Clusso, don't get too excited about this yet. Programming an ARM sock to behave like a Prop is not going to be the easy walk in the park of programming a Prop or other MCU.
.....
What if you can buy quad core ARMs off the shelf for a dollar or two? Seems there are already such things.
What if you can program them "to the metal" with no OS. Which you can. What if you can have the OS and nail a core or two to your real-time code? Which you can under Linux.
All of a sudden we may not care about that fuzzy distinction between micro-controller and micro-processor. As long as it does the job who cares what it is called?
We have a spectrum of devices and capabilities available, from the tiny 8 bitters (AVR, PIC), to the 32 bit ARM MCUs, (STM32 etc) to the big ARM SOC for Linux and other OS based devices.
Distinctions are getting blurred. Those "big" devices are actually now very small and cheap.
@Clusso, don't get too excited about this yet. Programming an ARM sock to behave like a Prop is not going to be the easy walk in the park of programming a Prop or other MCU.
@heater, Don't forget we don't have to program these ARMs to be peripherals as they are built-in. So some of the Prop sections (cores and objects) are no longer required.
Once an ARM is broken down into its cores where they can be used separately, we can have a main program (in C - ugh, did I say that), and special functions for the other cores.
For me, one of the huge advantages of the Prop was the 8 32-bit cores, and secondly the 32KB Hub Ram.
Alternative micros had very small RAM and plenty of FLASH. That has changed with a number of micros (eg PIC32Fxxx) now having 128KB+ of RAM plus Flash, and lots of peripherals.
Previously I had seen the ARM as a microprocessor running *nix. This was not a viable Prop alternative.
But now it seems to me that the ARMs have also caught up sufficiently (cheap and multiple cores) to pose a viable alternate solution to many of the uses where I would use a Prop or two.
Rockchip has a quad core ARM that has to be cheap - Lots of devices using it.. dongles, phones, tablets.
In the end, it doesn't matter what you call it, it is just what gets the job done efficiently and cost effective. That doesn't necessarily mean the cheapest either.
Previously I had seen the ARM as a microprocessor running *nix.
Ah, well, ARMs have been micro-controllers for a long time. And certainly the earlier generations of ARM based phones were not using Linux or iOS etc.
The STM32 range is great for example. The STM32 F4 is a beast of a micro-controller, loads of FLASH and RAM space, heaps of pins and peripherals, floating point support! And dead cheap.
So what can a Propeller 2 do that these devices cannot do?
It basically comes down to that nano-second by nano-second, fine grained and deterministic timing with which we can wiggle IO pins. Which is made easy to program by virtue of having many cores and hence not needing to mess with interrupts.
Now that is a great feature and I love it, but I have to wonder how many people actually need it. How big can that market ever be?
Just now I'd still rather have my ARMs running Linux and the Props as super intelligent I/O expanders for them.
That "highly deterministic" claim is based on using COG RAM. Which they call tightly coupled memory (TCM).
They have Instruction TCM and Data TCM. According to ARM can be up to 16MB, but currently on STM32 F7 (the first M7 to be implemented) is 16 Kb I-TCM and 64K D-TCM. Size far above any dream we can have on P2.
Those STM32 are currently covering any package option you can think of between 20 TSSOP up to 176 LQFP and between. With any price range between $1.45 to $19 (SINGLE QUANTITY). There are 319 different package options currently stocked at digikey. They have 7 different families, low power/low speed , more speed, ... , even more speed, ... up to 200 Mhz.
And they recently have made evaluation boards (called NUCLEO) that has the programmer included (ST-LINK v2) at US$15.
Their programming software is still Smile (only run on windows out of the box, even when they developed the tool with java). And they focus on commercial tools that cost $3,000 per seat (Keil, IAR, and Atollic only). BUT some people have found the way to use the software to throw some pure makefile initialization files, and there is a free programmer that works on linux too. So some people can use free software without any code size limit and with the free GNU compiler developed by ARM itself (some of its some employees).
They are moving towards eating the market the P1 is (was) good at. They are moving fast, and they are dirty cheap.
They are currently at 90nm, and I think that we will see the first 65nm or 40nm STM32 microcontroller at the same time we have the P2 in our hands.
This idea of "tightly coupled memory" seems like what we will have when hub execution appears in P2. COG memory will be the tightly coupled memory and will be deterministic and executing from hub memory will be less deterministic. Or maybe not. Is there still a cache in the new P2? Maybe even hub execution will be deterministic (but slow) if there is no instruction cache.
It seems like deterministic hub execution will be a little iffy. With the Mix-Master hub memory, only straight-line hub execution will be deterministic. Random jumps will usually end up in another hub slot, which will cause a random number of hub stalls.
It seems like deterministic hub execution will be a little iffy. With the Mix-Master hub memory, only straight-line hub execution will be deterministic. Random jumps will usually end up in another hub slot, which will cause a random number of hub stalls.
While that is most likely true (without painfully careful programming if even possible), will any sizable chunk of hard real-time tasks require more than ~512k longs?
No, most real-time tasks will not require more than ~512k longs. However, I assume you meant ~512 longs. Many real-time tasks will loop or perform other jumps. If the jump target is not in the correct memory bank there will be stalls. So even a very small real-time task would need to be hand-tweaked to avoid stalls.
EDIT: Perhaps you were implying that all real-time tasks would be small enough to fit in COG memory. That's probably true. My comment was related to hub execution. High level programs compiled to run in hub exec mode will most likely encounter lots of hub stalls. Even straight-line code will have stall issues when it is reading data from hub RAM.
EDIT2: I've never been a fan of the Mix-Master scheme. It will work great for hand-tweaked code, but I don't think it will work so well for HLL compiled code. I think there will be more motivation to use P2 with HLLs, but they won't get the most out of the Mix-Master design. I think this will tend to keep people away from the P2 instead of attracting new customers.
Generalizing a bit, I think the points are (using ARM as an example):
ARM is steadily increasing in core count (level of parallelism)
ARM is steadily increasing in clock speed (effective per-core performance)
ARM can provide "effectively deterministic" behavior for many use cases
ARM is staying low-price
ARM is staying low-power
ARM is manufactured by a large number of chip makers, resulting in both variety and stability
ARM is increasingly showing up in the hacker/maker areas
(NOTE: "effectively deterministic" is another way of saying "as long as I can guarantee that X critical instructions happen in Y time, it's deterministic enough for my needs". Propeller's claim of "deterministic" is only critical when you are working at the threshold of its performance limits. For instance, if you are just coding a 115Kbps serial channel, then you can make the Propeller "effectively deterministic" without having to perform clock-cycle level timings.)
It is for these reasons that I think Cluso commented that RPi 2 is changing how he's looking at all of this. It was nearly impossible to compare the original Propeller to the ARM, because they were so dissimilar. But the Propeller 2 has moved closer to ARM (hub execution, multi-tasking, etc), and therefore increasingly invites legitimate comparison. Still, Propeller 2 and ARM are not an apples-to-apples comparison, but they are now certainly a fruit-to-fruit comparison. And, to belabor the metaphor, the big product manufacturers are selling fruit pies, not apple pies (unless, of course, we are talking about Apple).
I like the Propeller precisely because it is a different paradigm. The Propeller 2 may be more capable, but it is also less different. At the end of the day, less different might be exactly what Parallax needs to thrive (I really don't know one way or another, so I'm certainly not making any judgement calls). But it will most certainly result in more comparisons to other products, and we should all accept that reality instead of arguing against it.
It seems like deterministic hub execution will be a little iffy. With the Mix-Master hub memory, only straight-line hub execution will be deterministic. Random jumps will usually end up in another hub slot, which will cause a random number of hub stalls.
Yes, it means a recompile can move code, and slightly change speeds, in unrelated areas, unless a means to hard-snap address to a boundary is provided.*
( I think that optimum boundary varies by COG, as the FIFO LSB is COG related)
* Another possible way to handle this, by reducing jitter at the expense of speed is to have an optional FIFO fill mode that always takes 16 clocks after any jump.
Page location then has no effect on delays, but raw code throughput is down on average.
Code where jitter does not matter, would disable that option.
Absolute code placement could still be used, but it is less mandatory.
t's another reason to favour skip-style opcodes, that can avoid multiple jumps (such opcodes also co-operate well with SDRAM and QuadSPI memory )
No, most real-time tasks will not require more than ~512k longs. However, I assume you meant ~512 longs. Many real-time tasks will loop or perform other jumps. If the jump target is not in the correct memory bank there will be stalls. So even a very small real-time task would need to be hand-tweaked to avoid stalls.
EDIT: Perhaps you were implying that all real-time tasks would be small enough to fit in COG memory. That's probably true. My comment was related to hub execution. High level programs compiled to run in hub exec mode will most likely encounter lots of hub stalls. Even straight-line code will have stall issues when it is reading data from hub RAM.
EDIT2: I've never been a fan of the Mix-Master scheme. It will work great for hand-tweaked code, but I don't think it will work so well for HLL compiled code. I think there will be more motivation to use P2 with HLLs, but they won't get the most out of the Mix-Master design. I think this will tend to keep people away from the P2 instead of attracting new customers.
Oops, meant 512 longs, not 512k longs (wouldn't that be nice?). And indeed, I was implying that they should fit in cog memory. I know you were specifically talking about hub ram, which is why I was asking if most real-time tasks couldn't just avoid it because for the most part the could fit in the available cog memory.
As far as the new hub access scheme goes, worst-case performance will be similar to that of the original round-robin method but in practice will be somewhere above that. It would seem that you would want to unroll your loops as much as possible. The way I always saw it was that hub execute was for big, non-hard real-time code, but still an option if those tasks don't have to operate at high rates. There's not really any perfect solution when you have to deal with parallel processes all sharing the same bandwidth-limited resource. What would you propose?
EDIT2: I've never been a fan of the Mix-Master scheme. It will work great for hand-tweaked code, but I don't think it will work so well for HLL compiled code. I think there will be more motivation to use P2 with HLLs, but they won't get the most out of the Mix-Master design. I think this will tend to keep people away from the P2 instead of attracting new customers.
I'd only agree up to a point - what you say is also true of ASM vs HLL on any MCU - hand tweaked code always works better (but is only needed in rare cases).
The HLL SW tools can help in P2, by making it easy to allocate code to in-COG execute - and that's a strong selling feature in favour of P2.
Even better : such SW tools can improve over time, and are not on the P2 tapeout critical path.
How about we devise a test for it? Even if only as a gedanken experiment.
Let's say you have two single bit digital inputs A and B, and a digital output X.
Your requirement is to compute the Exclusive OR of A and B and output the result in X.
A Prop executes code at 20MIPS or 50ns per instruction.
Let's say it takes 10 instructions to perform the given task. I'm too tired to imagine what the actual PASM required there is. Perhaps someone could offer a solution.
So, we can produce the correct result X 500ns after the inputs have changed. That's on a P1, perhaps a P2 can do it in 50ns, who knows.
So far so good. I'm very sure a STM32 F4 or whatever new fangled ARM MCU could also do that. Perhaps in a similar software loop. Perhaps even from an interrupt routine.
Now, let's make it harder. Let's say you have to do that task four times over simultaneously. Four pairs of inputs and four outputs.
On the Prop that is easy. Just dedicate 4 COGS to run the same loop. Job done.
On a single CPU ARM or whatever it is a bit tricky. You have modify that loop or interrupt routine to deal with four pair of inputs and four out puts. Perhaps doable.
Now, let's make it harder. At the same time you have to be able to so some other useful processing. Which may also require some stringent real time requirements.
On a Prop that's easy, just throw another COG at it. On that ARM it will take work to integrate with that high speed loop, it will no doubt impact the speed of that loop.
Now let's make it harder again. I want my code and your code to run together without impacting the timing of either AND without having to hack around with either code to make the system work. Mix and match of reusable components. I don't even want to know or care about the timing requirements of your code.
So tell me. Which is the most "highly deterministic" solution?
I kind of disagree with Seairth when he says the "Propeller 2 ... is also less different." Seems to me that with 16 COGS to do what I describe above it's twice as different as the P1 was:)
Comments
Parallax generally isn't competing on price as much as it is on value.
There are lots of ways to show value and things are worth what people will pay for them.
This is recognized value and it helps when the discussion is price driven.
Essentially, the dialog gets expanded to a more generic one associated with needs, goals, and the impact of those on business.
If we compare one bare cpu or board to another in a vacuum, the Parallax products do not come across as optimal from a feature price perspective.
However, if the ecosystem, tools, expertise, performance features, methodology, etc... are compared, now there is robust competition, and value for the dollar becomes primary instead of it just being dollars.
This is not always possible to do. Some prospects really do only fixate on price and value narrowly. They typically will not be Parallax prospects, just as many laptop buyers aren't Apple Mac prospects.
One artifact of that economic reality is absolute market share isn't a meaningful measure of success. A value based niche player need only secure enough people who recognize their value added to fund the business and deliver the owners enough financially to meet their goals.
By contrast a public corporation must also meet market expectations. Those do generally focus more on price than value, but there are exceptions.
Apple computer is a pure value player and they get nice margins on their products because they add a lot of value and they ask for it and get it. Despite their smallish share numbers, they are one of the most, if not the most well capitalized companies to ever exist.
When we discuss the market viability of the P2, we would do well to keep these things in mind.
It is really easy to find dime chips that perform well and believe all is lost When the truth is much different and far more positive than the popular economic ideas and expectations would otherwise indicate.
It is also worth looking at the Pi, which is almost entirely a price play. Look at the goals as a foundation to enable more people to do real computing, not just be users, and compare those with the likes of Intel and others who seek to make a clear and substantial profit.
The impact on price is significant and very little out there competes with a Pi on that basis.
The Pi depends on the body of Open Source software to add considerable value to the already low price.
Despite high volumes, profits back to everyone involved are lean and we'll distributed as well.
Just be sure and keep all the relevant dynamics in mind and the discussion will be more positive and productive.
IMHO, of course.
ie you cannot do your claimed verify step, without making something very close to P2.
There could be merit in a Shuttle run to prove critical custom blocks, but shuttle runs do not give commercial volume parts, and that call is for Ken et al, based on the Contract Custom designers track record, and sign-off deliverables.
Certainly will be worth tracking Pi 2.
I half agree, however that price point is met by things like BeMicro MAX 10 ($30), using a part that is NOT cheap, and the market does expect low cost development system these days.
So I expect Parallax to manage it.
However, I'm not sure about "Produce a P2 board (with just some of those Pi2 specs) " ?? - P2 is not trying to displace Pi2, but it would make strong sense to include a Pi2 header on any P2 board, so it can be used with Pi2.
Take two teams both producing identical products. Engineering is well matched and everyone is at par. Additionally, component prices are the same.
Those should price and sell the same, right?
But do they?
Almost never, unless the market is very highly regulated.
Say one of those teams has superior skill in showing value.
That team will likely get the majority of dollar margins, while potentially not getting superior share, though they could do that too, depending.
What about the team who develops and actualized a great community to compensate?
Suddenly we see robust competition and meaningful choices despite there being no other material difference in the product otherwise.
My point here isn't to devalue engineering or over inflate sales and marketing.
It is to highlight important dynamics for niche players and their potential paths to success and ideally a sustainable business.
Whatever we end up doing with P2, it absolutely has to have high value and that value must be made clear and it must be asked for in order for the niche player to maximize their chances and deliver that value to those who see it for what it is.
IMHO, Parallax is very well positioned to do that. The product is going to be highly differentiated, the community is stellar, and the move to open tools captures the very high use value perception so important for niche devices to make it.
The final price is important, but not in that "how do we compete with a Pi" way.
More like a more inclusive, "was it worth it?"
But "entirely a price play"? No way. The Raspberry Pi Foundation is not a corporation trying to ship a ton of product and maximize profit. They have an educational mission, it's specified in their charitable status. They add a huge amount of value in documentation, educational materials, community support etc, etc, etc.
It's not clear how we can compare the Raspberry Pi Foundation and it's mission and the work of countless unpaid volunteers to that of a for profit corporation like Apple, Intel, MS or even Parallax. Yes indeed. I'm not sure I know where you are going with that though.
Apple also depends on a huge body of Open Source and is still as overpriced as it was back in the early eighties.
Parallax is using a good amount of the same Open Source, propp-gcc, SimpleIDE, PropellerIDE, OpenSpin anyone.
Seems to me everyone depends on Open Source now a days. Have a look at what MS is up to.
Heck, I'm not even sure what we are debating here any more.
P2, good value or not? And how?
You: "More like a more inclusive, "was it worth it?""
Me: "P2, good value or not? And how?"
Seems like after a lot of rambling in odd directions we both arrived at the same punch line whilst cross posting:)
I think extremely highly of the Pi Foundation. They are just nailing their goals.
They are adding a ton of value at a low price and performance is secondary.
All true, and very high value.
Naysayers early on cited the performance ad a why bother, completely missing the high value.
My point stated another way is the P2 will have similar trade offs. Maximizing the other value potentials is important.
It is not so important to match up with products that offer different trade offs.
One thing recognized in the last iteration was the lack of an OS. That is as debatable as it is differentiated.
We may see that still true on this iteration.
To my economic and business eye, that kind of thing and the implications of it require more analysis so the potential is seen and communicated well enough for people able to see themselves actualize their goals in ways that make the no OS feature seen as as high value.
...for just one example.
In the end, I think the gloom and doom type discussions aren't a bad thing. They will drive us to think differently.
I just want to make it clear price isn't a primary consideration, if we do the work to maximize what P2 can or will do. It will operate differently, and that is where a lot of the value will be.
That we understand it and can communicate it well and support those who see it and want to realize that for themselves is where a the rest of the value will come from.
When that all shakes out, the price will be the best price Parallax can deliver. If it goes well, that price will fall too.
But early adopters will very likely but in on a value for dollar basis, more than any other. IMHO.
I wrote this in response to an earlier post a page or two earlier in the thread and I don't want to struggle with edits on mobile... please forgive.
There is this too. I didn't mean it as debate as much as I did food for thought and perspective as we think about and ideally get back to real work on the current iteration.
The Pi was a shock to the system. Nearly a factor of ten cheaper!
And then there is all those jerks complaining that it has half the performance of the latest whatever. Totally missing the point.
Aside: We are still using expensive ARM boards as there is some insistence on an industrial temperature spec. I'm sure the Pi could do the job but there we go.
So, where does this put the P2?
On the one hand we have the little 8 bit Arduino dominating the landscape for hobbyists. Industrially that translates into AVR and PIC chips all over the place.
On the other hand we have the Rasperry Pi dominating the high end "must have an OS, and graphics and servers, and security and ..." landscape. Industrially that translates to all kind of ARM boards.
So, again, where does that put the P2?
Like the P1 there is that application area of things that need precise real-time control, but don't justify the expense of custom logic or an FPGA. But then are actually not so fast.
How big is that application area?
There is how big, and what it is worth.
Both questions could use some analysis at some point.
Until then, I very strongly favor maximizing the differentiators. Compatability / me too features make some sense, but I don't see them making or breaking the project.
I'm very interested in the smart pin concept too.
It will be nice to get an image and get into the details.
I would not rule the "no os" ideas seen on the last iteration out.
That could be disruptive and compelling. I personally am waiting on details before thinking too much. Maybe being able to toss a bunch of stuff together and get real time behavior sans the OS and other complexities will expand that niche considerably.
Having said that, a product that is available sells a lot more than one that isn't. Parallax has released the code for the P1. Make a chip based off of that code (unchanged) that has a small feature size so that it runs several times faster than the P1. Development is already largely done (per my limited understanding of chip design). A Propeller that can run 100 million instructions or more per cog per second would be great. It would still have other limitations but it would be on the market and would open up other possibilities that didn't exist before. It's an easy incremental change. Make it a drop-in replacement for the P1.
Next step: Give it more pins. Release that version too. Incremental change. Better. Sells more chips than something that isn't released.
Next step: Add some instructions or RAM or something incremental. Release it. Sell chips. Bring in revenue.
Every company that I have ever worked for that couldn't set a fixed target for their products failed. A moving target means endless development. That's where the propeller is. Parallax has other revenue streams than the propeller so I don't think that it would go under (at least I hope not) but endless Propeller development has to be a big financial drag on the company.
Release something, anything and it would be better than nothing.
Working on two at once is obviously a recipe for disaster. The Prop2 is well on it's way already and long past the point of any Prop1 revisit.
Potatohead: Good info, thanks. The term "value", I've got qualms about it's use/meaning, have you got a formal definition?
By producing a shuttle run of a simpler, but upmarket P1 (or a cut-down P2) would serve to prove a number of P2 features. Now, if this works first up, Parallax gets a chip that it can sell. Provided it has some special features that many require (not want, but require) then it would have a market even at a higher price. For this, IMHO the required section is more hub ram.
Add in the Monitor ROM and the Fuses into this mix as it proves the Fuses and Security. If it (security and fuses) fails then this just gets sold without, and its back to the drawing board on this for the P2, else it is passed as proven. Same with SPI FLASH.
IMHO this way makes the most sense to me as it gets a chip out there for Parallax to get some ROI and gets some credibility back, as well as proving OnSemi and the Contractors can work together properly.
Meanwhile, P2 can continue with Smart Pins and Instruction Set and Hub Interface, etc.
How much does it cost Parallax to do this??? I don't know. But it's my best suggestion without knowing this.
R Pi 2
There are a few cheap ARM quad core processors out there now! Without the overhead of *nix, I am pretty sure that I could make some cores run "determinately". Besides 800-900MHz (or even 400MHz) is way better than 160-200MHz on the P2 (or 80-104MHz that we have in the P1).
I hate to say it, but the Pi 2 has just changed the way I see much of the microcontroller market. It's no longer an ARM *nix vs P2 (or P1) because the multiple core ARM's are here and cheap. They are indeed an alternative to the P1 & P2. Note that I have always previously argued they are not the same, but cheap multi-core ARMs without the OS can also be regarded the same - ie progress has just changed my opinion. This is the point I am trying to get across - things have moved on!
Maybe I am wrong. I hope I am for Parallax's sake.
Most of the embedded ARM's and PIC32's seem to do just fine at under 200 Mhz.
If the P-2 becomes working silicon, it's going to be compared to what else out there in the embedded world and popular at the time. Let Parallax worry about the advertising and product placement of it. White papers, benchmarks/Coremarks, etc.
Err, then you have actually 'proven' very little of the P2, which is delayed even further by the sign-offs needed...
I get the impression the Custom design is more on the critical path than Verilog, and certainly more directly costly
(as in cheques being signed).
A SECOND Parallel Custom design, needed for the trial device, would be very costly.
Verilog can be proven to a large extent on FPGAs, which is how everyone does it these days.
If you wanted to get a P2 asap, there could be a case for a Shuttle run on the exact P2-Custom design, but to test that really needs a good chunk of P2 Verilog, and once there, why not just do it properly ?
NXP have had Dual Core Microcontrollers for a while now, and many PCBs are designed with more than one chip, to achieve the same Multi-core benefits. Pi2 does not really change any of that microcontroller reality.
Maybe it makes Multi-core ARM just a little bit more visible, but I'm not sure how many you have to order to actually buy a BCM2836, but I cannot find anyone selling them.
I still wait to see how one can actually use the BCM2836 Multi-cores in a deterministic design.
It would need parts of the cache to be locked to selected cores, and even there Pin access is likely to be not quite independent.
[quoute]
The RPI-2 isn't a microcontroller and shouldn't be compared to the P-2.
[/quoute]
Quite so.
The Raspi is a board, a finished computer, the Propeller is a chip.
The Raspi processor is not available as a chip by itself. The Propeller 2 is (Well, we have to imagine it is for the sake of argument)
The Raspi processor is built and optimized to run an OS like Linux, it has all the complications of virtual memory and caches and so on. The Propeller is designed to be real-time real world interfacing, to the metal as it were.
The Raspi processor requires external RAM in order to operate, it's a micro-processor. The Propeller does not, it's a micro-controller.
But, Clusso has a point or two.
What if you can buy quad core ARMs off the shelf for a dollar or two? Seems there are already such things.
What if you can program them "to the metal" with no OS. Which you can. What if you can have the OS and nail a core or two to your real-time code? Which you can under Linux.
All of a sudden we may not care about that fuzzy distinction between micro-controller and micro-processor. As long as it does the job who cares what it is called?
We have a spectrum of devices and capabilities available, from the tiny 8 bitters (AVR, PIC), to the 32 bit ARM MCUs, (STM32 etc) to the big ARM SOC for Linux and other OS based devices.
Distinctions are getting blurred. Those "big" devices are actually now very small and cheap.
@Clusso, don't get too excited about this yet. Programming an ARM sock to behave like a Prop is not going to be the easy walk in the park of programming a Prop or other MCU.
@heater, Don't forget we don't have to program these ARMs to be peripherals as they are built-in. So some of the Prop sections (cores and objects) are no longer required.
Once an ARM is broken down into its cores where they can be used separately, we can have a main program (in C - ugh, did I say that), and special functions for the other cores.
For me, one of the huge advantages of the Prop was the 8 32-bit cores, and secondly the 32KB Hub Ram.
Alternative micros had very small RAM and plenty of FLASH. That has changed with a number of micros (eg PIC32Fxxx) now having 128KB+ of RAM plus Flash, and lots of peripherals.
Previously I had seen the ARM as a microprocessor running *nix. This was not a viable Prop alternative.
But now it seems to me that the ARMs have also caught up sufficiently (cheap and multiple cores) to pose a viable alternate solution to many of the uses where I would use a Prop or two.
Rockchip has a quad core ARM that has to be cheap - Lots of devices using it.. dongles, phones, tablets.
In the end, it doesn't matter what you call it, it is just what gets the job done efficiently and cost effective. That doesn't necessarily mean the cheapest either.
The STM32 range is great for example. The STM32 F4 is a beast of a micro-controller, loads of FLASH and RAM space, heaps of pins and peripherals, floating point support! And dead cheap.
So what can a Propeller 2 do that these devices cannot do?
It basically comes down to that nano-second by nano-second, fine grained and deterministic timing with which we can wiggle IO pins. Which is made easy to program by virtue of having many cores and hence not needing to mess with interrupts.
Now that is a great feature and I love it, but I have to wonder how many people actually need it. How big can that market ever be?
Just now I'd still rather have my ARMs running Linux and the Props as super intelligent I/O expanders for them.
http://www.arm.com/products/processors/cortex-r/cortex-r4.php
They say it about the M3 and M4, as well:
http://www.arm.com/products/processors/cortex-m/cortex-m3.php
I don't know how the level of determinism compares to the Propeller, though.
They have Instruction TCM and Data TCM. According to ARM can be up to 16MB, but currently on STM32 F7 (the first M7 to be implemented) is 16 Kb I-TCM and 64K D-TCM. Size far above any dream we can have on P2.
Those STM32 are currently covering any package option you can think of between 20 TSSOP up to 176 LQFP and between. With any price range between $1.45 to $19 (SINGLE QUANTITY). There are 319 different package options currently stocked at digikey. They have 7 different families, low power/low speed , more speed, ... , even more speed, ... up to 200 Mhz.
And they recently have made evaluation boards (called NUCLEO) that has the programmer included (ST-LINK v2) at US$15.
Their programming software is still Smile (only run on windows out of the box, even when they developed the tool with java). And they focus on commercial tools that cost $3,000 per seat (Keil, IAR, and Atollic only). BUT some people have found the way to use the software to throw some pure makefile initialization files, and there is a free programmer that works on linux too. So some people can use free software without any code size limit and with the free GNU compiler developed by ARM itself (some of its some employees).
They are moving towards eating the market the P1 is (was) good at. They are moving fast, and they are dirty cheap.
They are currently at 90nm, and I think that we will see the first 65nm or 40nm STM32 microcontroller at the same time we have the P2 in our hands.
I love the current P2, wish I had it now.
While that is most likely true (without painfully careful programming if even possible), will any sizable chunk of hard real-time tasks require more than ~512k longs?
EDIT: Perhaps you were implying that all real-time tasks would be small enough to fit in COG memory. That's probably true. My comment was related to hub execution. High level programs compiled to run in hub exec mode will most likely encounter lots of hub stalls. Even straight-line code will have stall issues when it is reading data from hub RAM.
EDIT2: I've never been a fan of the Mix-Master scheme. It will work great for hand-tweaked code, but I don't think it will work so well for HLL compiled code. I think there will be more motivation to use P2 with HLLs, but they won't get the most out of the Mix-Master design. I think this will tend to keep people away from the P2 instead of attracting new customers.
(NOTE: "effectively deterministic" is another way of saying "as long as I can guarantee that X critical instructions happen in Y time, it's deterministic enough for my needs". Propeller's claim of "deterministic" is only critical when you are working at the threshold of its performance limits. For instance, if you are just coding a 115Kbps serial channel, then you can make the Propeller "effectively deterministic" without having to perform clock-cycle level timings.)
It is for these reasons that I think Cluso commented that RPi 2 is changing how he's looking at all of this. It was nearly impossible to compare the original Propeller to the ARM, because they were so dissimilar. But the Propeller 2 has moved closer to ARM (hub execution, multi-tasking, etc), and therefore increasingly invites legitimate comparison. Still, Propeller 2 and ARM are not an apples-to-apples comparison, but they are now certainly a fruit-to-fruit comparison. And, to belabor the metaphor, the big product manufacturers are selling fruit pies, not apple pies (unless, of course, we are talking about Apple).
I like the Propeller precisely because it is a different paradigm. The Propeller 2 may be more capable, but it is also less different. At the end of the day, less different might be exactly what Parallax needs to thrive (I really don't know one way or another, so I'm certainly not making any judgement calls). But it will most certainly result in more comparisons to other products, and we should all accept that reality instead of arguing against it.
I think they mean "Relative to Windows"
( I think that optimum boundary varies by COG, as the FIFO LSB is COG related)
* Another possible way to handle this, by reducing jitter at the expense of speed is to have an optional FIFO fill mode that always takes 16 clocks after any jump.
Page location then has no effect on delays, but raw code throughput is down on average.
Code where jitter does not matter, would disable that option.
Absolute code placement could still be used, but it is less mandatory.
t's another reason to favour skip-style opcodes, that can avoid multiple jumps (such opcodes also co-operate well with SDRAM and QuadSPI memory )
Oops, meant 512 longs, not 512k longs (wouldn't that be nice?). And indeed, I was implying that they should fit in cog memory. I know you were specifically talking about hub ram, which is why I was asking if most real-time tasks couldn't just avoid it because for the most part the could fit in the available cog memory.
As far as the new hub access scheme goes, worst-case performance will be similar to that of the original round-robin method but in practice will be somewhere above that. It would seem that you would want to unroll your loops as much as possible. The way I always saw it was that hub execute was for big, non-hard real-time code, but still an option if those tasks don't have to operate at high rates. There's not really any perfect solution when you have to deal with parallel processes all sharing the same bandwidth-limited resource. What would you propose?
I'd only agree up to a point - what you say is also true of ASM vs HLL on any MCU - hand tweaked code always works better (but is only needed in rare cases).
The HLL SW tools can help in P2, by making it easy to allocate code to in-COG execute - and that's a strong selling feature in favour of P2.
Even better : such SW tools can improve over time, and are not on the P2 tapeout critical path.
How about we devise a test for it? Even if only as a gedanken experiment.
Let's say you have two single bit digital inputs A and B, and a digital output X.
Your requirement is to compute the Exclusive OR of A and B and output the result in X.
A Prop executes code at 20MIPS or 50ns per instruction.
Let's say it takes 10 instructions to perform the given task. I'm too tired to imagine what the actual PASM required there is. Perhaps someone could offer a solution.
So, we can produce the correct result X 500ns after the inputs have changed. That's on a P1, perhaps a P2 can do it in 50ns, who knows.
So far so good. I'm very sure a STM32 F4 or whatever new fangled ARM MCU could also do that. Perhaps in a similar software loop. Perhaps even from an interrupt routine.
Now, let's make it harder. Let's say you have to do that task four times over simultaneously. Four pairs of inputs and four outputs.
On the Prop that is easy. Just dedicate 4 COGS to run the same loop. Job done.
On a single CPU ARM or whatever it is a bit tricky. You have modify that loop or interrupt routine to deal with four pair of inputs and four out puts. Perhaps doable.
Now, let's make it harder. At the same time you have to be able to so some other useful processing. Which may also require some stringent real time requirements.
On a Prop that's easy, just throw another COG at it. On that ARM it will take work to integrate with that high speed loop, it will no doubt impact the speed of that loop.
Now let's make it harder again. I want my code and your code to run together without impacting the timing of either AND without having to hack around with either code to make the system work. Mix and match of reusable components. I don't even want to know or care about the timing requirements of your code.
So tell me. Which is the most "highly deterministic" solution?
I kind of disagree with Seairth when he says the "Propeller 2 ... is also less different." Seems to me that with 16 COGS to do what I describe above it's twice as different as the P1 was:)