About P1+ Is It Viable ?
pjv
Posts: 1,903
in Propeller 2
Hi All;
I absolutely love the current Propeller, it is so easy to use, and delivers great performance. Also I have great hopes for the upcoming P2, though it seems a bit like chasing a rainbow, albeit getting closer! Chip is clearly working his buns off getting P2 to a conclusion, and adding capabilities as he goes. With considerable and admirable contribution from the forumistas. Well done !
That said, I have for some time been seriously considering developing (with a SMALL "d") the enhanced version of the P1, embodying most, perhaps all of the items on Ken's published wish list. For myself, just the second 32 bit port B and some fuses would suffice. But some other enhancements, if simple to implement could be very desirable, and add commercial benefit to the project.
There also are some more substantial additions I would like such as indirect addressing and more speed. The first might be tough, but Chip figures that moving the P1 to 180 nm technology might up the clock to the 200 Mhz range. Nice ! But what will the smaller geometry do to static leakage current ? Any significant increase over the present few uA will be a killer for my battery apps.
Then there are Cluso's suggestions of 16 cogs. Again nice, but how much are we changing the basic design? The more changes we make, the greater the risk, but the better the commercial reward. So where is the balance..... clearly I would not pursue this dream if the reward cannot be assured.
And in order to assure success, I believe Chip's involvement would be required for at least a short period of time. How long would best be judged by him, once the feature set is determined which in itself would require some cycles from him to assess reasonability.
And then there are the programming tools. They would need to modified to accommodate whatever silicon changes are made. Who could do that, and at what cost?
So the VERY rough informal estimates I have from others much closer to those details is that a set of 180 nm masks alone will cost $200K. And the current sweet spot for cost vs return is the 60 nm size; more mask costs, but less silicon, so more chips per wafer. I fear however that will cripple any battery operation due to leakage, but still might be a viable product for other applications.
Ken's feeling is that it would be tough to get a positive return on the total investment in less than 5 years.
So, I have not yet abandoned my dream, but also have not committed to complete it. More investigation needs to be done, and so I am soliciting the forum members to add insight, positive as well as negative, into a decision tree to a final position for me.....
I "want" to do it, but does it make sense ??
Cheers,
Peter (pjv)
I absolutely love the current Propeller, it is so easy to use, and delivers great performance. Also I have great hopes for the upcoming P2, though it seems a bit like chasing a rainbow, albeit getting closer! Chip is clearly working his buns off getting P2 to a conclusion, and adding capabilities as he goes. With considerable and admirable contribution from the forumistas. Well done !
That said, I have for some time been seriously considering developing (with a SMALL "d") the enhanced version of the P1, embodying most, perhaps all of the items on Ken's published wish list. For myself, just the second 32 bit port B and some fuses would suffice. But some other enhancements, if simple to implement could be very desirable, and add commercial benefit to the project.
There also are some more substantial additions I would like such as indirect addressing and more speed. The first might be tough, but Chip figures that moving the P1 to 180 nm technology might up the clock to the 200 Mhz range. Nice ! But what will the smaller geometry do to static leakage current ? Any significant increase over the present few uA will be a killer for my battery apps.
Then there are Cluso's suggestions of 16 cogs. Again nice, but how much are we changing the basic design? The more changes we make, the greater the risk, but the better the commercial reward. So where is the balance..... clearly I would not pursue this dream if the reward cannot be assured.
And in order to assure success, I believe Chip's involvement would be required for at least a short period of time. How long would best be judged by him, once the feature set is determined which in itself would require some cycles from him to assess reasonability.
And then there are the programming tools. They would need to modified to accommodate whatever silicon changes are made. Who could do that, and at what cost?
So the VERY rough informal estimates I have from others much closer to those details is that a set of 180 nm masks alone will cost $200K. And the current sweet spot for cost vs return is the 60 nm size; more mask costs, but less silicon, so more chips per wafer. I fear however that will cripple any battery operation due to leakage, but still might be a viable product for other applications.
Ken's feeling is that it would be tough to get a positive return on the total investment in less than 5 years.
So, I have not yet abandoned my dream, but also have not committed to complete it. More investigation needs to be done, and so I am soliciting the forum members to add insight, positive as well as negative, into a decision tree to a final position for me.....
I "want" to do it, but does it make sense ??
Cheers,
Peter (pjv)
Comments
Although it seems like the road to the P2 is taking forever, doing the P1+ will delay the P2 at least a little. I think we're close enough to a P2 that a P1+ would not be available significantly sooner ... that's really the crux of this discussion.
Absolutely agreed. I cannot believe Chip's mastery - just a single person with a little input from the forum.
The P2 will be worth the wait for us hobbyists, but I fear way too late for those commercial users who required just those 4-5 items which could easily have been done years ago
The P1V has been modified to have more hub RAM, and more cog RAM, and 64 I/O (in fact 128 I/O has been achieved with removing something). Instructions have been added but not what I think should be in a commercial P1+.
The only risk I can see is that the fuses may not work. If not, then hopefully it's just a matter that we just cannot use them. IIRC correctly, if not blown, the default is ignore them (ie failsafe).
Not sure about indirect addressing complexity. Would AUGD/AUGS work? I had this working at one time.
Yes, static leakage goes up. 160MHz is achievable but 200MHz is uncertain. The P2 has reduced below 160MHz currently, but not because of the features required in a P1+.
Since the P1+ would run 2x speed but still 1:16 hub, LMM speed would double. Perhaps an auto-incremented RDLONG might help, but lets avoid feature creep.
I have 4 cogs working but not as cleanly as I would like. But Chip is already a master who can change the number of cogs easily, and knows the requirements and the traps. I would say easy as for Chip.
Perhaps Chip could have 8 cogs completely powered off if not used, or each cog powered off if not used. Don't know if this is possible or viable. This is how the phone chips etc get the power down - by dynamically switching off sections of the chip during idle times.
Absolutely.
Since the changes would be minor, I believe they could be done easily just to support the additions without feature creep. I know we all have wanted macros, ifdefs, etc for years. This is not about those, but just about the new instructions and bigger memory. I believe INB/DIRB is already supported (ie 64 I/O). Nothing more.
Spin supports 64 I/O but is limited to 64KB of hub ram/rom. So we can get 64KB of hub RAM for spin without any changes. My faster spin works although not exhaustively tested. It can be made faster again with 4KB cog ram.
BTW PropTool supports compilation of the 4 unused opcodes under their original names. We have used this feature to test new instructions on the P1V.
No idea about costs as the original estimates from Chip appears to be way too low.
OnSemi cannot do 60nm
Since Treehouse seem to be an OnSemi expert, and they don't have a license for >300K gates, I presume that 60nm would be out of their league. Thus I think <180nm is out of the question.
Note that <180nm seems to be a breakpoint where leakage per gate begins to fall ???
But what about just the incremental cost of doing a P1+ now versus positive return on that P1+ ???
ie consider the investment to date to be sunk on the P2 project that will still continue.
Makes sense to me.
The only real risk I see is that the Verilog -> Treehouse -> OnSemi process fails. The only OnSemi P2 failed due to catastrophic shorts marrying the new pins to the RTL generated masks - human error. I don't believe the P1+ would use this pin outline because it would use OnSemi's standard I/O pin circuits. Chip would need to comment about this one.
And of course real costs involved.
Regarding the power in waitloops, I have that covered with my task scheduler.... threads simply WAIT until the time they are supposed to run. Negligible active power. The static leakage is the main concern here.
Regarding the realistic availability of a P1+, depending on the features, I suspect it could take a year. And as for the availability of the P2, for me it isn't a choice of one or the other; I like the simplicity of the P1 so much, yet wish it had the extra port and fuse features. Any more would be just phenomenal, but perhaps too pricey.
The P1+ would need to justify its own existence, otherwise the investment would not be a wise one.
Cheers,
Peter (pjv)
I can confidently say that's what's referred to as a necessary evil. Powering down has some nasty subsequent power-up delays at the very least. I bet there is plenty of gotcha's in there also. Being able to idle at nominal voltage allows for instant action.
I suspected you'd have an interest in this.
As to financial concerns, this would have to be funded outside of Parallax, and I would need to know that it's "worth it". Hence I would want to keep feature creep to a minimum, and just get it done.
That said, and depending on how the + development went, I could have an appetite for a subsequent further development with a somewhat, or perhaps, much bigger "d" to encompass some of the things you and others have put forth. But that would be a discussion for a future time.
Let's see what comes out of this discussion, and see if this first pass is viable.
Cheers,
Peter (pjv)
I presume you mean a P1V+, as there is no incremental editing possible to the original P1.
That's the tricky one.
Keep in mind that fuses are not standard verilog, neither are the Analog PLL's or xtal oscillators or ADC pin-thresholds, all found inside the P1....
Adding any items outside standard verilog, adds more engineering cost, and risk.
Possibly the Xtal Osc + Analog PLL can be side-stepped with the inclusion of a support device like Si5351A. ( $0.8640 / 1k )
MSOP10 package, and uses a compact 24~27Mhz xtal.
Test vehicles need to be chosen for this, and I think a P1 + FPGA module could be released much faster than any silicon flow, and give important code-proof signoffs.
ie someone field or bench testing P1V+, is not testing P2, and that testing resource is something Ken is relying on, to get to a sign-off stage.
I came to this conclusion in a very roundabout way, starting with the P1, running out of pins, building Cluso's 3 prop board, building my own, adding multiplexers to share functions on pins, and finally moving over to FPGA chips which had more pins.
There are some interesting synergies with building retrocomputers on a cyclone IV vs the P1V work. Things such as building minimal bootloaders that have enough smarts to get SD working so more data can be stored externally, hence freeing up more internal ram on the FPGA.
The great thing about the P1V design is that it is possible to build the ideal propeller chip.
So I have been looking at some of the design files. A question for the boffins - what is taking up all the space?
I know the propeller is a different beast to, say, an 8080/Z80, but on the other hand, a cog has a smaller instruction set compared to a Z80. We have a design that has a Z80, internal 'rom', vga driver, sd driver, 4 uarts, keyboard, external ram memory driver, but it still only fills about half of the smallest cyclone IV chip (EP4CE6 which is 6000 LEs).
On page 2 of this thread http://forums.parallax.com/discussion/157850/a-de2-115-propeller-retromachine-1920x1080-enabled/p2 there is a quote from pik33 saying the (?) P1V is 14,000 LEs.
I dunno, maybe this is going off on a tangent, but the high end cylcone IV and V chips are pretty expensive, and maybe there is a cost effective solution at the lower end of the cyclone family that could run a stripped down propeller, with just a P1 and more pins and nothing much else?
How small can a fpga propeller go in terms of LEs?
On the Cyclone V, a clean P1V takes 8,320 logic ALMs; 5,543 Registers; 655,360 block memory.
On the Cyclone IV, each video (ie per cog) takes 216 LEs & 160 comb & 152 regs.
From my notes it looks like a clean P1V takes 14,785 LEs & 13,366 comb & 5,430 regs.
Remember, we have 8 cogs/cores/processors, so 14,785/8 = 1,848 LEs.
The cheapest Cyclone IV is US$11.95 for an EP4CE6E22C8N for qty 1. It virtually impossible to get a price from Altera for volume. Even Parallax had problems. Yes we know from China and from the BeMicro series that they buy these FPGAs from Altera way below the quoted prices. And don't attempt to use their config flash devices at around US$16 ea. Alternatives are sub $1.00.
At one stage I did thing about making a cheap FPGA board but there is no way to get competitive pricing on the FPGAs. I am not sure about Xilinx. With my contacts years ago, I was able to source the Xilinx FPGAs for a not too unreasonable price in quantity. BTW I hand configured the FPGA cells with Xilinx software. I used it to interface to the system bus on an ICL mini. Worked beautifully.
Nothing like seeing 10 of your boards plugged into the 19 slot min computer and connecting over 120 remote PCs and Terminals concurrently, located all over Oz to it. This was the early 90's
Depends a lot on how many COGs you choose
IIRC the number above was 14k for 8 COGS or 1750 per COG
Hmm - roughly 1800 LEs per cog. *ponders this*
I know this isn't quite the propeller way of doing things, but it is interesting to look at the tradeoff's of, say, a uart in a cog vs a uart in vhdl/verilog. I suspect the latter might use less resources?
There is a photo of the cyclone IV board I'm using here http://www.smarthome.jigsy.com/fpga
That won't fit a full P1V, but it would fit a P1 with one, maybe two cogs, and then do video/keyboard/sd/uarts/external ram in vhdl. Hub could be internal or external ram or a hybrid of both.
In terms of how people might build up a propeller project, grab a chip, grab objects from the obex and join them together with some glue code, there is a rather similar way of doing this with grabbing vhdl/verilog code and plugging it together with some glue code.
The EP4CE6 boards are under $25US. Cyclone II boards are cheaper again, but might only fit one cog, plus not sure if the latest version of Quartus actually can program cyclone IIs any more.
Playing around with P1V ideas on FPGAs is certainly going to be cheaper than getting chips fabricated...
You can choose the cogs, memory, peripherals, pins etc, and fit it to a wide variety of off the shelf boards. Even better you can inspect the verilog to see how it brought it all together
This FPGA price curve is one reason I favour using P1 and a Small FPGA.
The P1 represents a low cost per COG, but there are some things it simply cannot do, which is where a 'better COG' (or two) comes into play.
eg If you need high performance quadrature counting, the FPGA comes into its own.
A FPGA would also allow more advanced serial memory support.
Of course if someone has the time and the cash they can take the open source verilog and get whatever chip made from it they like...by way of proving the viability.
But that call is down to Ken and Chip, as they would have more an idea on costing and projected sales that both solutions would come to.
So I am hoping the success of the P2 will eventually fund a P1+.
And then, both the P2 and P1+ might drive Parallax prosperity for years to come.
For the interim, the P1V code on FPGA is available and working well.
Edit: I just get to play with the toys!
P1+ takes Chip 1 month
P1+ has negligible tool changes and des not require Chip's time (probably free to Parallax)
P1+ takes 3 months to real silicon for sale, so mid February 2016
P1+ sales are already there waiting for a better P1
P1+ is defined so commercial users will be ready to use, hence sales
P1+ costs say $200K (I have no idea)
P2 is delayed 1 month (Chip occupied with P1+)
P2 will have a little pressure relieved by the P1+ availability
P2 still has Chip adding video features
P2 has smart pins to do. Not started. Realistic absolute minimum 3 months, probably more!!!
P2 has debugging and testing - many gotchas still being discovered - few testing
P2 has to be heat and speed profiled, followed by another round of tweeks
P2 requires Treehouse to combine the Verilog/RTL with the hand laid I/O pin logic (or however that is going to be done)
P2 requires a lot of documentation. How long is this going to take???
P2 requires new tools. How long is this going to take??? How much will it cost Parallax???
P2 to silicon for sale... at least 9 months absolute minimum
P2 costs >>$200K + tool change costs $??? - again no idea, but definitely way more than P1+
P2 earliest silicon for sale mid August 2016 (providing first silicon works)
P1+ will have a different market to P2, so both will have their rightful marketplace. P1 will continue because it uses less power, etc.
How many P1+ chips can Parallax sell in the 6 months until P2 could be ready ???
How much ROI could Parallax recover from P1+ sales in 6 months ??? and then it's ongoing revenue in parallel with P2 sales !!!
So, the only real question is can Parallax fund the extra P1+ short term? Ken ???
Or could it be crowd funded ???
Downside for Parallax is the possible perception that Parallax cannot afford to fund the P1+ which might scare commercial purchasers of Parallax products. This needs to be considered by Ken & co.
I am going to start separate thread for crowd funding the P1+
P1+ is even less defined than P2, and any silicon release is going to need extensive FPGA testing, before sign-off.
P1V has had more thorough testing than any of the P2 stuff!
Any argument against the P1V/P1+ just makes P2 look more like a pipe dream.
Some points:
Personally, I am not in favor of crowdfunding a project such as this, and rather wish that concept had not been suggested. The effort to raise funds that way, and doubts about success, just put the project back a bunch of time.
I believe your time estimates are quite a ways out. I think the project would take a year before first operational silicon could be released.
It would be a given that Parallax would need to bless the development, and that Chip's involvement would be minimal.... advisory capacity only. Other professionals would need to do the actual implementation.
Features beyond the current P1 would need to be minimal or easy (cheap) to implement. I'm happy with PortB and some code security.
Regarding P2, I believe that, judging by where it's at today, we are still 2 years out from being able to buy a chip.
If I can get concurrence from Parallax, I plan to move this development one step forward.
Cheers,
Peter (pjv)
- Cluso convinces Chip that he only need to spend 1 week 1 month on P1+
- Chip is convinced by Cluso's projections and stops working on the P2
- After 3 months Chip is still working on integrating the P1V core with the I/O portion of the chip
- People on the forum realize that the P2 will be delayed even further and lobby for some P2 features to be added to P1+
- After 6 months everyone become so frustrated by the P1+ diversion that customers quit buying Parallax products
- After 9 months the P1+ design is still not completed and Parallax cancels all chip development projects
Cluso stop this P1+ pipe dream.
- P1V is viable.
Now...
- A number of us have made and tested most of the extensions required by P1+.
- I was sure I clearly stated that the I/O will be standard OnSemi digital pins but I see that apparently I didn't
-- I just asked Chip in case the P2 I/O is worth adding
- A long time ago Chip gave timing estimates to get the prototype silicon. These estimates were valid and it was just that the last chip prototype run that failed - that failure is now clearly understood.
I don't believe Chip needs to spend more than a week of his time on P1+, but I bowed to pressure and made it a month.
There will be some parts that he would need to do for P2 that may come a little earlier if there is a P1+, but that will be offset by later P2 time savings.
Dave,
If everyone supported my stance (and a few others) a couple of years ago that we needed a P1+, we would have had P1+ year(s) ago, Parallax would be generating a revenue stream from P1+, and Chip would still be doing the P2.
If you are right and the P1+ takes 9 months then the P2 cannot possibly be done in 9 months either!
pjv,
This isn't the first time crowd funding has been mentioned, and it certainly won't be the last. If you are wanting to fund it, absolutely fantastic!
If I was in the same financial position as 15 years ago, we wouldn't even be having this discussion. I would have built it for Parallax soon after the P1V was released.
However, it is now 15 months later and I believe Chip and the rest of us should be focusing on completing the P2 without being sidetracked by diversions like a P1+.
A leap to P1(plus/minus) silicon seems ill conceived. You first walk, before you run.
(minus applies because any cheapest Verilog P1 has to lose some P1 features)
I believe a smarter discussion is what FPGA modules can be used/created for development of P1V.
P1V+/- silicon is a long way off, and there is a market for a Modules.
Possible candidates (sub $100):
Compact Form factor : DIPFORTy1 "Soft-Propeller" €59.00 :
http://shop.trenz-electronic.de/en/TE0722-01-DIPFORTy1-Soft-Propeller
( 28K Logic Cells )
Good form factor, but shows 0 stock ?
Ardunio Shield option form factor : DE0-Nano-SoC Kit $99, with
http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=167&No=941
(40K Logic Elements)
The latter one uses a CV A4 5CSEMA4U23C6N, so should manage some P2 testing too.
Smaller FPGAs, lower cost, 1,2,3 COG region, and these would be used with P1 on module.
LCMXO3LF-6900C-S-EVN MachXO3LF CPLD 6864 LUTS $25.50 P1 Piggyback needed.
LCMXO2-7000HE-B-EVN MachXO2 CPLD 7000 LUTs $25~$29 P1 Piggyback needed.
or devices possibly iCE5LP 3520 LUTS - new, and currently lacks a 48p variant breakout board.
iCE5LP is low cost (OTP & RAM) and has a QFN48 7 x 7 mm package.
Others ?
Crystal Oscillators and Analog PLLs and RC oscillators and BOD are non-verilog cells.
Even mid-threshold pin use for ADC is ?? in FPGA or Verilog.