The "shiny bauble" of the P2 seems to have had us all mesmerized for the past year or two.
Yes it has!
And the improvements made after Thanksgiving created a spec that was very compelling. So compelling that I started designing a product around it that I have wanted to do for a very long time. Creating something Parallax needs... a potential NEW customer.
I still have hope and faith that, in the end, Chip will come up with something that is leading edge and also practical. So today I spend doing the tedious pad to pad testing of my first boards for this design...
Anything that is not code compatible and nearly identical in operating theory to the P1 is not a P1 variant. I am not in favor of tearing stuff out of the P2 to make it more energy efficient.
And I certainly don't want to see any new features added to the P1 the features it already has are just fine. If you want more features, then in my mind you are talking about a P2, whose design is still open.
Chip's proposed 16 Cog P1 with more RAM would be powerful enough to do lots of things that we can't do now. In my opinion, it wouldn't compare with a P2 at any level, but that is not to say that it wouldn't improve the P1 by about an order magnitude and still be best in class.
I even disagree with Chip about the usability of a 4 Cog P2. Because of the hardware design, most of the drivers I use now, which are happiest in there own cog, would all fit nicely into one P2 cog certainly everything that I do on a regular basis.
I understand that some of the visions about what the 8 Cog P2 could do can't be reasonably implemented with 4 Cogs and to really get the best out of the architecture, you really need
8 cogs but jeesh if that kind of functionality is the end goal and the only one that inspires people, then build a board with two 4 Cog P2's, spiff up the hardware just a little so that all the cogs
look the same in software and charge just a little more for the board about $9 more. Fix the hardware so the cooperation between the 2 4Cog P2's doesn't require a cog of its own, (so that
people don't complain that you are really only getting a 6 Cog P2.)
If you fix up the 4Cog P2 to allow it to seamlessly cooperate with its twin, then 3rd parties can easily offer variants of that design directed at all kinds of problems that the P2 now approaches but does not quite solve.
Here is something that I have no right to say… but if I understand things half way… it seems to me that opportunity is sitting there in the Trace hardware… that seems a perfect place to go back to and elaborate a little.
...build a board with two 4 Cog P2's, spiff up the hardware just a little so that all the cogs look the same in software...
Do you have any suggestions as to how that might be done?
How are those to chips going to allow access to each other HUB RAM? How are they going to give access to each others pins? What about locks? And so on.
Heck if that were so simple we would only need to have a very small one COG chip and just put as many on the board as we need.
Devices like the XMOS xcores and the old Transputers can indeed divide the code of a single program over many processors even when those processors are on different chips. However they do so my making some rather rigid rules in their programming languages. For example: You cannot share memory between threads, threads can only communicate via "channels". And so on.
I would be thrilled with a 4 cog P2 with some kind of built in data sharing between chips.
Even if it was as simple as a dedicated serial line that just shares the values of some data exchange blocks in the background. For my process control apps that could be as simple as each chip having 4-8 LONG values that all other P2s on the net could see with them broadcast in a round robin format. I could use that, along with flag bits contained within, to communicate and coordinate!
That is exactly how a lot of the large automation systems work.
The most obvious outcome, would of course be to do both. Developing a 65nm P2 will take time and cost money. Developing the PX16X32B would be relatively quick and raise money - which could then be used to fund the 65nm P2.
In the end, Parallax would have a family of three superb chips that all have quite a long life ahead of them.
Ross.
A PX16X32B will cost money (actual and opportunity costs due to P2 delays) and will HOPEFULLY RAISE money.
Carrying three chip inventories? What is the additional cost of that?
Parallax currently has 52,753 in 4 different packages. Assuming $3 per chip cost to Parallax, this is over $159,000 in inventory that are carrying. I don't know when they last did a production run so these could be high or average inventory numbers.
Assuming these are average numbers based on yield from production runs and other factors.
You've just added a few $100,000's to Parallax's inventory burden. I don't think California taxes INventory but if they do, this is another concern.
How much ESD storage space do they have? (This could be an additional 1 time expense but probably relatively small if they have the space)
Divergent documentation, cost to develop tool (many are open source but there is still a cost), train FTE's (it's time that takes away from something else), more little things that are COSTS to Parallax.
As others have pointed out, it may just shift money from the P1 pot to the PX16 pot.
Since you are still up, could the TRACE hardware be modified to enhance communication between two 4 COG P2's? How hard would this be how much extra real estate?
Do you have any suggestions as to how that might be done?
Based on past experience with automation systems you could do something like this via user defined software looking at the shared variables.
Each chip broadcasts 4 long values in order, round robin over a shared serial line. Each chip reads in said values (in the background) and stores them in a simple table (you could set a max # of chips allowed to simplify the data storage).
Lets say chip 1 needs to send data to chip 2. It first sets an agreed to bit in one of its longs to indicate that an exchange is wanted. Chip 2 sets a corresponding 'ok ready bit' in one of its variables. Chip 1 sees that bit set, posts the value to transfer in LONG1 and sets a 'data there' bit. Chip 2 reads the data and looks at that value to see what the command is. Based on the received value they handshake a data exchange. This could be as simple as passing an analog value and pin # to update or as complex as a complete script command string. All based on agreed to command (software) routines.
For simple control just using one or more long for state machine flags would allow a lot of coordination between chips.
Would it be uber fast for large data transfers? No, but for real world applications it would be fine for sharing most data in a usable time frame.
Absolutely not!!! I don't understand at least half of what is new in the P2. But I'm not sure that sharing HUBRAM should be a deal breaker. There will be more HUBRAM available.
If an executive function needs access to a shared memory space with a driver, then it would by definition (?) need that access at a lower bandwidth than the driver it is sharing
memory with(?). I'm not a language guy… how this complicates language development is beyond me, but in terms of cog execution and functionality at the cog level, I don't see
a huge problem. With the benchmarks currently on the table… the higher level language functions could take quite a hit and still achieve landmark functionality.
Since you are still up, could the TRACE hardware be modified to enhance communication between two 4 COG P2's? How hard would this be how much extra real estate?
Thanks
Rich
Hi Rich
You could use the pin transfer (SETXFR) in 16 bit mode and burst data into another P2's aux ram
using a similar method as Chips SDRAM driver with a tweak or two.
At 80Mhz that's 160 megabytes/sec transfer. As long as you limited your packets to fit in aux should
work pretty well.
Hi Rich
You could use the pin transfer (SETXFR) in 16 bit mode and burst data into another P2's aux ram
using a similar method as Chips SDRAM driver with a tweak or two.
At 80Mhz that's 160 megabytes/sec transfer. As long as you limited your packets to fit in aux should
work pretty well.
I/O counts need to be looked at. My feeling is that with external RAM connections ( which are needed to compete in the memory race ) we need at least 32 I/O free. With the lower cog count I would hope that we could fit that in even if the external ram I/O is digital only...
It is Chip's chip, so we will play with whatever he wants to make :-)
I know, and agree. I raised that concern, but apparently due to thermal envelope concerns, package cost, thermal pad size etc., Chip really wants the 64 I/O TQFP-100 package, for a P1E variant or 4 cog P2 - one or the other, Chip's choice
Which leaves 20 pins after 16 bit SDRAM is added.
Frankly, I'd just make the P2 as designed with 8 cores, 92 I/O, characterize it, and provide a nice table something like: (made up numbers for illustration purposes only) ... as it would most likely get to market fastest.
20MHz P2-8C-256K 0.1W typical, .2W maximum
40MHz P2-8C-256K 0.2W typical, .4W maximum
80MHz P2-8C-256K 0.4W typical, .8W maximum
100MHz P2-8C-256K 0.5W typical, 1W maximum
120MHz P2-8C-256K 0.6W typical, 1.4W maximum
140MHz P2-8C-256K 0.7W typical, 1.7W maximum
160MHz P2-8C-256K 0.8W typical, 3W maximum
Typical figures obtained running XXXXXX on 8 cogs
Maximum figures obtained running YYYYYY on 8 cogs
NOTE: Extreme worst case simulation result is 8W power consumption at 180MHz with all logic on all cogs active
Thermal management is strongly recommended above 1W
- heatsink at ... W
- heatsink and fan at ... W
- overclocking over 160W not recommended with out extreme cooling measures, if you must, use a peltier element, thermal paste, heatsink, fan or water/liquid nitrogen cooling. Voids warranty.
THAT, I believe is what engineers would expect, and want before pcb/product design. Forewarned is Forearmed.
I never expected the level of panick over power envelope! (Chip warned a long time ago of 1W-2W typical power consumption)
I was quite shocked and dissapointed at it.
I even showed how a 180nm Celeron, with roughly twice the overall integer performance to a P2-160Mhz running eight cogs, maxed out at 60W-66W.
I don't recall seeing a Celeron burn or cause a fire, and I used to overclock them like CRAZY! (2x - 3x spec)
p.s.
Engineers tent do be smart people
It would be very difficult, if not impossible, to make the P2 or the PCB it is on catch fire.
Frankly, I suspect it is impossible (see my discussion with tonyp12, and new catch fire thread)
I/O counts need to be looked at. My feeling is that with external RAM connections ( which are needed to compete in the memory race ) we need at least 32 I/O free. With the lower cog count I would hope that we could fit that in...
I completely agree. It is Chip's deal and I will support what ever he does as best I can. I also believe that the P2 is viable with proper engineering documentation. You have flexibility at low power and extreme power if you need it and do a proper design to use it. This is not a 'mobile' chip and people pushing for it to be power sipping at full throttle are killing an amazing design.
"With great power, comes great responsibility (to properly engineer)".
I completely agree. It is Chip's deal and I will support what ever he does as best I can. I also believe that the P2 is viable with proper engineering documentation. You have flexibility at low power and extreme power if you need it and do a proper design to use it. This is not a 'mobile' chip and people pushing for it to be power sipping at full throttle are killing an amazing design.
"With great power, comes great responsibility (to properly engineer)".
The power usage of the current P2 design is part of why I'm so excited about the P1b and it's development. Add enough IO, ram, speed, etc to it so it remains competitive in the "mobile" battery powered space and the bigger P2 in situations where power consumption isn't a real concern. Again I spent my EE days involved with industrial process controls and a programmable control board with a fraction of the capabilities of the P2 consumed 5+ watts. In those situations capability and reliability was what matters. The 5W P2 can open up other markets for Parallax in the industrial control world! Like a powersupply that outputs 1.8V at 2-5 amps! :-)
I completely agree. It is Chip's deal and I will support what ever he does as best I can. I also believe that the P2 is viable with proper engineering documentation. You have flexibility at low power and extreme power if you need it and do a proper design to use it. This is not a 'mobile' chip and people pushing for it to be power sipping at full throttle are killing an amazing design.
"With great power, comes great responsibility (to properly engineer)".
The power usage of the current P2 design is part of why I'm so excited about the P1b and it's development. Add enough IO, ram, speed, etc to it so it remains competitive in the "mobile" battery powered space and the bigger P2 in situations where power consumption isn't a real concern. Again I spent my EE days involved with industrial process controls and a programmable control board with a fraction of the capabilities of the P2 consumed 5+ watts. In those situations capability and reliability was what matters. The 5W P2 can open up other markets for Parallax in the industrial control world! Like a powersupply that outputs 1.8V at 2-5 amps! :-)
Ah! A fellow frequent hard hat steel toe boot wearer!!!!
Went into IT as a sys admin ~17 years ago for the money but have to admit I miss the work. When I got out of school I went to work with a company and spent a lot of time designing control systems for new applications or more frequently replaced obsolete control systems with "modern" electronics. In the late '80s I got a radioshack color computer 2 and noticed that the cartridge connector had all but one of the pins coming from the cpu (6809) and had the idea of using the processor board from the coco as a control board. A custom made board that plugged into the the expansion port and soon we were selling over 100 a month! Those were fun days. Of course looking back at the kind of things that I did back then I have no idea of how I did some of the things I did! Clearly forgot more then I remember!! :-)
In the late '80s I got a radioshack color computer 2 and noticed that the cartridge connector had all but one of the pins coming from the cpu (6809) and had the idea of using the processor board from the coco as a control board.
The COCO2 was cool like that! RS had the tech document book and a good assembly book for it. I designed and built an EPROM programmer for 2716/32's that plugged into it.
I trashed all that stuff in the 90's, wish I had kept some of it.
The COCO2 was cool like that! RS had the tech document book and a good assembly book for it. I designed and built an EPROM programmer for 2716/32's that plugged into it.
I trashed all that stuff in the 90's, wish I had kept some of it.
C.W.
The Cocos were great machines and am glad I kept mine. About 9 years ago I did a little looking online and found that a local coco club was holding a cocofest in the Chicago area! At those fests cocos are cheap and easy to come by!
Back then I would have killed for a propeller to use as a controller!
Comments
Not all of us.
No, that's true - some of us have made good use of the P1 in this period!
Ross.
Yes it has!
And the improvements made after Thanksgiving created a spec that was very compelling. So compelling that I started designing a product around it that I have wanted to do for a very long time. Creating something Parallax needs... a potential NEW customer.
I still have hope and faith that, in the end, Chip will come up with something that is leading edge and also practical. So today I spend doing the tedious pad to pad testing of my first boards for this design...
And I certainly don't want to see any new features added to the P1 the features it already has are just fine. If you want more features, then in my mind you are talking about a P2, whose design is still open.
Chip's proposed 16 Cog P1 with more RAM would be powerful enough to do lots of things that we can't do now. In my opinion, it wouldn't compare with a P2 at any level, but that is not to say that it wouldn't improve the P1 by about an order magnitude and still be best in class.
I even disagree with Chip about the usability of a 4 Cog P2. Because of the hardware design, most of the drivers I use now, which are happiest in there own cog, would all fit nicely into one P2 cog certainly everything that I do on a regular basis.
I understand that some of the visions about what the 8 Cog P2 could do can't be reasonably implemented with 4 Cogs and to really get the best out of the architecture, you really need
8 cogs but jeesh if that kind of functionality is the end goal and the only one that inspires people, then build a board with two 4 Cog P2's, spiff up the hardware just a little so that all the cogs
look the same in software and charge just a little more for the board about $9 more. Fix the hardware so the cooperation between the 2 4Cog P2's doesn't require a cog of its own, (so that
people don't complain that you are really only getting a 6 Cog P2.)
Rich
If you fix up the 4Cog P2 to allow it to seamlessly cooperate with its twin, then 3rd parties can easily offer variants of that design directed at all kinds of problems that the P2 now approaches but does not quite solve.
How are those to chips going to allow access to each other HUB RAM? How are they going to give access to each others pins? What about locks? And so on.
Heck if that were so simple we would only need to have a very small one COG chip and just put as many on the board as we need.
Devices like the XMOS xcores and the old Transputers can indeed divide the code of a single program over many processors even when those processors are on different chips. However they do so my making some rather rigid rules in their programming languages. For example: You cannot share memory between threads, threads can only communicate via "channels". And so on.
All in all very different machines.
I would be thrilled with a 4 cog P2 with some kind of built in data sharing between chips.
Even if it was as simple as a dedicated serial line that just shares the values of some data exchange blocks in the background. For my process control apps that could be as simple as each chip having 4-8 LONG values that all other P2s on the net could see with them broadcast in a round robin format. I could use that, along with flag bits contained within, to communicate and coordinate!
That is exactly how a lot of the large automation systems work.
A PX16X32B will cost money (actual and opportunity costs due to P2 delays) and will HOPEFULLY RAISE money.
Carrying three chip inventories? What is the additional cost of that?
Parallax currently has 52,753 in 4 different packages. Assuming $3 per chip cost to Parallax, this is over $159,000 in inventory that are carrying. I don't know when they last did a production run so these could be high or average inventory numbers.
Assuming these are average numbers based on yield from production runs and other factors.
You've just added a few $100,000's to Parallax's inventory burden. I don't think California taxes INventory but if they do, this is another concern.
How much ESD storage space do they have? (This could be an additional 1 time expense but probably relatively small if they have the space)
Divergent documentation, cost to develop tool (many are open source but there is still a cost), train FTE's (it's time that takes away from something else), more little things that are COSTS to Parallax.
As others have pointed out, it may just shift money from the P1 pot to the PX16 pot.
Sorry, just trying to keep things real.
Since you are still up, could the TRACE hardware be modified to enhance communication between two 4 COG P2's? How hard would this be how much extra real estate?
Thanks
Rich
Based on past experience with automation systems you could do something like this via user defined software looking at the shared variables.
Each chip broadcasts 4 long values in order, round robin over a shared serial line. Each chip reads in said values (in the background) and stores them in a simple table (you could set a max # of chips allowed to simplify the data storage).
Lets say chip 1 needs to send data to chip 2. It first sets an agreed to bit in one of its longs to indicate that an exchange is wanted. Chip 2 sets a corresponding 'ok ready bit' in one of its variables. Chip 1 sees that bit set, posts the value to transfer in LONG1 and sets a 'data there' bit. Chip 2 reads the data and looks at that value to see what the command is. Based on the received value they handshake a data exchange. This could be as simple as passing an analog value and pin # to update or as complex as a complete script command string. All based on agreed to command (software) routines.
For simple control just using one or more long for state machine flags would allow a lot of coordination between chips.
Would it be uber fast for large data transfers? No, but for real world applications it would be fine for sharing most data in a usable time frame.
Absolutely not!!! I don't understand at least half of what is new in the P2. But I'm not sure that sharing HUBRAM should be a deal breaker. There will be more HUBRAM available.
If an executive function needs access to a shared memory space with a driver, then it would by definition (?) need that access at a lower bandwidth than the driver it is sharing
memory with(?). I'm not a language guy… how this complicates language development is beyond me, but in terms of cog execution and functionality at the cog level, I don't see
a huge problem. With the benchmarks currently on the table… the higher level language functions could take quite a hit and still achieve landmark functionality.
You could use the pin transfer (SETXFR) in 16 bit mode and burst data into another P2's aux ram
using a similar method as Chips SDRAM driver with a tweak or two.
At 80Mhz that's 160 megabytes/sec transfer. As long as you limited your packets to fit in aux should
work pretty well.
Cheers
Brian
Cheers
64 I/O P2, quad core, after adding 16 bit SDRAM, has 20 I/O left.
8 bit xfer, with strobe in / strobe out, leaves 10 pins on the SDRAM prop, enough for video out and a few I/O's
BUT WE NOW HAVE a ~100MB/sec comm channel to another prop for more I/O! (roughly gigabit!)
AND
We can use 8 bit SDRAM, which would leave 29 I/O free ... far better than 20.
we might be able to use (for a few more pins) 480MBps USB Phy's
I know, and agree. I raised that concern, but apparently due to thermal envelope concerns, package cost, thermal pad size etc., Chip really wants the 64 I/O TQFP-100 package, for a P1E variant or 4 cog P2 - one or the other, Chip's choice
Which leaves 20 pins after 16 bit SDRAM is added.
Frankly, I'd just make the P2 as designed with 8 cores, 92 I/O, characterize it, and provide a nice table something like: (made up numbers for illustration purposes only) ... as it would most likely get to market fastest.
20MHz P2-8C-256K 0.1W typical, .2W maximum
40MHz P2-8C-256K 0.2W typical, .4W maximum
80MHz P2-8C-256K 0.4W typical, .8W maximum
100MHz P2-8C-256K 0.5W typical, 1W maximum
120MHz P2-8C-256K 0.6W typical, 1.4W maximum
140MHz P2-8C-256K 0.7W typical, 1.7W maximum
160MHz P2-8C-256K 0.8W typical, 3W maximum
Typical figures obtained running XXXXXX on 8 cogs
Maximum figures obtained running YYYYYY on 8 cogs
NOTE: Extreme worst case simulation result is 8W power consumption at 180MHz with all logic on all cogs active
Thermal management is strongly recommended above 1W
- heatsink at ... W
- heatsink and fan at ... W
- overclocking over 160W not recommended with out extreme cooling measures, if you must, use a peltier element, thermal paste, heatsink, fan or water/liquid nitrogen cooling. Voids warranty.
THAT, I believe is what engineers would expect, and want before pcb/product design. Forewarned is Forearmed.
I never expected the level of panick over power envelope! (Chip warned a long time ago of 1W-2W typical power consumption)
I was quite shocked and dissapointed at it.
I even showed how a 180nm Celeron, with roughly twice the overall integer performance to a P2-160Mhz running eight cogs, maxed out at 60W-66W.
I don't recall seeing a Celeron burn or cause a fire, and I used to overclock them like CRAZY! (2x - 3x spec)
p.s.
Engineers tent do be smart people
It would be very difficult, if not impossible, to make the P2 or the PCB it is on catch fire.
Frankly, I suspect it is impossible (see my discussion with tonyp12, and new catch fire thread)
I completely agree. It is Chip's deal and I will support what ever he does as best I can. I also believe that the P2 is viable with proper engineering documentation. You have flexibility at low power and extreme power if you need it and do a proper design to use it. This is not a 'mobile' chip and people pushing for it to be power sipping at full throttle are killing an amazing design.
"With great power, comes great responsibility (to properly engineer)".
The power usage of the current P2 design is part of why I'm so excited about the P1b and it's development. Add enough IO, ram, speed, etc to it so it remains competitive in the "mobile" battery powered space and the bigger P2 in situations where power consumption isn't a real concern. Again I spent my EE days involved with industrial process controls and a programmable control board with a fraction of the capabilities of the P2 consumed 5+ watts. In those situations capability and reliability was what matters. The 5W P2 can open up other markets for Parallax in the industrial control world! Like a powersupply that outputs 1.8V at 2-5 amps! :-)
I'm are half way reading this thread ----> BUT already with User Name
Give us before You start another work Instructions List to current P2.
and ser-out-in as them work now.
Before we start talking on NEW PX chip
THANKS
My alternative is
>
It was post to Chip.
That quoted one of Yours answer to him.
And I like You answer to
Went into IT as a sys admin ~17 years ago for the money but have to admit I miss the work. When I got out of school I went to work with a company and spent a lot of time designing control systems for new applications or more frequently replaced obsolete control systems with "modern" electronics. In the late '80s I got a radioshack color computer 2 and noticed that the cartridge connector had all but one of the pins coming from the cpu (6809) and had the idea of using the processor board from the coco as a control board. A custom made board that plugged into the the expansion port and soon we were selling over 100 a month! Those were fun days. Of course looking back at the kind of things that I did back then I have no idea of how I did some of the things I did! Clearly forgot more then I remember!! :-)
The COCO2 was cool like that! RS had the tech document book and a good assembly book for it. I designed and built an EPROM programmer for 2716/32's that plugged into it.
I trashed all that stuff in the 90's, wish I had kept some of it.
C.W.
The Cocos were great machines and am glad I kept mine. About 9 years ago I did a little looking online and found that a local coco club was holding a cocofest in the Chicago area! At those fests cocos are cheap and easy to come by!
Back then I would have killed for a propeller to use as a controller!