Looking into the future.
TC
Posts: 1,019
What do you think the future will hold for the propeller?
·
I was just looking back on the Basic Stamp and the upgrades. How over the years a small but reliable weed eater (BS1) can turn into a top fuel dragster (BS2px) and even learn a different language (Javelin). I can just see the same or better changes for the Propeller. I say “Bring on the upgrades Parallax, you are doing a wonderful job”
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
We all make mistakes when we are young………That’s why paste is edible!
·
I was just looking back on the Basic Stamp and the upgrades. How over the years a small but reliable weed eater (BS1) can turn into a top fuel dragster (BS2px) and even learn a different language (Javelin). I can just see the same or better changes for the Propeller. I say “Bring on the upgrades Parallax, you are doing a wonderful job”
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
We all make mistakes when we are young………That’s why paste is edible!
Comments
Hardware multipliers - The instructions for this are already present, but are unimplemented in the current silicon.
More Hub RAM - The video support in the COGs is great - but, it could really use more framebuffer space!
More COG RAM - For complex things (like VMs, or USB stacks) 512 instructions is a bit cramped. 1024 or even 4096 is a bit easier to work in.
More IO pins - 32 is nice, but 64 is nicer!
More COGs - with 64 IO pins, you could do more at once, so you'll need more multiprocessing to do it with.
A shared FPU on the hub bus - because some applications could really use fast floating point; even if there's just one unit of it.
More MHz - for handling faster events.
P.s. - no, I don't have any inside info, that's just my personal "wish list"; backed up by observing this forum and general experience monkeying with stuff...
Fortunately for us, some of these items are already planned like the multiplier, 64 I/O pins and speed enhancements.
The memory expansion is very dependent on chip area. From the photo in Circuit Cellar, it looks like the HUB RAM
occupies about 25% of the chip area and the COG RAM occupies maybe 20%. Doubling both of those causes a major
increase in chip area for an already large chip. We may have to wait for Parallax's chip foundry to move to whatever
the next finer feature size would be before we'll see this. Similarly for doubling the number of COGs, particularly if
COG memory is also to be doubled. I'm not sure we need a special purpose FPU since there's a very nice floating
point package that runs in 2 COGs (with the optional functions included). If the number of COGs is increased, there
would not be a need for a dedicated function unit and the COGs could be used for other things when not needed for
floating point. If COG memory is increased, the whole FP package could fit in one COG.
Or accept a larger package... (i.e, no dip - the smallest package being a PGA the size of the die).
>> I'm not sure we need a special purpose FPU since there's a very nice floating point package that runs in 2 COGs (with the optional functions included).
For some things, it's too slow - hence the hardware. I'm thinking of DSP-ish applications, in particullar... like, say, soft-radios.
Also, it does not need to have every possible FP op in hardware - just the basics, like add, multiply (and perhaps a multiply-and-accumulate op), sine, root, and exponent.
MSFT is in my blood.
I would like to see a .Net CLR footprint in Propeller. But, being candid, I must also say, bringing in .Net into the mix (in its present mode) will bring some latency in performance - unless ASM get supported inline. Adding support for VS.Net via the VSIP program is another integration effort.
just thinking out aloud
[noparse]:)[/noparse]
·
I do to agree about the more memory, cogs, faster speed and more I/O’s.
·Also it would be nice to have a dip 64 I/O propeller that can talk to what ever I hook to it with out me doing anything, listen to my wife when I spend way to much time trying to learn spin…… but then I woke up from my day dream, and there she was saying something about listening to her.
·
TC
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
We all make mistakes when we are young………That’s why paste is edible!
There's no advantage to putting specialized peripherals in hardware as flexible as the Propeller. That's the whole point of the pChip: to provide a plurality of processors that can be configured via software for any task, without having to rely on an acre or two of single-purpose silicon. Built-in peripherals are more likely to be "wasted", because when they're not used for their design purpose they can't be used at all. A cog can be repurposed at will.
Bit-banging has gotten a bad rap, because it's historically been the method of last resort in slow, poorly-provisioned microcontrollers. But that began to change with the SX and it's deterministic instruction flow and raw speed. In the Propeller, the "soft perpheral" concept has matured even further, to where it's preferable to its hardware kin.
-Phil
Mmm... sorry to hear that. Perhaps someone could recommend a good hematologist?
.NET is evil... even MS is moving away from that monstrosity... Oh, yeah - it probbably wouldn't fit, either...
Much better just to have a common ABI that all compilers on the platform target; with no run-time thunking or JIT required. Also, by going straight to native binary, the compiler can generate better code, because it can spend much more time and resources on it during compilation, than if it was time and resource constrained by being a JIT.
Now, if you just want a VB or C# style compiler, that's another matter entierly... and much more do-able.
>> listen to my wife
Oh, but you can... didn't you download the WifeInterface.spin module?
My 3.14159...bits,
Marty
OK, how's this for an idea:
How bout' being able to stack chips (much like PC/104 stacks boards) to get the feature set that you want? Need more ram? Clip it on! More cogs? Same thing - just clip it on! I/O? Clips on, with the leads coming out the side of the chip on a ribbon cable.
The idea would be to run the hub bus up the stack, and have the bottom chip be a basic prop (with all the existing features of the current prop - 32 I/Os, 32K hub ram, 8 cogs, etc).
Yes - it might be tricky to find a packager willing to do this in the chip level; you might have to go to having a really small PCB for it (BGA on PCB, or even die-on-PCB; and the PCB has pins/headers around it's edges - so the whole thing resembles a small pentium, but with a socket on top); and there might be thermal issues involved - but, I don't think these issues would be insurmountable...
Graham
But onto the stacked propeller, which remides me of the transputer idea some dacade or two ago. In particular how it was to be vertically scalable by plugging one CPU into the next and so on. So we make Dip headers that can stack on top of each other. The 40 pin dip seems the best form to start with. Look at the way the propstick is made with the EEprom and comms ship between its legs. By using this(Propstick) as a base and extending it sideways with two new rows of pins, probably 24 or 28 pins in each row, we have a stacking platform.
Various headers would have pads for SOIC, LQFP and other common SMD ICs. To manage the interconnections base the whole thing on wire-wrap techniques. The extra pins would provide for power to and multiplexing of each header. One variation of the header would be like a propstick but with delay electronics that let it act as a slave to another Propeller with its(the slaves) I/O pins wired to IDC header pins for another 32 or so I/Os.
This train of thought was prompted by my looking for SOIC to DIL addapters which cost about $20. Add another $30 for the components(Prop, RAM, EEprom, etc.) and each module ends up costing around $50.
So we could end up with a simple, cost effective multiple multi-processor stack that trades some of its I/O pins for unlimited scaleability. I know this is wild speculation but would it interest anyone as an experiment or even further spaculation?
Frans...
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Read a book, under a lamp and think what it took to get here.
I don't know enough about the innards of the current prop to be able to say if this is possible with the current die... but, if it's not, then it shouldn't be too much effort to modify it so that the requisite lines are available for pinning (er, at least, I think it shouldn't be hard -- I'm really a software guy, not a silicon design guru).
One pin forced high or low because of a fault will disable ALL COGs...
Also, with no way of knowing how many COGs are in the system, there's no way of knowing the timing of HUB memory accesses. (Yes, it can probably be calculated by timing how long a certain number of accesses takes, but that will only be valid until the unit is switched off)
As for how many pins it would take...
Remember that the internal DATA structures are 32bit, then you need address pins(No, multiplexing Data and Address, as it was done on some older 8bit CPUs, is not a good idea, timing-wise) and control signals...
Then there's the progamming issues...
With several chips, you'll have to be able to specify on WHICH chip a program is run if you want it to use the correct I/O-pins.
Timing issues...
Should they all run on the same clock?
Then you need to have the clock among all those pins 'exposed'. Clock signals MUST be stable, which means they're a pain in the @ss to lay down on the PCB. (The further, the more pain)
And yes, they need to transfer the 16x PLL tap. A 'local' PLL in each chip may drift away from the main one, particularly if the chips are some distance apart so that noise is introduced.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...
Ram... Ram is good, and in quanity it's great, BUT, it also lets the programmer get sloppy with code... I wouldn't let it get out of hand.
Hubs...Hubs can be stopped and started, recycled in effect, so it's not really the hub count, but the ram contained and shared. I would rather see hub 2 hub communications made easier then more hubs. Maybe a hunk of speciallized shared registers that are read only to all but the source hub. Don't mis understand me, 8 hubs are great, and 16 hubs could be better, but without some form of inter hub communication, this could bring preformances issues elsewhere down.
Speed... Yes, speed is a good thing, but I really believe that it could be a down fall, espically when working with devices that require different speeds. Maybe the ability to globally set a master speed, and to impose a slower speed on selected hubs would be better. In effect, it would reduce power consumption if only the hubs that needed to run at 80 Mhz were running there, where as the HID hubs can run very happily at 1 khz, thus, saving power.
I/O Pins. Interfacing with the world will always be limited by the interfaces chosen or created. If the stamp had 64 i/o's, I'm sure someone would need 128. The need is never ending and always growing; My only request is when the pin count goes beyond what can be made in DIP, they make a break out board that will put it into a DIP interface, otherwise a lot of hobbiest aren't going to 'get in' on it.
One thing that has me somewhat conserned is most every one's "do it in the software" approch. There are applications, interfaces and the like, that could most assuridally be used if created in hardware. I'd give up 32 i/o's on a 64 io chip for SPI and I2C interfaces to be hardware... the speed alone, the fact that "it works period" is worth it.
...
thanks for reading... now it's back to getting caught up with all the posts!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller + Hardware - extra bits for the bit bucket =· 1 Coffeeless KaosKidd
·
>> One pin forced high or low because of a fault will disable ALL COGs...
For many applications, having the whole machine stop is desirable behavior, instead of having it start doing random bad things... Also, I'm not suggesting that the closed, single-chip prop be done away with; as there are many applications where that's exactly what's needed.
>> Also, with no way of knowing how many COGs are in the system
What makes you think that you wouldn't know how many cogs are in the system? You, as the developer, would know how many of what you attached! For generic software modules, you could have a constant stored in a compile-time environment variable that specified it (which is something that's going to be required for code portability to any future "turboprops" anyway).
>> but that will only be valid until the unit is switched off
OK... now for the stupid question: what on earth would you be doing with that data when the unit is off?
>> As for how many pins it would take...
power - 2
eeprom - 2
reset - 1
osc -4
I/O - 32
hub data - 32
hub address - 32
hub control - 2
device id - 4
Total - 111 pins. On a 144 pin BGA, the rest can be used for extra power & ground connections.
>> you'll have to be able to specify on WHICH chip a program
Yes and no... yes, you need to be able to assign a program to a specific cog; but with a unified address space, any cog can reference any resource by a canonical address.
>> Should they all run on the same clock?
Yes! That's a must.
>> Clock signals MUST be stable, which means they're a pain in the @ss to lay down on the PCB. (The further, the more pain)
Yes - but we are only talking about going maybe an inch or so. Keeping the whole package small is important.
>> And yes, they need to transfer the 16x PLL tap.
Yup.
____________________________________________________________________________________________________________________
>> also lets the programmer get sloppy with code
Stupid users is not a good excuse for hobbling a machine. Instead, the IDE should detect this, and refuse to program the chip - "Error: ID10T. Please get somebody smarter than you to do this task."
>> Hubs can be stopped and started
Um... you mean cogs, right?
And no - I can envision having more than 8 things going on at once - hence the requirement for more cogs.
>> inter hub cog communication
That's what hub ram is for. Cog A writes to a variable, and Cog B reads it. For large data-structures (bigger than a word) use a mutex.
>> could be a down fall, espically when working with devices that require different speeds
Er... how? Just throw in some wait states if you must.
As for power consumption; selectively throttling back individual cogs could help; what one would do is to just add a clock divider to each cog (but, hub-access instructions would still have to run a full speed - so, when one comes up, the divider register just gets held cleared by the instruction, forcing full speed clock pulses for the duration of the instruction).
>> beyond what can be made in DIP, they make a break out board that will put it into a DIP interface
My idea was that it would resemble a small pentium with pins on the bottom of the carrier board, but with an expansion socket on top with the hub bus on it. It would definitely be hobby-friendly; as it has thru-hole (or socket-able) pins. The BGA package would be soldered to the carrier board at the factory. The whole package (chip-in-BGA, mounted to carrier board with sockets on it)might only be 1/2" to 3/4" square (which also helps out with the clock distribution issue)...
>> One thing that has me somewhat conserned is most every one's "do it in the software" approch.
That's what the whole design ethic of the prop is about... and as the toolset matures, getting the exact configuration desired will become a lot easier. As for the speed issues, I don't think that's a problem - remember, the existing prop is running at 80Mhz, and future ones are bound to be faster. You already can do video generation completely in software...
Ram takes up DIE space, which could be used for other things.
Like in another thread someone suggested OOP type approch. I don't think that would work here. Leave the DIE space for other things, if you need RAM, add it yourself.
I like the idea of FRAM on the I2C bus, and if they could include a RAM paging system, that would work as well. THe best of both worlds??
For me, RAM hasn't been the issue. The issue has been hardware and fast inter cog communications.
The issue being this... it takes TIME to write to hub ram... then TIME again to read hub ram... a lot of time, and that time adds up real fast if your moving a large amount of data across it. Mutex won't make it faster, in fact, at this level, it will slow it down. Inter cog communication is key for quite a few applications.
wait states consume resourses; cog ram.
I do understand the ethic of the prop, but some things are better left in hardware. Look through the 100+ posts on I2C and SPI. Both protocalls are simular in hardware and software. Bit level flipflops. The idea base level communications shouldn't be software, but hardware.
Don't misunderstand me; I love the fact that I can create my own communications protocall, use an existing one, or even moify something for a hybrid. The propeller is one of the best "what if" platforms created; Feature and resource rich.
Hamerhead, you make good points, but for me, I'd rather see some other things.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller + Hardware - extra bits for the bit bucket =· 1 Coffeeless KaosKidd
·
Chip Gracey
Parallax, Inc.
>> but would be a very different product for a different market
Yes - it would. Like I said above - I'm not suggesting that the current prop be dropped... not at all.
Chip -
>> Power consumption would skyrocket
Mmm... OK - market it this way: it's the "Northern Edition: keeps you nice and toasty!"
Um - what would it take to come up with some sort of high-speed serial interface? Then there would be many less drivers involved... so it should be simpler and smaller, right???
Kaos Kidd -
>> Inter cog communication is key for quite a few applications.
Nearest neighbor connections shouldn't be that hard to do; but a dedicated connection to each cog in the system doesn't scale well; as you are looking at an n-squared problem. For 8 cogs, that's 64 communications channels; for a 16 cog machine, 256; and for a 32 cog machine, 1024!
>> and if they could include a RAM paging system
That can be done in software (and is probbably best done that way; as you don't want unexpected timing glitches to hit you if you access an unmapped page). A simple bulk copy routine shouldn't be too big...
If the 64 i/o version of the Propeller was·packaged as the 32 i/o version this would result in a Virtual Port B.
If this proves useful then perhaps other virtual ports could be added.
Alternatively·each cog could have·registers that are shared with the cog before and after.
Thus each cog would have 4 registers as diagramed below for cog#4 .
1st reg··· C3 < C4
2nd reg··· C3 > C4
3rd reg·········C4 > C5
4th reg········ C4 < C5
Separating in to R and W would avoid problems with simultaneous access that RW would have (but would use·4·cog ram places).
Perhaps this could be reduced to only 2 registers or even ONE.
1st· reg for C4 write to both·C3 and C5
same·reg for C4 read data written from either C3 or C5 ( with possible simultaneous write from C3 and C5 to C4).
Thus one cog memory address would act as 2 registers conserving cog ram space.
Phillip Y.
A way might be to simply shadow the last (first) 4 (2) longs of cog ram into hub ram, read only for all but the owner.
Prolly not..
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller + Hardware - extra bits for the bit bucket =· 1 Coffeeless KaosKidd
·
Yes, it is...
>> One pin forced high or low because of a fault will disable ALL COGs...
> For many applications, having the whole machine stop is desirable behavior, instead of having it start doing random bad things...
You see, before, this ould only be IO-pins, but now you want to expose internal connections.
>> Also, with no way of knowing how many COGs are in the system
> What makes you think that you wouldn't know how many cogs are in the system? You, as the developer,
> would know how many of what you attached! For generic software modules, you could have a constant
> stored in a compile-time environment variable that specified it (which is something that's going to be required
> for code portability to any future "turboprops" anyway).
Trust me, this will cause problems for any third-party machine-code modules which needs high-speed access to the HUB memory.
>> As for how many pins it would take...
> power - 2
> eeprom - 2
> reset - 1
> osc -4
> I/O - 32
> hub data - 32
> hub address - 32
> hub control - 2
> device id - 4
> Total - 111 pins. On a 144 pin BGA, the rest can be used for extra power & ground connections.
How do you know that HUB control is just two pins?
>> you'll have to be able to specify on WHICH chip a program
> Yes and no... yes, you need to be able to assign a program to a specific cog; but with a unified address space,
> any cog can reference any resource by a canonical address.
THAT will only work if the 'extra' chips just contains COGs, and not I/O-pins.
>> also lets the programmer get sloppy with code
> Stupid users is not a good excuse for hobbling a machine. Instead, the IDE should detect this,
> and refuse to program the chip - "Error: ID10T. Please get somebody smarter than you to do this task."
Don't I wish that IDEs had this function...
Preferably also with the ability to control a shaped-charge explosive placed pointing upwards in the user's chair...
> As for power consumption; selectively throttling back individual cogs could help; what one would do is to
> just add a clock divider to each cog (but, hub-access instructions would still have to run a full speed - so,
> when one comes up, the divider register just gets held cleared by the instruction, forcing full speed clock
> pulses for the duration of the instruction).
I'm not certain that would be feasible...
A sudden ramp-up of the speed will cause a surge in power, and it may not be 'entirely stable' soon enough...
> My idea was that it would resemble a small pentium with pins on the bottom of the carrier board,
> but with an expansion socket on top with the hub bus on it. It would definitely be hobby-friendly;
> as it has thru-hole (or socket-able) pins. The BGA package would be soldered to the carrier board
> at the factory. The whole package (chip-in-BGA, mounted to carrier board with sockets on it)might
> only be 1/2" to 3/4" square (which also helps out with the clock distribution issue)...
Still not hobby-friendly...
Ever tried to lay down tracks on a PCB for such ICs?
And what about solderless experimenter boards?
>> One thing that has me somewhat conserned is most every one's "do it in the software" approch.
> That's what the whole design ethic of the prop is about... and as the toolset matures, getting the exact
> configuration desired will become a lot easier. As for the speed issues, I don't think that's a problem - remember,
> the existing prop is running at 80Mhz, and future ones are bound to be faster. You already can do video generation completely in software...
Doing it in SoftWare means that precious silicone real-estate isn't wasted on functions you don't need...
Now you CAN have a RS232 port going at 115200 FULL DUPLEX, or two if you need them... Or you can use the pins for I2C, SPI, 1-Wire...
(How many microcontrollres can deliver 2 or more UARTs? Or manage them efficiently?)
I look at the Propeller, then at old schematics the WLAN at the office, from back when we used leased V35 lines of 64 or 128Kbps...
How difficult would it be to program a Propeller as a 3 or 4-way 64Kbps router?
I believe that Chip has mentioned that higher speeds is in the works for the next generation. And the COG-HUB communication is to be improved so that the HUB 'spins' at higher speeds.
Going to 16 COGs may have been mentioned, too...
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...
I agree. I don't see any real upgrades to the Propeller happening until people start running out of computation resources. However, it is possible the I/O may limit what will be done with the chip. With 8 Cogs, the Propeller can do a LOT, but there are only so many I/O pins to attach things to. I am hoping to be able to get a Propeller board next month and am in the process of working my way through the Propeller manual now.
Er... so? I don't see how that's worse?
> How do you know that HUB control is just two pins?
Umm... A little bird told me???
Ok... upon thinking about it; you'd actually need four - read, write, lock, and phase.
>> Yes and no... yes, you need to be able to assign a program to a specific cog; but with a unified address space,
>> any cog can reference any resource by a canonical address.
>THAT will only work if the 'extra' chips just contains COGs, and not I/O-pins.
Um... why? Extra IO pins would be addressed over the hub bus, much like a memory location, in an atomic read-modify-write cycle...
>> Preferably also with the ability to control a shaped-charge explosive placed pointing upwards in the user's chair...
No, no, no - shaped-charges are a one-time thing... what's the IDE supposed to do when the NEXT idiot sits down? Um... wait - I know! What you really want are high-powered lasers fired from on top of the monitor...
> Still not hobby-friendly...
> Ever tried to lay down tracks on a PCB for such ICs?
Yes. It's no harder than laying down tracks for a qfp - because that's basically what it is; except with pins, not surface mount pads.
> And what about solderless experimenter boards?
That might require a carrier board; or something like the existing demo board or propstick, but set up for the pin layout of the prop stack.