hicount_0 := cnt64_high
locount := cnt64_low
while ( (hicount_1 := cnt64_high) <> hicount_0 )
hicount_0 := hicount_1
locount:= cnt64_low
' now hicount_1 and locount are consistent (no rollover problem in hicount)
If two successive reads of the high-order bits are the same, then the low-order bits read in-between cannot have caused a roll-over.
OK! Nick McClick's info on 'jazzed's SDRAM board (thread NEW: SDRAM Module for Propeller Platform) explained a lot for me.
But his SDRAM part isn't a '~$3' part; more like $6+ in 1000's quantity. I wonder if Chip was thinking of Parallax's getting many thousands qty pricing?
Anyway, Now I realize the 'D' in SDRAM implies 'dynamic'. And now have one part number to work from.
If anyone knows of Chip's '$~3' SDRAM please let me know of such part number!
I looked this up part. Qimonda's HYI25DSD512160CE-5 part, which is organized as 64M x 16 bit width. $6, and there was a note "Digi-Key has discontinued this item, limited quantity available". Don't know if other distributors also have done so, but don't think I'd want to design around this part.
I'd guess a dynamic, rather than a static, RAM might imply a bit of extra hardware like a latch for holding address info. If totally not requiring extra hardware, hopefully Parallax might provide info on the external RAM hook-up before Prop 2 is available. I'd guess many of us would like a lot more info to get started on designs using Prop 2.
Will the SDRAM stuff that Chip has discussed for the Prop 2 work with DDR memory? DDR signalling is different...
The DDR signalling I've looked at is very similar to SDRAM. The command signal set is identical. The mode register settings are different to allow data access window tuning. Data is accessed on both clock edges for DDR and one edge for SDRAM.
Today's Propeller would not benefit from double data clock edges.
I'd guess a dynamic, rather than a static, RAM might imply a bit of extra hardware like a latch for holding address info. If totally not requiring extra hardware, hopefully Parallax might provide info on the external RAM hook-up before Prop 2 is available. I'd guess many of us would like a lot more info to get started on designs using Prop 2.
I would like more info too, but I doubt it will be much different from today. There is a special SDRAM clock counter - perhaps in addition to current counters.
With the number of pins available on Prop2 latches will be optional. You could do an entire 8 bit interface with 28 pins unlatched for up to 128MB SDRAM with today's technology - with cumbersome address line sharing other devices could be added. With Prop2 you get the bonus of 16 (36pins) or 32 (42pins) bit data which can work wonderfully for any application with a cache.
I am sure Parallax will give us plenty of notice to design our "memory" based boards. There is no point in designing yet.
We will have to be more careful with SDRAM as it has a much shorter life than SRAM. I will not be looking for SDRAM chips until the PropII is close - it's still 6-12+ mths away.
We will have to be more careful with SDRAM as it has a much shorter life than SRAM. I will not be looking for SDRAM chips until the PropII is close - it's still 6-12+ mths away.
Wow, we better dump all our PCs now while we have a chance
@ Cluso99, Thanks, I didn't know that SDRAM parts had a shorter life.
@ jazzed, The points you brought up was one reason why I was curious. I didn't want to just 'dial into' just any SDRAM, in case there were too many differences. Though I suppose looking into any one of 32M x ?? size would be helpful.
I realize it is a bit too soon to do too much on any Prop 2 design. That really wasn't my point.
It was just that for Chip's remark about ~$3 SDRAM I couldn't find ANY. So I was hoping he or someone had part number(s) for such 'animals'. Twice that price I could find unless it would be huge quantity pricing.
I tried searching Digi-Key for SRAM parts in 32M x ?? size; there doesn't seem to be any. I didn't go to any other distributor as I too can wait. Though, I've not dealt with any RAM of any sort in recent decade or so. So wanted to just look at any data sheets available for such a part .
It was just that for Chip's remark about ~$3 SDRAM I couldn't find ANY. So I was hoping he or someone had part number(s) for such 'animals'. Twice that price I could find unless it would be huge quantity pricing.
I tried searching Digi-Key for SRAM parts in 32M x ?? size;
Try digikeys new price search, I get page full under $4, and avoiding the smaller ones, which will be nearing EOL, there is (for example)
SDRAM needs a refresh cycle, so is not as invisible as SRAM, but you can see it is much cheaper.
The design-life times of SDRAM have tended to be shorter, but I think when the industry settles on a Embedded Size, that life-time will increase.
Embedded is also seeing a SDR (Single Data Rate) DRAM, see ISSI's website for 3.3V SDR (Single Data Rate) Synchronous DRAM
(ie the very largest, PC usage ones will continue to follow that wave, but embedded apps will be more stable - the 128MBit winbond one may be a candidate ? ).
Future Electronics are a better source and there are a couple of 8Mx16 and 16Mx16 which are <$3. IIRC the package was TSSOPII-56 with 0.8mm spacing which is easy to solder.
Of course, another idea may be to put a DIMM socket on a pcb and use old laptop DDRs. 1GB anyone??? Note: Check the voltage - must be 3V-3V3 otherwise the conversion is a nightmare.
I have some DIMM on my desk now that has 16 TSOPII-66 DDR 32Mx8 chips on it. Yes, I've thought about the DIMM idea just a tiny little bit.
SODIMM used in laptops is small enough to be a reasonable solution. The normal size DIMM is monstrous, but is cheaper. DIMM would allow flexibility and hardware re-use.
Would using a dimm require more pins and logic over a single sd ram chip?
Yes.
One issue is that most PC3200 DIMM have 64 bits. To use all 64 bits 16 extra connections are required for chip selects. To use a 32 bit only interface (half the DIMM capacity) requires 8 extra connections.
It is not clear at this point how much the extra effort, flexibility, and FAB cost is worth. The DIMM preferences do seem to change with market conditions. The chips themselves are stable.
Added:
There is a 144 pin DDR2 SODIMM standard that provides 32 bit data. The only compatible product that I found quickly was a Dell Printer DIMM for $80
Newegg.com and many other sites apparently still sell the old PC100 and PC133 144 pin SDR SODIMM modules. Like the DIMM for 32 bit data, 8 extra pins are required for byte-lane enables. I have several of these left over from battered old laptops.
Is the 32bit "pipe" between the cogs just like pins without the pins? This to me would seem the most flexible.
On that note; since there are planned 92 iopins on the package, can we have the missing 4 pins (96-92) just like the 32bit "pipe" between the cogs so we could use them for handshake?
That way, we could implement 2 16bit buses between cogs with handshaking, or just use 2 of them to have a full 32bit data buss between two cogs that need high bandwidth.
But I have question to Chip ON
CAN even this pins can be USER internally usable ?
CHIP: Forgot to mention... 3 ports, minus 4 pins, are implemented, the last 'port' is the cog pipe, where each cog can filter what he's seeing from the other cogs' pipe outputs.
Is the 32bit "pipe" between the cogs just like pins without the pins? This to me would seem the most flexible.
On that note; since there are planned 92 iopins on the package, can we have the missing 4 pins (96-92) just like the 32bit "pipe" between the cogs so we could use them for handshake?
That way, we could implement 2 16bit buses between cogs with handshaking, or just use 2 of them to have a full 32bit data buss between two cogs that need high bandwidth.
Re: 'Pipes', even though I read all (I hope) of that 'Friday w/Chip' session covered, can anyone explain how a pipe is helpful?
In my mind I picture 8 cogs with lines between each one. Cog 0 has lines to Cogs 1..7, etc. But presently it seems a cog doesn't 'KNOW' what cog# 'he' is. So how is he going to know in advance of run time to know which pipe to listen to or post to? And is this one LONG that gets written/read? Like when PAR is used (sort of!)?
I suppose there will be some instruction to read/write to a pipe? "Oh, I see cog n pipe value; have you seen my pipe?" Now cogs sort of can talk to each other, rather than through the HUB.
Seems to me, the general concept of somehow mux'ing the pipes will keep objects portable. That was the intent Chip seemed most interested in preserving. At that time, Kye registered some concern, that I share, over dedicated resources introducing dependencies. Chip acknowledged that, and I suspect he's addressed it, or the pipes would not be there.
Why have them? Well, some trick stuff has been done with the lower pins, and it's shown how COGs being able to communicate can significantly improve their ability to work together. I'm all for it, so long as we don't have some ugly kludges to work out the sharing of the pipes. Might as well just have the HUB, IMHO. It's gonna be a lot faster anyway.
As compelling as that is, just having COGs be COGs is more compelling, IMHO.
Re: 'Pipes', even though I read all (I hope) of that 'Friday w/Chip' session covered, can anyone explain how a pipe is helpful?
In my mind I picture 8 cogs with lines between each one. Cog 0 has lines to Cogs 1..7, etc. But presently it seems a cog doesn't 'KNOW' what cog# 'he' is. So how is he going to know in advance of run time to know which pipe to listen to or post to? And is this one LONG that gets written/read? Like when PAR is used (sort of!)?
I suppose there will be some instruction to read/write to a pipe? "Oh, I see cog n pipe value; have you seen my pipe?" Now cogs sort of can talk to each other, rather than through the HUB.
Unless I'm misunderstanding something, I don't see why knowing what cog is running what object is important for the sake of using the pipe. Since the object will be specifying what bits of the pipe it will be using for tx/rx, cog# is irrelevant because all cogs are sharing a single 32-bit wide pipe (I'm assuming).
@jazzed
Pardon my ignorance, but is the extra logic required to operate a dimm strictly for addressing the individual memory chips? If so, I presume some fairly simple external circuitry along the lines of an incremental counter + shift registers could be used to minimize the additional prop pins the would be requires, correct?
While using dimms might not be an ideal solution for an embedded device of any volume, I think incorporating a dimm port on a project board would be awesome just for the sake of memory capacity flexibility.
Pardon my ignorance, but is the extra logic required to operate a dimm strictly for addressing the individual memory chips? If so, I presume some fairly simple external circuitry along the lines of an incremental counter + shift registers could be used to minimize the additional prop pins the would be requires, correct?
Yes. Looking at the schematics, it's just a matter of using the right DM data masks for each byte lane chip - thats a 3 to 8 74LVT138 so only 3 extra address pins are necessary.
While using dimms might not be an ideal solution for an embedded device of any volume, I think incorporating a dimm port on a project board would be awesome just for the sake of memory capacity flexibility.
I agree to a point. Like I said before, SODIMM is the most likely cost effective candidate for sockets because of the small physical size. i'm not crazy about the right-angle SODIMM SMT connectors though. Most straight up DIMMS are through-hole which is easier to manufacture.
PC100 and PC133 SDR SDRAM 144 pin SODIMM are still generally available apparently. Just google this "pc133 sodimm" ....
jazzed: I think the smaller SODIMM would be better. As long as it requires 3V3 then all would be good. The unused pins/chips or whatever could be tied or driven by a latch. I was thinking you may just end up using a single SDRAM chip on the pcb and the others disabled. There are plenty around so I am sure people could find them cheaply.
As I understand it "That Pipe's" are simple 32 bit I/O port without external pins. And as that it is not any dedicated resource. ALL COG's see it all COG's can write/read to it.
Only thing I'm not sure is possibility to divide it in smaller portions that more as 2 COG's can use it at same time in both directions.
Seems to me, the general concept of somehow mux'ing the pipes will keep objects portable. That was the intent Chip seemed most interested in preserving. At that time, Kye registered some concern, that I share, over dedicated resources introducing dependencies. Chip acknowledged that, and I suspect he's addressed it, or the pipes would not be there.
Why have them? Well, some trick stuff has been done with the lower pins, and it's shown how COGs being able to communicate can significantly improve their ability to work together. I'm all for it, so long as we don't have some ugly kludges to work out the sharing of the pipes. Might as well just have the HUB, IMHO. It's gonna be a lot faster anyway.
As compelling as that is, just having COGs be COGs is more compelling, IMHO.
It is my understanding that they are not full fledged ports that just don't connect out of the chip. That would waste a lot of space. So they don't have things like ADC/DAC, SDRAM mode, and so on.
Sapieha, I originally thought the same thing (single 32-bit port the same as pins but without physical pins, common to all cogs), but Chip's recent answers indicate otherwise:
CHIP: Forgot to mention... 3 ports, minus 4 pins, are implemented, the last 'port' is the cog pipe, where each cog can filter what he's seeing from the other cogs' pipe outputs.
I imagine this will be similar to the current prop's OUT register - currently (prop I) each cog has its own register, which are then OR'd together for the physical pin's state. It sounds like in the prop II each cog will have its own register for the internal 'pipe' but can choose which other cogs it is OR'd with.
As for what it can be used for (Harley's question), one simple answer is easy signalling between cogs as they can WAITPNE/WAITPEQ on the "pipe", the more advanced answer being high-speed long-wide comms between cogs - keep in mind that the cogs are automatically clock-sync'd (because they're all running on the same silicon, using the same internal clock) so you would not need any poll-reply or send-ack-send overhead (though presumably you would need some initial handshaking to set things up via the hub, sharing cog #'s etc).
Roy: It sounds like (someone can feel free to correct me if I'm wrong here) the counters live in the pins rather than the cogs now, and these are used to give the ADC/DAC/RAM etc. That being the case, I suspect you are right and that the internal port will not have them, seeing as many of the things wont be applicable to an internal-only port.
Random note on the internal port:
Chip said the "last" port is the internal one. Assuming the ports are numbered 0..3 (0, 1, and 2 being the ports with physical pins) port #3 would be the internal one. Here's my thought: wouldn't it be better to make port 0 the internal one? Not a huge deal, just means that if, on the off chance, a prop comes along with more ports you can add them without confusing the numbering scheme (or having to alter your code).
@Kal: What I proposed is that the third physical port have it's last 4 pins accessable via all of the cogs even though there are not enough pins in the package to fit them(only 92 of 96 will come out) that way, handshaking can be done on these 4 leftover pins instead of the hub. This should be much faster via WAITPNE or WAITPEQ.
hinv: I think that would cause more problems that it solves.
Firstly, if internal-pins are not 'full class citizens' then you'll end up with a port that is not homogenous (goes against prop philosophy). Or, you end up with internal pins that have functionality that makes no sense (an internal ADC for example) which wastes silicon that could be used for more HUB. There's also the thought I mentioned that if the prop II winds up in a larger package one day (could be unlikely, but hey, we programmers love needless future-proofing ) then those pins will no longer be internal.
At any rate, keep in mind that the internal port will allow cogs to (I assume) communicate at clock-rate (i.e. 160mhz). Handshaking only needs to be done at the start of the operation and so speeding that up will result in minimal improvement. If there is any overhead required (i.e. poll-reply, WAITPNE, etc) you might as well go via the hub, since it now supports quad-long reads meaning that you can either transfer 32-bits per clock (internal port) or 128-bits per 8 clocks (averaging 32-bits per 2-clocks).
I understand what you are saying about the synchronous nature of using the internal port, and getting twice the bandwidth, but what about latency? After all, it is latency that ties up a lot of time, which is why cache's exist. With 4 more internal "pins" we could use WAITPNE to trigger a transfer, which I would guess would be 4 cycles max, rather than 7 to 22 for the hub.
As for the non-homogenous 4 pins...I don't really care if they are full featured or not..except they take up more space if they are full featured. Because of some of the neat features of the pins, I am guessing that some pretty neat tricks can be done with full featured internal pins like sine based trig.....you will have to talk to PhiPi or Chip about the possibilities. OK, not knowing the cost, I want those missing pins to be full featured ;^). How much space can they possibly take up?...there's already 92 of them planned. I think we have me convinced. (As if that matters ;^)
Comments
If two successive reads of the high-order bits are the same, then the low-order bits read in-between cannot have caused a roll-over.
Try this:
http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=675-1022-1-ND
64MB for $6.00
The catch is it's DDR and not available in 3.3V.
That's why Propeller2 has QVDD to change IO voltages.
Will the SDRAM stuff that Chip has discussed for the Prop 2 work with DDR memory? DDR signalling is different...
I looked this up part. Qimonda's HYI25DSD512160CE-5 part, which is organized as 64M x 16 bit width. $6, and there was a note "Digi-Key has discontinued this item, limited quantity available". Don't know if other distributors also have done so, but don't think I'd want to design around this part.
I'd guess a dynamic, rather than a static, RAM might imply a bit of extra hardware like a latch for holding address info. If totally not requiring extra hardware, hopefully Parallax might provide info on the external RAM hook-up before Prop 2 is available. I'd guess many of us would like a lot more info to get started on designs using Prop 2.
The DDR signalling I've looked at is very similar to SDRAM. The command signal set is identical. The mode register settings are different to allow data access window tuning. Data is accessed on both clock edges for DDR and one edge for SDRAM.
Today's Propeller would not benefit from double data clock edges.
I would like more info too, but I doubt it will be much different from today. There is a special SDRAM clock counter - perhaps in addition to current counters.
With the number of pins available on Prop2 latches will be optional. You could do an entire 8 bit interface with 28 pins unlatched for up to 128MB SDRAM with today's technology - with cumbersome address line sharing other devices could be added. With Prop2 you get the bonus of 16 (36pins) or 32 (42pins) bit data which can work wonderfully for any application with a cache.
We will have to be more careful with SDRAM as it has a much shorter life than SRAM. I will not be looking for SDRAM chips until the PropII is close - it's still 6-12+ mths away.
@ jazzed, The points you brought up was one reason why I was curious. I didn't want to just 'dial into' just any SDRAM, in case there were too many differences. Though I suppose looking into any one of 32M x ?? size would be helpful.
I realize it is a bit too soon to do too much on any Prop 2 design. That really wasn't my point.
It was just that for Chip's remark about ~$3 SDRAM I couldn't find ANY. So I was hoping he or someone had part number(s) for such 'animals'. Twice that price I could find unless it would be huge quantity pricing.
I tried searching Digi-Key for SRAM parts in 32M x ?? size; there doesn't seem to be any. I didn't go to any other distributor as I too can wait. Though, I've not dealt with any RAM of any sort in recent decade or so. So wanted to just look at any data sheets available for such a part .
Thanks guys...
Try digikeys new price search, I get page full under $4, and avoiding the smaller ones, which will be nearing EOL, there is (for example)
W9412G6JH-5-ND 128MBit 100+ 1.81260
IS42S16400D-7TL 64MBit 100+ 2.89050
SDRAM needs a refresh cycle, so is not as invisible as SRAM, but you can see it is much cheaper.
The design-life times of SDRAM have tended to be shorter, but I think when the industry settles on a Embedded Size, that life-time will increase.
Embedded is also seeing a SDR (Single Data Rate) DRAM, see ISSI's website for
3.3V SDR (Single Data Rate) Synchronous DRAM
(ie the very largest, PC usage ones will continue to follow that wave, but embedded apps will be more stable - the 128MBit winbond one may be a candidate ? ).
The highest SRAM density per chip I've found is the Cypress 2Mx8 - of course that's $20 a pop. The same price will get you 128MB of SDRAM.
@jmg,
The ISSI SDR 32Mx8 is what I use on the GadgetGangster SDRAM Module.
Refresh is pretty easy to do. SDRAM or DDR is the way to go guys.
Of course, another idea may be to put a DIMM socket on a pcb and use old laptop DDRs. 1GB anyone??? Note: Check the voltage - must be 3V-3V3 otherwise the conversion is a nightmare.
SODIMM used in laptops is small enough to be a reasonable solution. The normal size DIMM is monstrous, but is cheaper. DIMM would allow flexibility and hardware re-use.
One issue is that most PC3200 DIMM have 64 bits. To use all 64 bits 16 extra connections are required for chip selects. To use a 32 bit only interface (half the DIMM capacity) requires 8 extra connections.
It is not clear at this point how much the extra effort, flexibility, and FAB cost is worth. The DIMM preferences do seem to change with market conditions. The chips themselves are stable.
Added:
There is a 144 pin DDR2 SODIMM standard that provides 32 bit data. The only compatible product that I found quickly was a Dell Printer DIMM for $80
Newegg.com and many other sites apparently still sell the old PC100 and PC133 144 pin SDR SODIMM modules. Like the DIMM for 32 bit data, 8 extra pins are required for byte-lane enables. I have several of these left over from battered old laptops.
Is the 32bit "pipe" between the cogs just like pins without the pins? This to me would seem the most flexible.
On that note; since there are planned 92 iopins on the package, can we have the missing 4 pins (96-92) just like the 32bit "pipe" between the cogs so we could use them for handshake?
That way, we could implement 2 16bit buses between cogs with handshaking, or just use 2 of them to have a full 32bit data buss between two cogs that need high bandwidth.
Thanks,
Doug
On Yours question --- I think answer is YES.
But I have question to Chip ON
CAN even this pins can be USER internally usable ?
CHIP: Forgot to mention... 3 ports, minus 4 pins, are implemented, the last 'port' is the cog pipe, where each cog can filter what he's seeing from the other cogs' pipe outputs.
In my mind I picture 8 cogs with lines between each one. Cog 0 has lines to Cogs 1..7, etc. But presently it seems a cog doesn't 'KNOW' what cog# 'he' is. So how is he going to know in advance of run time to know which pipe to listen to or post to? And is this one LONG that gets written/read? Like when PAR is used (sort of!)?
I suppose there will be some instruction to read/write to a pipe? "Oh, I see cog n pipe value; have you seen my pipe?" Now cogs sort of can talk to each other, rather than through the HUB.
Each cog knows what cog it is at run-time.
From a previous discussion Chip said there will be special instructions.
I suggest you look at the old very long thread for a better discussion.
Why have them? Well, some trick stuff has been done with the lower pins, and it's shown how COGs being able to communicate can significantly improve their ability to work together. I'm all for it, so long as we don't have some ugly kludges to work out the sharing of the pipes. Might as well just have the HUB, IMHO. It's gonna be a lot faster anyway.
As compelling as that is, just having COGs be COGs is more compelling, IMHO.
Unless I'm misunderstanding something, I don't see why knowing what cog is running what object is important for the sake of using the pipe. Since the object will be specifying what bits of the pipe it will be using for tx/rx, cog# is irrelevant because all cogs are sharing a single 32-bit wide pipe (I'm assuming).
@jazzed
Pardon my ignorance, but is the extra logic required to operate a dimm strictly for addressing the individual memory chips? If so, I presume some fairly simple external circuitry along the lines of an incremental counter + shift registers could be used to minimize the additional prop pins the would be requires, correct?
While using dimms might not be an ideal solution for an embedded device of any volume, I think incorporating a dimm port on a project board would be awesome just for the sake of memory capacity flexibility.
I agree to a point. Like I said before, SODIMM is the most likely cost effective candidate for sockets because of the small physical size. i'm not crazy about the right-angle SODIMM SMT connectors though. Most straight up DIMMS are through-hole which is easier to manufacture.
PC100 and PC133 SDR SDRAM 144 pin SODIMM are still generally available apparently. Just google this "pc133 sodimm" ....
As I understand it "That Pipe's" are simple 32 bit I/O port without external pins. And as that it is not any dedicated resource. ALL COG's see it all COG's can write/read to it.
Only thing I'm not sure is possibility to divide it in smaller portions that more as 2 COG's can use it at same time in both directions.
It is my understanding that they are not full fledged ports that just don't connect out of the chip. That would waste a lot of space. So they don't have things like ADC/DAC, SDRAM mode, and so on.
I imagine this will be similar to the current prop's OUT register - currently (prop I) each cog has its own register, which are then OR'd together for the physical pin's state. It sounds like in the prop II each cog will have its own register for the internal 'pipe' but can choose which other cogs it is OR'd with.
As for what it can be used for (Harley's question), one simple answer is easy signalling between cogs as they can WAITPNE/WAITPEQ on the "pipe", the more advanced answer being high-speed long-wide comms between cogs - keep in mind that the cogs are automatically clock-sync'd (because they're all running on the same silicon, using the same internal clock) so you would not need any poll-reply or send-ack-send overhead (though presumably you would need some initial handshaking to set things up via the hub, sharing cog #'s etc).
Roy: It sounds like (someone can feel free to correct me if I'm wrong here) the counters live in the pins rather than the cogs now, and these are used to give the ADC/DAC/RAM etc. That being the case, I suspect you are right and that the internal port will not have them, seeing as many of the things wont be applicable to an internal-only port.
Random note on the internal port:
Chip said the "last" port is the internal one. Assuming the ports are numbered 0..3 (0, 1, and 2 being the ports with physical pins) port #3 would be the internal one. Here's my thought: wouldn't it be better to make port 0 the internal one? Not a huge deal, just means that if, on the off chance, a prop comes along with more ports you can add them without confusing the numbering scheme (or having to alter your code).
Firstly, if internal-pins are not 'full class citizens' then you'll end up with a port that is not homogenous (goes against prop philosophy). Or, you end up with internal pins that have functionality that makes no sense (an internal ADC for example) which wastes silicon that could be used for more HUB. There's also the thought I mentioned that if the prop II winds up in a larger package one day (could be unlikely, but hey, we programmers love needless future-proofing ) then those pins will no longer be internal.
At any rate, keep in mind that the internal port will allow cogs to (I assume) communicate at clock-rate (i.e. 160mhz). Handshaking only needs to be done at the start of the operation and so speeding that up will result in minimal improvement. If there is any overhead required (i.e. poll-reply, WAITPNE, etc) you might as well go via the hub, since it now supports quad-long reads meaning that you can either transfer 32-bits per clock (internal port) or 128-bits per 8 clocks (averaging 32-bits per 2-clocks).
Unless there's something I'm missing here.
I understand what you are saying about the synchronous nature of using the internal port, and getting twice the bandwidth, but what about latency? After all, it is latency that ties up a lot of time, which is why cache's exist. With 4 more internal "pins" we could use WAITPNE to trigger a transfer, which I would guess would be 4 cycles max, rather than 7 to 22 for the hub.
As for the non-homogenous 4 pins...I don't really care if they are full featured or not..except they take up more space if they are full featured. Because of some of the neat features of the pins, I am guessing that some pretty neat tricks can be done with full featured internal pins like sine based trig.....you will have to talk to PhiPi or Chip about the possibilities. OK, not knowing the cost, I want those missing pins to be full featured ;^). How much space can they possibly take up?...there's already 92 of them planned. I think we have me convinced. (As if that matters ;^)
If you build it they will come.
Doug