PDA

View Full Version : Propeller II (Chat with CHIP) __ Some extra questions will come



Sapieha
11-07-2010, 03:40 AM
zappman: What types of FPGAs do you use? Altera, Xlinx?
Chip Gracey I'm using an Altera Stratix III part now. It's an EP3SL150

Chris (http://www.savagecircuits.com/forums/showthread.php?292-Friday-Chat-Room-3&p=2228#post2228), Chip here. This is the place, right?

Hi, Sapieha!
Hey, roy!
Hello, Everyone!

CHIP: Chris, do you want to order this discussion in any way?
CHIP: Ok, Prop II guts...
Let me open Quartus (Altera FPGA software), because there's more than I can remember....

localroger: How about basics like cores, hub and cog RAM, counters, and pins?
CHIP: Basics....
8 cores
128KB hub RAM
92 I/O's
32-bit pipes between all cogs
Each cog has a scientific calculator in hardware, pretty much.
We wanted more hub RAM, but these cogs have 7.4x the amount of logic as the current cogs - takes ~12mm of silicon for cog guts.
There are very nice I/O pins now...
13-bit quality delta-sigma ADC on every pin.
300MHz 9-bit DACs on every pin w/dither for 18-bit settable values.
DACs are 75-ohm.
Each cog has a colorspace converter matrix for ANY video standard.
SDRAM hooks up very easily for ~$3 of 32MB RAM external.
Lots of funny math functions in hardware, like stepping fixed-1's count patterns.
JohnR: does the SDRAM use some of the 92 i/o, or have it
Sapieha: Chip --can SDRAM's can be used as Video buffers?
CHIP: SDRAMs for video - Absolutely! 1080p @ 60Hz, 24 BBP.
Sapieha: Chip ---- On SDRAM's Can't You combine DATA/Adress lines on same pins ---- Like I8085 CPU have that can save pins BUT with speed of Propeller II It still will not be mising of acces timing
CHIP: Sapieha, on SDRAM, data and address must be fed concurrently, so no mux'ing.
Sapieha: YES - But with muxing possibilitys --- Needs only some latches to separate addresses --- BUT that can be even usable with some standard I/O IC's
CHIP: Sapieha, yes, you could use external latches.

CHIP: There's a lot of other stuff, but I can't think of it all. Need more questions.....
Bean: Chip any details about how the hardware registers are accessed ? I assume it is via some kind of index/pointer register ?
ratronic: Chip, 32 bit pipes between all cogs - does that mean cogs can talk to each other directly without using the hub?

Sapieha: Chip --- How about Serializer's
CHIP:Serializers... There are dual-edge-triggered serializers in each cog which can talk to OTHER propeller chips at ~400Mb/s.
Sapieha: But --can them send/receive both as syncron and asyncron?
CHIP: Yes, the serializers are asynchronous. The all feed off a common clock that someone supplies. ???

localroger: Does DACs on-chip mean video without the external resistors?
CHIP: Yes, DACs drive video DIRECTLY

CHIP: About accessing registers...
There are only PIN and DIR registers mapped in register space. All other special registers are written via dedicated instructions.
These dedicated instructions do nice things, though, like read and clear, so you don't have to compute deltas.
.Sapieha: (coment) More space for code
CHIP: Code space is 512 longs (-8, actually), so no big change there.

CHIP: Core runs at 1.8V, while I/O run best at 3.3V, but can be powered at 1.8V for logic compatibility.

realtham: do the cogs still have a CLUT as separate memory?
CHIP: Each cog has a CLUT of 128 longs.

Sapieha: Chip -- If Video use DAC's --- It needs only 5 pins?
CHIP: Video DACs.... 1 pin for composite, 5 for VGA: R/G/B/H/V, 3 for component/HDTV.

JohnR: does the SD Ram access use some of the 92 I/O or have it's own pins?
CHIP: The SDRAM eats 16 or 32 pins for data and another 20 for control+clock.

ratronic: Chip, 32 bit pipes between all cogs - does that mean cogs can talk to each other directly without using the hub?
CHIP: Yes, the 32-bit pipes between cogs are instantaneous and useable with WAITPNE/WAITPEQ + timeout.

bittled: Will it be 5V tolerant?
CHIP: I/Os are not 5V tolerant - 3.6V max.

Bean: Yes, I assume PINA, PINB, PINC, PIND ?
CHIP: Bean, yes. I/O regs start at $1F8: PINA, DIRA, PINB, DIRB, PINC, DIRC, PIND, DIRD.
Bean: I assume DIRA, DIRB, DIRC, DIRD, OUTA, OUTB, OUTC, OUTD are the 8 registers ?
CHIP: Yes, PINA..PIND with each of those REASSIGNABLE to whatever you want, so you can hardcode PINA..PIND, but initialize them on entry to point to whatever ports you want. They initialize to Port 0..3.

CHIP: Localroger, yes, same I/O voltages as current Propeller chip.
CHIP: Power requirements... Don't know yet, but extensive clock gating is used, so should be lower that current Prop.
CHIP: Supply voltages: core = 1.8V, I/O = 1.8V..3.3V
Al Booth: Do the 1.8V and 3.3V supplies need special sequencing on power-up and power-down?
CHIP: No special sequencing is needed on power.
CHIP: Packages... right now just planning on 14x14mm TQFP-128.
CHIP: Yes, the package is very dense, but it has to be to pack 128 pins. It's either that or a ball-grid array.

bittled: Will you have a breakout board for the Prop II for breadboarding?
CHIP: Breakout board... Absolutely!
CHIP: ...Probably a few breakout boards.

CHIP: There is 3D graphics stuff, too... texture lookup/lighting/alpha-blending. 8-bit quality on R/G/B.

reltham: With DIRA-DIRD and such, that is room for 128 ports/pins... with only 92 physical pins, are the other spots some form of internal thing?
CHIP: Roy, three 32-bit ports are implemented, though the last port's 4 highest pins are missing, since those got used on the package for XI/XO/BOEn/RESn.
CHIP: Each 8-bit port has its own set of power/ground pins for voltage selection and noise reduction.
CHIP: Forgot to mention... 3 ports, minus 4 pins, are implemented, the last 'port' is the cog pipe, where each cog can filter what he's seeing from the other cogs' pipe outputs.

Roger Lee: scientific calc. Floatpoint?
CHIP: No floating point, but multiply, divide, square root, and transcendentals via cordic, plus EXP and LOG via cordic.
Yeah, it seems fine to me.
John A. Zoidberg: btw are hardware dividers and multipliers included in Prop II?
CHIP: John, yes. 64/32 divide, 32x32 multiply, 32->16 sqrt, and cordic is all in hardware.
Al Booth: Does the 32x32 bit multiplyhave a 32 bit or 64 bit result?
CHIP: The 32x32 mutiplier has a 64-bit result. The 64/32 divider has a 32-bit quotient and remainder results.
CHIP: There is a separate 16x16 signed/unsigned single-cycle multiplier that executes from these instructions: MUL, MULS, MAC, MACS. The MAC instructions sum into a signed 64-bit accumulator.

CHIP: Some stuff about the CTRs...
CHIP: They can input ADC streams from the pins and run a Goertzel algorith, which is like a live slice of an FFT.
CHIP: Also, CTRs can read ADCs from pins and do PWM output without any hand-holding from the cog, so these will work in Spin, or other high-level languages.

Sapieha: Chip --- Can that CRT's generate 3 phase syncronized waves in one COG? with Sinusoid waves
(Cant find adequate answer)

CHIP: Oh... some more stuff about the new CTRs. Each is a sine/triangle/sawtooth/square function generator w/scale and amplitude for output to the DACs.
eod_punk: Any chance of built in DTMF support like the BS2?
CHIP: Since each cog has two CTRs, and each CTR can generate a sine, and each DAC channel can sum the outputs of its cog's CTRs, you can do this over one pin, hands-free.
CHIP: So, with the Goertzel in CTR hardware, and 50MHz ADC BW, you should be able to demodulate IF's pretty well.
John A. Zoidberg: By the way - will the new Prop II compiler will have easier access to counter modules?
John A. Zoidberg: Because I had a hard time using the counters for the Prop I in Spin and ASM
CHIP: In a way, because of the hands-free modes, the CTRs will be more useful under Spin.
John A. Zoidberg: On top of that - i read that these have a built-in Func.Generators:?

trodoss: Are you still planning on having the embedded ROM font?
CHIP: ROM font is still the same, though we'll probably add a smaller font, too, like an 8x12.
CHIP: Roy, 8x8 what?
CHIP: Okay... 8x8 font. Can do. It's small.

John A. Zoidberg: So there are shared ports in dira/dirb for aDCs and comparators? all needed to set in the respective config registers during startup?
tdlivings: Connecting to the A/D how close, for delta sig P1 had to be on top of chip
CHIP: About connecting ADC stuff - no need for multiple pins - it's all built-in, just connect your signal to the pin.
CHIP: There are special ADC modes within the pins for doing a very high-Z feedback delta-sigma between two adjacent pins for digitizing data from passive weak-signal sensors. Should be useful for lots of new things.
tdlivings: Because the ADC can be done inside the pin, there are no special considerations about trace length or pin numbers.
CHIP: Because the ADC can be done inside the pin, there are no special considerations about trace length or pin numbers.
microcontrolled: Can the built in ADC be used in direct replacement of an external IC?
CHIP: Yeah, there should be no need for external ADC chips. Not only can you do voltage ADC, but you can do current-feedback ADC between two pins to read funny stuff, like a loop of wire, or something.
microcontrolled: I saw something mentioned earlier about sigma-delta. Does this mean that the Prop II will take direct potentiometer input, with no external caps needed?
CHIP: For reading a potentiometer, just connect the outer legs to VSS/VDD and connect the center-tap to an I/O pin for conversion. Nothing else needed.
tdlivings: Sounds like I do not need my DDS chip anymore
tubular: any idea of ADC input impedance yet?
CHIP: ADC input impeadance is ~4M ohms.

CHIP: About DDS... The sine generator uses a 9-bit phase lookup table with 9-bit quality output. The scaling is done with a separate 9-bit coefficient, hence the 18-bit settable DAC outputs.
John A. Zoidberg: So, how many DDS channels does the prop II have?

Chris Savage: So "PINx" handles both INx and OUTx that the Prop 1 has as seperate registers ?
CHIP: Yes, PINx is both OUTA and INA in one, but don't get worried... Whenever you perform a write/r-m-w to a PINx register, it writes to OUT, whereas a read-only operation, cause IN to be read.
Bean: So "PINx" handles both INx and OUTx that the Prop 1 has as seperate registers ?
CHIP: About the PINx registers. These: TEST PINA,mask or TEST mask,PINA both cause the IN to be read, not the OUT.

CHIP: Extra about the 3D stuff - the texture lookup is perspective-correct, which means a hardware divide is being performed for each texel lookup.


Oldbitcollector: talk about multi-propeller2 considerations. Anything special there for us? Timetable to release. What's next?
CHIP: Multi-Prop II possibilities. They're there, facilitated by those fast serializers.
CHIP: To do multi-PropII apps, it's a matter of development tools. The chips can talk amongst themselves pretty easily.

Oldbitcollector: Timetable to release date? What's next?
Release date... We've got a test chip in fab now, but it's just to prove the memories, PLLs, I/O pins, etc. A final chip is a matter now of converting my FPGA code into a schematic, and then Beau laying it all out.
Release date... If we go standard cell place-and-route, it could be 6 months, otherwise, full custom will take another several months. Hard to say. Never good at it.

eod_punk: Have you had to take something out/reduce something that you didn't want to leave out/reduce? Biggest trade off you had to make?
CHIP: Well, we wanted a heck of a lot more ram, like 512KB, ideally. The cogs are such pigs now, that we can only fit 128KB. The good news is the pins are setup for doing
CHIP: SDRAM signalling and there is a hardware facilitator for moving data between SDRAM and hub
CHIP: SDRAM comes in 32MB for about $3. That's enough RAM to build console applications in. I really want to make a nice PCB layout program, but don't want to write it on the PC. I hope to do it on Prop II.
John A. Zoidberg: By the way - u mentioned about SDRAM controllers. How are SDRAMs important in microcontroller applications? Work things better?
CHIP: SDRAM is just BIG, cheap RAM!!! Having lots of memory opens lots of new doors.

CHIP: Chip design tools, too, would be fun to write.

tubular: Chip can you please give us an overview of what you're aiming for regarding security features?
CHIP: For security, we are planning to use poly silicon fuses for giving each chip a 128-bit key that can be used with some block encryption standard. The user can program the key himself.
CHIP: More about the security fuse bits... The ROM program will read the key (which is afterwards hidden) and perform downloading/loading, then start execution.

reltham: How is the CLUT arrange for colors? 16bit 24bit color?
CHIP: The CLUT is 128 longs. There are many modes of output which can use 4, 8, 16, or 24 bits to signify color.
CHIP: The CLUT can stream to the video like a FIFO, or you can hand off one set of pixels and scale at a time per WAITVID.


Sapieha: What types of media to load from?...
CHIP: What types of media to load from?... serial, USB, I2C EEPROM, SPI EEPROM, SD CARD. Any others that it should be sensitive to?
SDCARD, from what I know is just SPI, maybe 4-5 pins to work.
We would use FAT for loading off SDCARD, or, if some signature was present in the first bytes, just stream it in directly.
John A. Zoidberg: Btw any special interfaces for SDCARD?
CHIP: FAT would have to be supported to make things nicely compatible with the PC world.
CHIP: The chip, with SDCARD and SDRAM, could nicely host some O.S.


What are MADDs?
realthm: @JohnR, Multiply ADD
CHIP: MADD will go well with the SEUSSF and SEUSSR instructions, which circulate an invert bits through a Dr. Seuss-like pattern.

blittled: Will the Prop II be able to interface with USB easily since PS2 keyboards and mice are becoming outdated?
CHIP: About USB... Each even/odd pair of pins can form a 30MHz differential input, so you can make USB all over the place. There are also 1.5k resistors you can turn on in the pads to set the USB mode.

CHIP: How many hours designing Prop II... I don't know. It's been 4.5 years since the Propeller was finished, and this is the only project I've been working on. It's got to be several thousand hours, so far.
CHIP: Yes, Beau's been working several years now on the layout stuff. He designed every single polygon that is in the chip. There is nobody else's "IP" in there.

How about a double-barreled bit shifter?
CHIP: Yes, there is a barrel shifter just like in the current Prop. We've just got lots of new instructions.
CHIP: A barrel shifter is something that can shift any number of bits at once, as opposed to just one bit-position at a time.
CHIP: If you can do SHR reg,#20, that's a barrel shifter.


eod_punk: Has any other microcontoller influenced you in making the Proppeller 2?
CHIP: I've never used an Arduino, but have heard that many people like them. I heard it uses a type of C. That's about all I know about its technicality.

localroger: Not a P2 question but something I have always wanted to ask: how did you make the break from what everyone else was doing to the P1? I have been swimming in CPU's since 1974 and I would never have thought of it. So tell me whatr I missed!
CHIP: @localroger - I just new we needed something that could execute mutiple programs at once. I tried lots of different things on the FGPA, but then settled on simple cogs w/each having its own peripherals, and then all cogs sharing a common RAM, for comm.
CHIP: Yes, we plan on making the Prop I for as long as the process is available. It'll be around for at least another 15 years. MOST new IC designs actually use .35um technology, which is what the Prop I uses. Something like 70% of tapeouts target that process.


CHIP: Compatibility... There are many new instructions in the Prop II. The memory map is 32-bits, though we can't realize much of it.
Oldbitcollector: yes. will I be able to run my Prop1 code on the Prop2?
CHIP: You will have to rewrite your code in cases where you are doing new I/O stuff. For the most part, Spin routines should compile as they do now.

Beau Schwabe: @Chip Gracey - on Compatability between PropI and PropII - it sounds as though in some cases a simple text translator might be able to convert the differences.
CHIP: @Beau, the differences will probably go beyond text translation, as they mainly revolve around new possibilities.

CHIP: About 3D graphics... We have, thanks to Andre patient helping me to understand what we needed to do, a texture-mapping system with lighting and blending. It's rudimentary by today's standards, but gets you into the ballpark.
CHIP: I think 3D graphics will be especially neat for showing sensor data in nice ways.

CHIP: Instructions are almost all 1-cycle. Exceptions are the same we have now, like hub accesses. You can read four longs at once into cog ram, then execute them.
Bean: Are instructions 1 cycle ? Also is LMM execution (running code from the hub) supported by hardware / faster ?
CHIP: That is what facilitates LMM programming. Otherwise, there is no special case of LMM execution.

CHIP: @localroger - if a child could visualize a CPU, he could design one. It's like lego's.

CHIP: ADC bandwidth... If the chips is running at 160MHz, Nyquist says you'll get 80MHz bandwidth, but 50MHz would be more like it, as the aliasing would average out better.
CHIP: ADC input impeadance is ~4M ohms.

Sapieha: We are missing You on FORUM!
CHIP: I want to get back on the forum, but haven't been able to log in, so I've just been working.


electromanj: Given the track record of the BS1, I assume the PROP1 will be around for a while?
CHIP: Yes, we plan on making the Prop I for as long as the process is available. It'll be around for at least another 15 years. MOST new IC designs actually use .35um technology, which is what the Prop I uses. Something like 70% of tapeouts target that process.

CHIP: Full instruction set... I'll give to Daniel at Parallax for posting. Or, I'll find out how to get on the forum and do it myself.

Sapieha
11-07-2010, 03:41 AM
Reserved For extra Questions!

Hi CHIP.


As I said in Thread title " Some extra questions will come" ---> Now them come. Still not all of them.

CHIP - My first question are to clarify some misunderstanding on SERIALIZER?

I asked "Sapieha: But --can them send/receive both as syncron and asyncron? --> Need be - Asynchronous/Synchronous"
That maybe confused YOU!
As You answered ---> "CHIP: Yes, the serializers are asynchronous. The all feed off a common clock that someone supplies."

Now Look on attachment PDF -- That maybe give You more readable picture in You mind what I have mean on that.

Next question I have is: Will it be instructions to possibility of use COG's CLUT memory in user friendly requirement.

Bill Henning
11-07-2010, 05:49 AM
Thanks for the great summary - I was unable to attend the chat.

Cluso99
11-07-2010, 07:41 AM
Thanks Spaieha for posting this.

WOW! The Prop II just keeps getting better (apart from the loss of hub Ram). I want one, I want one, I want one... LOL

markaeric
11-07-2010, 08:44 AM
This is awesome! The P2 will be a killer SoC.


"CHIP: Power requirements... Don't know yet, but extensive clock gating is used, so should be lower that current Prop. "

So if the same clock speed was used, the P2 would be more efficient than the P1? This is contrary to what has been said before, and is a very good thing!

Phil Pilgrim (PhiPi)
11-07-2010, 09:20 AM
Don't forget: 1.8V @ 100 mA is the same amount of power as 3.3V @ 54mA. If you use linear regulators, though, you'll use nearly twice the power at 1.8V as you would at 3.3V.

But I agree: Chip's assertion is contrary to what's been said before.

-Phil

Bean
11-07-2010, 01:20 PM
I didn't get a chance to ask Chip, but does anyone remember anything about debugger "hooks" on the P2 ?

I really liked what the SX had. Being able to run assembly code at full speed and then stop at a breakpoint is really useful. It would be nice if the P2 was able to to do that.

Bean

Harley
11-07-2010, 07:51 PM
Sapieha, Thank you for the collection of comments from the Friday's 'meeting' on the Prop 2.

I wasn't able to register for some reason. Your summary might even be better than 'attending', tho some comments might have been missed. Thank you for doing this.

Thanks to Chip Gracey for the update on the Prop 2. Sounds like no one will be able to use everything on this smorgasbord microcontroller. WOW!!! I think it will be a Joy to use it.

Sapieha
11-07-2010, 07:55 PM
Hi Harley.

Only comments that are missing That Chip said are about Walnuts.



Sapieha, Thank you for the collection of comments from the Friday's 'meeting' on the Prop 2.

I wasn't able to register for some reason. Your summary might even be better than 'attending', tho some comments might have been missed. Thank you for doing this.

Thanks to Chip Gracey for the update on the Prop 2. Sounds like no one will be able to use everything on this smorgasbord microcontroller. WOW!!! I think it will be a Joy to use it.

jazzed
11-07-2010, 07:56 PM
... Being able to run assembly code at full speed and then stop at a breakpoint is really useful.
You can do that with today's Propeller.

Thanks Saphiea for posting all this.

I noticed a question about serializers that seemed to be unanswered.
Was there any mention at all about deserializers or even DMA?

Heater.
11-07-2010, 08:19 PM
Jazzed,

Did you miss this:


Sapieha: Chip --- How about Serializer's
CHIP:Serializers... There are dual-edge-triggered serializers in each cog which can talk to OTHER propeller chips at ~400Mb/s.
Sapieha: But --can them send/receive both as syncron and asyncron?
CHIP: Yes, the serializers are asynchronous. The all feed off a common clock that someone supplies. ???


Or this:


CHIP: To do multi-PropII apps, it's a matter of development tools. The chips can talk amongst themselves pretty easily.


All sounds quite amazing.

Sapieha
11-07-2010, 08:24 PM
Hi jazzed.

That Question was partially answered.
Serializers ---> As I proposed to Chip are BOTH Serialize/DeSerialize.
It Was not any discutions on DMA ---> I was not directly interested in first place -- And Not any other raised that Question



You can do that with today's Propeller.

Thanks Saphiea for posting all this.

I noticed a question about serializers that seemed to be unanswered.
Was there any mention at all about deserializers or even DMA?

Rayman
11-07-2010, 08:57 PM
Nice to see the 1080p is up to 60Hz.

So, I think I'll need 3 SDRAMs... One for code and 2 as video buffers...

Let me see... That 16data+20control * 3=108 ... Ugh Oh, I'm out of pins already :(

jazzed
11-07-2010, 09:02 PM
Jazzed,

Did you miss this:



Or this:



All sounds quite amazing.
Well, no I saw them both. Funny how the quotes disappear with the new forum software :) Inter-Propeller communications is intriguing. Can't wait :)

Most of us have asked about all kinds of stuff on Propeller2. No one "owns any of the ideas" except Chip and that is only because he is the implementer :) He was curious about the need for manchester and NRZ encoding though. Of course there was also a discussion about almost 1/2MB of HUB at some point, but that was more optimism than all the special COG stuff ended up taking. Having just an extra little bit of HUB say even 160KB over 128KB would be of great value since the "square block" can be used for data and the other 32KB used for code. The video generator is a DMA engine, it's just limited to output only.

There were discussions of other types of DMA before - I wonder if the SDRAM offers any input kick other than providing the clock (which can be done today). Curious that 16 or 32 bit only interfaces were mentioned for SDRAM :) Any SDRAM including DDR starts to make a lot of sense with flexible IO voltages. Of course any of the LMM compilers should feel comfortable with the SDRAM as backstore.

@Rayman, shhh! Give Chip a chance to ship this design :)

jmg
11-07-2010, 11:00 PM
CHIP: John, yes. 64/32 divide, 32x32 multiply, 32->16 sqrt, and cordic is all in hardware.
The 32x32 mutiplier has a 64-bit result. The 64/32 divider has a 32-bit quotient and remainder results.
There is a separate 16x16 signed/unsigned single-cycle multiplier that executes from these instructions: MUL, MULS, MAC, MACS. The MAC instructions sum into a signed 64-bit accumulator.


Nice; - what silicon cost is there, to have these HW maths operations on all COGs ?

I see NXP have an Asymmetric Dual core part just released.
That seems a very good way to avoid silicon bloat, and so allow the vital MORE RAM.

Could the P2 fit in more RAM, if the COGs were made with a common base, but
less 'fruit' on some ?

As an example Compact ones could call ROM Maths routines, to make the differences between HW and SW extended precision, more user-invisible ?.

jmg
11-07-2010, 11:16 PM
300MHz 9-bit DACs on every pin w/dither for 18-bit settable values.
DACs are 75-ohm.
Each cog has a colorspace converter matrix for ANY video standard.

CHIP:Serializers... There are dual-edge-triggered serializers in each cog which can talk to OTHER propeller chips at ~400Mb/s.
The serializers are asynchronous. They all feed off a common clock that someone supplies.

CHIP: Oh... some more stuff about the new CTRs. Each is a sine/triangle/sawtooth/square function generator w/scale and amplitude for output to the DACs.
eod_punk: Any chance of built in DTMF support like the BS2?
CHIP: Since each cog has two CTRs, and each CTR can generate a sine, and each DAC channel can sum the outputs of its cog's CTRs, you can do this over one pin, hands-free.


Just trying to make sure I have this counter detail right.
triangle/sawtooth/square are possible with a flexible counter+DAC, but Sine needs a look-up table. (RAM or ROM ? )

Does this mean each counter/DAC includes a small table ram, which can also do that
'colorspace converter matrix' ?

Q: What size is what I'll call Pin-Table memory ?

If the ports can dual-edge-triggered serialize at 400mbps, can the counters
count/capture at that 2.5ns speed, or just 200MHz, or is it some lower-still speed ?

My ideal Counter-pin-structure config option, allows One counter to Divide a PinFreq by N (1..2^32), and the TC of that Divider, to capture a highest-speed timebase counter.
(viz a reciprocal frequency counter. just add the 32*32->64 -> 64/32 )

With 2 Counters present per pin, the large building blocks seem to all be there!!.

Q:Will the details support the interconnections I describe ?

Cluso99
11-08-2010, 02:35 AM
The low power comment is a very significant one. Hopefully Chip will explain a little further.

It has been said before that the price of the Prop II was aimed to be (IIRC) < $10.

Now, I am presuming from what has been said, that the Prop II die just fits the QFP128 package and that is why we cannot have more Hub Ram.

Could a larger die be used with a bigger package to get say 512KB of Hub Ram??? Everything else the same, so either the extra pins would be no-connect or wider spacing (I am unsure of ramifications here).

I would much prefer a 512KB Hub Ram Prop II in a QFP196 (or QFP128 with bigger pin spacing preferred) for $12-$14. Perhaps BEAU could comment further???

BTW I agree, nothing will kill the current Prop 1. It will continue to be used in various places. The Prop II will only seek to further legitimise the Prop 1 and to make it more visible to professionals.

Beau Schwabe
11-08-2010, 05:38 AM
Cluso99,

Right now with the pitch size of the I/O's and the current 128 pin package, we are "CORE" limited as opposed to "PAD" limited. What this means is that we are limited physically as to what we can fit in the core based on the current package and available room remaining in the core. The I/O's could actually grow slightly in size if needed.

Saying that it's not by a huge amount that they 'could' grow. The I/O's themselves are packed with stuff of their own.

It's a balancing act ... reduce the I/O pad size and you gain more core space losing functionality of the PAD ... reduce the core space requirement and gain I/O functionality but at the cost of losing core functionalty.

In a system that requires both to make to make it work the way everyone wants it to, it's a complex juggling act.

Going to a larger package isn't such a huge deal, going to a larger die size becomes cost prohibitive, but it would eliminate being "CORE" limited.


EDIT:
One of the tests that we can determine with the test die is just how much current and I/R drop we get across the power and ground rails which are built within the PADs that makeup part the power ring that goes all the way around the chip.

Right now they are 100 microns wide totaling just over 400 microns 'JUST' for the power/ground rings... remember VIO,GIO,VDD, and GND ... The 'I/O PAD guts' are built underneath these power/ground lines. However if they can be reduced even by 50 microns, that's a difference of about 1.4 Million square microns gained in the core.

Phil Pilgrim (PhiPi)
11-08-2010, 05:58 AM
Beau,

Can you comment on the power requirements? It was my impression that the smaller feature size entailed more leakage current and a much higher power consumption than the Prop I. But Chip's recent comments seem to indicate otherwise.

Thanks,
-Phil

Beau Schwabe
11-08-2010, 06:04 AM
Phil Pilgrim,

"It was my impression that the smaller feature size entailed more leakage current and a much higher power consumption" - Compared to 350nm (<-- Prop I) this is true, but the leakage for the Prop II should be comparable to other 180nm processes. The exact numbers we won't know until we get our hands on the test die. Outside of that we only have what the simulator is telling us.

Phil Pilgrim (PhiPi)
11-08-2010, 06:13 AM
Outside of that we only have what the simulator is telling us.
And what is the simulator telling you, in comparison with the Prop I? ('Sorry to press you, Beau, but this is a big deal, as it affects the Prop II's market position relative to the Prop I. The conventional wisdom has been that one big reason the Prop I will not be obsoleted by the Prop II is the Prop I's much smaller power consumption.)

-Phil

Humanoido
11-08-2010, 06:30 AM
Thanks Sapieha.

Cluso99
11-08-2010, 06:37 AM
Beau: Thanks for the reply.

I sort of understand that you have balanced the I/O space with the core space. So, if you increase the core size (i.e. increase hub ram) then there will essentially be unused I/O space which has to be paid for in wasted die space. Die size translates directly to chip cost.

I do not know if anyone else agrees, but I would rather 512KB of Hub Ram with a larger die size. I am prepared to pay $2-$4 for this.

So, my question is, everything else remains the same (except maybe the package to accommodate the larger die) .....

Is it feasible to increase the hub ram to 512Kb and how much would the cost increase (estimate) ???

If it is within say 25% extra, then perhaps a poll to see if it would be viable?

Beau Schwabe
11-08-2010, 07:54 AM
Phil Pilgrim (PhiPi),

Yeah, I know it's a big deal, but the difference between 350nm and 180nm is really low in comparison to other processes.

- At 350nm the leakage is about 7% of the total consumed power

- At 180nm the leakage is about 16% of the total consumed power

- At 90nm the leakage jumps to almost 40% of the total consumed power



Cluso99,

I'm not sure it would be a simple matter of just going to a bigger die size along with a bigger package size for a couple of dollars extra per chip. I think there is a significant jump in price.

What you are paying for is silicon real-estate. Look at it this way... a die that measures 6mm square is going to have 36 million square microns. Compare that to a die that measures 8mm square with 64 million square microns ... increasing the die just 2mm almost doubles the silicon real-estate.

The 'next' available die size increase looking at it this way would be cost prohibitive.

Another factor... the bigger the die, the lower the yield in two ways... the lower the yield in terms of how many can physically fit on a wafer, and the lower the yield as far as how many IC's are actually good. A typical yield per wafer due to process variations is about 85%-95%... this factor goes down as the size of the die increases.

potatohead
11-08-2010, 08:00 AM
Costs then could easily hit 4X, due to waste (fewer IC's, coarser shape = less wafer utilization), excessive area dedicated to IC (excessive features / cost), and yield percentage drop (fewer "attempts" per wafer)?

, or (and this is a wild guess)

volumes need to be 8x to justify that?

Seems to me then, the smaller size, has less variance, but isn't as competitive, but costs less to compete.

The larger sizes must then be reserved for known proven designs in demand? A new design, really needs to be optimal, in order to yield a "burn rate" sufficient to spark demand? Don't know about optimal, but the overall risk / attempt at the market is favorable to the smaller size by a considerable factor, thus a smarter wager.

Close on that?

If so, very interesting dynamics in play here. Thanks for sharing!

Beau Schwabe
11-08-2010, 08:08 AM
potatohead,

That's a good summary, the cost per wafer is also very high, not sure right off hand what it is, but it factors in just as well.

potatohead
11-08-2010, 08:11 AM
Yeah, and that's a double dip on risk costs, which is why the smaller size has to be favorable in higher risk conditions.

...unless risk is mitigated with some known demand, then it's just a cost / margin equation.

Heater.
11-08-2010, 09:43 AM
potatohead,


...unless risk is mitigated with some known demand, then it's just a cost / margin equation.

I love it when you talk MBA :)

Ale
11-08-2010, 10:27 AM
There is a missed question: Can we access (external) SDRAM using RD/WRxxx instructions ? or it is going to be bit-banged using a COG ? (it looks like the second answer is the probable one).

This chip will pack some very serious features, good !!!

Roy Eltham
11-08-2010, 11:35 AM
Ale, it won't be able to read external SDRAM with the rd/wrxxx instructions. Pretty sure Chip said as much in the chat. The SDRAM stuff is special modes on the i/o pins, so it'll either be done via in/out instructions (most likely), or some new ones.

Rayman
11-08-2010, 11:46 AM
Well, since the SD card interface will be inherently supported, perhaps they could rig the Prop Tool to build SPIN in LMM mode and put the code on the SD card and just use HUB ram for buffers...

That would make the size of hub RAM almost irrelevant and let people write SPIN code of unlimited size...

evanh
11-08-2010, 02:44 PM
I believe Chip has said that SDRAM bandwidth can be as fast as Hub bandwidth. I'm getting the feeling that 128KB HubRAM is quite sufficient.

Cog count is more likely to be the next bottleneck.

evanh
11-08-2010, 03:20 PM
Also, kudos to Sapieha for posting the chat.

potatohead
11-08-2010, 03:22 PM
@Heater... Heh, guess I did.

Well, when I learn a new thing now , I do both. I enjoy the tech, and the business end of things. Self employed by force a while back taught me that lesson. :) Let's just say the bell rang at the school of hard knocks.

In this context, both are really interesting, because I've never had any exposure to the dynamics. Truth is, this little thread has answered a lot of questions I had about why things happen as they do with semiconductors. I've often been puzzled by trade-offs made, thinking, if they had only done... That enriches the hobby on a lot of levels for me. Probably I still don't know squat, but I at least have some plausible musings now. I'll take it!

I agree with Evanh, BTW. I consider all the details necessary to get off chip RAM operating optimally worth more than a large amount of on-chip RAM. 128K is a lot, compared to what we have now. Honestly, the thing could use more pins too, but now we know the dynamics around that not being a viable option right now. IMHO, the parallel nature of Propellers always will call for more pins. Call that a constant at this point.

One other thing seems clear to me, and that's the early ramping up ideas of what could happen makes sense now. Go big, then as the layout constraints hit home, scale back, balancing all the way down to the best trade-off possible at the die and package that makes the best financial and technical sense.

If we look at it that way, I think we are getting a LOT!! Lessons learned from Prop I are really gonna pay off in the next chip. I'm pretty stoked, and pleased as heck Chip and Beau are sharing the ride along.

Sapieha
11-08-2010, 03:50 PM
Hi Humanoido and ALL.

I think as BIGGEST THANKS we need give to -->
Chirs Savage (http://www.savagecircuits.com/forums/showthread.php?292-Friday-Chat-Room-3&p=2212#post2212) for give us that chat possibility's on his site.

And

Chip Gracey for answering ours questions.


Thanks Sapieha.

Sapieha
11-08-2010, 04:15 PM
Hi CHIP.

Added some extra questions to second post in this thread.

Can YOU answer them if possible?

Ps. in some time more that question will come in this post.
As I know it is good to You with some feedback BUT I to need some feedback from You if my ideas help else if them are VERY bad.


Next thing I'm are waiting on is: As You said! -->
Full instruction set... I'll give to Daniel at Parallax for posting. Or, I'll find out how to get on the forum and do it myself.

I think we ALL wait that. And maybe even can see what are missing ---> And maybe can be implemented that we can give ideas on!

jmg
11-08-2010, 06:42 PM
Now Look on attachment PDF -- That maybe give You more readable picture in You mind what I have mean on that.

Next question I have is: Will it be instructions to possibility of use COG's CLUT memory in user friendly requirement.

You do not mention FIFOs ? and also variable width serialize ?

Dual and Quad SPI is now quite common (80MHz+), and devices like XMOS allow user assignable width.

Supporting QuadSPI would be quite important, especially on a memory constrained device.

jmg
11-08-2010, 06:46 PM
Well, since the SD card interface will be inherently supported, perhaps they could rig the Prop Tool to build SPIN in LMM mode and put the code on the SD card and just use HUB ram for buffers...


Here the QuadSPI (and even the byte wide too?) would be a natural expansion ?.

Thus far, CODE is not (directly) from SDRAM, so other code paths should be provided.
QuadSPI is 80MHz+, which would stream 32 bits at 10MHz.
Not stellar, but fine for many code tasks.

Phil Pilgrim (PhiPi)
11-08-2010, 07:12 PM
Be careful what you wish for! Remember the Homer (http://onscreencars.com/tv/the-homer-the-car-built-for-homer/)? (Not that Chip would ever allow that to happen! :) )

-Phil

Ravenkallen
11-08-2010, 07:19 PM
The big question i wanted to ask is....
What kind of development platform is going to be released with the Prop 2? I think i remember Chip saying something like there will be 3 different kinds? I hope they will release a platform similar to the C3, but it uses a Prop 2 instead...Maybe they could call it C4(Like the explosive, haha). For those of us who can't hand solder smd parts(Well, at least not very well), this is going to be imperative.

Maybe they could make a board simliar in size to the STAMP or Prop stick and only run out 32 or so I/O lines(even though it would mean that I/O pins would go unused, it would provide a simple interface to a breadboard). I mostly want the new Prop for the 128 Kilos of memory, speed and direct boot from SD. That ought to put a cramp in Arduino's style.

Parallax has done well by making all of their products available in a easy to use DIP package. I think they should be wary about deviating to far from this "unspoken pledge"
Parallax is a good company, run by real people and i thank them for being so open with their designs...

One smaller question..
What kind of speed increase are we talking about for SPIN on the Prop 2. I know that Chip said that ASM would be like 8 times faster. Is it safe to assume that SPIN will have a equal increase?

Harley
11-08-2010, 07:51 PM
Ravenkallen, there was this info from Chip:

bittled: Will you have a breakout board for the Prop II for breadboarding?
CHIP: Breakout board... Absolutely!
CHIP: ...Probably a few breakout boards. I hope the forum is allowed to provide some good info on what WE'd like also. Wow! 96 (including crystal, reset, brownout) + power/grounds; lots of header pins to provide

Yes, it might be nice to do development on the Prop 2.

Ravenkallen
11-08-2010, 08:22 PM
@Harley.... yes, but i was wondering if he had any details of the various proposed boards...

Electronegativity
11-08-2010, 08:32 PM
This all sounds truly awesome.

I have been wanting to build an oscilloscope for a long time.
Now I have a vision of a 32 channel scope that displays 3D data in HDTV.
As much as I enjoy the dials and switches, it will be much more cost effective to replace them all with on screen menus.

markaeric
11-08-2010, 09:01 PM
Parallax not building breakout boards would be like Ford not building cars. With all the capabilities of the prop2, I'm sure there will be a huge amount of boards from Parallax, and others.

I have to imagine that spin will see a significant increase in speed, considering the single cycle instructions including mul/div, and 4 Long rd/wr.

Cluso99
11-08-2010, 10:57 PM
Spin will be at least 8x faster (instructions 4x and clock 2x) plus the hub access is 2x. Hub can fetch 16bytes per access so a small cache could be implemented - would need to see the overhead as it may not be effective.

Now, Chip has my faster Interpreter code, together with his own ideas, we could see 20-25% improvement here too. If the decode table was in ROM too we see more cog space as well. And don't forget we can expand to LMM for extra functions. The maths functions I re-coded gave a huge improvement. MUL, DIV and SQRT are now in hardware so this will give a significant speed improvement here too.

All in all, a lot of speed improvement here!!!

william chan
11-09-2010, 02:24 AM
Cluso,

Can you implement local variables in spin to use cog local ram as well in your new spin code.
This would really speed things up.

Bill Henning
11-09-2010, 02:31 AM
Sorry, I disagree.

For an interpreted VM such as Spin, local cog variables cannot be much faster.

Currently a spin op code takes 25us-100us, with an average around 60us.

On Prop2, say it will be say 5x+ faster - call it an average of 10us

On Prop2, a hub acess will take 8 cycles, which is 50ns, or approx. 1/200th of the opcode execution time.

Even if a local cog access took 0 cycles, it would only speed up instructions by 0.5%

As a local cog access takes 1 cycle, it would cause a speedup of less than 0.5%

Conclusion:

It's better to use any cog space that can be freed up for an LMM engine, and FCACHE area.


Cluso,

Can you implement local variables in spin to use cog local ram as well in your new spin code.
This would really speed things up.

Cluso99
11-09-2010, 04:14 AM
I am in agreement with Bill. There are much better ways to improve performance.

However, fast overlaying of some routines may now outperform LMM.

jazzed
11-09-2010, 04:54 AM
However, fast overlaying of some routines may now outperform LMM.
True if you always know the size of the routine to be used :)

With rdquad/wrquad or whatever it's called at least more instructions/data can be read into the COG per HUB access cycle. With that, more instructions between slots, and addition of the sorely missed indexed indirect access, maybe a 16MIPS+ aggregate LMM is possible. In any case, I think exciting times are ahead.

turbosupra
11-09-2010, 01:05 PM
How about the cnt function, is it still 32 bit?

Bill Henning
11-09-2010, 04:23 PM
Agreed - this is going to be fun...

Also, all virtual machines can also be re-written to be *MUCH* faster as well.


True if you always know the size of the routine to be used :)

With rdquad/wrquad or whatever it's called at least more instructions/data can be read into the COG per HUB access cycle. With that, more instructions between slots, and addition of the sorely missed indexed indirect access, maybe a 16MIPS+ aggregate LMM is possible. In any case, I think exciting times are ahead.

Bobb Fwed
11-09-2010, 04:32 PM
How about the cnt function, is it still 32 bit?
I don't see why they would change that. 32 bits is actually way more than you get on a lot of other micros, and even at the faster speed, it only rolls over every 26.8 seconds. Plus, they haven't mentioned a change.

A 64-bit cnt would be nice, and now that I think about it, seems like it would be simple and quite cool. If they did a 64-bit cnt, you could use the lower 32-bits just as the cnt is used now, but also access the upper 32-bits that would roll over every 3653-years. Maybe that's a bit overkill, but what the hay.

Cluso99
11-09-2010, 05:18 PM
Agreed - this is going to be fun...

Also, all virtual machines can also be re-written to be *MUCH* faster as well.


Bill: And imagine the speed we can get for the Floppy/HardDisk running in the 32MB SDRAM :smilewinkgrin:

Harley
11-11-2010, 06:02 PM
One of Chip's comments was about SDRAMs

SDRAM hooks up very easily for ~$3 of 32MB RAM external.
Searching Digi-Key I wasn't able to find anything at such a price. What might this SDRAM part number be? Anyone have an idea of such SDRAMs?

I note there were various organizations, from 4-bit wide to 32. I've not paid much attention to SDRAMs in the past; anyone using such parts? I need to bring my info up-to-date as it may be soon (about a year?) that Prop 2 might be available. Thanks in advance for any clues.

Bill Henning
11-11-2010, 06:41 PM
Drooling already...


Bill: And imagine the speed we can get for the Floppy/HardDisk running in the 32MB SDRAM :smilewinkgrin:

Harley
11-12-2010, 01:16 AM
OK! Nick McClick's info on 'jazzed's SDRAM board (thread NEW: SDRAM Module for Propeller Platform) explained a lot for me.

But his SDRAM part isn't a '~$3' part; more like $6+ in 1000's quantity. I wonder if Chip was thinking of Parallax's getting many thousands qty pricing?

Anyway, Now I realize the 'D' in SDRAM implies 'dynamic'. And now have one part number to work from.

If anyone knows of Chip's '$~3' SDRAM please let me know of such part number!

Roy Eltham
11-12-2010, 01:35 AM
Harley,
Here is an SDRAM chip in the Mouser online catalog that is $3.50 for 1. (under $3 for 100)
http://www.mouser.com/ProductDetail/ISSI/IS42S16400D-7TL/?qs=sGAEpiMZZMti5BT4iPSEnaWNh6hsejH%252bZrjO6t%252 btnf0%3d

They are often just called DRAM, but if you look at the data sheet (linked on that page) it is actually an SDRAM (Syncronous Dynamic Ram).

Harley
11-12-2010, 02:09 AM
Thanks Roy,

Mouser says it is a
DRAM 64M 4Mx16 143Mhz part

I thought Chip was referring to a 32 MB SDRAM, though he never gave a part number. He mentioned a x16 or x32 interface plus some 20 control lines. That's a lot of control lines; maybe it included refresh addressing?

I'd sure like to know about the part number Chip was referring to. Anyone know?

turbosupra
11-12-2010, 03:10 AM
Bobb,

Exactly, if you could count 3600 years or even 1800 years, no more count problems and confusion. The cnt feature has really been a PIMA.

It can't be that difficult to use two registers and have a 64 bit cnt feature ... but what do I know?



I don't see why they would change that. 32 bits is actually way more than you get on a lot of other micros, and even at the faster speed, it only rolls over every 26.8 seconds. Plus, they haven't mentioned a change.

A 64-bit cnt would be nice, and now that I think about it, seems like it would be simple and quite cool. If they did a 64-bit cnt, you could use the lower 32-bits just as the cnt is used now, but also access the upper 32-bits that would roll over every 3653-years. Maybe that's a bit overkill, but what the hay.

whicker
11-13-2010, 10:03 AM
Bobb,

Exactly, if you could count 3600 years or even 1800 years, no more count problems and confusion. The cnt feature has really been a PIMA.

It can't be that difficult to use two registers and have a 64 bit cnt feature ... but what do I know?

Then, such a program will read the upper and lower cnt registers at different times (this is still a 32 bit CPU after all), also generating strange bugs that only happen extremely sporadically.

Think about the Rollover from $FFFF_FFFF back to 0 on the lower cnt register just after reading the upper cnt register, now the upper cnt register is actually a count higher than just was read... combine that with the lower cnt, and we've gone back in time apparently.

Really, there's no clean way. There would have to be a "freeze timer" instruction. Ugly.

Yes. Dealing with rollover (basically modulus math) kind of stinks because they don't really teach it in school, other than the little bit in grade school with the minute and hour hands on clocks.

Cluso99
11-13-2010, 10:20 AM
No problems with the rollover. Read the upper long first, then the lower and if the lower is say <20, re-read the upper to ensure you got it correct.

Ding-Batty
11-13-2010, 04:31 PM
I prefer the following approach (pseudo-code):


hicount_0 := cnt64_high
locount := cnt64_low
while ( (hicount_1 := cnt64_high) <> hicount_0 )
hicount_0 := hicount_1
locount:= cnt64_low
' now hicount_1 and locount are consistent (no rollover problem in hicount)If two successive reads of the high-order bits are the same, then the low-order bits read in-between cannot have caused a roll-over.

jazzed
11-13-2010, 05:39 PM
OK! Nick McClick's info on 'jazzed's SDRAM board (thread NEW: SDRAM Module for Propeller Platform) explained a lot for me.

But his SDRAM part isn't a '~$3' part; more like $6+ in 1000's quantity. I wonder if Chip was thinking of Parallax's getting many thousands qty pricing?

Anyway, Now I realize the 'D' in SDRAM implies 'dynamic'. And now have one part number to work from.

If anyone knows of Chip's '$~3' SDRAM please let me know of such part number!

Try this:

http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=675-1022-1-ND

64MB for $6.00

The catch is it's DDR and not available in 3.3V.
That's why Propeller2 has QVDD to change IO voltages.

Roy Eltham
11-13-2010, 06:08 PM
jazzed,

Will the SDRAM stuff that Chip has discussed for the Prop 2 work with DDR memory? DDR signalling is different...

Harley
11-13-2010, 06:28 PM
jazzed, Thanks for this info.

I looked this up part. Qimonda's HYI25DSD512160CE-5 part, which is organized as 64M x 16 bit width. $6, and there was a note "Digi-Key has discontinued this item, limited quantity available". Don't know if other distributors also have done so, but don't think I'd want to design around this part.

I'd guess a dynamic, rather than a static, RAM might imply a bit of extra hardware like a latch for holding address info. If totally not requiring extra hardware, hopefully Parallax might provide info on the external RAM hook-up before Prop 2 is available. I'd guess many of us would like a lot more info to get started on designs using Prop 2.

jazzed
11-13-2010, 06:38 PM
jazzed,

Will the SDRAM stuff that Chip has discussed for the Prop 2 work with DDR memory? DDR signalling is different...

The DDR signalling I've looked at is very similar to SDRAM. The command signal set is identical. The mode register settings are different to allow data access window tuning. Data is accessed on both clock edges for DDR and one edge for SDRAM.

Today's Propeller would not benefit from double data clock edges.

jazzed
11-13-2010, 06:50 PM
Don't know if other distributors also have done so, but don't think I'd want to design around this part.
Of course not, but that is just one example. Some other manufacturer will fill the need. The other alternatives are not much more expensive.



I'd guess a dynamic, rather than a static, RAM might imply a bit of extra hardware like a latch for holding address info. If totally not requiring extra hardware, hopefully Parallax might provide info on the external RAM hook-up before Prop 2 is available. I'd guess many of us would like a lot more info to get started on designs using Prop 2.
I would like more info too, but I doubt it will be much different from today. There is a special SDRAM clock counter - perhaps in addition to current counters.

With the number of pins available on Prop2 latches will be optional. You could do an entire 8 bit interface with 28 pins unlatched for up to 128MB SDRAM with today's technology - with cumbersome address line sharing other devices could be added. With Prop2 you get the bonus of 16 (36pins) or 32 (42pins) bit data which can work wonderfully for any application with a cache.

Cluso99
11-13-2010, 08:22 PM
I am sure Parallax will give us plenty of notice to design our "memory" based boards. There is no point in designing yet.

We will have to be more careful with SDRAM as it has a much shorter life than SRAM. I will not be looking for SDRAM chips until the PropII is close - it's still 6-12+ mths away.

jazzed
11-13-2010, 09:10 PM
We will have to be more careful with SDRAM as it has a much shorter life than SRAM. I will not be looking for SDRAM chips until the PropII is close - it's still 6-12+ mths away.
Wow, we better dump all our PCs now while we have a chance :)

Harley
11-13-2010, 09:11 PM
@ Cluso99, Thanks, I didn't know that SDRAM parts had a shorter life.

@ jazzed, The points you brought up was one reason why I was curious. I didn't want to just 'dial into' just any SDRAM, in case there were too many differences. Though I suppose looking into any one of 32M x ?? size would be helpful.


I realize it is a bit too soon to do too much on any Prop 2 design. That really wasn't my point.

It was just that for Chip's remark about ~$3 SDRAM I couldn't find ANY. So I was hoping he or someone had part number(s) for such 'animals'. Twice that price I could find unless it would be huge quantity pricing.

I tried searching Digi-Key for SRAM parts in 32M x ?? size; there doesn't seem to be any. I didn't go to any other distributor as I too can wait. Though, I've not dealt with any RAM of any sort in recent decade or so. So wanted to just look at any data sheets available for such a part .

Thanks guys...

jmg
11-13-2010, 09:40 PM
It was just that for Chip's remark about ~$3 SDRAM I couldn't find ANY. So I was hoping he or someone had part number(s) for such 'animals'. Twice that price I could find unless it would be huge quantity pricing.

I tried searching Digi-Key for SRAM parts in 32M x ?? size;

Try digikeys new price search, I get page full under $4, and avoiding the smaller ones, which will be nearing EOL, there is (for example)

W9412G6JH-5-ND 128MBit 100+ 1.81260
IS42S16400D-7TL 64MBit 100+ 2.89050

SDRAM needs a refresh cycle, so is not as invisible as SRAM, but you can see it is much cheaper.

The design-life times of SDRAM have tended to be shorter, but I think when the industry settles on a Embedded Size, that life-time will increase.

Embedded is also seeing a SDR (Single Data Rate) DRAM, see ISSI's website for
3.3V SDR (Single Data Rate) Synchronous DRAM

(ie the very largest, PC usage ones will continue to follow that wave, but embedded apps will be more stable - the 128MBit winbond one may be a candidate ? ).

jazzed
11-13-2010, 10:51 PM
... I didn't know that SDRAM parts had a shorter life.
Yah, it's only been around since 1993 :) DDR will be around much longer.

The highest SRAM density per chip I've found is the Cypress 2Mx8 - of course that's $20 a pop. The same price will get you 128MB of SDRAM.

@jmg,
The ISSI SDR 32Mx8 is what I use on the GadgetGangster SDRAM Module.
Refresh is pretty easy to do. SDRAM or DDR is the way to go guys.

Cluso99
11-14-2010, 03:08 AM
Future Electronics are a better source and there are a couple of 8Mx16 and 16Mx16 which are <$3. IIRC the package was TSSOPII-56 with 0.8mm spacing which is easy to solder.

Of course, another idea may be to put a DIMM socket on a pcb and use old laptop DDRs. 1GB anyone??? Note: Check the voltage - must be 3V-3V3 otherwise the conversion is a nightmare.

jazzed
11-14-2010, 03:42 AM
I have some DIMM on my desk now that has 16 TSOPII-66 DDR 32Mx8 chips on it. Yes, I've thought about the DIMM idea just a tiny little bit.

SODIMM used in laptops is small enough to be a reasonable solution. The normal size DIMM is monstrous, but is cheaper. DIMM would allow flexibility and hardware re-use.

markaeric
11-14-2010, 09:50 AM
Would using a dimm require more pins and logic over a single sd ram chip?

jazzed
11-14-2010, 05:23 PM
Would using a dimm require more pins and logic over a single sd ram chip?
Yes.

One issue is that most PC3200 DIMM have 64 bits. To use all 64 bits 16 extra connections are required for chip selects. To use a 32 bit only interface (half the DIMM capacity) requires 8 extra connections.

It is not clear at this point how much the extra effort, flexibility, and FAB cost is worth. The DIMM preferences do seem to change with market conditions. The chips themselves are stable.

Added:

There is a 144 pin DDR2 SODIMM standard that provides 32 bit data. The only compatible product that I found quickly was a Dell Printer DIMM for $80

Newegg.com and many other sites apparently still sell the old PC100 and PC133 144 pin SDR SODIMM modules. Like the DIMM for 32 bit data, 8 extra pins are required for byte-lane enables. I have several of these left over from battered old laptops.

hinv
11-14-2010, 05:27 PM
I have another question for Chip.

Is the 32bit "pipe" between the cogs just like pins without the pins? This to me would seem the most flexible.
On that note; since there are planned 92 iopins on the package, can we have the missing 4 pins (96-92) just like the 32bit "pipe" between the cogs so we could use them for handshake?
That way, we could implement 2 16bit buses between cogs with handshaking, or just use 2 of them to have a full 32bit data buss between two cogs that need high bandwidth.

Thanks,
Doug

Sapieha
11-14-2010, 06:20 PM
Hi. hinv

On Yours question --- I think answer is YES.

But I have question to Chip ON
CAN even this pins can be USER internally usable ?


CHIP: Forgot to mention... 3 ports, minus 4 pins, are implemented, the last 'port' is the cog pipe, where each cog can filter what he's seeing from the other cogs' pipe outputs.



I have another question for Chip.

Is the 32bit "pipe" between the cogs just like pins without the pins? This to me would seem the most flexible.
On that note; since there are planned 92 iopins on the package, can we have the missing 4 pins (96-92) just like the 32bit "pipe" between the cogs so we could use them for handshake?
That way, we could implement 2 16bit buses between cogs with handshaking, or just use 2 of them to have a full 32bit data buss between two cogs that need high bandwidth.

Thanks,
Doug

Harley
11-14-2010, 09:25 PM
Re: 'Pipes', even though I read all (I hope) of that 'Friday w/Chip' session covered, can anyone explain how a pipe is helpful?

In my mind I picture 8 cogs with lines between each one. Cog 0 has lines to Cogs 1..7, etc. But presently it seems a cog doesn't 'KNOW' what cog# 'he' is. So how is he going to know in advance of run time to know which pipe to listen to or post to? And is this one LONG that gets written/read? Like when PAR is used (sort of!)?

I suppose there will be some instruction to read/write to a pipe? "Oh, I see cog n pipe value; have you seen my pipe?" Now cogs sort of can talk to each other, rather than through the HUB.

Cluso99
11-14-2010, 11:53 PM
The "pipe" is effectively just a register.

Each cog knows what cog it is at run-time.

From a previous discussion Chip said there will be special instructions.

I suggest you look at the old very long thread for a better discussion.

potatohead
11-15-2010, 12:02 AM
Seems to me, the general concept of somehow mux'ing the pipes will keep objects portable. That was the intent Chip seemed most interested in preserving. At that time, Kye registered some concern, that I share, over dedicated resources introducing dependencies. Chip acknowledged that, and I suspect he's addressed it, or the pipes would not be there.

Why have them? Well, some trick stuff has been done with the lower pins, and it's shown how COGs being able to communicate can significantly improve their ability to work together. I'm all for it, so long as we don't have some ugly kludges to work out the sharing of the pipes. Might as well just have the HUB, IMHO. It's gonna be a lot faster anyway.

As compelling as that is, just having COGs be COGs is more compelling, IMHO.

markaeric
11-15-2010, 01:03 AM
Re: 'Pipes', even though I read all (I hope) of that 'Friday w/Chip' session covered, can anyone explain how a pipe is helpful?

In my mind I picture 8 cogs with lines between each one. Cog 0 has lines to Cogs 1..7, etc. But presently it seems a cog doesn't 'KNOW' what cog# 'he' is. So how is he going to know in advance of run time to know which pipe to listen to or post to? And is this one LONG that gets written/read? Like when PAR is used (sort of!)?

I suppose there will be some instruction to read/write to a pipe? "Oh, I see cog n pipe value; have you seen my pipe?" Now cogs sort of can talk to each other, rather than through the HUB.


Unless I'm misunderstanding something, I don't see why knowing what cog is running what object is important for the sake of using the pipe. Since the object will be specifying what bits of the pipe it will be using for tx/rx, cog# is irrelevant because all cogs are sharing a single 32-bit wide pipe (I'm assuming).


@jazzed

Pardon my ignorance, but is the extra logic required to operate a dimm strictly for addressing the individual memory chips? If so, I presume some fairly simple external circuitry along the lines of an incremental counter + shift registers could be used to minimize the additional prop pins the would be requires, correct?

While using dimms might not be an ideal solution for an embedded device of any volume, I think incorporating a dimm port on a project board would be awesome just for the sake of memory capacity flexibility.

jazzed
11-15-2010, 01:55 AM
Pardon my ignorance, but is the extra logic required to operate a dimm strictly for addressing the individual memory chips? If so, I presume some fairly simple external circuitry along the lines of an incremental counter + shift registers could be used to minimize the additional prop pins the would be requires, correct?
Yes. Looking at the schematics, it's just a matter of using the right DM data masks for each byte lane chip - thats a 3 to 8 74LVT138 so only 3 extra address pins are necessary.


While using dimms might not be an ideal solution for an embedded device of any volume, I think incorporating a dimm port on a project board would be awesome just for the sake of memory capacity flexibility.
I agree to a point. Like I said before, SODIMM is the most likely cost effective candidate for sockets because of the small physical size. i'm not crazy about the right-angle SODIMM SMT connectors though. Most straight up DIMMS are through-hole which is easier to manufacture.

PC100 and PC133 SDR SDRAM 144 pin SODIMM are still generally available apparently. Just google this "pc133 sodimm" ....

Cluso99
11-15-2010, 02:24 AM
jazzed: I think the smaller SODIMM would be better. As long as it requires 3V3 then all would be good. The unused pins/chips or whatever could be tied or driven by a latch. I was thinking you may just end up using a single SDRAM chip on the pcb and the others disabled. There are plenty around so I am sure people could find them cheaply.

Sapieha
11-15-2010, 02:31 AM
Hi potatohead.

As I understand it "That Pipe's" are simple 32 bit I/O port without external pins. And as that it is not any dedicated resource. ALL COG's see it all COG's can write/read to it.
Only thing I'm not sure is possibility to divide it in smaller portions that more as 2 COG's can use it at same time in both directions.



Seems to me, the general concept of somehow mux'ing the pipes will keep objects portable. That was the intent Chip seemed most interested in preserving. At that time, Kye registered some concern, that I share, over dedicated resources introducing dependencies. Chip acknowledged that, and I suspect he's addressed it, or the pipes would not be there.

Why have them? Well, some trick stuff has been done with the lower pins, and it's shown how COGs being able to communicate can significantly improve their ability to work together. I'm all for it, so long as we don't have some ugly kludges to work out the sharing of the pipes. Might as well just have the HUB, IMHO. It's gonna be a lot faster anyway.

As compelling as that is, just having COGs be COGs is more compelling, IMHO.

Roy Eltham
11-15-2010, 03:56 AM
Regarding the "pipe" between cogs.

It is my understanding that they are not full fledged ports that just don't connect out of the chip. That would waste a lot of space. So they don't have things like ADC/DAC, SDRAM mode, and so on.

potatohead
11-15-2010, 03:59 AM
That was mine as well.

Kal_Zakkath
11-15-2010, 04:06 AM
Sapieha, I originally thought the same thing (single 32-bit port the same as pins but without physical pins, common to all cogs), but Chip's recent answers indicate otherwise:


CHIP: Forgot to mention... 3 ports, minus 4 pins, are implemented, the last 'port' is the cog pipe, where each cog can filter what he's seeing from the other cogs' pipe outputs.


I imagine this will be similar to the current prop's OUT register - currently (prop I) each cog has its own register, which are then OR'd together for the physical pin's state. It sounds like in the prop II each cog will have its own register for the internal 'pipe' but can choose which other cogs it is OR'd with.

As for what it can be used for (Harley's question), one simple answer is easy signalling between cogs as they can WAITPNE/WAITPEQ on the "pipe", the more advanced answer being high-speed long-wide comms between cogs - keep in mind that the cogs are automatically clock-sync'd (because they're all running on the same silicon, using the same internal clock) so you would not need any poll-reply or send-ack-send overhead (though presumably you would need some initial handshaking to set things up via the hub, sharing cog #'s etc).

Roy: It sounds like (someone can feel free to correct me if I'm wrong here) the counters live in the pins rather than the cogs now, and these are used to give the ADC/DAC/RAM etc. That being the case, I suspect you are right and that the internal port will not have them, seeing as many of the things wont be applicable to an internal-only port.

Random note on the internal port:
Chip said the "last" port is the internal one. Assuming the ports are numbered 0..3 (0, 1, and 2 being the ports with physical pins) port #3 would be the internal one. Here's my thought: wouldn't it be better to make port 0 the internal one? Not a huge deal, just means that if, on the off chance, a prop comes along with more ports you can add them without confusing the numbering scheme (or having to alter your code).

hinv
11-15-2010, 06:33 AM
@Kal: What I proposed is that the third physical port have it's last 4 pins accessable via all of the cogs even though there are not enough pins in the package to fit them(only 92 of 96 will come out) that way, handshaking can be done on these 4 leftover pins instead of the hub. This should be much faster via WAITPNE or WAITPEQ.

Kal_Zakkath
11-15-2010, 08:05 PM
hinv: I think that would cause more problems that it solves.

Firstly, if internal-pins are not 'full class citizens' then you'll end up with a port that is not homogenous (goes against prop philosophy). Or, you end up with internal pins that have functionality that makes no sense (an internal ADC for example) which wastes silicon that could be used for more HUB. There's also the thought I mentioned that if the prop II winds up in a larger package one day (could be unlikely, but hey, we programmers love needless future-proofing ;)) then those pins will no longer be internal.

At any rate, keep in mind that the internal port will allow cogs to (I assume) communicate at clock-rate (i.e. 160mhz). Handshaking only needs to be done at the start of the operation and so speeding that up will result in minimal improvement. If there is any overhead required (i.e. poll-reply, WAITPNE, etc) you might as well go via the hub, since it now supports quad-long reads meaning that you can either transfer 32-bits per clock (internal port) or 128-bits per 8 clocks (averaging 32-bits per 2-clocks).

Unless there's something I'm missing here.

hinv
11-15-2010, 08:49 PM
@Kal,

I understand what you are saying about the synchronous nature of using the internal port, and getting twice the bandwidth, but what about latency? After all, it is latency that ties up a lot of time, which is why cache's exist. With 4 more internal "pins" we could use WAITPNE to trigger a transfer, which I would guess would be 4 cycles max, rather than 7 to 22 for the hub.
As for the non-homogenous 4 pins...I don't really care if they are full featured or not..except they take up more space if they are full featured. Because of some of the neat features of the pins, I am guessing that some pretty neat tricks can be done with full featured internal pins like sine based trig.....you will have to talk to PhiPi or Chip about the possibilities. OK, not knowing the cost, I want those missing pins to be full featured ;^). How much space can they possibly take up?...there's already 92 of them planned. I think we have me convinced. (As if that matters ;^)

If you build it they will come.

Doug

Harley
11-15-2010, 08:54 PM
Back to those 'pipes', Does anyone know if it is a parallel interface or serial that cogs can communicate with other cogs?

And, does anyone know if these pipes can be used by all cogs at the same instant?

I wish Chip had communicated how the pipes actually get used. 8 cogs and 32-bit wide pipe implies only 4 lines/cog; or is this a wrong use? Is it just a 'register' that all cogs can read/write? Somebody please unclog my thinking on this (small) concept.

markaeric
11-15-2010, 10:02 PM
Is it just a 'register' that all cogs can read/write? Somebody please unclog my thinking on this (small) concept.


This is my assumption as well. So it should up to the programmer to decide how to implement the pipe width and comm protocol.

Cluso99
11-15-2010, 11:15 PM
Pipes:

Obviously we will have to wait and see. However, had we had the internal connections for portB in the Prop 1 we could have used this extensively.

Now most programs would likely only have comms between 2 cogs, but the option is here for doing anything we want.

The pins have very complex things on them and now use a lot of silicon, so it is not possible for the pins to actually exist in the silicon. So the pipes are just pins from the cog joined - no complex ADC, etc.

As for the counters, etc. I think from later discussions, that the counter sections still reside per cog, not per pin as I has guessed. This means that what Chip has said is that high speed comms can be done between cogs would hold true. Remember, most programs consist of a main cog and the other cogs do helper functions including any I/O. I think this will continue to be the likely case into the future. Of course, we will be able to off-load a lot more code into the I/O cogs because of the additional hardware in the cogs/pins and speed.

From the register map, it seems the PortD is the pipe and has 2 registers PIN & DIR, just the same as the other 3 ports A, B & C.

Maybe the 4 unused pins of Port C are internally connected, but there will be no ADC etc for these pins.

Basically it is too late to ask for features for the Prop II. First test silicon should be wip as we speak.

So (me included) lets restrict discussion to what is in Prop II. If you want, start a Prop 3 wish list thread, but don't expect anyone to take notice for quite a while (years).

RossH
11-16-2010, 12:33 AM
CHIP: Full instruction set... I'll give to Daniel at Parallax for posting. Or, I'll find out how to get on the forum and do it myself.

Have I missed this? I have kept my eye on this thread, but I have not seen anything turn up yet. Was it posted somewhere else?

Ross.

potatohead
11-16-2010, 12:40 AM
Doubt it's been posted. The test chip will determine some of how the Prop II is built, and those trade-offs will impact instructions.

Harley
11-16-2010, 12:52 AM
Wonder when Parallax got, or will get, test chips. Had that been mentioned yet?

Lots of us are highly interested in seeing the new instructions.

hinv
11-16-2010, 04:28 AM
@cluso99,

The pins I was referring to that I thought should be full featured pins are the 4 leftover pins that wouldn't fit in the package from port C. I would agree that port D should be just like Prop1 pins, but internal only

I don't know where you are getting your information, but last I heard they were waiting for their test chip to come back with only some test blocks laid out. Has the whole chip been laid out already?

Doug

markaeric
11-16-2010, 05:23 AM
I got the impression that the cogs will have two counters as it is now, while the pins themselves will also have one (or more?) - considering Chip was talking about how you can perform various automated functions on the pins with very little cog oversight.




BTW, does anybody know if there is a copy of the whole chat somewhere? I didn't see it on savage circuit's site.



EDIT: Reading through the chat again, it does seem to me now that the pins do NOT have their own counters.

eod_punk
11-16-2010, 12:21 PM
The chat starts here http://www.savagecircuits.com/forums/misc.php?do=ccarc&page=46
Chip joined in here http://www.savagecircuits.com/forums/misc.php?do=ccarc&page=31

eod_punk
11-16-2010, 12:24 PM
Double post

Cluso99
11-16-2010, 02:18 PM
hinv: Unfortunately it is only the test chip. I am unsure whether Beau has actually sent it out yet without checkin the thread.

I don't think there is space for more I/O pins on the die. If they did this it would most likely impact hub space from what Beau has said. The pin area has taken a huge amount of die space.

I would expect that they will test the cog for instruction set working.

There is definately a lot more stuff in the pins and counters. I guess we will have to wait a little longer to find out all that is in there. One thing you can be assured of, we will be amazed even if it doesn't do all of our pet requirements.

Fingers crossed for successful test silicon.

evanh
11-16-2010, 04:08 PM
The chat starts here http://www.savagecircuits.com/forums/misc.php?do=ccarc&page=46
Chip joined in here http://www.savagecircuits.com/forums/misc.php?do=ccarc&page=31

All I get from those links is the Smilie List.

evanh
11-16-2010, 04:11 PM
I got the impression that the cogs will have two counters as it is now, while the pins themselves will also have one (or more?) - considering Chip was talking about how you can perform various automated functions on the pins with very little cog oversight.
There has to be counters in the I/O blocks. Can't do the ADC/DAC functions without them. How accessible they'll be for using in other ways is another question.

Sapieha
11-16-2010, 06:25 PM
Hi evanh.

You need be member to get Chat functions.




All I get from those links is the Smilie List.

SkyFx
11-21-2010, 10:54 PM
Thanks for posting the chatlog. The Propeller 2 becomes more interesting day by day, i mean: 160MHz clock and one cycle instructions sound great on its own! A Logic Analyzer with it would be a killer application, as well as bitbanging high speed serial protocols.

As far as i think i understood the last "port" is a 4 bit bus (the rest of the last 32 bits not being connected as I/O pins) that all COGs can access like pins (i wonder if it would be possible to one cog masking it as an input and another as an output) without the hub access schedule. Thats a high speed method to trigger another COGs processing (in contrast to a hub access depending LOCKSET or Rdxxxx/Wrxxxx) or feed some data directly for extra processing. I guess the "filter it out" statement means its just a bus that requires some handshaking between COGs, but i guess mostly two COGs needing high speed synch will have to share data without the Hub RAM scheduler, which means the clock is the handshake.

Wrt the silicon real estate discussion i understand that there is this aim to have all I/O Pins have the same functionality, otoh it could be that only some pins have special functions (and save that spaces on the others). But i guess a lot of features that are preferably available on any pin goes hand in hand with a little extra functionality for special purposes. This makes the propeller such a great, versatile design.

Itīs a shame that lots of people tend to use more complex architectures instead of solving a lot of problems by parallel processing. I mean, i too like the small pin count packages of e.g. AVRs, but when mixing several serial I/O it gets complicated.

wbr,
SkyFX