Propeller II update - BLOG

davidsaunders · 2011-05-14 07:56

Chip:
Is it going to be possible to use the integrated DACs for VGA output? So that we can have a 5-Pin VGA solution (1-R, 1-G, 1-B, 1-HS, 1-VS).
I think that this may be of interest to others as well.

markaeric · 2011-05-14 18:29

I remember reading somewhere Chip saying DVI should be possible.

potatohead · 2011-05-15 09:00

Re: DACs. Yes. Using lots of pins for video output is one of the things being resolved in Prop II.

Re: VGA compared to YCbCr. Coupla things, based on my use of both. For overall quality of display, I don't see a lot of difference. Some HDTV devices do not have VGA, but I've yet to see one that does not have component. IMHO, having a luma only portion, with color expressed as difference signals, could open the door for some tricks not as easily done with straight RGB. High resolution luma, and lower resolution color, for example.

Re: DVI I saw that too. Took a moment, but could not find it.

The current Prop is basically a few hundred pixel video machine, unless the display is rendering something simplistic like text. Prop II will exceed that by a large factor. Good times for VGA / HDTV users ahead. Component is one of the output formats I've really wanted to do on Prop I, but it takes 3 COGS just to get the signal portion up and running... At TV sweep frequencies, a whole lot is possible, if one wants to dedicate most of a Prop to rendering video, but component capable displays at TV resolution are not easily found. S-video then, is the next best option.

markaeric · 2011-05-15 15:51

Hmm, I thought there was more to the DVI discussion, but all I could find was this:

http://forums.parallax.com/showthread.php?117789-Prop-II-on-chip-development-question&p=859096&viewfull=1#post859096

Edit:
Looking over the DVI wiki, it states:

"Minimum clock frequency: 25.175 MHz"

"Pixels per clock cycle: 1 (single link at 24 bits or less per pixel)"

"Less than 24 bits per pixel is optional."

So it would seem that DVI on the P2 should be theoretically possible, but at a color depth significantly below 24bpp.

Dave Hein · 2011-05-17 12:50

Maybe I'm asking the wrong questions about graphics capability. What I should be asking are questions about SDRAM access rate and DAC capability. Will the Prop 2 be able to access SDRAM and output analog signals at 8-bit resolution at 1080p60 rates? This will require 3 pins with 8-bit DACs that can update at a speed slightly over 1080x1920x60 = 124 mega-pixels/second. Wikipedia says that the maximum pixel clock rate for HDMI 1.0 is 165 MHz, but I don't know if a slower clock rate can be used, such as 160 MHz. Of course, that would not allow for any cycles to read from SDRAM, so it seems like 1080p60 is not possible unless a cog can transfer data directly into another cog's memory.

So now the question is, can the Prop 2 do 1080p30? This would require reading from SDRAM and then writing to a DAC. So, will Prop 2 be able to read from SDRAM in one cycle, and output the value to a DAC in the next cycle, and repeat this 1,920 times without any gaps? Can the output DACs provide 8-bit resolution at a sample rate of 80 mega-samples/second?

And, just to make it clear, I'm talking about displaying from a full-size frame buffer stored in external memory. I realize that amazing things can be done with sprites, but I'm not asking about that.

Thanks,
Dave

potatohead · 2011-05-17 20:55

I have those questions too. I don't think we know enough yet. The improved hub throughput suggests we might get there. Might be that we have to interleave two or more COGs too.

John A. Zoidberg · 2011-05-17 21:24

Looks like the Propeller II is going to be a very strong one. I can't wait to get my hands on it!

Also, is there a specified release date? Forgive me for being too nosy.

davidsaunders · 2011-05-18 15:38

I thought that I had remembered Beau posting a pinout of the Propeller II somewhere. I can not find it now. Does any one have it or know where it was posted?

davidsaunders · 2011-05-18 16:11

Ok.
Is the pinout still represented by:
http://forums.parallax.com/attachment.php?attachmentid=74729&d=1288195867
And is it still planned to be a 128TQFP with 0.5mm pin spacing?

Dave Hein · 2011-05-22 09:18

Were there any updates to the Prop 2 presented at UPEW? It would be good to post them here, or maybe to a new thread on the Prop 2. I read on another thread that the hub RAM will be 192Kbytes and the clock speed will be 120 MHz. Is that correct? Were there any details about the SDRAM interface, and the graphics capabilities of the Prop 2 per the questions I asked earlier in this thread? What are the capabilities of the DACs on each pin -- bit-depth and speed? Any more detail on the instruction set, such as what each instruction does?

potatohead · 2011-05-22 09:33

We did not get the instruction set detail I was hoping for. Mostly a time limitation though.

Chip did confirm refreshing a 1080p display, RGB, or YCbCr, from SDRAM as "in the can", possible.

Confidence level is fairly high on that, from my read on his presentation. RAM will be ABOUT 192Kb, depending on what happens with the synthesized logic block, and some minor details, involving boot ROM, which will be formed with the RAM, not defined as a separate block on chip. Won't be a solid power of two, but will be sized to fill the remaining area on die, once things are solidified more.

All the ROM plans go away, with the Prop II simply containing a boot loader program, that can fetch a program image, optionally decrypt, then launch on COG 0. From there, anything goes.

He suggested 120Mhz, instead of 160, though I am struggling to recall exactly why. The video has hard, scratchy audio...

I wanted to know about the math instructions, auto increment / decrement / index, rep, etc... but he did respond favorably to a updated instruction set listing. Maybe more detail there, and maybe somebody there could have a conversation about that. (I would have, had I been able to attend this year. Really miss the all night session)

If I am not mistaken, he also mentioned COGS being able to talk to SDRAM, so that's intriguing too. One thing about the SDRAM idea that is a concern is it being a non-parallel bottle neck. Might not have to be that way, if the pin cost is acceptable. Curious to learn more about that, and or whether or not another prop could be "the RAM".

I/O has low voltage, slower operation, and higher voltage, fast operation. Comparisons between pins, levels and such can be examined quickly for a lower, coarse bit depth, longer for a very precise one. Lots of features there I would have to review in the video, as I missed that part of it.

Power busses are well thought out to keep overall voltage drop center of die to a minimum, and the 3.3 is routed only to the I/O section. RCSLOW is still RCSLOW, with Chip talking about making it 1Mhz, due to the minimum power draw being such that speed wouldn't matter anyway. Maybe that might happen.

Package planning is that power and I/O are placed for corner feeds, and so that manual assembly and or test reduces the risk of shorting things.

COGS have 4 DAC channels that can be output to I/O pin groups. Not sure whether he mentioned the depth and speed though. Those are what I remember from yesterday.

Edit: The big news is the design process changes! All of the core things that make a propeller a propeller are done in house, by Chip and Beau, and those are very near complete. The glue logic that forms the core of the chip is going to be synthesized, trading somewhere around $70k in direct capital for people over three years. The time to market argument on that is a no-brainer, and something I personally am glad to see. It was hinted at a series of devices being far more possible and practical, once this has run it's course, with a new device turning into a money problem more than a time problem. Again, very good news.

He mentioned that human optimization of that region of the device would take a very long time to be competitive with the synthesis process they plan to use. Say a complete and shipping Prop II is out the door, and in use. A design variation would then involve whatever custom block was needed, a synthesis run, test, then ship, for a IIa, or something, with a time line on the order of months or a year, as compared to the one we are experiencing right now. We could be moving into exciting times!

Cost $12 or so in smaller unit quantities.

Another Edit: (sorry, my recall is slow this morning it seems)

Roy and Andre' gave chip considerable input on the basic hardware elements needed for graphics implementation, including alpha and textures. I was the one that mentioned YCbCr, for color space reasons, and because that is on darn near every HD display I've seen, where VGA isn't. (why is beyond me) Baggers and others have tossed in on that over time as well, contributing their graphics experience.

This has happened on a few other things too, like LMM, etc...

So, I suspect the reason why instructions have not been well detailed, is the last few polishes are being put on them. And Chip did mention not being able to finish some of that until the synthesis results are done. They output a great deal of timing info, which may constrain how things work, where a instruction might need a cycle, or a path changed, etc... Those consults probably impact a thing or two, which he explained takes a code change on his dev system, schematic update and vet from Beau, etc... It was implied that can't all be resolved for another few weeks.

jazzed · 2011-05-22 10:25

Adding to potathead's list:

It has been mentioned that the on chip boot loader will be minimalist. Basically the loader gets code from off-chip storage. The types of storage devices will be SPI and I2C. SPI will be 4 wire like on SD card in SDIO mode (what we use today). A SPI Flash chip can also be used. I2C boot loader storage apparently will be kept. I2C pins can be a subset of SPI and the device is easily discernible because of I2C ACK/NAK.

The highest frequency a COG can generate on a pin using PLL will be near 300MHz.

Some instructions will need to be added to make better general use of the CLUT (Color Look Up Table). The per COG CLUT is 128 longs and will apparently have instructions added to make it behave as a COG data array in addition to a IIFO stack and FIFO queue.

Apparently SDRAM in 3.3V packages will be a better fit for the IO since the AC switching characteristics will be faster than for 1.8V. Lower voltage DDR SDRAM can be used as well but will be slower - DDR just means double clock edge really. Going forward DDR will be cheaper than SDRAM because it is more generally available, but SDRAM will remain available at some price.

Chip made some references to COGs having specialty SDRAM abilities but the details are still unclear.

Someone else may have picked up more on all this.

potatohead · 2011-05-22 10:31

Forgot Kye in the list of consults! Glad you mentioned it. Personally, I think that's very encouraging for the overall design. The early adopters are going to see a very favorable design.

Andrey Demenev · 2011-05-22 10:35

I am eager to get my hands on it. Was it mentioned about early samples availability?

potatohead · 2011-05-22 10:47

Engineering samples for testing was something like 40 chips... If they work, then lots of chips are made. If not... another small sample run would be made. That was my take on that discussion. Seems we either get Prop II chips or not. One thing that Chip did mention was the possibility of shipping Prop II chips, sans the code protect feature first, with that to follow, but it was not clear that was the plan, only that could be one potential implication of the design process change to synthesis.

All right you guys: http://www.ustream.tv/recorded/14872589

Time to pay the piper folks! Pick through that scratchy audio, and see what you can find!

(and I'm perfectly happy to have listened to that audio! Thank you very much, as I could not attend this year, and missed that interaction big) It's just that I think I've scraped all I can remember onto the forum. Somebody else will pick up something, so let's get to it! That way we know more, or think we know more...

jazzed · 2011-05-22 10:48

Andrey Demenev wrote: »

I am eager to get my hands on it. Was it mentioned about early samples availability?

I don't know if this is possible at all. It's up to Parallax in all respects of course..

I removed my note about Kye working with Chip because the specific detail of the contribution should be left to him and Chip to discuss publically ... it was mentioned in Chip's speech i believe that USB would be an area of interest. It will definitely be a valuable contribution though.

Baggers · 2011-05-22 10:50

exciting times ahead

and looking like it's on the horizon too

especially for testing.

potatohead · 2011-05-22 10:52

Yeah, probably good call Jazzed. Let's see what those two come up with. Bet it's good. It is mentioned he's going there, so I'll leave mine, and the details to whatever they decide makes sense.

You know what strikes me is the open nature of it. I seriously benefit from each of these presentations, getting a good, rational, and understandable look at what it takes to make a chip, each time. And the overall interaction / synergy is very cool too. Intriguing, and very educational stuff. I like that Chip is just so humble, doing it, sharing it, and excited about it. It was fun to watch Beau drive the software, zooming in to highlight details as they were discussed. Never experienced that before. Damn cool.

@Baggers: Happy days! I think we've got a much shorter wait now. Exciting indeed! Tap... Tap... Tap... I got the sense from Chip that shipping product matters. What can be done to just start production of Prop II devices is gonna get done. That's a big shift, and a encouraging one, ideally marginalizing the "too late?" thread some!

Baggers · 2011-05-22 10:59

yeah, it was awesome watching chip go through it, I can't believe how open he is with it all! Kudos there Chip! I guess this is why we're here! and not with some other chip! he makes us all feel part of the company!

potatohead · 2011-05-22 11:45

Exactly. One of these days, we gotta figure out how to get you to a expo. Jamming all night with Chip around is a lot of fun. Catchy. One thing about open, that I really like, is all the nice positioning and vetting that goes on. When it's done the way Parallax seems to know how to do it, I think it's actually difficult to go wrong.

Heater. · 2011-05-22 12:08

All the ROM plans go away, with the Prop II simply containing a boot loader program,

Wow, is it possible that someone was listening to little old me when I suggested binning most the PROM and using the space for RAM?

The anticipation is getting unbearable here.

Phil Pilgrim (PhiPi) · 2011-05-22 12:42

Although I know it's hard to find the right balance to suit everyone, I'm sad to see the ROM font and function tables go away. ROM takes up much less room than RAM, and now those who need the data formerly contained in ROM will have to sacrifice a much dearer portion of RAM to contain it.

-Phil

Heater. · 2011-05-22 12:59

Yes, but currently a lot of space is wasted in most Spin programs by those COG PASM areas that are generally only ever loaded to COG once. That could be upto 16K bytes of wasted space.
If the Spin compiler would provide for loading COG PASM code from whatever boot device, starting a COG and then reusing the load space for data then a lot of that lost ROM data could be recovered by including whatever you need in the compiled program

potatohead · 2011-05-22 13:32

There are math tables in the COG, and instructions to compute with them.

@Heater: Agreed. And, trade-offs can be made. A 8x8 font, for example, is seen as a waste, when that big Parallax font is in ROM. However, getting some of that back as RAM, means being able to do "custom rom", and I think that can be very efficient, given the math in the COGS now. All in all, a trade-off I can't argue with on that basis.

One negative, which I spun as a positive on the Eclipse thread, is the potential for multiple and thus, fragmented operating environments. Prop tool images might work one way, C images another, etc... It's much more the generic CPU now, though still friendly.

Heater. · 2011-05-22 14:05

potatohead,

...multiple and thus, fragmented operating environments.

Hmm... Not sure if that is a positive or a negative. Most micros "suffer" from that and get along just fine.
Of course it helps if everyone can rely on the "One True Spin" and the OBEX, which I'm sure Parallax will continue with. Having multiple Spin versions popping up would tend to muddy the waters a bit but that encourages experimentation an progress.

Having the "One True Spin" and tool open sourced will hopefully enable it to progress more easily with contributions of the best ideas that are bred in those Spin variants.

As for C and other language users I suspect there is not much demand to build hybrid Spin/C programs so whatever happens in those environments is somewhat orthogonal.

potatohead · 2011-05-22 14:08

Yeah, me neither. I just know it's pretty much gonna happen. Could be really great! Probably will be, given our community. Hope so. I really would be concerned with 'the one true SPIN', but then again, on the new chip, a few extensions are bound to happen... Maybe there will be the original one true SPIN, and the current one true SPIN. Easy cheezy. Do you think it would remain exactly the same?

Heater. · 2011-05-22 14:12

potatohead,

One thing about the SDRAM idea that is a concern is it being a non-parallel bottle neck.

Not sure I understand the issue here.

Presumably external RAM is to be accessed by COG code much the same way as it is now but with some hardware assistance. So we end up with only one COG being able to run code form ext RAM efficiently. Or SRAM as a data store is accessed via the COG that drives it.

The other way to do this would be to arrange for ext RAM to be an extension to HUB RAM and accessible to all COGs via the HUB switch. I guess that might have a bad effect on the Props deterministic timing though and slow everything down.

potatohead · 2011-05-22 14:48

Baggers and I have discussed this in the context of video related tasks a few times. Where only one COG can operate to the RAM, the ability to use the COGS in parallel gets bottle necked, turning things into streaming problems, instead of simple parallel ones. Streaming data in and out gets complex in that context.

I'm hoping more than one COG can talk to more than one SDRAM. Then it's a choice, depending on how things need to go. The model works best when the HUB is largest. Clearly we get what we get, and don't get me wrong, the SDRAM option is a great thing for XMM! Huge step up, and very welcome. It's going to do a lot of good on the large program, with COGS as peripherals model.

Yeah, you are right on the other way. IMHO, the next step beyond inclusive HUB RAM is to just make external RAM "the HUB", and apply the round robin behavior in that context, thus turning what is a MCU into a full on CPU.

Cluso99 · 2011-05-22 15:07

Great discussion and info guys. Here are a couple of things I heard from the videos...

Each I/O pad has about 6,700 transistors as against about 30 on the Prop I.
Only 120MHz may be realisable because of on chip delays. This is a wait and see from what I gather.
The video can be up to 32b per pixel (8 each for RGB, and 8 for something akin to luminance???) at 1080 YCbCr - I don't know much about video.
Chip is going to add a pop instruction to the Cog CLUT so it can be used as a fifo stack.
The math tables are redundant in ROM because of the ROM in each COG for cordic (64x40b in ROM), 4x128x8b SIN ROM), etc.
Lots of other quick maths routines possible in cog.
The font 16x16 is no longer in ROM. (IMHO this is a waste if you don't use it and takes 16KB ROM - I'd rather 4KB hub RAM, which is basically what we are now getting)
2 PLLs per Counter
I/Os should be used at 3V3 for best speed as 1.8V slows response (8pin groups)
Prop II (all except I/O) likely to use 100mA with all cores running (at 1.8V)
Prop II likely to use 1mA when idle due to leakage at this geometry, hence the low clock speed maybe raised to 1MHz
The PLL can generate about 300MHz at the pin
Lots of I/O pin info - amazing but cannot recall it all.
92 I/O pins with 4 of those used for booting SPI (I didn't recall the I2C) from SPI Flash such as 25C80 (~$1.50 for 0.5-1MB??) or SD card.

Kye is helping Chip with NRZI and bit stuffing to improve the USB functions. Chip said 12MHz max USB speed.

Best possible case is avail in 3 months.

65 bit Fuses. 1b prevents overwriting and access to the fuses after boot and decryption, 64b for user. Will be shipped with semi-random serial no. Apparently these are reprogrammable?? Unfortunately the test chip failed in this area.

As said above, it may be possible that the PropII could be delivered quicker without code protection, with a new version to follow soon after.
IMHO, if that delivers in 3 mths, go for it please. We could all start working on Prop II while we wait for code protection, if it were to follow shortly. Charge a premium until the unprotected chips are exhausted??

All-in-all, I am really excited. With the new way Parallax are putting the chip together, new variants could be available quite quickly at reasonable costs.

Future... Perhaps that elusive 16 cog and 512KB hub would not be out of the question? The cogs are blocks, and the hub is an extendable block. The internal core logic is contracted and synthesised. That only leaves the pins to be shifted to allow for the increased inner space. Keep the same number of I/O pins so the cog does not require a change. Another idea, perhaps put 2 chips together on the die with the common I/O pins simplified to provide an interface between the props.

Heater. · 2011-05-22 15:19

Cluso,

Each I/O pad has about 6,700 transistors

Amazing, that's a tad more than the Intel 8085 for each pin!!!

Best possible case is avail in 3 months.

How about we can put in some pre-orders now? No code protection required here.

Propeller II update - BLOG

Comments