Propeller II (Chat with CHIP) __ Some extra questions will come

evanh · 2010-11-08 06:44

I believe Chip has said that SDRAM bandwidth can be as fast as Hub bandwidth. I'm getting the feeling that 128KB HubRAM is quite sufficient.

Cog count is more likely to be the next bottleneck.

evanh · 2010-11-08 07:20

Also, kudos to Sapieha for posting the chat.

potatohead · 2010-11-08 07:22

@Heater... Heh, guess I did.

Well, when I learn a new thing now , I do both. I enjoy the tech, and the business end of things. Self employed by force a while back taught me that lesson.

Let's just say the bell rang at the school of hard knocks.

In this context, both are really interesting, because I've never had any exposure to the dynamics. Truth is, this little thread has answered a lot of questions I had about why things happen as they do with semiconductors. I've often been puzzled by trade-offs made, thinking, if they had only done... That enriches the hobby on a lot of levels for me. Probably I still don't know squat, but I at least have some plausible musings now. I'll take it!

I agree with Evanh, BTW. I consider all the details necessary to get off chip RAM operating optimally worth more than a large amount of on-chip RAM. 128K is a lot, compared to what we have now. Honestly, the thing could use more pins too, but now we know the dynamics around that not being a viable option right now. IMHO, the parallel nature of Propellers always will call for more pins. Call that a constant at this point.

One other thing seems clear to me, and that's the early ramping up ideas of what could happen makes sense now. Go big, then as the layout constraints hit home, scale back, balancing all the way down to the best trade-off possible at the die and package that makes the best financial and technical sense.

If we look at it that way, I think we are getting a LOT!! Lessons learned from Prop I are really gonna pay off in the next chip. I'm pretty stoked, and pleased as heck Chip and Beau are sharing the ride along.

Sapieha · 2010-11-08 07:50

Hi Humanoido and ALL.

I think as BIGGEST THANKS we need give to -->
Chirs Savage for give us that chat possibility's on his site.

And

Chip Gracey for answering ours questions.

Humanoido wrote: »

Thanks Sapieha.

Sapieha · 2010-11-08 08:15

Hi CHIP.

Added some extra questions to second post in this thread.

Can YOU answer them if possible?

Ps. in some time more that question will come in this post.
As I know it is good to You with some feedback BUT I to need some feedback from You if my ideas help else if them are VERY bad.

Next thing I'm are waiting on is: As You said! -->
Full instruction set... I'll give to Daniel at Parallax for posting. Or, I'll find out how to get on the forum and do it myself.

I think we ALL wait that. And maybe even can see what are missing ---> And maybe can be implemented that we can give ideas on!

jmg · 2010-11-08 10:42

Sapieha wrote: »

Now Look on attachment PDF -- That maybe give You more readable picture in You mind what I have mean on that.

Next question I have is: Will it be instructions to possibility of use COG's CLUT memory in user friendly requirement.

You do not mention FIFOs ? and also variable width serialize ?

Dual and Quad SPI is now quite common (80MHz+), and devices like XMOS allow user assignable width.

Supporting QuadSPI would be quite important, especially on a memory constrained device.

jmg · 2010-11-08 10:46

Rayman wrote: »

Well, since the SD card interface will be inherently supported, perhaps they could rig the Prop Tool to build SPIN in LMM mode and put the code on the SD card and just use HUB ram for buffers...

Here the QuadSPI (and even the byte wide too?) would be a natural expansion ?.

Thus far, CODE is not (directly) from SDRAM, so other code paths should be provided.
QuadSPI is 80MHz+, which would stream 32 bits at 10MHz.
Not stellar, but fine for many code tasks.

Phil Pilgrim (PhiPi) · 2010-11-08 11:12

Be careful what you wish for! Remember the Homer? (Not that Chip would ever allow that to happen!

)

-Phil

Ravenkallen · 2010-11-08 11:19

The big question i wanted to ask is....
What kind of development platform is going to be released with the Prop 2? I think i remember Chip saying something like there will be 3 different kinds? I hope they will release a platform similar to the C3, but it uses a Prop 2 instead...Maybe they could call it C4(Like the explosive, haha). For those of us who can't hand solder smd parts(Well, at least not very well), this is going to be imperative.

Maybe they could make a board simliar in size to the STAMP or Prop stick and only run out 32 or so I/O lines(even though it would mean that I/O pins would go unused, it would provide a simple interface to a breadboard). I mostly want the new Prop for the 128 Kilos of memory, speed and direct boot from SD. That ought to put a cramp in Arduino's style.

Parallax has done well by making all of their products available in a easy to use DIP package. I think they should be wary about deviating to far from this "unspoken pledge"
Parallax is a good company, run by real people and i thank them for being so open with their designs...

One smaller question..
What kind of speed increase are we talking about for SPIN on the Prop 2. I know that Chip said that ASM would be like 8 times faster. Is it safe to assume that SPIN will have a equal increase?

Harley · 2010-11-08 11:51

Ravenkallen, there was this info from Chip:

bittled: Will you have a breakout board for the Prop II for breadboarding?
CHIP: Breakout board... Absolutely!
CHIP: ...Probably a few breakout boards.

I hope the forum is allowed to provide some good info on what WE'd like also. Wow! 96 (including crystal, reset, brownout) + power/grounds; lots of header pins to provide

Yes, it might be nice to do development on the Prop 2.

Ravenkallen · 2010-11-08 12:22

@Harley.... yes, but i was wondering if he had any details of the various proposed boards...

Electronegativity · 2010-11-08 12:32

This all sounds truly awesome.

I have been wanting to build an oscilloscope for a long time.
Now I have a vision of a 32 channel scope that displays 3D data in HDTV.
As much as I enjoy the dials and switches, it will be much more cost effective to replace them all with on screen menus.

markaeric · 2010-11-08 13:01

Parallax not building breakout boards would be like Ford not building cars. With all the capabilities of the prop2, I'm sure there will be a huge amount of boards from Parallax, and others.

I have to imagine that spin will see a significant increase in speed, considering the single cycle instructions including mul/div, and 4 Long rd/wr.

Cluso99 · 2010-11-08 14:57

Spin will be at least 8x faster (instructions 4x and clock 2x) plus the hub access is 2x. Hub can fetch 16bytes per access so a small cache could be implemented - would need to see the overhead as it may not be effective.

Now, Chip has my faster Interpreter code, together with his own ideas, we could see 20-25% improvement here too. If the decode table was in ROM too we see more cog space as well. And don't forget we can expand to LMM for extra functions. The maths functions I re-coded gave a huge improvement. MUL, DIV and SQRT are now in hardware so this will give a significant speed improvement here too.

All in all, a lot of speed improvement here!!!

william chan · 2010-11-08 18:24

Cluso,

Can you implement local variables in spin to use cog local ram as well in your new spin code.
This would really speed things up.

Bill Henning · 2010-11-08 18:31

Sorry, I disagree.

For an interpreted VM such as Spin, local cog variables cannot be much faster.

Currently a spin op code takes 25us-100us, with an average around 60us.

On Prop2, say it will be say 5x+ faster - call it an average of 10us

On Prop2, a hub acess will take 8 cycles, which is 50ns, or approx. 1/200th of the opcode execution time.

Even if a local cog access took 0 cycles, it would only speed up instructions by 0.5%

As a local cog access takes 1 cycle, it would cause a speedup of less than 0.5%

Conclusion:

It's better to use any cog space that can be freed up for an LMM engine, and FCACHE area.

william chan wrote: »

Cluso,

Can you implement local variables in spin to use cog local ram as well in your new spin code.
This would really speed things up.

Cluso99 · 2010-11-08 20:14

I am in agreement with Bill. There are much better ways to improve performance.

However, fast overlaying of some routines may now outperform LMM.

jazzed · 2010-11-08 20:54

Cluso99 wrote: »

However, fast overlaying of some routines may now outperform LMM.

True if you always know the size of the routine to be used

With rdquad/wrquad or whatever it's called at least more instructions/data can be read into the COG per HUB access cycle. With that, more instructions between slots, and addition of the sorely missed indexed indirect access, maybe a 16MIPS+ aggregate LMM is possible. In any case, I think exciting times are ahead.

turbosupra · 2010-11-09 05:05

How about the cnt function, is it still 32 bit?

Bill Henning · 2010-11-09 08:23

Agreed - this is going to be fun...

Also, all virtual machines can also be re-written to be *MUCH* faster as well.

jazzed wrote: »

True if you always know the size of the routine to be used

With rdquad/wrquad or whatever it's called at least more instructions/data can be read into the COG per HUB access cycle. With that, more instructions between slots, and addition of the sorely missed indexed indirect access, maybe a 16MIPS+ aggregate LMM is possible. In any case, I think exciting times are ahead.

Bobb Fwed · 2010-11-09 08:32

turbosupra wrote: »

How about the cnt function, is it still 32 bit?

I don't see why they would change that. 32 bits is actually way more than you get on a lot of other micros, and even at the faster speed, it only rolls over every 26.8 seconds. Plus, they haven't mentioned a change.

A 64-bit cnt would be nice, and now that I think about it, seems like it would be simple and quite cool. If they did a 64-bit cnt, you could use the lower 32-bits just as the cnt is used now, but also access the upper 32-bits that would roll over every 3653-years. Maybe that's a bit overkill, but what the hay.

Cluso99 · 2010-11-09 09:18

Agreed - this is going to be fun...

Also, all virtual machines can also be re-written to be *MUCH* faster as well.

Bill: And imagine the speed we can get for the Floppy/HardDisk running in the 32MB SDRAM :smilewinkgrin:

Harley · 2010-11-11 10:02

One of Chip's comments was about SDRAMs

SDRAM hooks up very easily for ~$3 of 32MB RAM external.

Searching Digi-Key I wasn't able to find anything at such a price. What might this SDRAM part number be? Anyone have an idea of such SDRAMs?

I note there were various organizations, from 4-bit wide to 32. I've not paid much attention to SDRAMs in the past; anyone using such parts? I need to bring my info up-to-date as it may be soon (about a year?) that Prop 2 might be available. Thanks in advance for any clues.

Bill Henning · 2010-11-11 10:41

Drooling already...

Cluso99 wrote: »

Bill: And imagine the speed we can get for the Floppy/HardDisk running in the 32MB SDRAM :smilewinkgrin:

Harley · 2010-11-11 17:16

OK! Nick McClick's info on 'jazzed's SDRAM board (thread NEW: SDRAM Module for Propeller Platform) explained a lot for me.

But his SDRAM part isn't a '~$3' part; more like $6+ in 1000's quantity. I wonder if Chip was thinking of Parallax's getting many thousands qty pricing?

Anyway, Now I realize the 'D' in SDRAM implies 'dynamic'. And now have one part number to work from.

If anyone knows of Chip's '$~3' SDRAM please let me know of such part number!

Roy Eltham · 2010-11-11 17:35

Harley,
Here is an SDRAM chip in the Mouser online catalog that is $3.50 for 1. (under $3 for 100)
http://www.mouser.com/ProductDetail/ISSI/IS42S16400D-7TL/?qs=sGAEpiMZZMti5BT4iPSEnaWNh6hsejH%252bZrjO6t%252btnf0%3d

They are often just called DRAM, but if you look at the data sheet (linked on that page) it is actually an SDRAM (Syncronous Dynamic Ram).

Harley · 2010-11-11 18:09

Thanks Roy,

Mouser says it is a

DRAM 64M 4Mx16 143Mhz

part

I thought Chip was referring to a 32 MB SDRAM, though he never gave a part number. He mentioned a x16 or x32 interface plus some 20 control lines. That's a lot of control lines; maybe it included refresh addressing?

I'd sure like to know about the part number Chip was referring to. Anyone know?

turbosupra · 2010-11-11 19:10

Bobb,

Exactly, if you could count 3600 years or even 1800 years, no more count problems and confusion. The cnt feature has really been a PIMA.

It can't be that difficult to use two registers and have a 64 bit cnt feature ... but what do I know?

Bobb Fwed wrote: »

I don't see why they would change that. 32 bits is actually way more than you get on a lot of other micros, and even at the faster speed, it only rolls over every 26.8 seconds. Plus, they haven't mentioned a change.

A 64-bit cnt would be nice, and now that I think about it, seems like it would be simple and quite cool. If they did a 64-bit cnt, you could use the lower 32-bits just as the cnt is used now, but also access the upper 32-bits that would roll over every 3653-years. Maybe that's a bit overkill, but what the hay.

whicker · 2010-11-13 02:03

turbosupra wrote: »

Bobb,

Exactly, if you could count 3600 years or even 1800 years, no more count problems and confusion. The cnt feature has really been a PIMA.

It can't be that difficult to use two registers and have a 64 bit cnt feature ... but what do I know?

Then, such a program will read the upper and lower cnt registers at different times (this is still a 32 bit CPU after all), also generating strange bugs that only happen extremely sporadically.

Think about the Rollover from $FFFF_FFFF back to 0 on the lower cnt register just after reading the upper cnt register, now the upper cnt register is actually a count higher than just was read... combine that with the lower cnt, and we've gone back in time apparently.

Really, there's no clean way. There would have to be a "freeze timer" instruction. Ugly.

Yes. Dealing with rollover (basically modulus math) kind of stinks because they don't really teach it in school, other than the little bit in grade school with the minute and hour hands on clocks.

Cluso99 · 2010-11-13 02:20

No problems with the rollover. Read the upper long first, then the lower and if the lower is say <20, re-read the upper to ensure you got it correct.

Propeller II (Chat with CHIP) __ Some extra questions will come

Comments