Propeller II update - BLOG

Heater. · 2010-09-12 02:19

Never mind start ups, designing/testing with FPGA and then going to ASIC or "real" chip is the way I have seen it done in large companies nearly twenty years ago. It was very nice as a software engineer being able to tell the hardware guys what to put into their chips to make life easier:)

Of course at that time even though the amount of logic they wanted was orders of magnitude smaller than the Prop it took a couple of boards full of FPGAs to "run" the design.

I don't think anyone here can keep their hats on any more. It's kind of jaw dropping to be able to watch how all this is all done and follow the progress.

Still, very small teams have pulled off world changing microprocessor/controller designs before. The original ARM for example. Let's hope that Chip and Beau have as much success with the Prop II.

Cluso99 · 2010-09-12 21:53

heater: Yes, I know other companies use FPGAs for starters.

Most real innovation seems to come from the small guys who are not constrained within the current methodoligies (spelling?).

This is very true of Chip's outlook. I truely hope that he, Beua and the rest of Parallax achieve this quantum leap. The Prop is certainly a leap but has not yet really inspired enough users to gain the recognition it deserves. We already know this. I really do believe the Prop II will be awe inspiring and in turn it will open up the use of the existing Prop as well.

Beau: Thnakyou for the time you have taken to answer and explain to us mere mortals what you are doing. There are not many who get to do the chip design these days and they certainly do not take the time to show us what they are doing.

David Betz · 2010-09-14 10:56

I posted this to another thread originally but it seems like it belongs here.

Question: Will the following work on the Propeller II?

         movs :fetch, line
         nop
:fetch   mov tag, 0-0

I'm wondering if the deeper pipeline on the P2 will mean that more NOP instructions will be needed between the movs and the mov. I suspect the answer might be no since I think Chip has said that register forwarding would be used to hide the pipeline in most cases. However, if that is the case, will the following code work on P2?

         movs :fetch, line
:fetch   mov tag, 0-0

Seems like forwarding could handle this case as well.

Beau Schwabe · 2010-09-15 06:48

David Betz,

I asked Chip your question, and this is his response...

"You will need to insert ONE instruction between the modifier instruction and the modified instruction. The data forwarding circuitry helps, but it's not magic (as on the no-NOP example). -Chip"

David Betz · 2010-09-15 06:53

Beau Schwabe (Parallax) wrote: »

David Betz,

I asked Chip your question, and this is his response...

"You will need to insert ONE instruction between the modifier instruction and the modified instruction. The data forwarding circuitry helps, but it's not magic (as on the no-NOP example). -Chip"

Thanks for checking into that. I'm glad that forwarding allowed you to make the Prop II compatible with the Prop 1 in this regard. Another thing you could do is to introduce pipeline stalls to resolve these dependency issues. I guess that would make it harder to count cycles in time critical code though.

Harley · 2010-09-15 09:01

Google's browser Chrome also works well with the Intel iMac.

If the layout can be done as fast as some of the videos present, the chip should be done soon? I'd guess some of what was shown doesn't happen as fast in real time!

Beau Schwabe · 2010-09-15 09:05

"I'd guess some of what was shown doesn't happen as fast in real time!" - Nope, that's real time speed. What you don't see is the scope of the size. Those blocks are actually pretty small by comparison to the whole project.

Harley · 2010-09-15 09:40

Ah, that's right. ¡Comprendé!

Although it would be like a mile for your work, I've done some pcb layouts where I needed to work with a 1 mil grid. Was not able to view much beyond the detail of interest at that moment. Had to connect angled traces to connect to some weird device footprint. The checker required that you connect to the centerline of the traces. Looked like lots of movement, but compared to the full layout, it was just some 'jiggles'.

Thank you for the videos, and also thanks to Parallax for allowing us a look-see.

KaosKidd · 2010-09-15 10:54

I just wanted to say thank you for sharing.
Like many others here, I' waiting for this Super Prop to come out soon!

Thanks again!

KK

Unknown · 2010-09-15 18:24

Heater. wrote: »

Never mind start ups, designing/testing with FPGA and then going to ASIC or "real" chip is the way I have seen it done in large companies nearly twenty years ago. It was very nice as a software engineer being able to tell the hardware guys what to put into their chips to make life easier:)

This chip making stuff is over my head. These are the assumptions I was told:

From what I understand an FPGA has to be turned into an ASIC and FPGA-VHDL isn't easy to be converted unless you go with the brand ALTERA (an assumption). It has to be tested (long and involved process) a lot to get it right. Unless you know how to do it, you might also have to be paying for sram, pll, io and std cell library IP but you could do it. From what I understand it costs a lot of money and you probably only get a once in a lifetime chance to get it right.

We talked about it and they're under the impression it costs a lot of money and I imagined that they could offer to sell rights to their creation in return for a baked chip which they thought was a possibility.

Cluso99 · 2010-09-15 18:45

Xilinx is supposedly the leader in FPGAs, having brought out the first ones. Unless you have large volumes, an FPGA makes sense these days.

Beau Schwabe · 2010-09-16 00:19

Just an official bump to the Blog. Ideally I would like to keep updates like this a Friday thing, but things have been going so well this week in terms of layout. Sometimes you can chase an elusive bug that drives you absolutely nuts. I'm sure you all can relate, it's the same kind of scenario that can happen with programming. Anyway all has been smooth sailing so far. There are 4 different memories that we will be testing as well as a few other critical blocks ...

RAM - MEM_COG
RAM - MEM_HUB
RAM - MEM_DUALPORT
ROM - MEM_ROM
OTP - FUSE_BIT

PLL - Phase Locked Loop
OSC - Internal RC Oscillator
BOD - Brown Out Detector
PUD - Power Up Detector

IO_PAD - Packed with a BUNCH of cool stuff I can't talk about :smilewinkgrin:

... Today ALL of the sub-blocks within the Test-Die passed LVS/DRC :smilewinkgrin: ... This means that the remainder of the Test-Die is just a wiring exercise... i.e. no active devices need to be placed. Much of the wiring can be done with an Auto-Router so, once the labels are in their proper place we can move closer towards our early November test chip shuttle run.

Cluso99 · 2010-09-16 00:38

Great work Beau and thanks again for the update.

Are you testing the new counter/video circuit in this shuttle run? I am sure there are as many secrets here as with the I/O pins - and we already know quite a bit about the I/O pins anyway.

We will all be keeping our fingers crossed for a sucessful run.

A quick question... once the I/O pad has been verified, is it just a simple copy/paste to lay the rest of the I/O pads down?

I noted you seemed to imply there may only be 128KB of hub ram. Is this correct or are we still at 256KB or 384KB ??

Maybe when Chip is not looking, you could squeeze another 4-8 cogs into the chip by mistake LOL. Of course you realise no matter what is inside, we will push it way beyond what Chip and you expect.

Magnetspin · 2010-09-16 02:19

Hi Beau
1) Is there a fuse bit for code protection?

2) I have done a lot of work with the SX Chip. What I miss in the prop architecture is:
a) more Timers (with 64 Pin you should be able to more dac, adc work)
b) interrupt's (at least timer:(SX: retiw...)) to do deterministic time loops beside asynchronous tasks in the same cog. We are waisting a lot of coq time with waitcnt, wait....

Best Regards
Magnetspin

Cluso99 · 2010-09-16 06:27

1. I understand there will be some form of decoding. The code is still stored externally (EEPROM/FLASH or perhaps SD/microSD). There will be a few fuses - this may form the decoding or some other protection.

2. (a) The I/Os are full of extra functions. Not sure what the counters will be able to do extra.
(b) Definately no interrupts. But I understand there will be waitxxx with an exit option. There will be some method to multitask within a cog.

Suggest you check out the very long PropII thread.

Beau Schwabe · 2010-09-16 21:20

Cluso99,

"Are you testing the new counter/video circuit in this shuttle run?" - critical components of the circuit will be tested.

"A quick question... once the I/O pad has been verified, is it just a simple copy/paste to lay the rest of the I/O pads down?" - That's correct.

"I noted you seemed to imply there may only be 128KB of hub ram. Is this correct or are we still at 256KB or 384KB ??" - The Physical area right now indicates that only 128KB will fit. ...Wait until I get underway with the COG logic, and then we will have a little more knowledge of the available real estate. It's relatively easy to add memory, providing you have the room to do so. ... I have looked at the possibility of changing the aspect ratio of the current memory, to see how much layout that would involve (takes about 2 weeks to put a memory together by hand), to even see if something like that would work, and that may be a possibility... but we are not to that bridge yet.

Magnetspin,

"1) Is there a fuse bit for code protection?" - That's the idea. OTP programmable fuses are tricky... since you can't really guarantee a 'complete' open circuit every time, i.e. the connection could just become highly resistive after metal vaporization. You must design the fuse in a way that the Fuse is compared differentially to a 'good' reference fuse that isn't purposefully destroyed. ...That said we are testing our implementation of that idea. The exact coding scheme used afterward is ***TOP SECRET***

... no, there are a few ideas that have been discussed, and at this point I'm not sure what will be implemented.

"...with 64 Pin you should be able to more dac, adc work" - 64 Pin? try 92 individual IO's each with their own ADC and DAC built into each IO ... plus soo much more. In a way you can think of the IO's as their own mini processor with a comparable analogy to that of the current counters that are now in each COG with the Propeller I.

K2 · 2010-09-16 21:41

It's so interesting to consider the direction Prop II is going . The original approach of more cogs and more memory has given way to faster access, more I/O, and really crazy I/O. It is that last bit that seems so fascinating to me. My imagination is now running wild over what could be be done with an I/O pad in the context of the Propeller archetecture.

But it's clear that the Propeller II will still be very much an embedded controller. Only more so.

Beau Schwabe · 2010-09-16 21:42

Today's Blog update (09-16-2010):

--> Not that exciting really... lots and lots and lots of wire labeling. Fortunately I wrote a script (thank God for script based design editors) to help facilitate the labeling.

--> for the Test-Die, there is a huge FlipFlop chain with latches on every bit so that it can be serially programmed from outside the test chip. All of that is routed as far as power/ground and all the labels associated to the 'D' lines and the 'Q' lines.

--> Three of the four memories have all of their power connections in place

--> One of the memories has all of the appropriate text labeling and I am working on script to generate labels for the remaining memories.

--> The IO's will need labeling also, since there is a substantial wire buss to and from each IO.

Cluso99 · 2010-09-16 22:38

WOW Beau. Those I/O sound extremely attractive. I cannot wait to see what we can do with them outside what you and Chip 'think' we can do with them :idea:

I seriously do hope you can squeeze at least 256KB of hub in. Guess we will just have to wait and see.

How much ROM do you expect? Would it be in order for me to start a thread to see what we could put in there, presuming of course Chip and you have not filled it already?

Heater. · 2010-09-16 23:41

Oooo...128K RAM is a serious disappointment after being teased with 256K and even more at some time.

But wait.. There will be 92 IO pins each with 7000 transistors. Good grief, that's as many transistors as an Intel 8008 per pin! Am I reading that right?

So I'm hoping there is are ways to use those 644000 transistors to enable attachment of external RAM with assistance with the bus "wiggling" such that programs compiled to some kind of LMM (XMM?) by Catalina or ICC or byte code interpreted languages can be run at speed from external RAM.

I presume a seamless integration of the external RAM into the memory map is unlikely.

Cluso99 · 2010-09-17 02:31

Thought I might speculate about the extra functions on those pins... (since I have no idea I can do that)

We know the counter/video section is going to be a lot smarter with many extra functions. We also know there is almost as many transistors as the Z80 had (8,500) and more than the 6800 & 6502 (4,000).

So, my speculation is that instead of the counter set being for each cog, they will be for each pin. Hey, that's 92 counter sets!!! Most likely this will allow such things as the refresh generation and address generation for SDRAM. My guess is that the UART/SPI/I2C will be much easier to implement and we will get much faster comms. This would result in less cog code for these functions. There has also been discussion about USB so those functions may be there too.

I am sooo excited :smilewinkgrin:

Heater. · 2010-09-17 02:52

Good speculation Cluso.

Begins to sound like we will have something akin to 92 little processors all running at 160MHz.

This is a very serious attack on the "software defined silicon", FPGA replacement idea as pushed by a certain other manufacturer.

The waiting is getting harder.

Sapieha · 2010-09-17 06:13

Hi Beau Schwabe.

You said 92 DAC's that is very nice BUT.
One thing is to have them --- another to control them.

My question ? --->
Will it be possible to synchronize 3/4 that DAC's to output DATA correctly in same time.

To control 3 phase motors them needs possibility to be synchronized in 120 degrees but strobed on outputs in same time.

To control Steeper motors in micro steps them need be strobed on 4 DAC's in same time.

Beau Schwabe (Parallax) wrote: »

"...with 64 Pin you should be able to more dac, adc work" - 64 Pin? try 92 individual IO's each with their own ADC and DAC built into each IO ... plus soo much more. In a way you can think of the IO's as their own mini processor with a comparable analogy to that of the current counters that are now in each COG with the Propeller I.

User Name · 2010-09-17 08:04

Heater. wrote: »

Oooo...128K RAM is a serious disappointment after being teased with 256K and even more at some time.

I'm just a newbie and not an advanced thinker like Humanoido, but on a philosophical level it seems a waste to throw a lot of RAM into HUB. HUB access is too slow, even on the Prop II, to make it the smart place to stick most of your transistors.

If these super I/O cells can providing high speed access to external SRAM, then HUB would start looking like nothing more than storage space for SPIN variables, temporary storage for COG code, and a transfer point for low-priority data and semaphores. (Which is all it ever was anyway.)

And even if the Super I/O cells don't facilitate external SRAM connection (but I wouldn't bet against it), I'll bet they can be used for high speed signaling and data transfer between COGs - at a far faster rate than through HUB. And clever people like Bill Henning will find a way to attach external RAM, FIFOs, shift-registers, etc, anyway.

Heater. · 2010-09-17 08:30

User Name,

Don't forget that the Propeller and it's successor are micro-controllers not micro-processors.

That is to say they should pretty much run by themselves without attached RAM and ROM and whatever external circuitry is required to decode, latch, buffer that stuff.

(OK the Prop has it's EEPROM dependency but that is only a tiny thing hooked on two pins.)

Also don't forget that for a lot of micro-controller applications one would quite like to have all those pins for interfacing to real world gadgets rather than waste them on RAM/ROM buses.

So the demand for ever more internal RAM is still there.

Yes it may look slow, that is not a negative point on the Prop compared to other micro controllers because we still have those other seven COGs that are doing the necessary high speed work whilst the apps main loop wonders around on the first COG.

Having said that I'm looking forward to what the likes of Bill, Cluso, Jazzed, Dr_A etc will come up with in the Prop II external RAM department.

User Name · 2010-09-17 09:16

Heater. wrote: »

User Name,

Don't forget that the Propeller and it's successor are micro-controllers not micro-processors.

Not forgotten at all. In fact the Prop I is as fine a microcontroller as I've ever seen. I'm not looking for more.

But I've become aware that you and others have made it more than just a microcontroller. It is for folks like you that I want it to have the necessary hooks. Haven't you spent great efforts to get small speed improvements in Zog? In light of that, wouldn't you be better served by having fast access to 512K of external SRAM than by slow access to 256K of HUB?

Heater. · 2010-09-17 09:43

Of course I want things like Zog, Catalina, and such to be able run as fast as possible from external RAM. But I would not want any internal RAM to be sacrificed in order to do that.

From what I gather of the Prop II it is going to be substantially faster anyway and apparently the I/O system is going to give some extra assistance for external RAM and other bus usage. That's the "hooks" you mention".

I'm happy:)

evanh · 2010-09-17 09:54

Sapieha wrote: »

My question ? --->
Will it be possible to synchronize 3/4 that DAC's to output DATA correctly in same time.

To control 3 phase motors them needs possibility to be synchronized in 120 degrees but strobed on outputs in same time.

To control Steeper motors in micro steps them need be strobed on 4 DAC's in same time.

Good question. But neither of those examples requires such simultaneous firings since both examples are a broadly modulated analogue that are far from noise free and far from needing precision on a per sample basis. The response of the motor windings is likely at least a couple of orders of magnitude slower than your carrier frequency. The currents are in phase even if the switching isn't.

The one exception would be if you were to independently drive the high-side and low-side stages with separate outputs on the Prop. And there is an argument for doing that.

A better example is for multichannel direct A/D where noise and phase alignment are critical for each and every sample. Where you want to compare the signal between channels.

The answer is no problem really. That's the power of the Prop, it's deterministic even down to instruction to output response time. About the only unknown is clock jitter/skew.

User Name · 2010-09-17 10:08

Heater. wrote: »

Of course I want things ... to be able run as fast as possible from external RAM. But I would not want any internal RAM to be sacrificed in order to do that.

That's probably where we differ. The only thing I wish I could add to the Prop is bit-mapped video. I'd be plenty willing to stay with 32K of HUB RAM if it meant that I could solder an SRAM to some I/O pins and get graphics.

evanh · 2010-09-17 10:14

Magnetspin wrote: »

b) interrupt's (at least timer:(SX: retiw...)) to do deterministic time loops beside asynchronous tasks in the same cog. We are waisting a lot of coq time with waitcnt, wait....

LMM and future VMs cover that one.

Propeller II update - BLOG

Comments