Shop OBEX P1 Docs P2 Docs Learn Events
The New 16-Cog, 512KB, 64 analog I/O Propeller Chip - Page 43 — Parallax Forums

The New 16-Cog, 512KB, 64 analog I/O Propeller Chip

14041434546144

Comments

  • TubularTubular Posts: 4,646
    edited 2014-04-22 22:15
    Very slick, Chip.

    Bodes well for multiple cogs in a DE0, even allowing for quite a few LEs for each smart pin.

    I'm curious how you divvied up the 6 ALU blocks to reduce current - by critical path? Instruction coding? Instruction frequency?
  • cgraceycgracey Posts: 14,133
    edited 2014-04-22 22:28
    jmg wrote: »
    Ahh.. oops, when you said 'the ALU', I took the Arithmetic to mean the single shared Arithmetic Block in the Hub.

    So the COG ALU does MUL, MULS and Addition and Subtraction opcodes ?


    That's right.
  • cgraceycgracey Posts: 14,133
    edited 2014-04-22 22:29
    T Chap wrote: »
    I am curious if since the ALU is slower than the cog, does the cog simply just wait for a reply from the ALU and then continue?


    The cog allows two clocks for the ALU to settle. By that time, the result WILL be ready.
  • cgraceycgracey Posts: 14,133
    edited 2014-04-22 22:39
    Tubular wrote: »
    Very slick, Chip.

    Bodes well for multiple cogs in a DE0, even allowing for quite a few LEs for each smart pin.

    I'm curious how you divvied up the 6 ALU blocks to reduce current - by critical path? Instruction coding? Instruction frequency?


    By functional blocks:

    add, inc, rot, logic, mul, mux

    Each block's internals take about the same time to resolve, while each block varies in response time. They are all within 60% of each other's sizes. I arranged the instruction op codes so that the blocks were easy to decode. Their different response times makes the final mux easy to arrange, with the slowest (add) at the top of the mux.

    Right now, I'm going to make the 'add' block go faster by pre-decoding a few internal signals before the flops. This will get the adder off to a faster start and earlier finish time, as it is sticking WAY out, slowing everything down. In this design, the Z flag must be totally resolved by the end of the ALU cycle. On the Prop2, it was half-resolved, with other circuits getting the final result early in the next cycle. With these power-saving flop partitions, though, everything must be done at the end of the cycle, so the Z-test must complete AFTER the result mux. This tacks more time onto the cycle. By getting rid of SUMZ/SUMNZ (which were hardly used, according to the analyses Phil, Heater, and Roy did), I can pre-determine the bottom-level inputs to the 'add' block to make it go faster. This thing is sticking out like a sore thumb, causing the Z-test to stretch things out to 100MHz, when we were 120MHz before. I had gotten one compile to go 140MHz, but I haven't been able to repeat it.
  • ElectrodudeElectrodude Posts: 1,639
    edited 2014-04-22 22:50
    cgracey wrote: »
    This thing is sticking out like a sore thumb, causing the Z-test to stretch things out to 100MHz, when we were 120MHz before. I had gotten one compile to go 140MHz, but I haven't been able to repeat it.

    100/120/140 MHz or MIPS? Didn't it use to be 200MHz?
  • TubularTubular Posts: 4,646
    edited 2014-04-22 22:57
    Thanks, Chip. Fascinating stuff.
  • cgraceycgracey Posts: 14,133
    edited 2014-04-22 23:09
    100/120/140 MHz or MIPS? Didn't it use to be 200MHz?


    100MHz for the ALU means 100MIPS, or 200MHz clock.
  • ElectrodudeElectrodude Posts: 1,639
    edited 2014-04-22 23:20
    OK, thanks. I didn't realize the ALU had a separate clock.
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-04-22 23:56
    WTG Chip !!!

    New Cog ALU 1800 LEs
    P1 Cog complete 1850 LEs
    P2 Cog complete ~30,000 LEs

    Shows the complexity and hence power of the resulting P2 design.
    The shakedown seems to have resulted in something extra-ordinary!!
    A very efficient design plus >200MHz likely too.
  • Heater.Heater. Posts: 21,230
    edited 2014-04-23 02:23
    Chip,
    The entire Prop1 cog was 1850 LE's, while the ALU is always the biggest chunk. The Prop2 cog was ~30,000 LE's.
    Say what! And 100MIPs. And 16 of them. Now I'm very worried. This sounds too good be true :)

    Can't wait to see it in action. Just dusting off my nano board as we speak.
  • evanhevanh Posts: 15,349
    edited 2014-04-23 02:31
    cgracey wrote: »
    ... All ALU operations take two clocks.

    Not including branch instructions, right?
  • Heater.Heater. Posts: 21,230
    edited 2014-04-23 02:56
    Who cares? My code only goes straight ahead. It's moving too fast to make a turn :)
  • BeanBean Posts: 8,129
    edited 2014-04-23 05:04
    In the quest to get more instructions, I was wondering if anyone has thought about using the IF_xx bits for instructions that would make no sense if they were conditional.

    For example "IF_Z NEGZ value1, value2" would not make sense, since the instruction could be assembled as just "NEG value1,value2".
    Likewise "IF_NZ NEGZ value,value2" would be assembled as just "MOV value1,value2".

    So could not the IF_Z condition flag be used to indicate a totally different instruction for NEGZ and the like ?

    Just a thought....

    Bean
  • evanhevanh Posts: 15,349
    edited 2014-04-23 05:15
    Start with NOP, or more accurately IF_NEVER. There is currently over 268 million possibly variants of NOP in the instruction set.
  • SeairthSeairth Posts: 2,474
    edited 2014-04-23 12:19
    Bean wrote: »
    In the quest to get more instructions, I was wondering if anyone has thought about using the IF_xx bits for instructions that would make no sense if they were conditional.

    For example "IF_Z NEGZ value1, value2" would not make sense, since the instruction could be assembled as just "NEG value1,value2".
    Likewise "IF_NZ NEGZ value,value2" would be assembled as just "MOV value1,value2".

    So could not the IF_Z condition flag be used to indicate a totally different instruction for NEGZ and the like ?

    No, you can't substitute like that. In the first example, if Z=0, then NEGZ doesn't execute at all and the D register (value1) does not get changed. NEG (without a predicate) always updates D, which may not be your intent.
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-04-23 13:27
    Bean,
    We are not short of opcodes. However, on the earlier P2 Chip used these bits to define some instructions.
  • T ChapT Chap Posts: 4,217
    edited 2014-04-23 13:58
    What is involved to offer a .65mm pitch version?
  • Roy ElthamRoy Eltham Posts: 2,996
    edited 2014-04-23 14:53
    T Chap,
    A LOT more money! (million+ instead of in the 10s-100s of thousands)
    Oops, misread.
  • T ChapT Chap Posts: 4,217
    edited 2014-04-23 14:56
    I am talking pin spacing.
  • Brian FairchildBrian Fairchild Posts: 549
    edited 2014-04-23 15:19
    T Chap wrote: »
    What is involved to offer a .65mm pitch version?
    Is there a 0.65mm 100-pin TQFP package?
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2014-04-23 15:47
    Is there a 0.65mm 100-pin TQFP package?

    Even if there was, the 0.4mm pitch package is big enough already, having it too big would limit some applications I feel. Here's a footprint comparison between P1 and P2 (thermal pad in blue is approximate in size).

    P2-P1 FOOTPRINT.png

    Correction, I have used a QFP100 with 0.65mm pitch before (not TQFP).
    P2-P1 FOOTPRINT-2.jpg
    899 x 876 - 15K
    1024 x 805 - 128K
  • RaymanRayman Posts: 14,132
    edited 2014-04-23 16:52
    Is there a 0.65mm 100-pin TQFP package?

    I googled and found someone selling a probe adapter for one, so I guess it does/did exist...

    I have to say that 0.65 is enormously easier than 0.5 mm for me.
    Don't know if it's an option, but I'd also much prefer 0.65 mm spacing.

    For the old P2, maybe it didn't matter because I guess we'd mostly buy Parallax made modules anyway...

    But, the new P2 has enough RAM that I think I could do cool things without the need for SDRAM.
    So, I'd like to be able to solder the chip myself.

    I can do 0.5 mm, but it often needs very difficult rework, needing a loop to inspect the leads...

    BTW: Why does everybody quote everything somebody says when replying? People seem pretty quote happy here...
  • mindrobotsmindrobots Posts: 6,506
    edited 2014-04-23 17:01
    Doesn't some of the packaging options depend on the one Chip found with the large thermal pad. It sounded like he wanted to use that packager so it's up to what they have to offer.
  • jmgjmg Posts: 15,155
    edited 2014-04-23 17:11
    T Chap wrote: »
    What is involved to offer a .65mm pitch version?

    Mainly bonding mapping and deciding what not to bond, as well as NREs for test and packaging setups.

    Amkor show these possible pin counts, in a Thermal Pad, 14x14 package : 52/64/80/100/120/128

    and the appx Pin pitch these pin counts imply are
    12.5/(128/4) = 0.390625
    12.5/(120/4) = 0.416
    12.5/(100/4) = 0.5 << Package Chip has chosen
    12.5/(80/4) = 0.625 (0.65mm) << Might just manage 64io ?
    12.5/(64/4) = 0.78125 (0.8mm) << 64 pin 0.8mm seems common Asian option
    12.5/(52/4) = 0.9615

    and there is also Body Size 14mm x 20mm, Pitch 0.65mm, and also a 128 pin 0.8mm

    There is a bit of a trend for Asian uC vendors (eg Renesas) to offer both 0.5mm and 0.8mm packages, which must be production-flow/handling driven.

    I've seen consumer PCBs with very long pads / solder thieves, 45° placed, on what I think was 0.8mm, that looked to be wave soldering handled. - some info here http://www.ami.ac.uk/courses/topics/0170_wsp/

    Infineon also says this

    [" Wave soldering an exposed-pad QFP is not possible because the package has to be attached to the PCB by SMD glue. In QFPs with exposed pad, this is not possible if the exposed pad is intended to be soldered to the PCB. Furthermore, wave soldering is only possible if the products in QFPs are qualified for wave soldering (passing the solder heat test).
    Generally wave soldering of QFPs is difficult because the leads have to be soldered at all four sides of the component. Using a 45° rotated layout is recommended to allow the solder to wet the leads more easily. Wave soldering of QFPs with lead pitches of 0.65 mm or smaller may lead to excessive bridging and therefore is not recommended."]
    ie ~0.8mm is wave solder min.

    Looks like a common option is the 64p 0.8mm one, question is, what IO's can that give (~48?) , and is that still useful ?
  • T ChapT Chap Posts: 4,217
    edited 2014-04-23 17:28
    I will use a stainless stencil on the .5mm Prop, not wave, and there will percentage of bridge fixes. .65mm is very easy. I was more asking about .65mm as an option. I will be dropping 3 SO16 wide body parts and 1 SO8 with the implementation oft the new Prop, so I would be happy with the large package.
  • RaymanRayman Posts: 14,132
    edited 2014-04-23 17:33
    Think I'll have to get a stereo microscope if P2 comes out with 0.5mm pitch, as planned.
    Need to figure out what X I need...
  • T ChapT Chap Posts: 4,217
    edited 2014-04-23 17:36
    So much space, yet 25 pins crammed together. I think I'd prefer a BGA 100 10x10 array at 1mm centers, although I understand the thermal needs. To get access to a TQFP pins you will need some tiny traces at the chip fanned out as large a the .65mm part anyway.
    634 x 384 - 19K
  • jmgjmg Posts: 15,155
    edited 2014-04-23 17:47
    T Chap wrote: »
    I will use a stainless stencil on the .5mm Prop, not wave, and there will percentage of bridge fixes. .65mm is very easy. I was more asking about .65mm as an option. .

    If adding another package, it might be smarter to gain more than just slightly better yields on manual assembly.

    The 0.8mm option seems to open up wave soldering, and I find there is a 88 pin 0.8mm offering in 20mm x 20mm.
    (which may be a better combination that the 28mm 128 pin 0.8mm ?)

    88 pins should manage 64io quite well, or the more common 80pins fits 0.8mm at 14mm x 20mm, but has fewer Vcc pins for still 64 io.


    http://www.pcb-3d.com/tag/qfp
  • T ChapT Chap Posts: 4,217
    edited 2014-04-23 17:58
    Yes, 88pin at .8mm is great.
  • RaymanRayman Posts: 14,132
    edited 2014-04-23 18:47
    Maybe they're thinking that using smaller pin spacing means that pins are closer to the die and therefore are better at dissapating heat...
Sign In or Register to comment.