Shop OBEX P1 Docs P2 Docs Learn Events
The New 16-Cog, 512KB, 64 analog I/O Propeller Chip - Page 21 — Parallax Forums

The New 16-Cog, 512KB, 64 analog I/O Propeller Chip

11819212324144

Comments

  • RaymanRayman Posts: 14,768
    edited 2014-04-11 15:50
    Actually, that article gives me the thought...

    Wouldn't it be nice if hubexec mode would also let you run routines that are in cog ram? Does it do that already?
    Would that give you more predictive timing on critical timing routines, say like SDRAM access or something?
  • mindrobotsmindrobots Posts: 6,506
    edited 2014-04-11 16:01
    jazzed wrote: »
    Right it's just running code "on the metal" as I suggested before. You can call it "Pink Floyd" if you like. I don't care. It doesn't matter, but I called it native. It's not important.

    Pink isn't well he stayed back at the hotel.
    They sent us along as a surrogate band
    We're gonna find out where you all really stand!
  • Heater.Heater. Posts: 21,230
    edited 2014-04-11 16:04
    I did not get it either.

    A Spin program may well have some assembler in it. In a DAT section.

    On a P1 this gets loaded to GOG and runs.

    On a PII it may well never get loaded in a lump like that. It just gets run from where it is.

    Makes no difference to OBEX style objects.
  • Heater.Heater. Posts: 21,230
    edited 2014-04-11 16:12
    Rayman,
    I think executing from cog registers is the same thing as executing from L2 cache on a "normal" core...
    Not according to anyting I ever read.


    Registers are registers. Memory is memory. There may well be one or more layers of cache.


    For sure on a normal processor, cache or not, you cannot do:

    mov    r0, #22
    jmp    r0
    


    On a Prop you can.
  • Heater.Heater. Posts: 21,230
    edited 2014-04-11 16:13
    mindrobots,

    Is there anybody in there?
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-04-11 16:40
    Rayman wrote: »
    Actually, that article gives me the thought...

    Wouldn't it be nice if hubexec mode would also let you run routines that are in cog ram? Does it do that already?
    Would that give you more predictive timing on critical timing routines, say like SDRAM access or something?
    Yes, you can JMP/CALL/RET between hub and cog at will. There are 4 sets of JMP/CALL/RET/PUSH/POP, where the return is placed to $1EF, onto PTRA stack, PTRB stack, or the 4-level internal stack. And there is the JMPSW which is the same as the old P1 JMPRET (not sure if this works for hub though).
    see the Instructions thread.
  • evanhevanh Posts: 16,041
    edited 2014-04-11 16:47
    Rayman wrote: »
    Actually, that article gives me the thought...

    Wouldn't it be nice if hubexec mode would also let you run routines that are in cog ram? Does it do that already?
    Would that give you more predictive timing on critical timing routines, say like SDRAM access or something?

    Hubexec is integral to the redesigned normal Cog operating mode, unless the P1+ is different to the P2 implementation, rather than it being a selection of processor modes that are switched between. So, by-in-large, Cogexec and Hubexec are in reference to where the executing code happens to reside.

    There is still timing/fetching issues with HubRAM that causes stalls when the executing code resides in HubRAM. So timing critical code would still best reside in the Cog.
  • jazzedjazzed Posts: 11,803
    edited 2014-04-11 16:49
    Hmm.

    We're just two lost souls swimming in a fish bowl year after year.
  • ElectrodudeElectrodude Posts: 1,660
    edited 2014-04-11 16:53
    Can there be a way to automatically wrquad ptra+=16 the tiny stack on overflow and rdquad ptra; ptra-=16 it on underflow, i.e. to automatically swap it into or out of the hub? It would be faster than using only a hub stack but would allow for a bigger stack than just the tiny stack.

    If you can't do this, can you please at least make the tiny stack have 8 levels?

    electrodude
  • evanhevanh Posts: 16,041
    edited 2014-04-11 17:02
    Is it even a good idea to have the tiny stack at all? From what I've read compilers just won't use it. Doesn't that then mean that assembly coding can do fine through other means?
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-04-11 17:11
    evanh wrote: »
    Is it even a good idea to have the tiny stack at all? From what I've read compilers just won't use it. Doesn't that then mean that assembly coding can do fine through other means?
    Yes, I have wondered that too. But without it, we have to use hub (or the fixed $1EF which means we then need to save it somewhere ie make our own stack). But 4 deep is small.

    Maybe we could have 1 hub stack (using PTRB) and one cog stack (using INDB) or a deeper LIFO ???
  • David BetzDavid Betz Posts: 14,516
    edited 2014-04-11 17:28
    evanh wrote: »
    Is it even a good idea to have the tiny stack at all? From what I've read compilers just won't use it. Doesn't that then mean that assembly coding can do fine through other means?
    Is there a tiny stack in P1+. I hadn't noticed. You're right that the compiler is unlikely to use this for anything other than COG helper functions. I think PASM programmers liked the idea of it though and I like the idea that self-modifying code isn't required.
  • David BetzDavid Betz Posts: 14,516
    edited 2014-04-11 17:32
    Rayman wrote: »
    Actually, that article gives me the thought...

    Wouldn't it be nice if hubexec mode would also let you run routines that are in cog ram? Does it do that already?
    Would that give you more predictive timing on critical timing routines, say like SDRAM access or something?
    I had assumed that was already possible.
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-04-11 17:42
    David Betz wrote: »
    Is there a tiny stack in P1+. I hadn't noticed. You're right that the compiler is unlikely to use this for anything other than COG helper functions. I think PASM programmers liked the idea of it though and I like the idea that self-modifying code isn't required.
    It's in the instruction spec (unless Chip has forgotten to remove it) as being 4-levels deep. Seems like the P2 version we had for tasks.
    Unfortunately 4 deep is a bit short to be of much use.
  • rjo__rjo__ Posts: 2,114
    edited 2014-04-11 18:23
    I am constantly amazed by how little information is necessary to give you guys a complete understanding... It is just f***ing astounding.
    The idea that ozpropdev does what he does on a regular basis is a perfect example.

    I think most of the people lurking have no idea what you guys are talking about. I have no clue how the next chip is going to operate... the instructions are fine, I get them
    very easily... after that it is less than a blur.

    How about throwing some diagrams into your arguments?
  • rjo__rjo__ Posts: 2,114
    edited 2014-04-11 18:25
    By the way, I like the idea of calling the default (first user experience) mode... native. It gives the natural impression that other things are possible.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2014-04-11 18:48
    Well, this thread has certainly devolved into mediocrity! I stay away for a couple days, and arguments that used to be centered on substance now revolve around semantics. Where's the passion, guys? :)

    Seriously, it's probably a good sign that actual deicsions are being made. Decisions presage progress; progress presages silicon. (I'm not holding my breath, though.)

    -Phil
  • mindrobotsmindrobots Posts: 6,506
    edited 2014-04-11 18:48
    rjo__ wrote: »
    By the way, I like the idea of calling the default (first user experience) mode... native. It gives the natural impression that other things are possible.

    When I first used "Native" last night, it was in parentheses and mostly as a space holde between CMM and XMM. The comment afterdwards said "a big flat address space like everyone else"

    I'm REALLY glad now I didn't put "commando"...I was considering it last night as a joke.
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-04-11 18:49
    Chip,

    Are these registers correct?

    Should the first INDA/INDB be PTRA/PTRB ?
    They are INA & INB (the Port Inputs) - thanks Seairth.

    Are you implementing INDA/INDB ?
    addr        read        write        name        background
    --------------------------------------------------------------------------
    000..1EE    RAM        RAM           -           -
    1EF         RAM        RAM           (used by LINK to save return address)
                                                   
    1F0         CNT        -             CNT         DCACHE0
    1F1         RND        -             RND         DCACHE1
    1F2         INA        -             INA [s]PTRA?[/s]   DCACHE2
    1F3         INB        -             INB [s]PTRB?[/s]   DCACHE3
    1F4         RAM        RAM+OUTA      OUTA        -
    1F5         RAM        RAM+OUTB      OUTB        -
    1F6         RAM        RAM+DIRA      DIRA        -
    1F7         RAM        RAM+DIRB      DIRB        -
    1F8         RAM        RAM+CTRA      CTRA        -
    1F9         RAM        RAM+CTRB      CTRB        -
    1FA         RAM        RAM+FRQA      FRQA        -
    1FB         RAM        RAM+FRQB      FRQB        -
    1FC         PHSA       PHSA          PHSA        ICACHE0
    1FD         PHSB       PHSB          PHSB        ICACHE1
    1FE         indirect   indirect      INDA        ICACHE2
    1FF         indirect   indirect      INDB        ICACHE3
    
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-04-11 18:54
    Phil and kuroneko,

    Do we need 2 sets of counters in each cog?
    Would one set suffice?
    I presume some modes could be removed because of the new Smart I/O features? If so, what?
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2014-04-11 19:11
    Cluso wrote:
    Phil and kuroneko,

    Do we need 2 sets of counters in each cog?
    Would one set suffice?
    I was rather hoping for three. My signal front-ends-plus-I/Q-mixers use three counters (actually five, including the local oscillators), and i've had to start an extra cog in the P1, just to accommodate the third one.

    A single counter per cog is useless for the kind of stuff I do. But, again, if more can't be accommodated, I'm content to keep using the P1. It'll be lower-power and less expensive anyway. At some point, it would be nice to see a multi-core chip from somebody that's optimized for RF apps, without having to resort to FPGAs.

    -Phil
  • RamonRamon Posts: 484
    edited 2014-04-11 19:11
    Well, this thread has certainly devolved into mediocrity!

    It looks that we cannot self-moderate ourselves in this forum. It could be possible to close all threads in the P2 forum for a while? or limit the number of post per day? (Allow just one post per person per day.) I think that right now we are a huge waste of time for Chip.

    Do we want to do the same again? The best microcontroller never made.
  • RossHRossH Posts: 5,484
    edited 2014-04-11 19:12
    jazzed wrote: »
    Ya Heater that's it.

    Ross, I don't understand what you mean. OBEX is comprised of Spin code. How is it relevant?

    I'm assuming there will be an OBEX equivalent for high level languages other than Spin. Then it will become quite important to know which objects use COG mode and which use HUB mode.

    Ross.
  • SeairthSeairth Posts: 2,474
    edited 2014-04-11 19:17
    Cluso99 wrote: »
    Chip,

    Are these registers correct?

    Should the first INDA/INDB be PTRA/PTRB ?
    Are you implementing INDA/INDB ?
    addr        read        write        name        background
    --------------------------------------------------------------------------
    000..1EE    RAM        RAM           -           -
    1EF         RAM        RAM           (used by LINK to save return address)
                                                   
    1F0         CNT        -             CNT         DCACHE0
    1F1         RND        -             RND         DCACHE1
    1F2         INA        -             INA PTRA?   DCACHE2
    1F3         INB        -             INB PTRB?   DCACHE3
    1F4         RAM        RAM+OUTA      OUTA        -
    1F5         RAM        RAM+OUTB      OUTB        -
    1F6         RAM        RAM+DIRA      DIRA        -
    1F7         RAM        RAM+DIRB      DIRB        -
    1F8         RAM        RAM+CTRA      CTRA        -
    1F9         RAM        RAM+CTRB      CTRB        -
    1FA         RAM        RAM+FRQA      FRQA        -
    1FB         RAM        RAM+FRQB      FRQB        -
    1FC         PHSA       PHSA          PHSA        ICACHE0
    1FD         PHSB       PHSB          PHSB        ICACHE1
    1FE         indirect   indirect      INDA        ICACHE2
    1FF         indirect   indirect      INDB        ICACHE3
    

    Thats "IN", not "IND".
  • rjo__rjo__ Posts: 2,114
    edited 2014-04-11 19:22
    I agree that from time to time things get messy around here. BUT there is plenty of information flying around. I wouldn't be coming here if I thought it was just wasted conversation.

    I don't think Chip is paying much attention at all. I think he sees a few words, knows what the rest of the conversation is going to be and then goes back to work.
  • RossHRossH Posts: 5,484
    edited 2014-04-11 19:26
    Ramon wrote: »
    It looks that we cannot self-moderate ourselves in this forum. It could be possible to close all threads in the P2 forum for a while? or limit the number of post per day? (Allow just one post per person per day.) I think that right now we are a huge waste of time for Chip.

    Do we want to do the same again? The best microcontroller never made.

    The only time I really get worried is when I see Chip actually posting a lot. That tells us he's not busy concentrating on the P16X32, or perhaps he's got himself a knotty problem he just can't solve.

    Normally, I think he just drop by occasionally to poke a stick in the anthill, to see if anything interesting crawls out. :lol:

    Ross.
  • rjo__rjo__ Posts: 2,114
    edited 2014-04-11 19:27
    Phil,

    I respect your work. And if you say that there is something in the architecture that will keep you from being able to use it for your RF work... that is important to me.
    I know of all kinds of University level research that depend upon RF modulation. This means that there are academic markets that this chip cannot penetrate.
    It also means that there is research that won't be done because budgets won't allow it.

    So, it isn't just you.

    I don't understand the problem, but if you say it is there, that is good enough for me. I wish you would be a little more verbose. I would like to understand it more, but I also understand that you are not exactly happy right now:)

    Rich
  • jmgjmg Posts: 15,175
    edited 2014-04-11 20:10
    I was rather hoping for three. My signal front-ends-plus-I/Q-mixers use three counters (actually five, including the local oscillators), and i've had to start an extra cog in the P1, just to accommodate the third one.

    A single counter per cog is useless for the kind of stuff I do. But, again, if more can't be accommodated, I'm content to keep using the P1. It'll be lower-power and less expensive anyway. At some point, it would be nice to see a multi-core chip from somebody that's optimized for RF apps, without having to resort to FPGAs.

    My understanding was Chip was looking at removing Counters in the COGs entirely, and using the Pin-Cell counters.
    ( certainly not adding more COG counters )

    There is a 32b adder and Mux and Carry out needed to do NCO at the pins, and that does boost the size of the Pin-Cell and slow it down.
    I tried this, and got varying impacts depending on which Lattice device I targeted.
    The 65nm ECP3 FPGA (~same as Cyclone IV?) seems to have Clock Enables and ripple logic, and it has less impact from adding NCO+Carry mode, than older FPGA/CPLD choices. With a pipelined mux the ECP3 P&R reported over 220MHz

    So this may come down to Speed and Die area, with perhaps some backward compatibility questions, and may even depend on the On-Semi tools and how they think.

    I would think the 200MHz target is important to try to meet, especially as the cores now only need to hit 100MHz

    If the existing COG counters are small enough, it would seem sensible to keep them for backward compatible reasons, if nothing else.
    If that is done, the Pin Cells can be a little smaller, and faster, without duplicating the NCO feature.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2014-04-11 20:13
    rjo__ wrote:
    I don't think Chip is paying much attention at all.

    That would be my fondest hope and a good sign that real progress is being made.
    Phil ... but I also understand that you are not exactly happy right now

    I can't honestly say that I'm unhappy. I'll use -- or not -- whatever actual silicon results from this very flawed and much too-open dev process. And if I don''t, I can make the P1 jump through whatever hoops I need it too.

    But I can't help grimacing occasionally at the tragedy unfolding before our eyes that seems, at times, to be a subconcious effort not to have a real end in sight -- that somehow the process of getting there trumps whatever "there" is. While I can throw my total respect behind such a posture for things like hiking, kayakng, sailing, etc., in this context, it can only have a tragic end. And I shudder thinking about it, because Parallax has held such a pivotal importance in my life, in the firends I've made, and in my business.

    -Phil
  • rjo__rjo__ Posts: 2,114
    edited 2014-04-11 20:17
    My understanding is that in order to get the numbers, Chip had to move some of the logic outward to the pins for a variety of reasons including heat dissipation. I think we are firm on the number of Cogs... and there might be no way to get Phil to where he wants to be without moving even more toward the pins. I think the functionality is the issue... and I see Phil as the canary in the coal mine right now. I am sure Chip is looking very, very carefully at what Phil is saying.
Sign In or Register to comment.