Shop OBEX P1 Docs P2 Docs Learn Events
Any interest in (more than 8) cog chip? — Parallax Forums

Any interest in (more than 8) cog chip?

prof_brainoprof_braino Posts: 4,313
edited 2013-07-19 14:26 in Propeller 1
Another thread discusses a 1 cog chip

http://forums.parallax.com/showthread.php/123505-Any-interest-in-one-COG-chip

This got me thinking, but rather than derail that discussion, I thought a separate thread would be appropriate.

Is there any interest in a "more than 8 cog" chip?

Most simple apps have trouble using all 8 cogs. On the other hand, some complex apps can use many more than 8 cogs.

In the context of propforth, we can simply add more physical chips, and communicate with MSC at the cost of one cog on both the master and the slave chip, and two I/O pins each. MSC is similar to Beau's synchronous serial. MSC is also synchronous serial and include protocol, and the link continuously sens 96 bit packets at clock speed. The effect is that we drop in a character on this side, and it pops out the other side, into the i/o stream of what ever cog is connected.

this is fine, but really, for the larger apps, we would benefit from 16 or 32 cogs on a chip, and more I/O pins. Even if the cost is more than $1 per cog, this would be a benefit. The cogs would be full, compete, cores; not pared down or otherwise reduced function.

More I/O pins: Althought the prop 1 COULD control 32 servos at once (for example) we would not be able to talk to the chip. Another 32 pins would be the smallest reasonable increase, I think.

So it seems our need seems to go the opposite direction of the other threads needs in several ways:

- need more COMPETE, identical cogs ; 16 or 32.
- need more I/O pins. 64 pins total would be minimum on a chip with more cores.
- More cog memory would be secondary, but nice.
- More hub memory would be secondary, but nice.
- faster clock would be secondary, but nice.
- VGA support could be removed. Rarely do apps require this. Even those that do only need it on a few cogs, so having it present on all cogs is not a big benefit. Any external solution might provide this functionality better.

Does anyone find a need for MORE cogs?

Could the cost still be around $1 per core?

Thanks for your opinions.
«1

Comments

  • Duane DegnDuane Degn Posts: 10,588
    edited 2013-07-17 12:50
    Does anyone find a need for MORE cogs?

    Could the cost still be around $1 per core?

    I know this idea gets bounced around every so often.

    First off, I often need/want more cogs. I frequently use a second Prop in my projects.

    There are lots of severe problems with adding additional cogs to a Propeller.

    One problem is just the physical layout of the chip. I don't think there's much more room around the hub for the additional silicon the new cogs would require.

    The other big problem is with timing. More cogs means less access to the hub. Many drivers that work on an 8 cog Prop would not work on a 16 cog version.

    I ran into the problem of having no pins left for input when I made my 32 servo demo. Sure the Prop could drive 32 servos but all movements had to be pre-programmed since there weren't any pins left over to receive input. This problem could easily be solved with a few extra chips though. One Prop pin could be used to drive many servos if external circuitry is used.
  • Ken GraceyKen Gracey Posts: 7,400
    edited 2013-07-17 12:52
    I don't have the answer you're looking for, Doug, but I can tell you what customers request from us.

    The most common requests we are receiving are lower pin-count chips, at a lower cost, maybe even with fewer cogs. Aside from this, they ask for more RAM, code protect, A/D and more pins (Propeller 2). This larger request might also include more cogs.
  • Dave HeinDave Hein Posts: 6,347
    edited 2013-07-17 13:03
    The hub access thing could be handled by a more flexible bus arbiter. Some things we run in cogs require precise timing, but I think most applications do not require the timing to be predictable. For cogs that require a periodic bus slice, the arbiter could be told which cycles are assigned to each cog. For the rest of the cogs, they would get memory access on a first come, first served basis.

    With 16 cogs, each cog would be assigned by default one hub access slot out of 16. This could then be modified where one cog might get every 4th access slot, while other cogs get assigned on a first come, first served basis.

    It would be nice if the cog memory was increased. An additional 2K of memory per cog would be good. And more hub RAM would be good. 128K by P2 would be very nice.
  • average joeaverage joe Posts: 795
    edited 2013-07-17 13:25
    A propeller with more cogs would be great. More pins would be even better. A chip with more cog ram would be nice, but the bigger hub would be my first choice. There has only been a couple situations where I ran out of cog ram. This was solved by implementing a state machine to do lmm-esque functions. There has been several times I have run out of hub ram. Lucky I had sd on these projects so segmenting code into smaller applications worked. A code protect scheme would be my first request over all the above options. There are not many projects I work on that are closed source, but the current one is. As nice as any of these improvements would be, I don't expect to see new silicon containing these features until well after the release of ths p2.
  • prof_brainoprof_braino Posts: 4,313
    edited 2013-07-17 13:55
    Ken Gracey wrote: »
    ... but I can tell you what customers request from us.

    most common request... lower pin-count chips, at a lower cost, maybe even with fewer cogs

    Aside from this, they ask for more RAM, code protect, A/D and more pins (Propeller 2). This larger request might also include more cogs.

    Thanks for the input! So, it looks like more PINS is the driving request (and last place), and more cogs has been secondary (or off the list), to most users. That tells the story.
  • cavelambcavelamb Posts: 720
    edited 2013-07-17 14:59
    Ken Gracey wrote: »
    I don't have the answer you're looking for, Doug, but I can tell you what customers request from us.

    The most common requests we are receiving are lower pin-count chips, at a lower cost, maybe even with fewer cogs. Aside from this, they ask for more RAM, code protect, A/D and more pins (Propeller 2). This larger request might also include more cogs.

    And, of course, more memory!
  • localrogerlocalroger Posts: 3,452
    edited 2013-07-17 16:30
    I don't see much use for more cogs. The only way you regularly end up using them all up now is using high-speed drivers that need multiple cogs to interleave data, and the need for those will be much less on P2 with its higher clock rate and pipeline throughput. The system I'm working on right now that inspired the "modular programming in Spin" post is completely replacing an entire PC dedicated kiosk. It needs 1 cog for main Spin logic, 1 cog for PS/2 keyboard, 1 cog for 2 serial ports, and 3 cogs for my 40x18 ROM font VGA object. If I wanted to do archiving to a SD card or ENC28J60 ethernet that would be one more cog each (but I won't have the RAM to properly implement those on P1 along with the HMI). Cogs just aren't the limit, and on P2 I won't need three cogs for the video driver.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-07-17 16:50
    For a larger P1 there seems 4 basic requirements, of which 3 have all been met with P2...
    • more I/O
    • more hub ram
    • more cog ram
    • more cogs
    With a new P1C design (keeping power usage low)...
    • more I/O
      • 48-64 I/O would be nice (perhaps 2 bonding options?)
      • add in another 32 internal I/O for use between cogs (just I/O without the P2 support)
    • more hub ram
      • suggest ~60KB hub ram and use ~4KB for boot/debug (as P2)
      • 64KB is a current P1 limitation and suggest this be kept
      • yes it is a larger die
    • more cog ram
      • Would love to see this but it's just too complex
      • P2 adds a 256 long clut/fifo
    • more cogs
      • Personally I would love to see this. However, doesn't seem to be so much demand
    Other comments...
    • I would like to see the hub access from 1:16 to 1:8 if possible.
    • I don't think VGA or the counters add much silicon space, so no real benefit to remove them from some cogs
    Smaller P1...
    I am really surprised by this request. Seems to me that no one really understands that Parallax cannot significantly reduce the P1's cost without huge
    volume. The die is a certain size, and removing pins does not really reduce this much.
    I believe the driving force to limit the P1 is to achieve reduced cost, not board space.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2013-07-17 20:29
    Is there any interest in a "more than 8 cog" chip?

    I wonder if some of these things can be done in software rather than hardware?

    Buy two propeller chips. Devote two pins to an interprop comms system (objects already written) and devote a small amount of hub ram for circular buffers. Open a spin editor, and start adding objects. Add one line to each object which defines which chip it resides in.
    A lot of the code could be hidden from the user, or added automatically, For instance, say one prop talks to a keyboard, and the other prop does the display. Use a keyboard object and a display object. Write a tiny bit of code that is a series of virtual ports, so the keyboard might decide to send all the captured data to a virtual port #1. And on the display propeller, the characters to display are collected from virtual port #1. The virtual port handler object would have n ports, with circular buffers of size x.

    I've got more things on a "wish list", but thinking about what could be done now with the existing tools. Prop chips can program other prop chips. I have done experiments with the speed and it it possible for a prop chip to 'capture' a download and send it out again to a second prop chip. The virtual port object is a combination of existing code already out there. The second prop could have an eeprom or it could work without an eeprom and be loaded from the first prop's eeprom.

    So ok, not a prop with 16 cogs, but it could be done with two props and one 64k eeprom.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-07-17 20:57
    Drac: I pretty much agree with your analysis. It is after all what a number of us have been doing for a while. And we certainly have not exhausted all the possibilities of what the P1 can do. The prime shortage is pins, and this is mainly due to hub ram shortage. But realistically, we are doing things that the P1 was not designed for, and most likely not it's main stream for commercial volumes.
    I don't mind getting a few pcbs done, but I sure would not want to put the $$$$ into a new P1 just because a few of us have pet projects that may have use for them.
  • Heater.Heater. Posts: 21,230
    edited 2015-07-01 22:52
    Personally I find this kind of speculation disheartening.

    If I were Chip I'd be rather ticked off at it. I mean he spent years developing the fabulous P2 with much input and discussion with all us folks on the forum. And now before it even hits the shelves people are clamoring for more this and better that.

    At one point Chip did ask if we wanted more COGs or more RAM. Given that you have to make a trade off, you don't have an infinite number of transistors or amount of space to play with and you have a price point to consider.. I think the majority said "Gives us the RAM, 8 COGs is OK"

    The reasons for this being:

    1) Adding RAM adds a lot more utility to the device than adding COGs. You can put a lot more functionality into your code for the amount of silicon consumed.

    2) Adding COGs reduces bandwidth to HUB. Not good.

    There are those that say the bandwidth issue can be mitigated by changing the round robin HUB access some how and allowing more HUB cycles to COGs that need it. Or arranging for a bus arbiter that can dynamically let COGs take access slots when other COGs do not.

    I believe this is a really bad idea because:

    1) It buggers up determinism. (You don't have real-time deterministic timing on your HUB accesses)

    2) It buggers up determinism. (You can no longer tell if random objects selected from OBEX will work together)

    Some say that 1) is OK because we rarely need real-time determinism in HUB access, that's all done on the COG side of things.

    BUT I say 2) is the most important thing about the Propeller. Consider: I write an object or two that actually requires the maximum hub bandwidth I can get with some asymmetric HUB access scheme. Then I select an object I like from OBEX that also requires maximum hub access bandwidth.
    BOOM I'm screwed those objects will not play together. And the reasons why they fail may not be obvious.

    Introducing a bus arbiter or such HUB access mechanism means that not all COGs are equal. That breaks a fundamental design principle of the Propeller. Not good.

    Don't forget the PII now has hardware multi-threading in each COG. So an 8 COG PII can run, what is it, 32 real time threads. That goes a very long way to mitigating the desire for more COGs.
  • jmgjmg Posts: 15,183
    edited 2013-07-17 23:27
    Is there any interest in a "more than 8 cog" chip?

    Most simple apps have trouble using all 8 cogs. On the other hand, some complex apps can use many more than 8 cogs.

    Interest ? Sure, interest is free. Commercial critical mass ? : Nope.

    Once you add the threading of Prop 2, this chestnut simply goes away.

    The 8 Cogs that exist now, are very rarely fully code loaded, and it is that code-memory-usage that can now be fully harvested with threading/time slicing.

    That same threading, is what allows 4 or even 1 COG parts to have practical use.
  • jmgjmg Posts: 15,183
    edited 2013-07-17 23:33
    Heater. wrote: »
    ... means that not all COGs are equal. That breaks a fundamental design principle of the Propeller. Not good.

    All COGs do not need to be equal. There is significant silicon cost to the many fast MUL ad DIV, and I doubt many designs will use EIGHT sets of Screaming MathOps.
    I would rather have (say) 4 Full Speed math cores, and 4 smaller, slower ones, could release silicon for some smarter serial IO and smarter Counters/timers.
  • Heater.Heater. Posts: 21,230
    edited 2013-07-17 23:38
    What might be really cool is a way to make using those PII threads as easy to use as COGs are now from Spin.

    Imagine a "threadinit" function that acted just the same as as coginit. Point it at some PASM, give it some PAR pointer and start up a thread.

    I'm sure the idea has issues. Like, how do you know the code you want to run as a thread will fit in the COG that is already running some threads? How does it get loaded, you would need a kernel in the COG to do the load as the hardware can't load a partial COG?

    Hmm..messy. Scratch that idea.
  • franksanderdofranksanderdo Posts: 14
    edited 2013-07-17 23:42
    Hi Folks

    Even though I am very much new to the Propeller, I can't stop to add my opinion here:

    I believe it is important to feed Chip and his Team with Ideas, but on the other hand we have to be fair enough to keep the commercial interest in mind as well.
    Companies have been failing just because they started to early diversifying chip designs (i.e. Inmos).

    See you around
    Frank
  • Heater.Heater. Posts: 21,230
    edited 2013-07-17 23:53
    jmg,

    I disagree. All COGs do need to be equal.

    Of course if you are writing all your code yourself you can be sure you know what resources are needed by your different processes and allocate them as required. That's a bit of extra thoought and planning required but doable. It's all under your control.

    BUT:

    One of the most brilliant features of the Prop is that I can fetch a bunch of objects from OBEX, that I did not write. Then without studying their internals I can stitch them together into an application with a bit of my own top level code. All I have to worry about is:
    0) Do I have as many COGs as those objects require free?
    1) Do I have pins available for them to use?
    2) Do I have enough memory space to fit them in.

    Given the above conditions I can be sure that mixing and matching any objects will "just work". This amazing simplicity of combining stuff is one of the Props greatest strengths.

    However if you destroy the symmetry, if all my COGS are different, all that ease of use of third party objects is also destroyed. I would have to check carefully the resources required of each object and check if they will play nicely together. Yuk.

    In a way it's like putting interrupts into the Prop. Which would bring us back to the world of having to organize resources between objects. Effectively having to assign priorities to objects. Having to give up when two objects both want the top priority etc ect.

    This is not good.

    Your example of maths cores illustrates this perfectly.
    You say it's unlikely we need 8 cores with full high speed maths support.
    I say, what is unlikely is surely going to happen. What if Joe user happens to stumble across five objects that need those maths capabilities? BOOM he is thwarted. And thwarted for reasons he may not understand and should not have to worry about.

    By the way your plan for only four cores with high speed maths has just buggered up my plan for running my Fast Fourier Transform in parallel chunks over six cores:(
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-07-17 23:59
    Heater: I don't agree it is disrespectful to Chip at all.

    It's fun to discuss this sort of thing. It will probably never happen. The P2 will take us way up to the next level and beyond. However, power will be a problem - just how much we don't know yet. IMHO the only mistake we made was when asked about the P1B we said NO, go for the P2. In hindsight, I think most of us would have now preferred to get the P1B a year or two ago.

    I agree that threading will help a lot. But there is still the old 2KB cog restriction although we do have some clut to use as fifo or scratch registers.
    And of course, you are 100% correct in that we asked for more hub ram in preference to cogs, and this was before we got threading. But unfortunately we didn't get as much ram as we would have liked - but that's life.

    What really annoys me is that I just cannot think of that killer app that would sell millions of P1s or P2s :(
  • jmgjmg Posts: 15,183
    edited 2013-07-18 00:04
    Heater. wrote: »
    However if you destroy the symmetry, if all my COGS are different, all that ease of use of third party objects is also destroyed. I would have to check carefully the resources required of each object and check if they will play nicely together. Yuk.

    What you describe is a luxury, and not a necessity.

    If you really want equal COGS, you will have that in Prop 2. - but you do need to be aware that luxury is not cost free.

    History shows us Asymmetric Cores are already being used more, where the price matters. Making all cores the same, simply adds too much to the price.
  • franksanderdofranksanderdo Posts: 14
    edited 2013-07-18 00:11
    Cluso, to be honest I believe if some body would find that killer app right now, it would be exactly that: a killer. :innocent:
    Some how I believe more in the concept of growing slow but steady. (even though I admit that the parallax team might feel better with the income of the killer app :lol:)

    There for don't be anoid, keep thinking :smile:
  • Heater.Heater. Posts: 21,230
    edited 2013-07-18 00:11
    Cluso,

    "disrespectful' might be a bit of an over statement.

    I'm just thinking about it like a mother spending all day cooking a fabulous roast beef dinner with all the trimmings for her children. When it's done and on the table the kids look to mom and say "Where are the hamburgers?".

    I can't help thinking that all those clamoring for more this and better that should come back after the have had the Prop II in their hands for a couple of years and have exhausted all it's possibilities.

    Or as that mother might say to the kids, "next Sunday you can make your own f'ing dinner."
  • Heater.Heater. Posts: 21,230
    edited 2013-07-18 00:38
    jmg,
    What you describe is a luxury, and not a necessity.
    What I describe may well be a luxury, but I believe it is a very important luxury that makes the Prop easier to use for an ocean of people. And not just those new to programming. Also to those that that want to quickly throw something together and not have to invest a lot of time into sorting out the conflicts I describe.
    If you really want equal COGS, you will have that in Prop 2. - but you do need to be aware that luxury is not cost free.
    That is clear enough. I believe the cost is worth it.
    History shows us Asymmetric Cores are already being used more, where the price matters. Making all cores the same, simply adds too much to the price.
    Only if by "history" you mean the past couple of years:).

    We see this in ARM processors now. The logic goes something like this:

    1) Let's have full up ARM core with piles of memory space that can run Linux
    and hence Android. Or IOS.
    2) Better have two or more of those for performance.
    3) We'd like more for real time I/O work in other embedded apps but a full ARM
    is expensive, Let's tack on a couple of lesser non-ARM cores to do that.
    4) Of course you have the GPU to deal with if you want any display output.

    Well, if you want all that just buy a Beagle Bone Black. It will cost you about the same as a Prop board and has similar amount's of IO. There is no need for you to have a P II.

    You might find though that programming all that is a bit of a handful. That's OK if you have a team of software engneers and are going to deliver millions of units of the same product. You can afford to invest the time in development.

    It's not OK for where the Prop(s) end up being used.

    More philosophically. The symmetry and purity of the Prop is a unique selling point. The only other such devices come from XMOS. Even they don't quite do it (XMOS cores can only access "their" set of pins on the device). If we give that up we might as well give up the whole idea and go the way of the modern ARMs as you suggest.

    In general there is no point in trying to compete on those terms, there are already many players there.


    P.S. By the way history shows no such thing. I'm really glad my 4 or 8 core Intel box has all cores the same. That makes it dead easy for my app to be parallelized or for the OS to share app load amoung cores.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-07-18 02:35
    IMHO unless there is considerable space saved by removing something from a cog, there is really no point in making some cogs different. That is after all, the beauty of all the cogs being the same and able to access all the hardware exactly the same. Apart from booting, the only pin restrictions are that VGA uses a bank of 8 I/Os and TV uses a bank of 4 I/Os although the forth is not usually required.

    You almost need a degree to fathom which devices you can use together with some other micros as they share specific sets of pins.

    A large portion of the silicon on the P2 is devoted to each pin configuration/circuit. Some die space could probably be saved here, but then again, the pins would not be equal. The same goes for the clut on each cog. I am totally in awe of what Chip has managed to cram into the P2. I am just hoping I cannot cook my eggs on the P2 ;)
  • prof_brainoprof_braino Posts: 4,313
    edited 2015-07-01 22:52
    Dr_Acula wrote: »
    I wonder if some of these things can be done in software rather than hardware?

    Buy two propeller chips. Devote two pins to an interprop comms system (objects already written) and devote a small amount of hub ram for circular buffers.
    ...

    So ok, not a prop with 16 cogs, but it could be done with two props and one 64k eeprom.

    As mentioned in the first post, this is what we do, except we use cog ram for the buffers.This give us 7 usable cogs on the first prop, 6 usable props on the last prop, and 5 usable cogs on all the one in between. So if N is the number of prop chips 2 or more, the number of available cogs is (N-2) *5 + 13. The delay from the first prop to the last prop is like one cycle per prop chip, otherwise its the communication parts it the same as being on the same chip.

    But we lose resources to the over head. This question is to see if anyone else has found this to be a problem.
    Personally I find this kind of speculation disheartening. If I were Chip I'd be rather ticked off at it.

    Don't be silly. The only stupid question is the one that isn't asked. This is free market research, and if it confirms previous research, it means you did it right. If it doesn't, it means you better start thinking about it right now. In this case, it seems to be showing that folk haven't finished growing into 8 cores, and have not out grown and need more than 8 cores.

    As far as any implementation details, these should be left up to the guy doing the work. Unless you consider yourself Chip's peer in design skill.
  • prof_brainoprof_braino Posts: 4,313
    edited 2013-07-18 06:47
    FYI - this thread is JUST ASKING, not demanding, if anybody else feels a need for more than 8 cores. I was not heavily involved in the prop 2 planning discussion.

    The need we feel is for MANY IDENTICAL cores (16 or 32 or more). The cores should be complete cpu cores in there own right, not trimmed down "core-ettes" as found in the Green Arrays GA144.

    And it seems to confirm that this is NOT general need, so there is not need to get agitated. But its still a need for me, and I will continue to be patient.
  • Dave HeinDave Hein Posts: 6,347
    edited 2013-07-18 07:06
    jmg wrote: »
    Once you add the threading of Prop 2, this chestnut simply goes away.
    Heater. wrote: »
    What might be really cool is a way to make using those PII threads as easy to use as COGs are now from Spin.

    Imagine a "threadinit" function that acted just the same as as coginit. Point it at some PASM, give it some PAR pointer and start up a thread.

    I'm sure the idea has issues. Like, how do you know the code you want to run as a thread will fit in the COG that is already running some threads? How does it get loaded, you would need a kernel in the COG to do the load as the hardware can't load a partial COG?
    Threading does seem like a good way to simulate the functionality of more cogs. A Spin or C interpreter cog could handle multiple threads by maintaining a set of registers for each instance. It will have to avoid self-modifying code, which I think is feasible in P2. However, the cog RAM size is going to be a limiting factor for threads. With the current cog RAM size I think threading will be limited to running multiple drivers that don't require a lot of memory or speed, such as serial drivers.
  • prof_brainoprof_braino Posts: 4,313
    edited 2013-07-18 07:25
    As I understand it, The 32 threads on the P2 can be treated as "execution opportunities" and divided into pools. We can set up a pool for time critical functions that need deterministic execution, and another pool for non-time critical functions. It works for simple drivers, and for the entire application. (The drivers are time critical, the overall application itself need not be).

    This is being tested on the P1, with 8 threads. So far it appears to work really well, but multitasking is orders of magnitude more difficult to debug.

    The developer does have to be aware of what going on to use it properly.
  • ManAtWorkManAtWork Posts: 2,178
    edited 2013-07-18 09:08
    Is there any interest in a "more than 8 cog" chip?

    Imho, this discussion (and also the 1-cog version) is a WOMBAT (waste of money, brain and time).
    Any special, downsized version of the P1 would be more expensive than the current one (because of lower production volume) and therefore useless.
    Any special, more complex version of the P1 will be inferior to the P2 but nearly as expensive and therefore useless.
    The only improvement to the P1 that has a slight chance of making sense would be scaling down the current design unchanged to a smaller process to cut cost. I don't know if this is possible.

    "More cogs" can be solved with multiple chips. "More pins" can be handled with external shift registers (HC595-something).

    So I think Parallax' decision against many derivates is right! All resources for further development should be concentrated on bringing out the long awaited P2. Basta.
  • Heater.Heater. Posts: 21,230
    edited 2013-07-19 02:48
    Braino,

    Stupid questions are like when your wife comes home with a new blue dress and you ask "Why didn't you buy the red one?"

    Which is kind of what we have here.

    After years of consultation, and tremendous effort my Chip and crew we are about to get the Prop II.

    I don't what to be looking to the Prop III, VI, V... to hard before we have even celebrated the arrival of P II let alone worked with it.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2013-07-19 05:27
    Re Prof Braino
    As mentioned in the first post, this is what we do, except we use cog ram for the buffers.This give us 7 usable cogs on the first prop, 6 usable props on the last prop, and 5 usable cogs on all the one in between. So if N is the number of prop chips 2 or more, the number of available cogs is (N-2) *5 + 13. The delay from the first prop to the last prop is like one cycle per prop chip, otherwise its the communication parts it the same as being on the same chip.

    But we lose resources to the over head. This question is to see if anyone else has found this to be a problem.

    Good points. Sorry your post has gone off topic a bit. Debating the merits/new design features/existence etc of the P2 is easy. Actually solving problems is hard. Let's talk about hard problems.

    Firstly, interesting you are using cog ram as the buffer rather than hub ram. What are the merits of each system? I guess it depends where the data goes. If most of the data is going through a propeller and on to another one, then it can stay in the cog. If it is being passed to other cogs, presumably it has to go to the hub so other cogs can access it. That kind of implies data packets, with a header attached which contains the destination, and some code to work out where the packet should go.

    You forth boffins are probably ahead of the game here.

    Ok, data comms systems. For years it was a 'common bus', eg 50 ohm cable around the place running ethernet. Then everyone seemed to go to a 'star' system with cat5 cable. So for a 'star' system, your master propeller will have maybe one cog and two pins per connection. You could maybe have a cog multitasking but it would slow down the comms. So the star maybe isn't the best as it runs out of cogs too quickly.

    I think your solution
    the number of available cogs is (N-2) *5 + 13
    may indeed be close to optimum.

    Perhaps there is something to learn from the biology of brains, which seem to have mostly local connections, and then a lesser number of longer distance connections. The brain does not appear to be one general purpose neural network, but instead, functions are divided into fairly distinct anatomical areas. When someone has a stroke, it becomes more obvious how this works, eg a patient I have in her late '30s who has had a stroke that has damaged the expressive part of speech, so she cannot say any words, but she can still communicate with the aid of customised software on an ipad. The speed with which she communicates makes it quite clear she is still able to 'think' words, they just don't get translated into speech so the data has to go through another path.

    I wonder if one could think about local networks of propellers and whether that leads to a hybrid star/single buss model? Take a cluster, say, of 8 propellers, and build the most efficient comms system in terms of cogs and pins. Maybe this is a common bus with 2 pins per propeller and one cog per propeller? Use data packets and label each one as local or external to the group. For external messages, one propeller in the group is devoted to handling external messages, perhaps on another common bus. So you have clusters of 8 propellers and then clusters of clusters?

    Could this change your equation
    the number of available cogs is (N-2) *5 + 13
    so the 5 becomes 6, and hence frees up another cog?
  • prof_brainoprof_braino Posts: 4,313
    edited 2013-07-19 06:45
    Dr_Acula wrote: »
    .. using cog ram as the buffer rather than hub ram. What are the merits....?
    Cog is smaller, faster. Don't need to use hub for this.
    ... where the data goes. If most of the data is going through a propeller and on to another one, then it can stay in the cog.

    The sending cog sends preproced -results-. The recieving cogs processes those results and has other results. So we don't send raw data have less to buffer, run faster, whenever possible, which is most of the time if its done properly. (If it isn't then we should think a little more).
    You forth boffins are probably ahead of the game here.

    Just Sal, not me. I only talk.
    So the star maybe isn't the best as it runs out of cogs too quickly. ... mostly local connections, and then a lesser number of longer distance connections.

    So the limit is is the number that can hang off one prop master, which is 7 slaves. .
    clusters of 8 propellers and then clusters of clusters?

    Could do, but then the cost of support overhead goes up, needing extra code to set up and organize all the props runs out of room for code for each cog activity. Each cog could run individual script from SD for example, but its a whole series of trade offs.
Sign In or Register to comment.