Shop OBEX P1 Docs P2 Docs Learn Events
Max MIPs for propeller processor — Parallax Forums

Max MIPs for propeller processor

webmasterpdxwebmasterpdx Posts: 39
edited 2009-09-01 15:44 in Propeller 1
Assuming an optimized application that runs with 8 processors in parallel without any waiting for the hub, what is the maximum processing speed in MIPs?

Is it one clock per instruction???

If so, thats 640MIPs for $8 !!!

I'm assuming that the app has gone through it's initialization and is running all out, again optimized without any waits for the hub.

Anyone got that number????

Thanks
-Donald
«1

Comments

  • shanghai_foolshanghai_fool Posts: 149
    edited 2009-08-31 07:06
    We wish. No, that's 4 clocks per instruction (except hub memory) for 20Mips/cog or 160 Mips total.
  • webmasterpdxwebmasterpdx Posts: 39
    edited 2009-08-31 07:45
    Thank you for the prompt response.

    Thats still not bad....160MIPs is 5 times the fastest desktop PC less than 20 years ago.
    160MIPs was considered in the supercomputer domain 30-40 years ago.

    Compared to the fastest 8-bit pic which is 16MIPS at $4. For $4-6 you can get a high end dsPIC33 which is a 32 bit processor that runs at 40MIPs.

    As long as the application can be implemented in the propeller, it has a lot to offer at those speeds.

    I do have one or two more questions please....

    1. Does the processor have hardware multiply or divide. I noticed a 16-bit multiply in the datasheet, but I'm unsure if that's SPIN or assembler.

    2. How wide are the processor internal buses. i.e. Are the instructions 32-bit or what?

    Thanks very much
    -Donald
  • heaterheater Posts: 3,370
    edited 2009-08-31 07:55
    Prop COGs are 32 bit processors. The HUB is 32 bit RAM. Can be accesses drom COG as 16 or 8 bits though with the right alignment.

    Sadly there is no multiply or divide instructions. BUT there are some damn fast 32, 16 and 8 bit multiplies been coded up on this forum.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • heaterheater Posts: 3,370
    edited 2009-08-31 07:57
    Like so: http://forums.parallax.com/showthread.php?p=828096

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • Nick MuellerNick Mueller Posts: 815
    edited 2009-08-31 08:10
    No, there is no hardware MUL or DIV (it's just reserved, not implemented).

    The bus is 32 bits wide, instructions are 32 bits. This also is true for accessing HUB-RAM.

    Maybe you didn't get the HUB-RAM vs. COG-RAM thing absolutely right. So I repeat in other words:
    The HUB-RAM is common RAM that all COGs can access. To avoid concurrent accesses, a "rotating propeller" (thats from what the name 'Propeller' is coming from I guess) grants accesse to each COG in a time-sliced manner.
    Each COG has its own limited RAM where every access is at full speed without any waits. So you could copy some data from HUB-RAM to COG-RAM if you need to do intensive computation on that. As long as that copied section fits into COG-RAM together with code, of course.
    ASM for the Prop tends to be very compact. So if 500 words sounds incredibly small, it is only very small. wink.gif
    Don't forget, the SPIN-interpreter fits into one single COG. And again, the SPIN code to be interpreted is outside of the COG in HUB-RAM.
    For C (ICCV7Prop), it is a bit different. There is an interpreter in a COG that is loading sections of ASM-code from HUB-RAM to COG-RAM and this is executed there. So C is actually in PASM (Propeller-ASM) that is paged in to a COG. Execution is about 7 times faster (I had the same experience) but the code is about 5 times bigger compared to SPIN. Again, SPIN is interpreted, think of a P-machine and C is ASM that is paged in.
    You might do a search for "LMM" (Large Memory Model) here in the forum. There's also a XMM that reads from (EE)PROM with paging to HUB-RAM. There's also an assembler (not from Parallax) that supports LMM (& XMM?) programming. But I have no experience, others have to jump in if you do have questions to that.


    Hope that helped to clarify some things that are a bit confusing / unclear at the beginning.


    Nick

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Never use force, just go for a bigger hammer!

    The DIY Digital-Readout for mills, lathes etc.:
    YADRO
  • heaterheater Posts: 3,370
    edited 2009-08-31 08:14
    You said you are interested in an audio application of the Prop.

    Which reminded me that I have long ago wondered if I could implement my little audio project on the Prop.
    Last time I thought about it I did not know much about the Prop so it went out of my mind.

    Basically I have an algorithm coded in C that performs as a 3 way digital crossover for use in driving multi element loud speakers and sub-woofers.

    For some years now a friend of mine has been using this crossover algorithm in his PC sound set up and he loves it.

    Now apart from wondering if the Prop has the horsepower to run the algorithm at 40 odd KHz sample rate the hard part for me is the hardware, getting high quality 16 bit sound in and out of the thing. Ideally 16 bit A/D in and two or three 16 bit D/A out.

    I'm beginning to think the horse power is there.

    A Prop crossover in a small box with VGA interface for crossover point setting would be so cool.

    Any ideas for A/D, D/A anyone?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.

    Post Edited (heater) : 8/31/2009 9:04:19 AM GMT
  • Toby SeckshundToby Seckshund Posts: 2,027
    edited 2009-08-31 09:14
    Heater

    I would be interested in the Xover project too. Years of listening to £1000+ speakers leaves you very dissapointed with 99.99% of commertial speakers. I have not made any recently as just as they were getting to sound reasonable SHE objected to size and uglyness (woodwork, I think) and so a compromise was struck and they still need tweaking. A configerable Xover may be the answer, even for old ears with tinnitus.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Style and grace : Nil point
  • LeonLeon Posts: 7,620
    edited 2009-08-31 10:35
    Donald:

    The dsPIC is actually a 16-bit processor with a DSP engine. The PIC32 is 32-bit, and runs at 80 MIPS.

    Leon

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Amateur radio callsign: G1HSM
    Suzuki SV1000S motorcycle
  • Agent420Agent420 Posts: 439
    edited 2009-08-31 10:59
    Regarding the original post, I think the programming language used should be cited when discussing any kind of MIPS from an application performance perspective.· It is possible to achieve very high MIPS using the native pasm and cog ram.· As you increase the number of external memory access commands from the cog (which may be needed for cog-to-cog collaboration or memory storage), you begin to cut into the throughput because hub commands require at nearly twice as many clock cycles as non-hub commands (plus potential cycles wasted for synching the hub).· I'm guessing a well written program would offset some of that, though I think designing and coding an efficient collaborative parallel processing app can be more complex than standard linear coding.

    As Spin is both interpreted and makes heavy use of external memory access, it is much slower than pasm; I believe I saw references claiming that Spin is approximately 40x slower than assembly.

    In short, like any hardware, there is a difference between native mips performance and actual throughput which is often determined by coding efficiency.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • heaterheater Posts: 3,370
    edited 2009-08-31 11:01
    Toby do remind me one day. I'd like to start a thread for it but I have a lot going on now.

    Need some suggestions/advice about selection of audio quality A/D and D/A that can
    a) Be bolted to the Prop easily.
    c) Are easy/fast for the Prop to drive.
    d) Don't cost arms and legs.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
  • Agent420Agent420 Posts: 439
    edited 2009-08-31 11:11
    heater said...
    Toby do remind me one day. I'd like to start a thread for it but I have a lot going on now.

    Need some suggestions/advice about selection of audio quality A/D and D/A that can
    a) Be bolted to the Prop easily.
    c) Are easy/fast for the Prop to drive.
    d) Don't cost arms and legs.

    I wonder if an audio codec chip would be useful for that...· they seem to provide both adc/dac and interface via either spdif or i2s...

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • LeonLeon Posts: 7,620
    edited 2009-08-31 11:28
    The TLV320AIC23B is a nice codec, and has an integrated headphone amp, an electret supply, and pre-amp. It uses an I2S interface for audio and SPI for control. I'm working on a PCB for one that is intended for interfacing to one of the XMOS kits via a 2x8 header and ribbon cable. It could also be used with a Propeller proto board.

    Leon

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Amateur radio callsign: G1HSM
    Suzuki SV1000S motorcycle

    Post Edited (Leon) : 8/31/2009 12:13:07 PM GMT
  • HumanoidoHumanoido Posts: 5,770
    edited 2009-08-31 17:17
    For supercomputers there are actually two quotes. One is the theoretical maximum attainable speed with the given hardware and the other is the actual bench-marked speed. The latter is language dependent. So PROpeller MIPS can have two quotes, based on the app.

    humanoido
  • NetHogNetHog Posts: 104
    edited 2009-08-31 18:11
    humanoido said...
    For supercomputers there are actually two quotes. One is the theoretical maximum attainable speed with the given hardware and the other is the actual bench-marked speed. The latter is language dependent. So PROpeller MIPS can have two quotes, based on the app.

    humanoido
    The other thing to note is that MIPS is not a very useful measurement.

    Millions of instructions per second... but what is an instruction?
    For an X86, a single instruction can do some really complex stuff.

    When I was studying super-computers (many years ago now) the preferred measurement was MFLOPS. But even that is not very useful.

    Better is application specific benchmarks.
  • pgbpsupgbpsu Posts: 460
    edited 2009-08-31 18:11
    @heater

    You might consider the AD1871 ADC. You get 2 channels ADC on one chip.

    www.analog.com/en/analog-to-digital-converters/audio-ad-converters/ad1871/products/product.html

    It needs an external clock but you might be able to drive it from the prop counters. I can share code with you for reading from it in its default mode written in PASM. Cost is ~$5 in quantity one (which gets you 2 channels of 24-bit, if you believe the specs).

    PM me if you are interested.
  • LeonLeon Posts: 7,620
    edited 2009-08-31 18:26
    DMIPS - Dhrystone MIPS - are popular, and a lot more meaningful than old-fashioned MIPS as they involve an actual benchmark program.

    Leon

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Amateur radio callsign: G1HSM
    Suzuki SV1000S motorcycle
  • HollyMinkowskiHollyMinkowski Posts: 1,398
    edited 2009-08-31 18:35
    Agent420 said...
    As Spin is both interpreted and makes heavy use of external memory access, it is much slower than pasm; I believe I saw references claiming that Spin is approximately 40x slower than assembly.

    Coders do not live by mips alone!
    The prop is more powerful than the mips rating would suggest.


    The relative slowness of interpreted Spin (relative to pasm) made me wince a bit at first.

    But then I realized that with 8 cogs I could use one to run critical timing code in pasm
    and others to run parts of the program that did not really need speed, in Spin. Every
    program has code that would be running fast enough if it was coded in PICBASIC for an old 8bit PIC.
    So what if a cog or two are not running the code at max speed in pasm but instead in slow Spin?
    It just doesn't make any real difference....well, actually it does, you save a lot of time by
    using Spin when it suffices. With 8 cogs it's wasteful to spend any more programming time
    than is absolutely necessary writing asm code...wasting time wastes $ and $ is almost
    always the most important consideration. Spin is inside the beast to allow us to take "the easy
    way out" and still get the job done.

    When you program a single processor, even if it is a fast ARM, always foremost is
    the reality that any part of the program that is not optimized is slowing down the whole
    thing and could end up making a mess of critical timing....so you churn out the tightest
    C and asm you can manage. The code inside those interrupt routines have got to
    be made faster, faster and still faster or the whole thing will go down in flames.

    I do admit that at times I still cringe at the thought of that interpreter running my code.
    I just can't entirely shake the feeling that I have cheated a bit. Multiple processors
    has done this to me! wink.gif Chip is an evil genius, he knew the prop would bring us to
    this disgraceful condition smile.gif


    When you are in big doodoo is when your application is so close to the limits of the prop
    chip that you need to code everything in pasm and tweak, tweak, tweak to fit it all in
    and make it fast enough. I have not come anywhere close to that yet...maybe the prop2 will
    come along before I have to face it.

    If someone told me a year ago I'd be writing code that would run on an interpreter I'd have
    called them crazy... On the AVR and ARM even compiled C was a compromise, asm was how the real coders
    created magic. But plodding along in Spin on a cog in a prop chip is just fine....you have more cogs!

    Way back in HS we had a science fair, I was programming PICs (insert scream here) and created a project
    that could unlock a door by tapping the metal doorknob with your finger in a coded pattern.
    I simulated a door lock by using a homemade solenoid to move a large nail in and out of
    a hold in a tiny homemade wooden door. (it was embarrassingly crude and noisy) I coded it
    in PIC asm and was very proud of it smile.gif

    Someone else had a project that used what I now recognize as a basic stamp. I inquired
    about it and poked a bit of fun at them because it used basic and was interpreted and SLOW.
    But that project beat out mine at the end of the day. The little cpu was fast enough to do
    the job and it was so easy to use anyone could work with it.... The extra points I felt
    were due me because I slaved over that nasty PIC asm were not at all obvious to the 2 teachers
    that judged the fair. I shoulda used a stamp and spent the extra time working on that noisy
    solenoid and ugly little door.

    About the mips on the prop, remember that you go from 160 to 200 mips theoretical maximum by just using
    one of those 6.250mhz xtals in place of the regular xtal. That extra 40mips might save you having to write some
    pasm someday smile.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - Some mornings I wake up cranky.....but usually I just let him sleep -
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2009-08-31 18:42
    NetHog said...
    The other thing to note is that MIPS is not a very useful measurement.
    Very true. Some particular Prop advantages worth noting:

    1. It has a two-register instruction architecture. This means any memory location can be a data source, and any location can be a destination. There's no accumulator bottleneck, and operations are 32 bits wide.

    2. Any instruction can be exectued conditionally. You don't have to jump around instructions (or small groups of them) that don't need to be executed.

    3. Any instruction for which zero and carry results have meaning can selectively write the zero and carry flags, or not.

    4. Any non-hub instruction can selectively write its destination register, or not.

    5. There's no pipeline penalty for jumps (only for conditional jumps that aren't taken).

    6. Shift instrucitons use barrel shifters. This means any size shift can be accomplished in one instruction cycle.

    7. The mux instructions make quick work of bit setting and clearing.

    8. Several multiple-effect instructions (e.g. cmpsub, addabs, etc.) optimize common operations that would take more instructions on another processor.

    9. Timing granularity for the wait instructions is one clock, not one instruction cycle.

    In many ways, the Propeller instruction set resembles microcode. There's a lot that gets done, selectively and/or conditionally, during each instruction cycle.

    -Phil

    Post Edited (Phil Pilgrim (PhiPi)) : 8/31/2009 6:48:40 PM GMT
  • Toby SeckshundToby Seckshund Posts: 2,027
    edited 2009-08-31 18:50
    Holly I think that was an early teaching of "Sod how clever it is, how shiney is the facia, and how many preety blue LEDs are there!"

    (And no I do not know how to spell)

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Style and grace : Nil point
  • HollyMinkowskiHollyMinkowski Posts: 1,398
    edited 2009-08-31 18:55
    Good points Phil Pilgrim smile.gif
    I'm going to copy those points and stick the txt file in my prop directory....
    I squirrel away everything I find useful about the prop....that way I don't have
    to come here and search for it later on. It's helpful to do when you are a noob like me.

    PhilPilgrimsPropPoints.txt smile.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - Some mornings I wake up cranky.....but usually I just let him sleep -
  • HollyMinkowskiHollyMinkowski Posts: 1,398
    edited 2009-08-31 19:03
    Toby said...
    how many preety blue LEDs are there!

    I actually hate blue LEDs....they seem too bright and harsh.
    I understand they were once quite rare? Now they seem to be overused.

    My fav LEDs are RGB so I can select from a large range of colors.

    Green I like best of standard colors.

    1 Green
    2 Yellow
    3 Orange
    4 red
    5 blue

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - Some mornings I wake up cranky.....but usually I just let him sleep -
  • NetHogNetHog Posts: 104
    edited 2009-08-31 19:07
    Agreed Phil/etal.

    As has been pointed out elsewhere, the jobs that we ask a microcontroller to do, are usually jobs that it doesn't make sense to throw a more complex CPU at.

    e.g.
    Price
    Simplicity
    "Everything embedded" (give or take an external flash)
    I/O functionality (PIN stickyness, ability to to ADC, DAC, signal edge detection, special functions etc, etc)
    Instruction predictability (watch the PROP output to VGA and tell me this isn't important [noparse]:)[/noparse]

    As I check my 'bit' draw, I have lots of different types of microcontrollers I constantly use. I have a few old generation X86 chips I've scavanged, but I doubt I'll ever use them. Now that said, I would not use a microcontroller to build a compute intensive supercomputer.
  • NetHogNetHog Posts: 104
    edited 2009-08-31 19:09
    HollyMinkowski said...
    Toby said...
    how many preety blue LEDs are there!

    I actually hate blue LEDs....they seem too bright and harsh.
    I understand they were once quite rare? Now they seem to be overused.

    My fav LEDs are RGB so I can select from a large range of colors.

    Green I like best of standard colors.

    1 Green
    2 Yellow
    3 Orange
    4 red
    5 blue

    In the early days of LED's... I remember going "ooooo" when I learned about them after so many projects using regular bulbs... there was a lot of talk about the need to try and create a blue one. We had the option of Red, Green, or something between [noparse]:)[/noparse]
    ·
  • Agent420Agent420 Posts: 439
    edited 2009-08-31 19:17
    HollyMinkowski said...
    Toby said...
    how many preety blue LEDs are there!

    I actually hate blue LEDs....they seem too bright and harsh.
    I understand they were once quite rare? Now they seem to be overused.
    watch out! Soon you'll become a curmudgeon like myself and revert back to warm, organic incandescent and neon ;-)

    frontui5.jpg

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    Post Edited (Agent420) : 8/31/2009 7:22:19 PM GMT
  • HollyMinkowskiHollyMinkowski Posts: 1,398
    edited 2009-08-31 19:18
    NetHog said...
    I would not use a microcontroller to build a compute intensive supercomputer.


    But there are times you want to run a very low power controller just to watch and wait for something to
    happen, you can have it sleeping most of the time and wake it several times a second to see if it has happened.
    And sometimes once that something does happen then the lowly controller is not capable of dealing with it
    at all so you have it wake up a multi Bips machine to do the work, store the data and then turn back off
    and hand the lookout chore back to the controller. Both are needed sometimes...the little $2 controller
    keeps the big number cruncher from staying on all the time and killing the battery....he's the tireless little
    lookout that makes it all possible sometimes.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - Some mornings I wake up cranky.....but usually I just let him sleep -
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-08-31 20:57
    > revert back to warm, organic incandescent and neon wink.gif

    @Agent

    Nice clock!

    Only thing it seems to be missing is a big red lever on the side [noparse]:)[/noparse])

    - H

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • NetHogNetHog Posts: 104
    edited 2009-08-31 21:05
    HollyMinkowski said...
    NetHog said...
    I would not use a microcontroller to build a compute intensive supercomputer.


    But there are times you want to run a very low power controller just to watch and wait for something to
    happen, you can have it sleeping most of the time and wake it several times a second to see if it has happened.
    And sometimes once that something does happen then the lowly controller is not capable of dealing with it
    at all so you have it wake up a multi Bips machine to do the work, store the data and then turn back off
    and hand the lookout chore back to the controller. Both are needed sometimes...the little $2 controller
    keeps the big number cruncher from staying on all the time and killing the battery....he's the tireless little
    lookout that makes it all possible sometimes.

    Precisely
    This happens in a PC today, known as "wake on hardware". E.g. when a network packet arrives. A low power device·(Microcontroller, PLA, ASIC, etc)·identifies an external event and wakes up·the power hungry·PC in response.

    I have a general saying when mentoring "right tool for the right job". Goes for both hardware and software.
    ·
  • jazzedjazzed Posts: 11,803
    edited 2009-08-31 23:24
    Agent420 said...

    frontui5.jpg

    Woohoo!!! Time Machine! Can you set that to 1977 and Press Enter?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    --Steve

    Propeller Tools
  • hover1hover1 Posts: 1,929
    edited 2009-08-31 23:26
    I'd like to see a big old DPST knife switch on the side. tongue.gif
    CounterRotatingProps said...
    > revert back to warm, organic incandescent and neon wink.gif

    @Agent

    Nice clock!

    Only thing it seems to be missing is a big red lever on the side [noparse]:)[/noparse])

    - H

  • heaterheater Posts: 3,370
    edited 2009-09-01 02:53
    Agent420: That's a great clock.

    I built a Nixie clock in about 1973. All of a sudden Nixie tubes are as rare as hens teeth.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    For me, the past is not over yet.
Sign In or Register to comment.