Shop OBEX P1 Docs P2 Docs Learn Events
spin code incredible slow — Parallax Forums

spin code incredible slow

Chris MicroChris Micro Posts: 160
edited 2009-04-17 11:03 in Propeller 1
Hello togther,

I tried to make a little waveform generator in spin. But it seems to be incredible slow.
I assume that the loop only can run with a speed of approximately 1KHz. Is that possible?

CON
  _clkmode = xtal1 + pll16x
  _xinfreq = 5_000_000
  OneSecond = 80_000_000        'SysClk: Number of clk cycles in one second
  FSample = 2_000              ' sampling frequency

  LeftOutputPin = 10   ' PWM Pin for left analog signal
  RightOutputPin = 11  ' PWM Pin for right analog signal

VAR
  LONG Phase

OBJ
  DAC : "DualDac"

PUB MAIN
  Dac.Start(LeftOutputPin,RightOutputPin)
  Triangle(200,5000)
   

PUB Triangle(pitch,duration)|freqInc,temp,temp2
'' duration in ms
  freqInc:=65536/FSample*pitch
  temp:=duration*FSample/1000
  temp2:=OneSecond/FSample
  repeat temp
    phase+=freqInc
    phase&=$FFFF
    Dac.Write(phase,phase)
    waitcnt(temp2+cnt) 

Comments

  • AleAle Posts: 2,363
    edited 2009-04-16 18:06
    Well you should count that every spin operation takes some 80 clocks or more. Adding a number to a variable takes 3 or 4 operations. Compact but not fast. Use asm where possible. I can give you a short course sometime next week if you want smile.gif, just show up at the uni smile.gif
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-16 18:33
    HI Ale,

    yes, I know, assembler is very fast. But I hoped, that the propeller has so much "horsepower" that even in spin it should be possible to write a waveform generating routine.
    Somebody said...
    Well you should count that every spin operation takes some 80 clocks or more.
    Is it 80cylces at 80Mhz? Than a spin operation should take 1uSec?

    I will try a very "corse" estimation. My cyle loop looks like this:
    repeat temp ' 4uS
        phase+=freqInc     '--> 3uSec ?
        phase&=$FFFF '--> 1uSec
        Dac.Write(phase,phase) '--> lets give it 10uS
        waitcnt(temp2+cnt) '--> 3uS
    
      ' total time 21uSek
    
    



    The estimated caldulation would be in this case 21uS ~= 50KHz. But my measurements with the oszilloskop it seems to have under 2Khz. That would mena, the cycle time is 10 times slower.
    Probably the problem is in the Dac.Write Method . It would be good do have a spin code debugger.
  • mparkmpark Posts: 1,305
    edited 2009-04-16 18:43
    How did you determine your assumed 1kHz speed?

    Edit: never mind, you posted before I did.

    temp2 is 40000, so that waitcnt takes ~500us, not 3.

    Post Edited (mpark) : 4/16/2009 6:49:17 PM GMT
  • Mike HuseltonMike Huselton Posts: 746
    edited 2009-04-16 18:53
    Ale, what is uni?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    JMH
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-16 18:53
    Somebody said...
    temp2 is 40000, so that waitcnt takes ~500us, not 3.

    sorry, well, yes temp2 depends on the sample frequency
    FSample = 2_000              ' sampling frequency
    



    This leads to the 500us delay. The time I estimated for the delay was the fastest possible time.
    But fact is, as soon I go up with the the FSample paramter ( let's say to 20KHz) the loop is much slower than the demanded time.
  • mparkmpark Posts: 1,305
    edited 2009-04-16 19:14
    Try making the following changes (marked with ****). You should be able to increase FSample to 15000.

    PUB Triangle(pitch,duration)|freqInc,temp,temp2,t ****
    '' duration in ms
      freqInc:=65536/FSample*pitch
      temp:=duration*FSample/1000
      temp2:=OneSecond/FSample
      t := cnt        ****
      repeat temp
        phase+=freqInc
        phase&=$FFFF
        Dac.Write(phase,phase)
        waitcnt(t+=temp2)         ****
    
    
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-16 19:17
    That's because the runtime of the code adds to the waittime. When your frequency is low, you don't recoginze that, but if the frequency is higher the error is higher as well.
    You should add the time2 to cnt before you execute all the code. But then you have to be sure that your code is done before cnt reaches the calculeted value, otherwise you'll wait for ~50sec.

    In Dac.write(phase, phase) you use the same parameter twice. So, you sould modify write to only have one parameter. Passing parameters needs time.

    Dac.write can propably be run in a extra COG and maybe in PASM. That would speed up things as well.
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-16 19:32
    @mpark
    Thank you for this, I will try it.
    MagIO2 said...

    In Dac.write(phase, phase) you use the same parameter twice. So, you sould modify write to only have one parameter. Passing parameters needs time
    Dac.write is running in it's own core.
    PUB Start(LeftDacPin,RightDacPin)
    ''  Start: Initialize dual dac outputs     
      cognew(Init(LeftDacPin,RightDacPin),@Stack)
    
    PUB Write(LeftValue,RightValue)
    {{  write analog value
        imput range: 0..$FFFF
        }}
    



    What I'm not shure about, is, if the calling of this method takes much more time than I estimate.
    Or is there a better way of passing variables to an object than implementing a dedicated method?
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-16 19:49
    Why don't you simply test the runtime? Get the cnt before an instruction and get the cnt after the instruction and see how long it needed - maybe simply in a little test program.

    Instead of using variables in the DAC object you can use variables in the code using the object and pass the adress of this variable to the Init. Then your code does not have to call the write function, which simply passes the LeftValue and RightValue to the object internal variables. Instead you can write the values to the variables you know directly. So, one call less with all the parameter passing overhead.
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-16 19:52
    mpark said...
    Try making the following changes (marked with ****). You should be able to increase FSample to 15000.
    Hi mpark, you have been right, with this little change the loop can be run at 15KHz.
    Strange: An FSample value above 15KHz seems to disable the program.
    I'm wondering why your small code change speed up the loop that fast.
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-16 19:59
    Magio said...
    Try making the following changes (marked with ****). You should be able to increase FSample to 15000.
    That was what I tried to achive the last view hours of coding. But because I'm new to spin, I was not successfull.
    Than I came to the solution of changing the values by invoking a method of the object which seems in the way of object oriented programming much more elegant.

    If you could show me, how to pass the pointers, I would be very glad.
    What I do not unterstand: Are the varibles of the object than held in the global or the local memory?
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-16 20:04
    Read my post from 12:17. High frequency means low waittime. With low waittimes your code can run as long as the waittime. In your old code waittime and runtime of the code summarized. When you calculate the waittime before the runtime does not summarize with the waittime. First the code runs and waitcnt will only wait for the rest of the time. So, speed is doubled.

    To hight FSample does not disable the program. waitcnt only missed the calculated cnt-value, which means that you have to wait for ~50sec which is the time when cnt wraps around.
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-16 20:13
    OK, understood smile.gif
    It would be very interesting to see how fast I can get the loop running by passing the DAC-values by a pointer. But as I mentioned one post above I wasn't sucessfull with implemting pointers.
  • mparkmpark Posts: 1,305
    edited 2009-04-16 20:16
    I posted something in your dac thread that might help with the pointer problem.
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-16 20:18
    Ok ... that's how I'd do it:
    CON
      _clkmode = xtal1 + pll16x
      _xinfreq = 5_000_000
      OneSecond = 80_000_000        'SysClk: Number of clk cycles in one second
      FSample = 2_000              ' sampling frequency
    
      LeftOutputPin = 10   ' PWM Pin for left analog signal
      RightOutputPin = 11  ' PWM Pin for right analog signal
    
    VAR
      LONG Phase
    
    OBJ
      DAC : "DualDac"
    
    PUB MAIN
      Dac.Start(LeftOutputPin,RightOutputPin,@Phase)           ' ****** here you pass the pointer to the DAC
      Triangle(200,5000)
       
    
    PUB Triangle(pitch,duration)|freqInc,temp,temp2,t
    '' duration in ms
      freqInc:=65536/FSample*pitch
      temp:=duration*FSample/1000
      temp2:=OneSecond/FSample
      repeat temp
        t := temp2 + cnt
        phase= (phase + freqInc) & $FFFF
        waitcnt( t ) 
    
     
    
    

    I put the Phase in the VAR section, because the Dac itself is started in the MAIN. So you can't use a local variable of Triangle. If you would move the Dac.Start to Triangle you could use a local variable as well. But then you should make sure that the DAC-COG also gets stopped when leaving Triangle.

    The dac then should store the adress of the 3rd parameter somewhere.
    VAR
    · long dacValue_adress
    pub Start(left,right,val_adr )
    · dacValue_adress:=adr
    ....

    And later on, when the DAC needs the value, it can access it with:
    · long[noparse][[/noparse] dacValue_adress ]
    ·
  • Beau SchwabeBeau Schwabe Posts: 6,568
    edited 2009-04-16 20:41
    Chris Micro,

    What does your "DualDac" object look like?· Seems that it is different than the one distributed within "Source-Code-PE-Kit-Labs-Fundamentals.zip"

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Beau Schwabe

    IC Layout Engineer
    Parallax, Inc.
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-16 20:50
    Oh .. now that I read the DAC thread ... why do you start the DAC in a different COG at all?

    You can use the counters of the COG which runs your Triangle function.
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-16 20:58
    Beau Schwabe (Parallax) said...
    What does your "DualDac" object look like?
    Because I couldn't find a simple DAC-Object in the demo libraries I tried to develope one for myself here.
    For begiinners like me ( in the sens of knowing the propeller, not other microntrollers ) it is very essential to have easy objects to be able to learn fast.

    I developed this little Synthesizer on an Atmega8 in 'C' which was a little bit challenging at that time becaus it was processing power of the Atmega8 and the C-Compiler Optimiziation was sufficient.

    I'm wondering now how complicated it would be in spin on a propeller with its huge processing power. For now it seems to me as if spin consumes to much resources related to 'C' on an Atmega8.


    @MagIO2
    Thanks for the example. I will try to implement it tomorow.

    I'm off for now. At this side of the globe it's 9p.m. an my concetration is weekening.

    good night,
    chris
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-16 21:57
    Of course SPIN consumes more of the processing speed than C. C is compiled directly to machine-code. SPIN is compiled as well, but compiled to bytecode. The bytecode has to be interpreted during runtime. So, there are two speed issues with bytecode:
    1. bytecode stays in HUB-RAM and is fetched one instruction after the other to COG RAM.
    2. execution then needs several PASM instructions per bytecode

    For a lot of things SPIN is fine, but if you need speed PASM should be used.

    An alternative would be to use the ImageCraft C compiler. But this still produces PASM code which is slower than self-written PASM code because of the memory model. It's very difficult to writr a C compiler for this kind of architecture the propeller has. That's why they use the LMM (large memory model) which eliminates the 2nd problem of SPIN code, but still has problem number 1 (massive HUB RAM access).
  • WNedWNed Posts: 157
    edited 2009-04-17 02:58
    If you are more comfortable with C, you should look into the two C compilers available for the Propeller. There's ImageCraft C, which is a commercial product available from Parallax, but you can test drive it free for 45 days. More info here: http://www.imagecraft.com
    There is also Catalina C, an open source product which was developed by RossH, a forum member. You can learn about that here: http://forums.parallax.com/forums/default.aspx?f=25&m=339139.
    There is currently a drive to get more objects written for the Prop in C, and there are several threads that go into some detail about the various benefits and drawbacks of Spin, PASM, and C. Like any tool, you need to decide which one fits the requirements of the task at hand, as well as your ability to wield it (a chain saw is a horrible tool for cutting wood, if you don't know how to start it).

    Ned

    Added: You might be interested in looking at the available objects in these two categories of the Object Exchange...
    (Signal Generation)
    http://obex.parallax.com/objects/category/9/
    (Speech and Sound)
    http://obex.parallax.com/objects/category/10/
    Even if they don't do exactly what you want, they make terrific examples to work from.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    "They may have computers, and other weapons of mass destruction." - Janet Reno

    Post Edited (WNed) : 4/17/2009 3:23:47 AM GMT
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-17 07:19
    Hello Wned,

    uselly I programm in C. But I tried the catalina C-Compiler. After a little hazzle with installing I was able to compile the test_led.c programm. But unforunatelly the LED didn't blink ( my led is at pin16 not pin 1 and I changed the programm). Because I had this problem in the beginning I thought its not usefull to follow this path.
    MaglIO said...
    An alternative would be to use the ImageCraft C compiler. But this still produces PASM code which is slower than self-written PASM code because of the memory model. It's very difficult to writr a C compiler for this kind of architecture the propeller has. That's why they use the LMM (large memory model) which eliminates the 2nd problem of SPIN code, but still has problem number 1 (massive HUB RAM access).
    I think it would be possible to surpass this if these C-Compilers would enable a special tecnic to run short C-Slices internal of one Cog not using LMM. I used to programm Attiny13s in C and they have only 2KBytes of programm memory. It ist even possible with these chips to implement some small Synthesizers in C.
  • AleAle Posts: 2,363
    edited 2009-04-17 08:10
    Chris: It would be possible to use a simple C program compiled with Catalina or ICCs and just put it into the COG memory. I'd write it in pasm though, too much hassle just for a simple piece of code.


    James: Uni is short for university smile.gif, I know Chris so I just was telling him to show up at my lab so I could give him a short pasm course smile.gif. I hope he realizes it is me smile.gif!
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-17 08:33
    Ale said...
    I hope he realizes it is me smile.gif!
    Of course I realized it smile.gif
    Ale said...
    I'd write it in pasm though, too much hassle just for a simple piece of code.
    It is probably easier in Assembler. Could you create a simple Assembler example for the Dual Dac with spin code head?
    For now I learned a lot about the structure of spin and its time limitations. For this my little sound generator project is good because it is so time critical.
    MaglIO said...
    The dac then should store the adress of the 3rd parameter somewhere.
    VAR
    long dacValue_adress
    pub Start(left,right,val_adr )
    dacValue_adress:=adr

    I added the following method to my spin DAC-Object to get the address of my DualDac object:
    PUB GetLeftPointer
      RESULT:=@LeftChannel
    
    



    Then I changed the sound loop to access the pointer
    t:=cnt
      repeat temp 
        phase+=freqInc
        phase&=$FFFF
    
        LONG[noparse][[/noparse]DacAddress]:=phase
    
        'Dac.Write(phase,phase)
        waitcnt(t+=temp2)
    



    result: it does not work jumpin.gif

    Post Edited (Chris Micro) : 4/17/2009 8:38:59 AM GMT
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-17 08:49
    You should not forget that there is a difference in having a 8-bit controller with 2kb and a 32-bit controller with 2kb, which is only 512 longs (minus special registers). In the average the number of instructions you can put in 2kb is somwhere between 1 to 4 times lower for a 32 bit system. The real ratio depends on the used 8-bit microcontroller. Some have variable instruction length. Here you have 1 byte instructions (like INC, DEC ..), 2 byte instructions, 3 byte instructions and maybe 4 byte instructions as well. In other controller families you have a fixed size, but the organisation of program memory is different (isn't it 12 bit for PICs? here you'd have 2kwords with 1 word=12bits )
    With the propeller architecture you leave track of classical C compilers and of standard C of course. You'd need some intelligence to decide which block of code belongs together and should run in a COG. E.G. you'd need mechanisms to overlay code in one COG. This intelligence is easier to be outsourced to the developer himself ;o)
    By using the LMM, they can use existing C compiler methods which simply create plain PASM code. No efford for building an intelligent compiler.

    I personally like the SPIN/PASM combination. SPIN is powerfull enough to run managing code and PASM is for the high-speed drivers.
  • Chris MicroChris Micro Posts: 160
    edited 2009-04-17 09:16
    MaglIO said...
    With the propeller architecture you leave track of classical C compilers and of standard C of course. You'd need some intelligence to decide which block of code belongs together and should run in a COG. E.G. you'd need mechanisms to overlay code in one COG. This intelligence is easier to be outsourced to the developer himself ;o)
    I also think that to outsource the intelligence to the developer is the right way to go. There should be some special compiler directives to enable this to the developer.

    By the way: Do you have some hint for me what's wrong with my pointer access in the DAC-loop?
  • AleAle Posts: 2,363
    edited 2009-04-17 10:29
    MagIO2: Attiny have 16 bit instruction words. The rest of your rationale is correct smile.gif

    Chris: Do you mind posting the whole code ?
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-04-17 10:47
    Ale: I didn't mention the Attiny .. I talked about PIC which in fact has 12 bit instruction-words (looked id up in wikipedia). So, my whole post is OK ;o)

    Chris: A lot has been said and changed since your first post, so I'd also suggest to give us a complete update of the code.
  • AleAle Posts: 2,363
    edited 2009-04-17 11:03
    This is not a competence. Chris was talking about Attiny and you picked up from there so you where _also_ talking about that one, later you changed. And no, not all PICs have 12 bit instructions there are some with 16, 18, 32 and probably more. But all this do not help Chris any further!

    Have fun.
Sign In or Register to comment.