Shop OBEX P1 Docs P2 Docs Learn Events
Speed of PropForth vs Spin - Page 3 — Parallax Forums

Speed of PropForth vs Spin

13»

Comments

  • AribaAriba Posts: 2,690
    edited 2014-07-09 17:23
    For comparison I tried the good old way with Spin + optimized PASM code for the LED String code.
    The PASM code uses the counter trick to speed up the transfer to 20 Mbit/sec. The RGB color data is still defined inside the Spin code, so this is not harder to use than the C or Forth versions.

    I don't have such LED strings, so could not test it. But on the scope the data and clock looks good and I have used code with the same timing to write to SPI RAMs. If it not works with the LEDs, it may be too fast and needs some delays here and there.

    The Benchmark result for this Spin+PASM is 388 us (microseconds not milliseconds!) for updating all the LED chips with one color, including the zero byte at begin.
    This is over 3600 times faster than the PropForth version in the first post and about 8 times faster than the fastest candidate so far.

    Here is the code:
    CON
      _clkmode = xtal1  + pll16x
      _xinfreq = 5_000_000
    
      LEDdata  = 0
      LEDclock = 1
      numLEDs  = 264
    
    OBJ
      term : "FullDuplexSerial"
    
    VAR
      long rgbdata
      
    PUB Main 
      term.start(31,30,0,115200)
      cognew(@fspi,@rgbdata)
      waitcnt(clkfreq*2 + cnt)
    
      result := CNT
      rgbdata := $808080
      repeat until rgbdata == 0
      result := CNT - result
    
      rgbdata := $80FFFF            'another color
      waitcnt(clkfreq + cnt)         'some delay
    
      rgbdata := $FF80FF            'another color
      repeat until rgbdata == 0
    
      term.str(string(13,"time = "))
      term.dec(result/80)
      term.str(string(" us",13))
      repeat
    
    DAT
    fspi    mov     dira,outmask
            mov     outa,#0
            mov     ctra,modeMux
            mov     frqb,freq4
    loop    mov     t1,#0
            wrlong  t1,par
    waitd   rdlong  t1,par  wz
      if_z  jmp     #waitd
            mov     lcnt,#numLEDs    'number of LED chips
            shl     t1,#8
            mov     phsa,#0          'ctra for mux+shiftreg
            mov     phsb,#0          'ctrb for 20MHz sclk
            mov     ctrb,modeClk     'start clock
            nop                      '$00 at begin (8 bits)
            nop
            nop
            nop
            nop
            nop
            nop
            mov     ctrb,#0          'stop clock
    leds    mov     phsa,t1          'send the RGB bytes (24 bits)
            mov     phsb,#0
            mov     ctrb,modeClk
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            shl     phsa,#1        
            mov     ctrb,#0
            djnz    lcnt,#leds      'repeat for all LED chips
            jmp     #loop
    
    modeClk       long      %00100<<26 + LEDclock    'counter mode for Clock
    modeMux       long      %00100<<26 + LEDdata     'counter mode for shiftreg/Mux
    freq4         long      $4000_0000               'frequency = sysclock/4
    outmask       long      1<<LEDclock | 1<<LEDdata
    
    t1            res       1
    lcnt          res       1
    
    You just write the desired RGB data to the rgbdata variable. The PASM cog notices that and writes the data to the LEDs, then the variable gets cleared. You can wait in Spin until the variable is zero, but normally you anyway have a delay longer then 388us after a color change in the code.

    Andy
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2014-07-09 17:53
    Ariba wrote: »
    For comparison I tried the good old way with Spin + optimized PASM code for the LED String code.
    The PASM code uses the counter trick to speed up the transfer to 20 Mbit/sec. The RGB color data is still defined inside the Spin code, so this is not harder to use than the C or Forth versions.

    I don't have such LED strings, so could not test it. But on the scope the data and clock looks good and I have used code with the same timing to write to SPI RAMs. If it not works with the LEDs, it may be too fast and needs some delays here and there.

    The Benchmark result for this Spin+PASM is 388 us (microseconds not milliseconds!) for updating all the LED chips with one color, including the zero byte at begin.
    This is over 3600 times faster than the PropForth version in the first post and about 8 times faster than the fastest candidate so far.

    You just write the desired RGB data to the rgbdata variable. The PASM cog notices that and writes the data to the LEDs, then the variable gets cleared. You can wait in Spin until the variable is zero, but normally you anyway have a delay longer then 388us after a color change in the code.

    Andy

    I just knew that someone would pull this trick out of the hat and it had to be you Andy :)

    The Tachyon time is 2.66ms so that makes your PASM counter code almost 7 times faster although I know that I can code this way but the challenge was to do it with HLL. I know my version could be faster if I used the counter method but I'm interested in running more than snippets of code which can always seem to perform well when they "hog a cog". However, of course your counter code would make a good object for this device and when I get a round TUIT I might incorporate it in my resident run-time OBEX which will live in upper EEPROM or SD. The idea there is that I can load objects at run time (or is that fun time?) and reuse cogs as needed.
Sign In or Register to comment.