Speed of PropForth vs Spin

Ariba · 2014-07-09 17:23

For comparison I tried the good old way with Spin + optimized PASM code for the LED String code.
The PASM code uses the counter trick to speed up the transfer to 20 Mbit/sec. The RGB color data is still defined inside the Spin code, so this is not harder to use than the C or Forth versions.

I don't have such LED strings, so could not test it. But on the scope the data and clock looks good and I have used code with the same timing to write to SPI RAMs. If it not works with the LEDs, it may be too fast and needs some delays here and there.

The Benchmark result for this Spin+PASM is 388 us (microseconds not milliseconds!) for updating all the LED chips with one color, including the zero byte at begin.
This is over 3600 times faster than the PropForth version in the first post and about 8 times faster than the fastest candidate so far.

Here is the code:

CON
  _clkmode = xtal1  + pll16x
  _xinfreq = 5_000_000

  LEDdata  = 0
  LEDclock = 1
  numLEDs  = 264

OBJ
  term : "FullDuplexSerial"

VAR
  long rgbdata
  
PUB Main 
  term.start(31,30,0,115200)
  cognew(@fspi,@rgbdata)
  waitcnt(clkfreq*2 + cnt)

  result := CNT
  rgbdata := $808080
  repeat until rgbdata == 0
  result := CNT - result

  rgbdata := $80FFFF            'another color
  waitcnt(clkfreq + cnt)         'some delay

  rgbdata := $FF80FF            'another color
  repeat until rgbdata == 0

  term.str(string(13,"time = "))
  term.dec(result/80)
  term.str(string(" us",13))
  repeat

DAT
fspi    mov     dira,outmask
        mov     outa,#0
        mov     ctra,modeMux
        mov     frqb,freq4
loop    mov     t1,#0
        wrlong  t1,par
waitd   rdlong  t1,par  wz
  if_z  jmp     #waitd
        mov     lcnt,#numLEDs    'number of LED chips
        shl     t1,#8
        mov     phsa,#0          'ctra for mux+shiftreg
        mov     phsb,#0          'ctrb for 20MHz sclk
        mov     ctrb,modeClk     'start clock
        nop                      '$00 at begin (8 bits)
        nop
        nop
        nop
        nop
        nop
        nop
        mov     ctrb,#0          'stop clock
leds    mov     phsa,t1          'send the RGB bytes (24 bits)
        mov     phsb,#0
        mov     ctrb,modeClk
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        shl     phsa,#1        
        mov     ctrb,#0
        djnz    lcnt,#leds      'repeat for all LED chips
        jmp     #loop

modeClk       long      %00100<<26 + LEDclock    'counter mode for Clock
modeMux       long      %00100<<26 + LEDdata     'counter mode for shiftreg/Mux
freq4         long      $4000_0000               'frequency = sysclock/4
outmask       long      1<<LEDclock | 1<<LEDdata

t1            res       1
lcnt          res       1

You just write the desired RGB data to the rgbdata variable. The PASM cog notices that and writes the data to the LEDs, then the variable gets cleared. You can wait in Spin until the variable is zero, but normally you anyway have a delay longer then 388us after a color change in the code.

Andy

Peter Jakacki · 2014-07-09 17:53

Ariba wrote: »

For comparison I tried the good old way with Spin + optimized PASM code for the LED String code.
The PASM code uses the counter trick to speed up the transfer to 20 Mbit/sec. The RGB color data is still defined inside the Spin code, so this is not harder to use than the C or Forth versions.

I don't have such LED strings, so could not test it. But on the scope the data and clock looks good and I have used code with the same timing to write to SPI RAMs. If it not works with the LEDs, it may be too fast and needs some delays here and there.

The Benchmark result for this Spin+PASM is 388 us (microseconds not milliseconds!) for updating all the LED chips with one color, including the zero byte at begin.
This is over 3600 times faster than the PropForth version in the first post and about 8 times faster than the fastest candidate so far.

You just write the desired RGB data to the rgbdata variable. The PASM cog notices that and writes the data to the LEDs, then the variable gets cleared. You can wait in Spin until the variable is zero, but normally you anyway have a delay longer then 388us after a color change in the code.

Andy

I just knew that someone would pull this trick out of the hat and it had to be you Andy

The Tachyon time is 2.66ms so that makes your PASM counter code almost 7 times faster although I know that I can code this way but the challenge was to do it with HLL. I know my version could be faster if I used the counter method but I'm interested in running more than snippets of code which can always seem to perform well when they "hog a cog". However, of course your counter code would make a good object for this device and when I get a round TUIT I might incorporate it in my resident run-time OBEX which will live in upper EEPROM or SD. The idea there is that I can load objects at run time (or is that fun time?) and reuse cogs as needed.

Speed of PropForth vs Spin

Comments