PropBASIC LMM JumpTable

Bean · 2019-09-10 12:47

For my latest PropBASIC project (VGA,Sound,Controller interface for the ZX81) I needed a jump table for some LMM code.
With a little inline assembly, I got it working.
This is for LMM code ONLY!!!

      regIndex = regIndex MAX 34 ' Only 35 entries in jump table

      ' Jump Table
      ASM
        MOV __param1,regIndex
        ADD __param1,#1 ' Skip RDLONG instruction
        SHL __param1,#2 ' 4 bytes per long
        ADD __param1,__PC
        RDLONG __PC,__param1
        LONG @ @ @LSB0
        LONG @ @ @LSB1
        LONG @ @ @LSB2
        LONG @ @ @LSB3
        LONG @ @ @LSB4
        LONG @ @ @LSB5
        LONG @ @ @LSB6
        LONG @ @ @LSB7
        LONG @ @ @LSB8
        LONG @ @ @LSB9
        LONG @ @ @LSB10
        LONG @ @ @LSB11
        LONG @ @ @LSB12
        LONG @ @ @LSB13
        LONG @ @ @LSB14
        LONG @ @ @LSB15
        LONG @ @ @LSB16
        LONG @ @ @LSB17
        LONG @ @ @LSB18
        LONG @ @ @LSB19
        LONG @ @ @LSB20
        LONG @ @ @LSB21
        LONG @ @ @LSB22
        LONG @ @ @LSB23
        LONG @ @ @LSB24
        LONG @ @ @LSB25
        LONG @ @ @LSB26
        LONG @ @ @LSB27
        LONG @ @ @LSB28
        LONG @ @ @LSB29
        LONG @ @ @LSB30
        LONG @ @ @LSB31
        LONG @ @ @LSB32
        LONG @ @ @LSB33
        LONG @ @ @LSBHandled
      ENDASM

!!! NOTICE Do not put spaces between the @ the forum doesn't display correctly. !!!

Bean

Cluso99 · 2019-09-10 13:03

You can use words instead of longs as you only need 16 bits for LMM addresses.

Bean · 2019-09-10 13:43

Good point. I'll have to try that.

Bean

ersmith · 2019-09-14 22:40

If you're looking at PropBASIC again Bean, did you see my LMM experiments thread (https://forums.parallax.com/discussion/169998/stupid-lmm-tricks#latest)? It's possible to get dramatic performance increase in LMM code by providing a small cache in COG memory, and that can be done without any compiler changes at all (just a change in the LMM interpreter). So it provides many of the benefits of FCACHE without requiring the compiler to figure out what code to cache and what not to.

Basically the way it works is that the man LMM loop is the same, but the LMM_JUMP routine for handling branches checks for short backwards branches. If it finds one, it loads the LMM instructions from the destination up to the current jump into a COG cache space and jumps into it there, where it can run at full COG speed.

I haven't tried it with PropBASIC specifically, but it's a general LMM feature so it should work with any compiler. The current source code (as tested in fastspin) is attached.

Performance of Heater's fftbench benchmark (Spin version, compiled with fastspin):

plain LMM:        233538 us
unrolled LMM:     154896 us
auto-cache LMM:    58332 us   (128 instruction COG cache)
compiler FCACHE:   63019 us   (128 instruction COG cache)

In this particular case the auto-cache LMM beats the compiler. With smaller COG caches the compiler FCACHE does better, if I remember my testing correctly. I've still left the compiler FCACHE as the default in fastspin, since I've already done the compiler work for it, but for future projects I may switch to auto cached LMM instead.

PropBASIC LMM JumpTable

Comments