Shop OBEX P1 Docs P2 Docs Learn Events
Executing code from lookup RAM — Parallax Forums

Executing code from lookup RAM

I was under the impression that on the Prop2 we could execute code from lookup RAM just like hub RAM, by putting the code at $200. But I can't seem to get it working. Can anyone tell me what's wrong with the following code? It works correctly if start_cog is at $100, but not if I force it to be at $200.

What am I doing wrong?
DAT
        org 0

start   cogid hex_value
        call #\Flash_Hex
        call #\start_cog
loop    jmp #loop

led_mask
        long     |< (56-32)
hex_value
        long     $10203040
hex_count
        long     $0
flash_count
        long     $0

'        orgf     $200 ' put start_cog in lookup RAM
        orgf     $100 ' put start_cog in hub RAM

start_cog
        coginit  #%1_0000, #start wc
        ret

'--DEBUGGING FUNCTIONS -----------------------------------------

CON

blip_time = 1_000_000
hex_time  = 10_000_000
flash_time = 5_000_000

DAT
        orgh $1000
'
' LED_On - turn LED on
'
LED_On
        or       dirb,led_mask
        andn     outb,led_mask
        ret
'
' LED_Off - turn LED off
'
LED_Off
        or       dirb,led_mask
        or       outb,led_mask
        ret
'
' Flash_LED - flash the LED flash_count times
'
Flash_LED
        cmp      flash_count,#0 wz
   if_z waitx    ##flash_time
   if_z jmp      #done_flash
flash_loop
        call     #LED_On
        waitx    ##flash_time
        call     #LED_Off
        waitx    ##flash_time
        djnz     flash_count,#flash_loop
done_flash
        ret
'
' Blip_LED - flash LED briefly (e.g. used to indicate zero)
'
Blip_LED
        call     #LED_On
        waitx    ##blip_time
        call     #LED_Off
        waitx    ##flash_time
        ret
'
' Flash_Hex - flash the LED to display up to 8 hex digits in hex_value
'
Flash_Hex
        cogid    hex_count
        mov      led_mask,##|<(56-32)
        shl      led_mask,hex_count
        mov      hex_count,#8
digit_loop1
        ' skip leading zeroes
        rol      hex_value,#4
        mov      flash_count,hex_value
        and      flash_count,#$f wz
 if_z   djnz     hex_count,#digit_loop1
        ' if all we have are zeroes, do one blip
        tjnz     hex_count,#digit_loop2
        call     #Blip_LED
        jmp      #done
digit_loop2
        tjnz     flash_count,#do_flash
        ' for zero digits, do one blip
        call     #Blip_LED
        jmp      #do_next
do_flash
        ' for non-zero digits, flash the digit count
        call     #Flash_LED
do_next
        djz      hex_count,#done
        waitx    ##hex_time
        rol      hex_value,#4
        mov      flash_count,hex_value
        and      flash_count,#$f
        jmp      #digit_loop2
done
        waitx    ##hex_time*2
        ret

Comments

  • RaymanRayman Posts: 14,789
    edited 2019-03-29 00:57
    You have to jump into lut for it to work...

    Oh, you have to copy it into LUT first?
    https://forums.parallax.com/discussion/165903/how-to-write-lut-exec-assembly-code
  • _ret_ prefix can avoid separate ret.
  • RossHRossH Posts: 5,502
    Rayman wrote: »
    You have to jump into lut for it to work...

    Oh, you have to copy it into LUT first?
    https://forums.parallax.com/discussion/165903/how-to-write-lut-exec-assembly-code

    Aha! Thanks.
  • RossHRossH Posts: 5,502
    TonyB_ wrote: »
    _ret_ prefix can avoid separate ret.

    Yes, still getting used to that one :)
  • RaymanRayman Posts: 14,789
    See my link above..
    Need to copy code into lut first ...
  • evanhevanh Posts: 16,068
    edited 2019-03-29 04:12
    I've been stashing all my generic subroutines in lutRAM. This is where ORG becomes useful. :) You'll note that the copying is going from hubRAM to lutRAM even though there is an ORG $200 on the block to be copied. The only code block that usefully starts off in cogRAM of cog#0 is the ORG prior to first ORGH. And if that ORGH is set to typical $400 then only 1 kB can fit before it.

    Here's my subroutine wrapper code:
    ORGH
    ALIGNL
    _diaginit				'only called once at beginning
    		...
    
    '-------- Copy lut code into position --------
    		setq2	#(LUT_end - LUT_code - 1)	'copy length, in longwords
    		rdlong	0, ##@LUT_code			'the "0" is lutRAM zero, or $200 in memory map
    
    		...
    
    
    ORG  $200					'longword addressing
    '*******************************************************************************
    '  LUT Code  (Has to be copied from hubram to lutram)
    '*******************************************************************************
    LUT_code
    
    		...
    
    LUT_end
    FIT  $400
    
    
  • RossHRossH Posts: 5,502
    evanh wrote: »
    I've been stashing all my generic subroutines in lutRAM.

    Yes, I am intending to do the same thing. Thanks for the wrapper code.
  • Cluso99Cluso99 Posts: 18,069
    Don't forget, the internal stack is only 8 deep.
    And the _RET_ cannot restore the C & Z flags, but you can with RET wc/wz/wcz
  • RossH wrote: »
    I was under the impression that on the Prop2 we could execute code from lookup RAM just like hub RAM, by putting the code at $200.

    As others have already mentioned, you have to explicitly load the LUT RAM with code before trying to execute from it. A few other points:

    - Unlike P1+LMM, P2+hubexec runs at full speed except for branches, which incur a hub lookup penalty
    - As I result I was surprised at how little difference putting code in LUT makes. It's worth it for small loops (particularly if you can use REP) but in general hubexec works pretty well
    - The other reason to keep code in LUT is if you need to use the rdfast/wrfast mechanism, which conflicts with hubexec
    - Oddly enough, it's more efficient to keep code in LUT and tables in COG RAM than the reverse. "rdlut" takes 3 cycles to execute, as opposed to 2 cycles for accessing COG memory, so getting data from COG memory is 50% faster than getting it from LUT memory

    Regards,
    Eric
  • evanhevanh Posts: 16,068
    Also, keeping data in cogRAM keeps full selection of manipulation on tap. No load-and-store when the data's all in the general registers already.
  • RaymanRayman Posts: 14,789
    For P1, it was often best to put tables at register #0 for fastest access... Wonder if that's still true here...
  • evanhevanh Posts: 16,068
    I don't know how fast the Prop1 was at doing that but the Prop2's ALT instructions makes tables in any part of cogram easy.
  • TonyB_TonyB_ Posts: 2,196
    edited 2019-03-29 13:18
    If cog and LUT RAM are in one contiguous block of $400 longs in hub RAM, presumably adding $200 to PTRB after COGINIT gives start address of LUT code?
  • TonyB_ wrote: »
    If cog and LUT RAM are in one contiguous block of $400 longs in hub RAM, presumably adding $200 to PTRB after COGINIT gives start address of LUT code?

    It's rare for the COG code to actually fill $200 longs (especially since the last $10 are registers). Besides, instead of:
       add ptrb, ##$200
    
    it's actually 1 instruction shorter to do:
       loc ptrb, #\@cog_code_addr
    
  • RaymanRayman Posts: 14,789
    What's the backslash do?
  • The backslash forces absolute rather than relative addressing. It's optional as long as you don't move the code around, but personally in situations like this I like to know exactly what the assembler is generating -- with the backslash it's forcing a direct move of the hub address for cog_code into ptrb, without it it could do a relative address, so adding the current pc with the difference to @cog_code_addr.

    I'd still prefer to see different mnemonics for these instructions (loc and locrel, or locabs and locrel, or whatever) instead of the backslash, but that's another story.
Sign In or Register to comment.