Are there current examples of PASM2 in LUT memory?

Larry Martin · 2024-06-26 17:12

Hi folks. I hope this is not a stupid question, but here goes...

I have a PASM2 cog that nearly fills the $1F0 limit. It's ported from a PASM1 cog with the same problem. In both cases, I have made the code fit by commenting out diagnostics. In P2, I want to uncomment the diagnostics by placing them in LUT memory. The examples I can find in the forum make it sound straightforward, but when I follow the examples my code freezes - I can't even get a version string out of it (via SmartPin serial)

I can't publish my company code, and hope to get away without contriving a full example. My working module looks like this:

CON
VAR
PUB Start function
PUB Other functions
DAT
ORG or ORG0
PASM2 Loop including commented diagnostics
PASM2 functions including commented diagnostics
long statements
FIT $1F0

My attempt at exec-in-LUT looks like this:

CON
VAR
PUB Start function
PUB Other functions
DAT
ORG or ORG0
PASM2 Loop including commented diagnostics
long statements
ORG $200
PASM2 functions including commented diagnostics

Pnut 42 accepts and loads the code, but the target does not respond to my serial Version command, which is in the top level SPIN2 code. It uses a variation of JonnyMac's jm_fullduplexserial, in a different cog, which I did not change for this exercise. I expected that, even if the cog with code in LUT went in the weeds, top level SPIN2 (i.e. MAIN), and the other cogs, would still work. As far as I can tell, everything just freezes. My command line to load is:
PNut_v42s.exe %1 -r

I have read through the posts I can find on LUT execution, and I'm afraid they are all over my head - they seem to be part of ongoing discussions between Chip Gracey and the early adopters, about 10 years ago. Example here:
https://forums.parallax.com/discussion/162631/examples-isr-in-cog-lut-hub

Can someone point me to a recent working example of exec-in-LUT or explain what else I need to do so the COG (register?) code can call the LUT code?

Christof Eb. · 2024-06-26 17:20

There is a tile vga driver in the Flex suite, which uses lut code, as far as I remember.

The lut has to be loaded with the code from hub. Is this the problem?

Wuerfel_21 · 2024-06-26 18:44

@"Larry Martin" said:
My attempt at exec-in-LUT looks like this:

CON
VAR
PUB Start function
PUB Other functions
DAT
ORG or ORG0
PASM2 Loop including commented diagnostics
long statements
ORG $200
PASM2 functions including commented diagnostics

There is no way to autoload LUT code, you need to load it manually. Since PTRB starts out with the hub pointer to your cog code when starting in cogexec, you can do this to load it:

PUB start()
  coginit(COGEXEC_NEW,@entry,0)
DAT
              org
entry
              add ptrb,##@lutcode - @entry
              setq2 #511
              rdlong 512-512,ptrb

              ' [useful code here]

              fit 496

              org 512
lutcode
              ' [even more code here]

              fit 1024

Example in actual source file (probably not the best example, but this is what I was just working on): https://github.com/Wuerfel21/tempest2k/blob/p2port/src/mikodsp.spin2

Larry Martin · 2024-06-26 20:13

@"Christof Eb.": The lut has to be loaded with the code from hub.
@Wuerfel_21: There is no way to autoload LUT code, you need to load it manually.

Does that mean that LUT code is like an overlay? It looks to me like once rdlong 512-512,ptrb executes, the original contents at entry are overwritten. So the code doesn't execute in LUT as much as reside in LUT, so it can be copied into COG memory for execution. Is that right?

I'll look at the examples, thanks for the help.

Wuerfel_21 · 2024-06-26 20:21

@"Larry Martin" said:
@"Christof Eb.": The lut has to be loaded with the code from hub.
@Wuerfel_21: There is no way to autoload LUT code, you need to load it manually.

Does that mean that LUT code is like an overlay? It looks to me like once rdlong 512-512,ptrb executes, the original contents at entry are overwritten. So the code doesn't execute in LUT as much as reside in LUT, so it can be copied into COG memory for execution. Is that right?

I'll look at the examples, thanks for the help.

No. The SETQ2+RDLONG combination is what you need to load stuff from the hub into LUT memory. The D parameter is "512-512" (i.e. zero) because it needs a 9-bit address in LUT space. In this case we want to load the entire LUT RAM from the beginning, so it's LUT address zero. This is a bit confusing: For execution, LUT appears between $200 and $3FF, but all instructions that read/write data in LUT memory use 9-bit addresses from $000 to $1FF.

Larry Martin · 2024-06-26 21:16

@Wuerfel_21 said:

@"Larry Martin" said:
@"Christof Eb.": The lut has to be loaded with the code from hub.
@Wuerfel_21: There is no way to autoload LUT code, you need to load it manually.

Does that mean that LUT code is like an overlay? It looks to me like once rdlong 512-512,ptrb executes, the original contents at entry are overwritten. So the code doesn't execute in LUT as much as reside in LUT, so it can be copied into COG memory for execution. Is that right?

I'll look at the examples, thanks for the help.

No. The SETQ2+RDLONG combination is what you need to load stuff from the hub into LUT memory. The D parameter is "512-512" (i.e. zero) because it needs a 9-bit address in LUT space. In this case we want to load the entire LUT RAM from the beginning, so it's LUT address zero. This is a bit confusing: For execution, LUT appears between $200 and $3FF, but all instructions that read/write data in LUT memory use 9-bit addresses from $000 to $1FF.

That sounds more promising, thank you. I will read the examples and try again!

Larry Martin · 2024-06-28 18:44

@"Christof Eb.", not loading the code from HUB to LUT was indeed the problem.
@Wuerfel_21, thank you for your detailed example. Let me recap for the next person who tries this. There's a question at the end...

In a "standard" SPIN2 file meant for a cog, i.e., including a DAT section with "org" followed by PASM2 content, you first need to pick out the code that will be moved to LUT. It was easy for me since I had a few large functions. Cut that code out and move it to the end of the file, past any LONG statements with labels that will be used in the main code. The LONG variables all have to have addresses below $200 because of the 9 bit size of the PASM2 src and dst fields. Those LONGs wil be accessible by both COG and LUT code.

After your LONG statements, and before the code you just moved, make a new ORG line for $200, the start of LUT memory (do not start in column 0). Add a label (in column 0), for the linker to find your code. It doesn't have to be lutcode, but it has to match the label in the next section. As I understand it, this code will now be link/located for org 512, but will actually be **loaded ** somewhere in 512 MB HUB memory.

    ORG 512

lutcode

Higher in your DAT section, right after your original ORG[ 0] statement, you need to add three simple lines to copy your lutcode from HUB to LUT memory:

                     add ptrb,##@lutcode - @entry   'ptrb is loaded with address of this instruction by SPIN level coginit()
                     setq2 #511                     'sets count for next command
                     rdlong 512-512,ptrb            'read code from HUB to LUT (not COG - this is COG memory)

I did all this, loaded code and ran as usual, and it seemed to work.

Per your example, in order for your original COG code to access the LUT code, it needs to redirect through a LONG variable with a 9 bit address, so you have make pointers...

func1_p long func1

... and then do the call indirectly:

calld func1_p,func1_p wcz

I made the pointers but forgot to change my calls. The code seemed to work anyway. Can you think of an explanation?

Larry Martin · 2024-06-28 19:05

Per [@Wuerfel_21] example, in order for your original COG code to access the LUT code, it needs to redirect through a LONG variable with a 9 bit address, so you have make pointers...

func1_p long func1

... and then do the call indirectly:

calld func1_p,func1_p wcz

I made the pointers but forgot to change my calls. The code seemed to work anyway. Can you think of an explanation?

Update: I changed my calls to the CALLD pattern and the code stopped working. Changed back to the original CALL # pattern and it works again. Now I'm worried. What's going on?

Wuerfel_21 · 2024-06-28 19:59

CALL #A takes a 20-bit immediate address or offset, so it's fine. (however, calling from cog/lut code into hub code is broken in PNut unless you're doing a pure ASM program due to how spin programs are linked).

CALL D with a register would also work, but CALLD D,S is an entirely different instruction that does something else (instead of pushing the return address to the stack, it gets saved into D).

evanh · 2024-06-28 20:06

@"Larry Martin" said:
I made the pointers but forgot to change my calls. The code seemed to work anyway. Can you think of an explanation?

As Ada just hinted, there's nothing to manually change in the wider sources. The ORG 512 takes care of it for you. That's how ORG works, it informs the assembler of where the code is expected to be. You would be getting crash behaviour if you failed to copy the code there before making a call to it.

Larry Martin · 2024-06-29 00:39

Ok, thanks.

Are there current examples of PASM2 in LUT memory?

Comments