SETQ2 with LUT RAM
ke4pjw
Posts: 1,169
I need a quick sanity check. SETQ2 would be used with RDLONG and WRLUT correct? Something like....
mov t2, #$200 ' Set the LUT address in t2 setq2 #10-1 ' get 10 longs from hub to LUT rdlong t1, ptra ' populate intermediate register with HUB data @ptra wrlut t1, t2 ' write intermediate register data into LUT @t2
Would it increment the intermediate cog register (t2) that is used to shuffle data from HUBram into the LUT?
Am I even close to getting this right?
As always, thanks in advance.
--Terry
Comments
Just the SETQ2+RDLONG is all you need. The SETQ2 completely changes how RDLONG operates.
What you've got there copies ten longwords from hubRAM (beginning at address
ptra
) to lutRAM (beginning at addresst1
). The WRLUT then subsequently places the singular value fromt1
into lutRAM at address #200.There's four types of modal instructions like SETQ. They all have one thing in common - they put a temporary hold on interrupts.
In no particular order:
@evanh so this should work?
No, you need
Yeah, the answer looks odd because the RDLONG assembly syntax is ill-fitting for this. Both operands are addresses but D is immediate direct mode while S is direct indirect mode. You get that when it's a repurposed instruction. ... And then, just to add fun, there's the memory map differences between data space and program space.
Uh, now I am really confused. What is the ($200-$200) syntax? Isn't that $00 ? I thought LUT memory started at $0200
Sorry for being so thick.
Yeah, understandable. Data addressing of lutRAM starts at zero. Only the instruction fetching for program execution maps lutRAM to $200 to $3ff. It's maybe something Chip should've aligned but it wasn't thought much about at the time. EDIT: Acutally, there is a good reason: Direct address values (encoded in the instruction itself) are limited to 9 bits. By keeping the addresses in the 0 to $1ff range the program is more compact and faster running.
Got it. One last, and very dumb question. Can LUTs be reserved for symbols? If so, how is it done?
Yes, as program labels. But that means if you use them for a data access (RDLUT/WRLUT) then it's up to you to subtract the $200 appropriately.
How would that look? I mean, how would it make it's way into the LUT address space?
Here's a list of addressing modes:
Wait, I can use a constant to point to a specific LUT address if I wanted to, right?
Yep. When it's S operand it has a preceding #. When it's the special case block copy then no #.
Hi,
struggling with block move to LUT too. The code is executing from HUB.
I had had:
which works
( FOR: ... NEXT: r2
Is a Taqoz macro for the loop, with djnz )
And try to use now a block move for the LUT part:
It does not work. I want to replace the cog move section later, but started with the LUT first here.
I don't understand, why ($200-$200) is used in post #5 instead of $0 ? Is this a way to preload the register0 with zero?
Is it necessary to use a pointer register and r4 will not work?
Edit: Hm, changed to use PTRA, did not work either.
What's wrong with my snippet? Thanks for any hints!
Christof
The r0 is wrong. The register number itself is the cogRAM or lutRAM address. The earlier ALTD is the solution to indexing the D operand. But you aren't wanting to index it so do this instead:
.
She's used $200-$200 like a comment in an unsuited association with lutRAM's program space - which begins from address $200.
It equates to zero, and zero is correct first address of lutRAM data space. Using a zero is sensible.
Thank you very very much! I had spent several hours on this miracle! Trying to isolate it, checking the code, that the assembler produces, experimenting...
So this is the code, that works:
Don't you want to write a book on P2?
Christof
Ha, no way! I don't have the ambition. Nor do I have the will to even catalogue a list to write about. Not to mention I'd get sidetracked and not finish anything anyway.
BTW, does PTRA adressing work with WRLUT? I mean, can I write
to write a long to LUT with auto-incrementing pointer? And do I have to put $200 into ptra or $0?
And yes, somebody should finally finish the %&#! documentation of the P2. The assembly language manual still misses the explanation of some of the most important instructions.
Either. The addresses are truncated to 9 bits.
https://p2docs.github.io/lutmem.html#wrlut
Great, thanks! So you wrote the book I think somebody should put a link to it here.
.... still awfully struggling with setq.
The last long of lut (at least) makes trouble. It gets overwritten with wrong values.
This is the routine, which swaps the tasks. For this it swaps the complete LUT and also the registers of the FORTH machine. It executes from HUB.
This is the slow version, that works:
What do I oversee? Is this some timing problem? Is there some difference between rdlong and wrlong in combination with setq2?
Edit:
If I use lut510 instead lut511, my code works. The slow version though works with lut511.
If someone can see something, I am grateful!
Christof
Hm, spent some more hours here.
Is it possible to fill PTRB=505 with fast block move using setq? Something very strange seems to happen here?!
Edit: Very strange indeed, if I fill up to 506, which is DIRA, then PTRB gets filled correctly.
The last register is filled with garbage???
Is this known?
I don't think I ever tested interaction of block moves with hardware registers. Might actually be a real chip bug.
Thanks for the reply! It did give me some encouragement, not to believe to be completely crazy.
So I did some more investigation and wrote a bug report. https://forums.parallax.com/discussion/175592/bug-in-setq-for-fast-block-move#latest
If you are ever looking for some example for https://p2docs.github.io/hubmem.html#block-transfers , you could consider the following code, which swaps all of LUT and also the relevant parts of the Taqoz Forth virtual processor in COG memory. There are 2 buffers in register a and b. As I am fed up with this, PTRA is also done the slow way.... These block moves make the code about 9 times faster, than conventional loops.
Christof