Correct syntax to reserve the first n longs for a run-time fast lookup table

godzich · 2017-07-16 19:22

Hi,

I want to run PASM in one cog so that I have 64 longs from address 0 reserved for a run-time generated table that I can access as fast as possible within the same COG's PASM program. The table must start att the COG address 0 to have the fastest access possible (no table base address addition before accessing the data). What is the proper syntax for this? To save the precious HUB memory I want to use the RES directive before my code since the table is generated at the beginning of the PASM program. Can this be done? Wich is the proper syntax?

Cheers,

Christian

frank freedman · 2017-07-16 23:26

The prop manual says to put RES after your code, but I played with this, and used a long $00_00_00_00[64] as the first line after org and it appeared to work. Modified the blink demo in the online propeller manual. Y Alternatively, though I have not made it work, the @entry_name in the coginit/cognew seems to imply that that value could be a non-zero starting address so that the cogninit(cog#, @starting_address, par_data) would start cog # at the point your code would start (located after the lookup table area) with a value passed to par. Not sure if I did something wrong in how I set it up or if a cog can only start running code from address 0. Someone may have that answer out there!

{{ AssemblyToggle.spin }}
CON
_clkmode = xtal1 + pll16x
_xinfreq = 5_000_000
PUB Main
{Launch cog to toggle P16 endlessly}
coginit(1,@start,0) 'Launch new cog
DAT
{Toggle P16}
org 0 'Begin at Cog RAM addr 0
start long $00_00_00_00[64]
Toggle mov dira, Pin 'Set Pin to output
       mov Time, cnt 'Calculate delay time
       add Time, #9 'Set minimum delay here
:loop  waitcnt Time, Delay 'Wait
       xor outa, Pin 'Toggle Pin
       jmp #:loop 'Loop endlessly
Pin    long |< 16 'Pin number
Delay  long 6_000_000 'Clock cycles to delay
Time   res 1 'System Counter Workspace
FIT    496

Dave Hein · 2017-07-17 00:16

Comment removed.

Cluso99 · 2017-07-17 00:24

There are two ways that I know will work.

1. As Frank says, place 64 long zeros (NOPs) at the beginning and your code will fall thru the NOPs.

2. Place a JMP to the start of your code at Cog $000, the the remaining 63 longs of your table, followed by the first entry of your table (the one replaced by the JMP), then the start of your code. Label the first entry of you table TABLE0. Then at the start of you code (following the JMP, 63 table, first table entry), perform a MOV $0, TABLE0

Hope this helps.

frank freedman · 2017-07-17 02:01

Being careful not to jack the OPs thread, a cog bassed program will always start executing at cog address 0? So he reserves as I did with zeros, then generates or loads the LUT into the 64 lowest addresses and continues on as planned? Or codes a static LUT with a jump in the first instruction, puts the correct value in place of the jump instruction at 0 and continues as planned?

Phil Pilgrim (PhiPi) · 2017-07-17 05:15

But the OP's objective was to not use up hub space with the table, which the long $00_00_00_00[64] will do. An option would be to write a short piece of code at the beginning of the cog that relocates the actual program that immediately follows it to location 64, then jumps to it. The relocation code should start with an org 0; the code to be relocated, an org 64, so its internal references work out right at the new location.

-Phil

Ariba · 2017-07-17 07:07

What Phil says, he just posted while I prepared the code for my post ;-)
Here is a code snippet that shows the suggestion (not tested):

DAT
		org	0
IncDS		long	$201			'Increment D and S field
ccntr		mov	ccntr,#496-64		'number of longs to move
:moveup		mov	495,437			'start with top addresses
		sub	:moveup,IncDS		'next 1 address lower
		djnz	ccntr,#:moveup
		jmp	#64			'jump to start of moved code

		org	64
		<your code follows here>  	'will be placed at addr 64 after moveup executed

So you can reduce the overhead from 64 to 6 longs.

Andy

godzich · 2017-07-17 09:20

Hi all,

Thanx for all replies. Phil and Ariba - you nailed it! I don't want to loose 64 HUB longs just for the sake of making room for an initially empty table in a COG.

Your idea moving the code after it has started - is genious. And then nex overwrite the code snipplet that moved everything with my runtime generated table. But what happens to compile time resolved absolute adresses and references? Will this really work - oh I see (thanks Phil for pointing it out), the org 64 directs the compilert to resolve all references properly so that when this relocated code executes from 64 upwards, all references are correct.

This technique also means (?!!) that I cannot fill the COG completely with code - I will lose 64 LONGs at the end of the COG codespace - and that is a no go - my algorightm needs about all 496 available longs... ? Is this assumption correct?

So - to return to the original question; Isn't there any trick using RES that would work??? How RES erally works is still a little dim. Such a waste to only be able to use it at the END of your code !!??

The calling COG that starts my PASM COG has an label "Start" ( cognew(@Start, 0) ) that indicates where the execution starts? Or is it just used as a reference for the compiler to keep it on the track? Execution always starts at 0?

As always, grateful for your comments and support with this very different but interesting processor. Still learning...

Cheers,

Christian

godzich · 2017-07-17 09:30

Hmm... probably the only room I loose are those instructions needed to transfer my code....

Christian

Cluso99 · 2017-07-17 10:05

Sorry I missed the part about saving the hub space.

Ariba's concept will work for you. Following the bit of code moving the 496-64 longs up, calculate the table and put into cog overwriting the startup code. You will need to be careful not to overwrite your code executing. You might have to put a tiny bit of code at the end of your cog that will be used as data space once everything is ready.

If you don't have enough space, then ask as I can help you get a few extra longs using the shadow ram in the register space. It's tricky but can be done if there is no other way.

Ariba · 2017-07-17 10:38

You can not have a 64 longs table and 496 longs code. The Cog-Ram is only 496 usable longs that get used for code and data and register-variables.

The Assembler builds a cog image in Hub-Ram. A cognew loads this image into a cog and starts execution at address 0. Cognew always loads 496 longs from the hubaddress you give in the second parameter. Normally you don't have code that spans the full 496 longs, so it additionally loads just what is in hub behind the cog image (mostly hub-data or Spincode).

RES allows to make the cog-image smaller if you have variables that don't need to be initalised. So RES increments the cog address for the Assembler, but does not reserve memory in Hub-Ram. This can only work at the end, not at begin. If you do it at begin of the image the whole image gets displaced references.

If you want to spare Hub-Ram, you can use the space of the cog-image for other things, like a fifo or sector-buffer, after you have startet the cog. The hub-image is then no longer needed, as long as you don't want to start the same code in another cog, or stop and start the cog again.

Andy

Dave Hein · 2017-07-17 12:47

godzich, I now understand that your primary goal is saving hub memory. Have you tried re-using the cog image that's in hub memory for buffers? Once a cog is started the image in hub RAM is no longer needed, unless of course you need to re-start cogs later on. Also, are you using BST to compile your code? It has options for reducing code size, such as removing unused methods. I think OpenSpin does some of this, but I'm not sure if it does as much as BST.

godzich · 2017-07-17 13:23

Hi,

@ Andy and Phil: thank you a zillion - works like a charm

I learned at the same time a neat trick about how to optimize code and run efficient loops using "local" count variables in a clever way (ccntr)! Those variables (ccntr and IncDS) must naturally be defined and referenced "locally" and cannot be variables that are defined after the org 64

Thanks for saving me 58 longs in HUB! Very helpful. I assume this is the only way to reserve n longs in the beginning of a cog and preserving at the same time HUB ram. Obviously HUB ram can later be overwritten. I need to restart or launch several COGs running this same code so this is not in this case an option.

it is clear that the max size of all code, variables, constants can be max 496 LONGs.

@Cluso99 - it would also be interesting to learn more about the shadow ram

Cheers,

Christian

Cluso99 · 2017-07-17 15:01

Each of the 16 special registers also has COG RAM that is in parallel to these registers. When you write to these registers, you also write a copy to its shadow cog ram. So if you write to the read only PAR register it also writes this value to the shadow ram at $1F0. Presume you wrote the value of a JMP $000 instruction there. If you executed an instruction at $1F0 (by jumping to it, or falling into it from $1EF) the instruction would be fetched from the shadow ram, not from the PAR register. T he first 4 registers can be used in this way. I use them to run an LMM loop to trace both the Interpreter and PASM code in my zero footprint debugger. It's not easy but it works. You can also use DIRB and OUTB as variables since the Port B is not implemented in the Prop. INB can be used as zero as it's a read only register.

Kuroneko is the specialist for tricks using the counters FRQA/B and PHSA/B.

frank freedman · 2017-07-17 16:20

Hm, thought relocating the code would be more involved. Nice. I did go on the assumption that the hub space could be recovered once the cog was loaded. What I found odd in both proptool and bst was that wherever I set the label in the calling coginit/cognew, the code in the cog would exist from that point. So if I changed the label from start in my modified blink, the first 64 slots would be gone as well as any code that would have been loaded before the label in the pasm code. Saw this behavior using Gear to simulate since the blink program stopped when the coginit used a label that was not at the first long in cog ram.

Does this mean cog execution always start at zero, and that the compiler will generate code starting from the label in coginit/cognew call, discarding any code before the label?

msrobots · 2017-07-17 19:50

Yes.

coginit/cogstart load the code from the address given into the COG and then always start at COG address 0

Enjoy!

Mike

Phil Pilgrim (PhiPi) · 2017-07-17 20:06

I was just looking at Andy's very nice code again and might suggest a couple changes for clarity:

DAT
		org	0
:moveup		mov	495,437			'start with top addresses
		sub	:moveup,IncDS		'next 1 address lower
		djnz	ccntr,#:moveup
		jmp	#64			'jump to start of moved code

IncDS		long	$201			'Increment D and S field
ccntr		long    #496-64	                'number of longs to move

		org	64
		<your code follows here>  	'will be placed at addr 64 after moveup executed

This way, it's less "tricky" looking and still accomplishes the task in six longs.

-Phil

pjv · 2017-07-17 22:01

Phil;

But in "relocating" the code, what happens when the S field contains an immediate constant (#), and does not want to be adjusted? I suspect a few more lines of code are required to take care of such issues.

Cheers,

Peter (pjv)

Phil Pilgrim (PhiPi) · 2017-07-17 22:17

Peter,

The org 64 will take care of any such issues. The code will assemble as if it were already relocated at the new address. None of the code that's being moved gets altered in any way.

-Phil

pjv · 2017-07-17 22:52

Sorry Phil, my bad;

I did not read your code precisely enough.... I thought you were adjusting code assembled at org 0 , but on closer inspection of the line

sub :moveup,IncDS 'next 1 address lower

I now see it simply adjusts the addresses of the move pointers

Cheers,

Peter (pjv)

Correct syntax to reserve the first n longs for a run-time fast lookup table

Comments