[POC] reverse overlay loader aka cog-to-hub transfer
kuroneko
Posts: 3,623
OK, I didn't have a rainy day but at least an odd idea. Turns out it actually works and can be used to transfer cog memory to hub RAM without penalty. The following example transfers 64 longs.
Enjoy!
Enjoy!
'' '' cog to hub transfer '' '' Author: Marko Lukat '' Last modified: 2011/02/18 '' Version: 0.1 '' '' acknowledgements '' - code based on work done by Phil Pilgrim (PhiPi) and Ray Rodrick (Cluso99) '' CON _clkmode = XTAL1|PLL16X _xinfreq = 5_000_000 VAR long guard_before, storage[64], guard_after PUB null | n cognew(@entry, @storage{0}) waitcnt(clkfreq + cnt) dira[16..23] := outa[16..23] := -1 ' success (preset) if guard_before or guard_after outa[16..23] := $81 ' transfer failed else repeat n from 0 to 63 if storage[n] <> $DEADBEEF outa[16..23] := $81 ' transfer failed quit waitpne(0, 0, 0) DAT org 0 entry movd xfer_copy7, #data + 63 ' -4 move cog address into wrlong instruction movd xfer_copy1, #data + 62 ' +0 move cog address into wrlong instruction movi ctra, #%0_11111_000 ' +4 LOGIC always [COLOR="#D3D3D3"]'{optional} long 0[3] {3 x nop} ' +8 adjustment[/COLOR] mov frqa, par ' +4 frqa gets added twice between modifying shr frqa, #1 ' +8 shadow[phsa] and wrlong mov phsa, #256 - 1 ' -4 hub byte count (8n + 7) xfer_copy7 wrlong 0-0, phsa ' +0 = copy long from cog to hub sub $-1, dst2 ' +8 decrement cog address by 2 sub phsa, #7 wz ' -4 decrement hub length by 1 long (prev by 1, now by 7) xfer_copy1 wrlong 0-0, phsa ' +0 = copy long from cog to hub sub $-1, dst2 ' +8 decrement cog address by 2 if_nz djnz phsa, #xfer_copy7 ' -4 decrement hub length by 1 long (prev by 7, now by 1) cogid cnt ' cogstop cnt ' sayonara ... ' initialised data and/or presets dst2 long 2 << 9 data long $DEADBEEF[64] ' uninitialised data and/or temporaries fit DATLimitations
- transfers have to be in 2n long quantities
- consumes a counter
- [thread=141015]Fastes possible memory transfer[/thread] [sic]
- [thread=129701]PHSA: A question for kuroneko & an idea for storing data in unused cog ram (from OBC)[/thread]
- [thread=104167]Assembly Oververlay Loader for Cog FAST (renamed & released)[/thread] [sic]
- [thread=118012]Quick Cog-to-Hub transfer[/thread]
Comments
FRQA = hubaddress << 1 (for every 2nd access) << 2 (implied because par points to longs)
PHSA = 255 originally = (4 * 64 longs) -1
we write the 2nd last long
PHSA = 255-7 wz (because we access the shadow cog ram so it holds the value we wrote last (not the actual current PHSA value)
we write th last long
PHSA = 255-7-1 and provided z not set, repeat loop
The zero flag is necessary as I needed the lower 2 bits high (to get the sub #7/djnz going). Simply adding/or'ing them would have left me with 3 at the end of the transfer which can't be used as an exit condition. So -1 worked just as well (%--11 and usable exit condition).
Just noticed a possible misunderstanding regarding last long. You probably meant the last one written whereas I refer to the last one in the block.