Another use for djnz
Alsowolfman
Posts: 65
There have been a number of posts where people are using djnz to decrement a pointer, and i thought i would try taking this a little further(sorry to anyone who may have come up with this before me). One can use the line:
djnz :loop, #:loop
to jump and post decrement the source register. one can use the following two instructions to find whether an array in cog memory contains a value (a)
:loop cmp a, 0-0 wc, wz
if_b djnz :loop, #:loop
this first executes the compare between a and whatever address is loaded into 0-0, then it decrements the address( after, due to pipelineing), and then executes the compare between a and the register with the next lowest index.
This does not have that many uses where it would be beneficial, but i am using it in a a cog that has 4 nested loops that each run up to 80 times, so a single instruction can add seconds to the completion time.
some things that make this harder to use are:
- the array must be loaded in with the first data point having the highest address
-once it has completed its operations one must use the movs instruction, and add one to find the address that contained the correct data.
Here is some working demo code for it that turns pin 5 high to demonstrate that it works:
I hope this helps someone, or is at least interesting.
Post Edited (Alsowolfman) : 9/16/2009 9:25:21 PM GMT
djnz :loop, #:loop
to jump and post decrement the source register. one can use the following two instructions to find whether an array in cog memory contains a value (a)
:loop cmp a, 0-0 wc, wz
if_b djnz :loop, #:loop
this first executes the compare between a and whatever address is loaded into 0-0, then it decrements the address( after, due to pipelineing), and then executes the compare between a and the register with the next lowest index.
This does not have that many uses where it would be beneficial, but i am using it in a a cog that has 4 nested loops that each run up to 80 times, so a single instruction can add seconds to the completion time.
some things that make this harder to use are:
- the array must be loaded in with the first data point having the highest address
-once it has completed its operations one must use the movs instruction, and add one to find the address that contained the correct data.
Here is some working demo code for it that turns pin 5 high to demonstrate that it works:
CON _clkmode = xtal1 + pll16x _xinfreq = 5_000_000 pub main cognew (@work,0) repeat dat org 0 work movs :loop, #:d 'this gives the address after the main array nop djnz :loop, #:loop ' this jumps past the data, it is necessary to start with a djnz so the first cycle is not repeated twice long 0 ' this fulfills the condition in :loop, and prevents it from doing strange things long 1,2,3,4,5,6,7,8 :d long 9 :loop cmp a, 0-0 wc, wz if_b djnz :loop, #:loop ' this works as a jump and post decrement the source register operation movs b, :loop add b, #1 ' this corrects for the post decrement of :loop cmp b, #3 wz, wc 'these two lines verify that it has not stopped due to the zero at the end if_e jmp #noend movs :loop2, b nop :loop2 mov c, 0-0 ' this retrieves the number from the array mov b, #1 shl b,c mov dira, b mov outa, b noend jmp #noend ' prevents execution of further data a long 5 b long 0 c long 0
I hope this helps someone, or is at least interesting.
Post Edited (Alsowolfman) : 9/16/2009 9:25:21 PM GMT
Comments
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,
In your case, where the two labels are the same you have to put an extra step to handle the pipeline delay.
The question is whether reverse indexing your data is worth 4 bytes & 4 cycles.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
For example, in ZiCog we write data to the SD card. Now, I suspect quite often, the write will be to the same sector as just read. Anyway, we are blocking the sectors, so we have to re-read the block before writing anyway. Now, if we compare the 128 bytes that we are writing with what we just read, we could determine if a write is required. We only have to compare until the first byte of inequality. Actually, we can actually compare 4 bytes at a time by using longs. Food for thought
BTW a similar technique is used in my overlay loader which came from Phil's??? original work in reverse LMM.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
ok, i posted it to tricks and traps, good idea.
ericball,
yeah, it has a lot of shortcomings, but it is the fastest way that i have found to search a large array in cog memory.
cluso99,
It sounds like you are comparing the same bytes in each array, a[noparse][[/noparse]0] to b[noparse][[/noparse]0], and a to b... , i think it would be best to use the add #1_000000001 trick instead. this is more useful if you are trying to find if any two longs in any part of the two arrays meet some criteria ie, a[noparse][[/noparse]0] to b[noparse][[/noparse]0], a[noparse][[/noparse]0] to b... a to b[noparse][[/noparse]0]... to use this trick you would have to change the destination register every loop as well, so it would take 3 instructions, and be very confusing.the add #1_000000001 trick would only take you 3 instructions and you would not have to store everything backwards.
yeah, this is based on the reverse LMM method, i was trying to give credit in the first line, but i guess i should have been more explicit. i started with the idea from that of using djnz to decrement a pointer from the reverse LMM method, and instead use it to decrement the source register directly.
***add #1_000000001 trick in case people have not seen it(i did not come up with this)
:loop cmp a,b wc, wz
add :loop, c
if_e jmp #:loop
c long #1_000000001
this adds one to both the source and destination registers in one action
One of the problems ericball mentioned is in these two lines from your first post:
This will not work without an instruction before cmp. You have to fill the pipe.
Edit: What was I thinking? Another broken fragment removed [noparse]:)[/noparse]
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propeller Tools
- Propalyzer: Propeller PC Logic Analyzer
- BMA: An on-chip PASM Debugger
- SPUD: Spin Source Level Debugger
Post Edited (jazzed) : 9/18/2009 6:33:33 PM GMTCURRENT instruction
LAST/NEXT instruction
T=-1:
Fetch "source" for LAST operation
T=0: Fetch Instruction
Perform LAST operation
T=1: Decode instruction
Store result back into "dest"
T=2: Fetch „dest“ operand
T=3: Fetch „source“ operand
T=4: Perform operation
Fetch NEXT instruction
T=5: Store result into "dest"
Decode NEXT instruction
T=6:
Fetch "dest" for NEXT instruction
the first time that the cmp is used it executes using the source that is originally there, while it is preforming the cmp the new source address is written. when it loops back to the cmp the second time it retrieves the address put there by the first djnz, and while it is preforming that cmp, it changes the address again...
this does work, test the code, it should pull pin 5 high.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
Granted, the cmp instruction source 0-0 at :loop is post decremented. Why is #:loop in the djnz instruction decremented before the jump happens?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propeller Tools
The djnz loop is no different (except that it's a loop) from the following which will always throw a flag for me.
In this case dst will not get the content pointed to by the ndx value unless there is something in the pipe.
So, you are relying on the previous value of the source in "cmp a, source wc,wz" to be the old value. Mmm, OK.
Looks like a good trick, and it's nice to have a fresh tool in the box, but I would have to put a big fat comment
on it to keep from getting confused.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propeller Tools