PASM Cycle Timing
hippy
Posts: 1,981
Do these have the same cycle timing between Start and IsZero/NotZero or does the IF_Z make Start to NotZero fewer cycles in the second example ?
:Start mov acc,#1
tjz acc,#:IsZero
:NotZero
:Start mov acc,#1 WZ IF_Z jmp #:IsZero :NotZero

Comments
-Phil
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Still some PropSTICK Kit bare PCBs left!
' debug mov t1, par add t1, #8 mov t_in, cnt mov cr_temp,#1 wz 'tjz cr_temp,#:IsZero IF_Z jmp #:IsZero :NotZero mov t_last, cnt :IsZero subs t_last, t_in abs t_last, t_last wrlong t_last, t1 add t1, #4 :hard_stop jmp #:hard_stopMy reported values were 16 and 12 for the TJZ and IF_Z, respectively. (note: I would comment out one line for each version...this code is for the IF_Z version)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
I don't have hardware setup to test it and the manual seemed unclear, however I have now found this ... "When an instruction’s condition evaluates to FALSE , the instruction dynamically becomes a NOP , elapsing 4 clock cycles but affecting no flags or registers" ( page 368 ).
It seems then that "IF_Z JMP" is always faster than "TJZ", only ever taking 4 cycles.
@ lonesock : Thanks for taking the time to test it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
Sorry, I misread the intent of your question. But lonesock's answer prodded me to take another look at the pipeline. Paul Baker explained it best:
Here are some observations relative to your question:
1. Since the status bits get written during the R stage, they are not yet ready before the next instruction is decoded. So the decision whether to execute the next instruction has to be deferred at least to the S stage, where the source (i.e. the destination of a jump) is fetched.
2. For JMP-type instructions, the destination operand has to be loaded into the PC before the I stage. So the decision whether to execute has to occur before that. I guess this could occur at either the S or D stage. Maybe S isn't even loaded if a jump isn't taken, but I suspect that it is and that the PC is loaded during D.
3. Once D passes, and the PC is loaded for the next instruction, the die is cast. If the E stage determines that the jump shouldn't be taken after all (e.g. TJZ, DJNZ), the pipeline has to be flushed and PC reloaded with the address of the next instruction, which entails an extra four clocks.
It would appear that the earlier decision implied by the IF_NZ is what makes the shorter execution time possible in a no-jump situation.
At the risk of changing the subject (in order not to copy Paul's quote to yet another thread), how many times have we done something like this:
movs nxt,reg [b]nop[/b] nxt oper dest,#0-0when we could've done this:
mov src,reg nxt oper dest,src ... src long 0-0According to the pipeline, the NOP is unnecessary in the second example. Of course, similar savings are not possible if it wasn't an immediate operand to begin with.
-Phil
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Still some PropSTICK Kit bare PCBs left!
This arose while writing yet another Virtual Machine which handles indirection ...
mov acc,src test opc,#%0101 WZ IF_NZ rdlong acc,acc test opc,#%1000 WZ IF_NZ rdlong acc,accThe questions which came to mind was - how fast is this when 'opc' is zero ? Do the 'rdlong' all become 4 cycles so it's as fast as can be or will each wait for hub access and slow down execution even though they effectively do nothing.
It then struck me that my cycle counting may have 'gone to pot' anyway because of conditionals with 'jmp'.