PASM Cycle Timing
hippy
Posts: 1,981
Do these have the same cycle timing between Start and IsZero/NotZero or does the IF_Z make Start to NotZero fewer cycles in the second example ?
:Start mov acc,#1 tjz acc,#:IsZero :NotZero
:Start mov acc,#1 WZ IF_Z jmp #:IsZero :NotZero
Comments
-Phil
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Still some PropSTICK Kit bare PCBs left!
My reported values were 16 and 12 for the TJZ and IF_Z, respectively. (note: I would comment out one line for each version...this code is for the IF_Z version)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
I don't have hardware setup to test it and the manual seemed unclear, however I have now found this ... "When an instruction’s condition evaluates to FALSE , the instruction dynamically becomes a NOP , elapsing 4 clock cycles but affecting no flags or registers" ( page 368 ).
It seems then that "IF_Z JMP" is always faster than "TJZ", only ever taking 4 cycles.
@ lonesock : Thanks for taking the time to test it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lonesock
Piranha are people too.
Sorry, I misread the intent of your question. But lonesock's answer prodded me to take another look at the pipeline. Paul Baker explained it best:
Here are some observations relative to your question:
1. Since the status bits get written during the R stage, they are not yet ready before the next instruction is decoded. So the decision whether to execute the next instruction has to be deferred at least to the S stage, where the source (i.e. the destination of a jump) is fetched.
2. For JMP-type instructions, the destination operand has to be loaded into the PC before the I stage. So the decision whether to execute has to occur before that. I guess this could occur at either the S or D stage. Maybe S isn't even loaded if a jump isn't taken, but I suspect that it is and that the PC is loaded during D.
3. Once D passes, and the PC is loaded for the next instruction, the die is cast. If the E stage determines that the jump shouldn't be taken after all (e.g. TJZ, DJNZ), the pipeline has to be flushed and PC reloaded with the address of the next instruction, which entails an extra four clocks.
It would appear that the earlier decision implied by the IF_NZ is what makes the shorter execution time possible in a no-jump situation.
At the risk of changing the subject (in order not to copy Paul's quote to yet another thread), how many times have we done something like this:
when we could've done this:
According to the pipeline, the NOP is unnecessary in the second example. Of course, similar savings are not possible if it wasn't an immediate operand to begin with.
-Phil
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Still some PropSTICK Kit bare PCBs left!
This arose while writing yet another Virtual Machine which handles indirection ...
The questions which came to mind was - how fast is this when 'opc' is zero ? Do the 'rdlong' all become 4 cycles so it's as fast as can be or will each wait for hub access and slow down execution even though they effectively do nothing.
It then struck me that my cycle counting may have 'gone to pot' anyway because of conditionals with 'jmp'.