[resolved][puzzle] singularity
kuroneko
Posts: 3,623
Disclaimer: This is not a puzzle in the sense of the one(s) previously posted. I know the result but no information available to me can explain the behaviour.
The special register area (SRA) can be used for storing data and - if done carefully - can be used for code as well (if you don't mind leaving dira alone and don't require counters and/or video). The odd thing here is that while vscl ($1FF) can be written to with any value (and you'll read back the same value) it is completely ignored for code execution.
In the code fragment below I fill the SRA (excluding dira) with an ordinary add instruction (keeps counters and video off). So the expected result is $8F. However, all I get is $8E. This can be narrowed down to code not being executed at address $1FF.
Question is why? My guess would be that it somehow involves handling the address wrap-around (execution continues at $000). Sampling cnt before the jump to SRA and after the jmpret as a result of the wrap-around shows that 18 4-cycle instructions are executed (14x add, 1x nopa, 1x nopb, 2x jmp[noparse][[/noparse]ret]).
Thoughts?
Solution: It turns out that you can in fact execute code at $1FF provided you don't go there by normal means (jmp & Co, arriving via $1FE). All it takes is to pretend that you're not actually at $1FF while the instruction is executed. The opposite is also true, any $xxx:$000 phase jump will only execute the instruction at $000 because the PC is set to $1FF while we are at $xxx, meaning the first target is ignored.
adira being left alone
badd at $1FF, isn't executed but consumes 4 cycles
Post Edited (kuroneko) : 12/16/2009 11:32:23 PM GMT
The special register area (SRA) can be used for storing data and - if done carefully - can be used for code as well (if you don't mind leaving dira alone and don't require counters and/or video). The odd thing here is that while vscl ($1FF) can be written to with any value (and you'll read back the same value) it is completely ignored for code execution.
In the code fragment below I fill the SRA (excluding dira) with an ordinary add instruction (keeps counters and video off). So the expected result is $8F. However, all I get is $8E. This can be narrowed down to code not being executed at address $1FF.
Question is why? My guess would be that it somehow involves handling the address wrap-around (execution continues at $000). Sampling cnt before the jump to SRA and after the jmpret as a result of the wrap-around shows that 18 4-cycle instructions are executed (14x add, 1x nopa, 1x nopb, 2x jmp[noparse][[/noparse]ret]).
Thoughts?
Solution: It turns out that you can in fact execute code at $1FF provided you don't go there by normal means (jmp & Co, arriving via $1FE). All it takes is to pretend that you're not actually at $1FF while the instruction is executed. The opposite is also true, any $xxx:$000 phase jump will only execute the instruction at $000 because the PC is set to $1FF while we are at $xxx, meaning the first target is ignored.
DAT org 0 start jmpret $, #setup wrlong report, par cogid cnt cogstop cnt setup mov $1F0, inst mov $1F1, inst mov $1F2, inst mov $1F3, inst mov $1F4, inst mov $1F5, inst mov $1F6, #0 ' avoid dira mov $1F7, inst mov $1F8, inst mov $1F9, inst mov $1FA, inst mov $1FB, inst mov $1FC, inst mov $1FD, inst mov $1FE, inst mov $1FF, inst jmp #par inst add report, #1 report long %1000_0000 fit
adira being left alone
badd at $1FF, isn't executed but consumes 4 cycles
Post Edited (kuroneko) : 12/16/2009 11:32:23 PM GMT
Comments
Maybe this answer is too simple? ... Edit: Oooops. Wrong question That's some weird behavior.
Post Edited (jazzed) : 12/15/2009 5:46:30 AM GMT
When doing a cognew the first instruction cycle might look like 'do nothing' just to get the first instruction with that cycle and executing it with the second cycle. In this way cognew will set the PC to $1ff and let it do nothing but the wraparound? That was at least my 'working assumption' when writing my propeller simulator.
Maybe it's only the instruction fetch which has this kind of problem (this hardcode). Guess it can still be used as shadow register, can't it?
Post Edited (MagIO2) : 12/15/2009 2:26:31 PM GMT
I used the first 4 ($1F0-$1F3) to run a LMM program loop for my spin & pasm zero footprint debugger.
However, currently I do not have time to investigate further.
I re-read you findings. All I can suggest is that the wrap-around is detected and causes an abort of the current instruction at $1FF before the writeback phase.·Execution begins at $000. I guess the pipeline flushing, which would cause a refetch of the I phase does not occur (due to your findings with the cnt).
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
Post Edited (Cluso99) : 12/15/2009 3:13:47 PM GMT
After loading 512 longs into COG RAM the COG has to start execution. Goal is to run the program from beginning of COG-RAM. But as the propeller uses pipelining and the first stage of the pipeline is NOT the fetch of an instruction this can't be done straight forward by setting the PC to $000. Having the PC set to $000 means that the first instruction fetched would be that one at $001. How to solve: set the address to $1FF during start of a COG.
Problem that we still have is 'What happens in the first instruction cycle?" Because no extra logic (meaning extra die space wasted) should be there - or at least it should be as easy as possible. Solution: run the first cycle empty except of that part that fetches the next instruction and increases the PC. So, I believe that the address $1FF simply drives a signal line low (by NAND of all address bits - NAND are easy). This signal tells all stages but the parts that get the next instruction and increase the PC to do nothing.
I'd bet that the register that holds the current instruction is not initialized when starting a COG. So, what's in there is simply undefined and needs to be skipped.
Post Edited (MagIO2) : 12/15/2009 9:08:48 PM GMT
Ray, you wondered what a phase jump is good for. I had an idea this morning. What if I don't let execution continue (wrap) but get out of there immediately? So I came up with this:
If you activate the jmp #$1FF at (setup + 1) then you get old-style behaviour. Reported value is $C0. Pulling a $1FF:$002 phase jump I get $81 as the result. Very nice!
Which clears up two things IMO:
- execution at $1FF is in fact possible (result %x0xx_xxx1)
- a PC wrap will abort the current instruction (most likely to aid cog startup as MagIO2 pointed out)
Does that make sense?Another interesting observation. If I place an immediate jump at $1FF then old-style execution will still have this instruction aborted, while a phase jump allows the jump to proceed (even though the phase jump itself is interrupted/aborted).
Post Edited (kuroneko) : 12/16/2009 3:24:19 AM GMT
Post Edited (kuroneko) : 12/16/2009 11:18:08 PM GMT
- Conditional Execution Encoding - initial claim
- Conditional Execution Encoding - transfer
It now dawns on me why you asked the question. Just having a fetch from #0 isn't that special, the actual condition is that the PC must be $1FF (which would normally lead to a fetch from #0). Serves me right for being on-line with a headache But you're right, a jmp #0 will obviously involve a fetch from #0. Thanks for spotting that. This puzzle is where it all started. What is a [thread=118159]phase jump[/thread]? Nothing sinister, just a jump using a phase register (phsx) as jump target. This in itself (using a register as opposed to an immediate value) is nothing special until you activate the counter. You can do fun things like executing the two adds but not the sub (base = target - 2, frqx = 1).So to sort out the mess from the last couple of days, any instruction where the PC is $1FF is aborted (nop). This is true for location $1FF and - because it becomes a nop - the next fetch is from #0. It's easy enough to check. Just put some evil instruction there and see how it is ignored (make sure you catch re-entry at #0).
How can this be simulated (as vscl isn't a nice place to be in)? We simply have to make the cog think that it's at $1FF when it's not.
- live registers like phsx are sampled during e-phase (or after D-phase, SDeR)
- this means frqx has been added twice to the value we find at the beginning of the jmp phsx in phsx
- in other words the jump target is base + 2*frqx
- the new PC is written during the R-phase but as the counter is still active that will be base + 3*frqx
- this new PC is seen when the instruction at target is executed, and if $1FF will abort the instruction
This gives us two equations with two variables. So for small enough target addresses we end up with this code sequence: