PHSA: A question for Kuroneko & an idea for storing data in unused cog ram (from OBC)
Cluso99
Posts: 18,069
OBC raised some interesting ideas and it seems RossH is using a similar idea in Catalina for temporary storage. Along those lines, here are a few ideas...
I have a fast overlay program which would work (ideas from others put together, as usual on this great forum). It unravels backwards, which is fine for transferring to the cog ram. However, this will not work for moving back to hub ram.
So I got to thinking (and not sure if anyone has thought about it or done it)... And yes, thinking is dangerous at my age
So, we need a loop like this....
My assumptions are...
1. I believe we should be able to have PHSA increment by 4 every 16 clocks.
2. I believe we should be able to have PHSA incremented after each "loop" instruction has executed.
3. I believe we can use the PHSA this way (get the value of PHSA and not the shadow ram)
Are these correct assumptions ??? Has anyone done it ???
Now, if this works, I can probably get this to run in a zero footprint mode, and possibly grab another 4 longs in shadow space as well. This would give a cool 2,000 bytes exactly. Nice for a part of a screen buffer !!!
Currently no time to check this out. If I don't write it somewhere I will most likely forget.
I have a fast overlay program which would work (ideas from others put together, as usual on this great forum). It unravels backwards, which is fine for transferring to the cog ram. However, this will not work for moving back to hub ram.
So I got to thinking (and not sure if anyone has thought about it or done it)... And yes, thinking is dangerous at my age
So, we need a loop like this....
loop rdlong *-*, PHSA ' becomes wrlong for storing back to hub (easy, just set the NR bit appropriately) add loop,H0200 ' inc to next cog address djnz count,#loop
My assumptions are...
1. I believe we should be able to have PHSA increment by 4 every 16 clocks.
2. I believe we should be able to have PHSA incremented after each "loop" instruction has executed.
3. I believe we can use the PHSA this way (get the value of PHSA and not the shadow ram)
Are these correct assumptions ??? Has anyone done it ???
Now, if this works, I can probably get this to run in a zero footprint mode, and possibly grab another 4 longs in shadow space as well. This would give a cool 2,000 bytes exactly. Nice for a part of a screen buffer !!!
Currently no time to check this out. If I don't write it somewhere I will most likely forget.
Comments
Here is the thread in question [thread=118012]Quick Cog-to-Hub transfer[/thread].
Just re-looking at the counters to see how the PLL and the clocks are tied together. Any simple counter explanations anywhere (while I look)?
I have just been looking at the shadow ram and the use of the registers, etc. I am convinced I can get this side to work.
Define simple. What you see is what you get. I always found the data sheet and the counter app note sufficient. Sometimes a counter is just a counter
So the only way we can run the counter without outputting to pin(s) is PLL internal (video mode) %00001 and logic always %11111.
Both these add FRQA to PHSA on each clock cycle.
The PLL is on the output of PHSA pin 31, so it is no help to us.
We will not have a problem of counter jitter because we ignore the bottom 2 bits when accessing hub by longs.
So the only method found so far is the staging of every 4th long, then returning 3 more times, each time offsetting 1 more long.
Is this correct?
Wish those PortB pins were inside!
As for not outputting stuff, just clear dira if you don't want to be seen
What are you saying here? Does this mean it will work but not output to the outside world. i.e. provided I am not using the eeprom, I could use this pin with DIRA as inputs??? Maybe a solution
If I have an input pin (such as the eeprom SDA when not using the eeprom) and I output the counterA in /16 mode (but disable the DIRA pin so no effective output goes out the pin) and use that to enable the accumulation of FRQB into PHSB in CTRB, does this still work? If so, what do I get if I happen to read this pin with INA - do I see the outside world's pin value or the CTRA pins value???
Yes, it's pays to always keep someone around who doesn't know what can't be done.
They ask for things that people know can't be done, so they are never tried. (In this case, I'm that guy.)
...continues to watch the thread...
I can onl ad to that --- Read my FOOTER
For every eight clocks, you want phsa to increment by $408. So set frqa to $81. Then it's just a bunch of fiddling with addresses, clock counts, and pipeline stuff to figure out what to initialize phsa to before jmping to it.
-Phil
Jonathan
-Phil
Yes, the transfer loop will be in the SFRs. Just not sure which ones yet. If at all possible, I want to reserve the first 4 to make 500 longs total (=2000 bytes). I will do a mix of LMM style here just like I did with my zero footprint debugger, but the copy loop will be SFR resident when run. I have been looking at the SFRs to see what I can use.
It's a real shame we didn't have the internal silicon for the PortB pins. I keep drooling at what we could do! (Never happy are we). Or even be able to feed the PLL output back into the second counter without using a pin - just a single gate enabled by an unused config bit. Isn't hindsight wonderful.
You know, I have never made such comments about any other micro I have used. Why? Because there were so many things it wasn't worth thinking about!!! It never ceases to amaze me just how much we can do with this prop. And it's first silicon. Just gotta say Chip... Congratulations!!!
There will still be 16 clocks between each rdlong/wrlong executed, as lonesock pointed out.
-Phil
-Phil
-Phil
Now, of course, none of this was supposed to be discovered. But we are a nosey bunch and we try to squeeze every last ounce out of this great chip, so we are always looking for other things to do. I think that is the elegance of Chip's design. It just does so much. If we had say an XYZ chip, we would be spending all out time using different XYZ variations, so we would never try the oddball things.
Here is a suggestion... We want some extra wires in this chip. Parallax has this great machine that can cut and add wires etc to a die. So, we should work out a modification and get Parallax to build us a special die, just for us fanatics !!! ..... OUCH.. I can hear Ken from here in Oz - sorry Ken
I have said before, I started doing the instruction functions of a cog on a Xilinx Spartan 3A FPGA. What I found when doing the emulation is that Chip is/was so smart. He has re-used lots of gates to make this regular instruction set do amazing things with a humungus saving in gates. Like a number here, I would like to see the schematics of the counters.
I think I have a grasp of the shadow ram. The I phase fetches the instruction, and that comes from the shadow ram. The S & D phases depends on the specific registers. You cannot use the D if you are fetching some registers, as in read/modify/write, as the fetch phase of D will return the shadow ram. The R phase varies with the register as some are read only, but it always writes to the shadow ram as well.
The archive contains a demo, the PASM section can be used from any calling language.
A you have to pick speed or size
Even tho' it missed the hub so takes 32 clocks per long, you do get an extra 2K in hub And n x 2KB is you have n free cogs
It is a pitty we do not have a waitlock command as we could use that to cut power when not required.
Update: Latest version (unreleased, counter based) offers full speed (clkfreq/4 B/s) and 486 longs available.