COGINIT's Dest Field [PASM]
Vega256
Posts: 197
Hey guys,
A question about the PASM version of COGINIT.
The Prop Manual version 1.2 says that the destination register of COGINIT is a 32-bit field which is supposed to hold PAR, start up address, and other information about the start-up cog.
I thought that the destination and source fields of PASM only allowed for 9-bit values. What am I missing?
A question about the PASM version of COGINIT.
The Prop Manual version 1.2 says that the destination register of COGINIT is a 32-bit field which is supposed to hold PAR, start up address, and other information about the start-up cog.
I thought that the destination and source fields of PASM only allowed for 9-bit values. What am I missing?
Comments
Just to clarify Mike's comment, the 9-bit destination field contains the cog address of a register that, in turn, contains the PAR, code address, and cog ID.
-Phil
Yes, or the other way, only able to reference the first 9 bits of HUB RAM, or a parameter that's 9 bits.
The key in this is to understand what the # sign does. (octothorpe)
9 bits are specified either way. When there is no # sign, those 9 bits point to the COG memory address that contains the 32 bit value. When there is a # those 9 bits are the value to be used directly.
means jump to the address of somewhere.
means jump to the address stored at somewhere.
How come # means the complete opposite in the case of jmp?
That is what I get for posting late
Re: JMP
Think of what gets loaded into the program counter. In both cases it's an address, right? The only discussion is where the address value comes from. Values put into the program counter are used as addresses. Values used in an add instruction, for example, are either just values or addresses to values.
Compare the two:
JMP #5
JMP 5
In both cases, the instruction contains the value 5. In the first case, the presence of the "#" means the value 5 will be directly loaded into the program counter. In the second case, the value 5 contains the COG long address that contains the value that will be loaded into the program counter. Say COG long 5 contains 10. These would then be the same:
JMP #10
JMP 5
Of course, we use labels for all of that, but I find it easier to just plug numbers in sometimes.
Now the ADD:
ADD 10, #5
ADD 10, 5
Say COG long 5 contains 10 just as it did last time. In both cases, the value 5 is present in the COG long that contains the ADD instruction. In the first case, the value 5 is to be added directly to the value contained in COG long 10.
Say COG long 10 contains 3. After the addition, COG long 10 would contain 8, in the first case.
In the second case, the value 5 is the address of the value to be added to the contents of COG long 10. After the addition, the COG long 10 would contain 13.
It really isn't any different. In both cases, a value is either used directly, or as an address to a value. The difference lies in the program counter treating values it sees as addresses, because that is what it does.
It doesn't at all. consider this:
The 'jmp' opcode really means 'move into the pc' (jmp/call/ret are the only instruction that can access the pc directly). You also have to realize that each instruction lives in a cog register and that labels are just names for cog registers. The jmp instruction updates the pc, the instruction-execution unit does the read from cog ram to get the instruction given its cog-address in the pc.
Yeah, that's right; I just got confused and second guessed myself. I've done asm for the Z80 and x86 architectures, so the program counter and how an instruction jump works isn't new to me. The notation is the complete opposite from my other assemblers. If I try to assemble something like this in Z80 asm,
JP 10
My assembler assumes that I mean literal 10, so if someLabel is at address 10, I can just put
JP someLabel
and it assumes the address at someLabel (which is 10). I don't need '#' anywhere; value means #value.
But back to the case of coginit. The destination field is really just the location where the 32-bit is, like Phil said. So then, Should be
Without the #?
With other mcu you are used to have R0 to R15 etc.
Think of the Prop of having R0 to R511.
You would never type in #R15 in other mcu either and only #nn when you want intermediate values
That the Prop acctually can have cog code in its R registers is something little harder to comprehend.
Also, regarding bit 3 of the bitfield, what does it mean to restart a cog? Does it just send the specified cog to some other place in hub ram?
From the manual:
If the third field bit is set (1), the Hub will start the next available (lowest-numbered inactive)
cog and return that cog’s ID in Destination (if the WR effect is specified).
If the third field bit is clear (0), the Hub will start or restart the cog identified by Destination’s
fourth field, bits 2:0.
"restart" just means that you have given coginit the id of a cog and set that bit 3 to zero and that the specified cog is already running something. It just stops that something and loads up your new code.
How come the second bit-field for coginit is the upper 14 bits of a 16-bit address? I thought hub ram was only 32k large. What are the first two bits set to?
That said, coginit only allows for 4n addresses (par and code base, 14+14+1+3), i.e. the lower two bits are always cut off.
I wrote a graphics driver in ASM which happened to be too big for a cog to handle. I thought that one way to solve this problem, aside from shrinking the code, was to split it into two pieces. I would then run the first part of the code with a cog, and when that same cog reaches the end of the first piece, restart it at the address of the second piece using coginit. This way, I have a continuous piece of code? Maybe I am over-complicating this. Do you guys see a different solution?
Code chaining like this should work but there will be a delay as the new code is loaded (I think this is 100us or so, certainly that's the fastest hub memory can be loaded up into the cog). You'll have to keep your live state in hub RAM of course.
IMHO, splitting it into two pieces is the right idea. Instead of serializing things, it is worth it to think about how to get the cogs to perform the task in parallel, so that the two COGS are just running, using some signal to communicate the processing tasks.
What are the driver tasks? We can discuss higher level structure in an attempt to find a good division of labor. Or, perhaps driver data is in the COG, and it can be placed in the HUB to free space.
I'm curious about the driver tasks myself. Good solutions will be more obvious.
Algorithmically, what's going on is...
- Driver reads tile data
- COG puts tile data on a scanline buffer for that particular scanline
- Driver reads sprite data
- COG puts sprite data on the scanline buffer that particular scanline
- TV Driver renders the line from the scanline buffer
- Repeat for the next scanline
All of this is in COG ram, however, multiple COGs are doing this; they each do their own line. That's why I don't think putting the entire driver in the HUB would work because each COG needs its own copy of the driver. Would it work if I put all of it the HUB?If you want to check the drivers out, they are here
http://forums.parallax.com/showthread.php?140874-Graphics-Driver-Improvement-and-Optimization.
-Phil
The tile sprite driver in my signature has some great rotating scan line buffer code that Bangers and I did.
It works like this:
One cog does the signal. It sets up the buffers pointers to things and the cog signal locations. It then can start graphics cogs.
There would be one buffer per graphics cog, plus one for the current scan line, two minimum.
Right away, the graphics cogs all render their scan lines and wait. When the signal cog begins to draw one to the screen, that buffer becomes the current buffer, signaling the graphics cog that built it to move to the next available buffer in the circular set of buffers, and it all continues to draw all the scan lines.
If you only use one graphics cog, it only has one scan line of time to work. That will get tiles and a few sprites. If you use two or four or more, they have x scan lines to work meaning more robust graphics on each scan line.
All graphics cogs are the same, meaning only one image in the hub, as well as overwriting it for buffers and such after the driver is running.
Edit: Well, that driver could be modded to your hardware. Have to see what that is.
Or, the technques can be applied to the one you've got cooking too.
I really only referenced it, because the buffers and graphics COG interaction in that one are well aligned with what you want to do.