evanh, could you explain in more detail what you're suggesting? The ORG directive currently tells the assembler to go into cog mode, and it sets the starting cog address. What else should it do?
I'm fine with ORG, that was just a passing remark. It's LOC that needs the work.
EDIT: I'd call it a base address rather than start address. "Start" might be mistaken for start of execution.
EDIT2: Hmm, base is wrong too, ORG is not a relative thing at all. Section origin then.
OK, I was wrong. I ran this under spinsim, and I got PA=$1020 and PB=$1030. This may be correct, or p2asm may be wrong, or spinsim might be wrong. I'll have to check the binary with PNut's binary.
OK, I was wrong. I ran this under spinsim, and I got PA=$1020 and PB=$1030. This may be correct, or p2asm may be wrong, or spinsim might be wrong. I'll have to check the binary with PNut's binary.
p2asm looks OK, it produces the same thing as fastspin does, and when I run the result on the FPGA both pa and pb have the same value. I've attached the code I used for testing: foo.bas is the original source, foo.spin2 is the raw PASM produced by fastspin, foo.lst is the listing file that p2asm produces when it compiles foo.spin2. The output is:
which is correct (1064 = $428, which is where the label ends up in memory).
Note that there's a bug in fastspin 3.9.10 such that it cannot handle @ and \ in inline assembly. That's fixed in the current github sources, so you'll need to use those if you want to regenerate foo.spin2.
orgh $400
loc pa, #@label ' relative addressing
loc pb, #\@label ' absolute addressing
cogstop #0
label
shows PA = $8 PB = $40c
00400- 08 00 90 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00 '..........d.....'
I think you're confusing the instruction encoding with what is actually put in the register when the instruction executes. If you execute the "relative addressing" version of the loc instruction, the PC after the instruction ($404) is added to the offset ($8 in this case) to get the final value of $40c. In other words, at run time PA and PB will end up with the same value of $40c in them when the two instructions execute.
orgh $400
loc pa, #@label ' relative addressing
loc pb, #\@label ' absolute addressing
cogstop #0
label
shows PA = $8 PB = $40c
00400- 08 00 90 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00 '..........d.....'
I think you're confusing the instruction encoding with what is actually put in the register when the instruction executes. If you execute the "relative addressing" version of the loc instruction, the PC after the instruction ($404) is added to the offset ($8 in this case) to get the final value of $40c. In other words, at run time PA and PB will end up with the same value of $40c in them when the two instructions execute.
orgh $400
loc pa, #@label ' relative addressing
loc pb, #\@label ' absolute addressing
cogstop #0
label
shows PA = $8 PB = $40c
00400- 08 00 90 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00 '..........d.....'
I think you're confusing the instruction encoding with what is actually put in the register when the instruction executes. If you execute the "relative addressing" version of the loc instruction, the PC after the instruction ($404) is added to the offset ($8 in this case) to get the final value of $40c. In other words, at run time PA and PB will end up with the same value of $40c in them when the two instructions execute.
(Try it!)
WHAT ?!?!
Look at foo.spin2 and/or foo.lst that I posted a few pages back (that's foo.bas converted to PASM2 by fastspin). The relevant instructions are:
'
' sub getlabelvals()
00408 _getlabelvals
' asm
00408 fe90001c loc pa, #@label
0040c fea00428 loc pb, #\@label
00410 f6006df6 mov _var_00, pa
00414 f6006ff7 mov _var_01, pb
' paval = x
00418 fc606c2b wrlong _var_00, objptr
' pbval = y
0041c f1045604 add objptr, #4
00420 fc606e2b wrlong _var_01, objptr
00424 f1845604 sub objptr, #4
' label:
00428 label
00428 _getlabelvals_ret
00428 fd64002e reta
Note that the first loc is encoded as $fe90001c (relative addressing) whereas the second loc is encoded as $fea00428 (absolute addressing). At runtime they both put $428 into the respective registers, as is proven by the program output.
The reason is simple: the PC relative "loc" instruction adds the next PC (PC+4) to the offset to get the value to put into the register, just like a relative "jmp" adds the next PC to the offset to get the new PC. So the first loc, at address $408, adds $40c to the offset $1c to get the final value $428.
Note that it isn't *just* the offset that is different in the two loc encodings, there's actually a bit in the instruction that says whether the offset is absolute or relative.
You should be able to assemble and run foo.spin2 with PNut to verify this. Actually maybe not, it may use @@@, so you may have to use fastspin or p2asm. But all 3 assemblers agree about the encoding of the LOC instructions, so this isn't some quirk of fastspin or p2asm, it's the way the hardware works.
The reason is simple: the PC relative "loc" instruction adds the next PC (PC+4) to the offset to get the value to put into the register, just like a relative "jmp" adds the next PC to the offset to get the new PC. So the first loc, at address $408, adds $40c to the offset $1c to get the final value $428.
Oh, oops, I've not been examining the final register content ... and I was convinced I was too, damn ...
I get $40C for both cases when running on the FPGA. However, spinsim seems to be confused. It produces $1020 and $1030. It's shifting the value up by 2 bits, which means it must think it's in the COG mode.
If I move the routine to a different location other than $400 I get the correct value in the relative mode, but an incorrect value in the absolute mode. This kind of shows the value of having position-independent-code. It appears that my linker isn't adjusting the address for the absolution mode. It doesn't surprise me since I don't recall handling relocation for the LOC command.
I'm going to take the warning print out for the LOC command.
Apologies on the PC-relative complaint. I was way off there.
There is still the bug in Pnut though. It is in the cogexec LOC instruction encoding for PC-relative encoding below absolute $400. I guess that's where Cluso came unstuck and got me digging.
Here's another one:
I've just been experimenting with building some diagnostic code and discovered it would be nice to know if the caller code was from cogexec or hubexec. A third status bit in the stacked address maybe.
In this case I'm wanting a subroutine to extract the encoding of the instruction prior to the call. If I don't know whether the caller was in cogexec at the time or not then I can't calculate the relative address from the call stack.
EDIT: Ah, forgot that code can't execute below $400 in hubRAM. That should be enough ...
EDIT2: And working source code:
pop char 'grab caller address
push char 'restack it
cmp char, ##$400 wcz 'test if caller was cogexec or hubexec, C = borrow of (D - S)
if_c sub char, #2 'was cogexec
if_c alts char 'MOV indirection - get register content of register number in "char"
if_c mov pa, 0-0
if_nc sub char, #8 'was hubexec
if_nc rdlong pa, char
Wow, that detail is needed. Pnut is making a mess of the PC-relative LOC encodings. Only six of the twelve PC-relative combinations above is correct. Even two of the hubexec encodings ($fe800110 is absolute encoding) is wrong because it is using absolute encoding below $400 in hubRAM where it should still be PC-relative.
Or is that case intentional because hubexec can't go there?
Wow, that detail is needed. Pnut is making a mess of the PC-relative LOC encodings. Only six of the twelve PC-relative combinations above is correct. Even two of the hubexec encodings ($fe800110 is absolute encoding) is wrong because it is using absolute encoding below $400 in hubRAM where it should still be PC-relative.
Or is that case intentional because hubexec can't go there?
LOC is also usable to obtain the hub address of a table, which can reside below or above hub $400.
That is one of the problems I found - the hard way as I wasted a whole day trying to find a bug in my program.
Chip,
I've bumped into a design flaw/bug in lutRAM sharing! RDLUT data, or address, is being garbaged if the sharing cog WRLUTs to the same address on the same sysclock.
In my case, an instruction stall would also mess me up but it would have to be a number of clocks to produce the result I'm getting.
PS: I'm very certain. Testing is on P123 board with v32i image loaded.
Chip,
I think I've bumped into a design flaw/bug in lutRAM sharing! RDLUT data, or address, is being garbaged if the sharing cog WRLUTs to the same address on the same sysclock.
In my case, an instruction stall would also mess me up but it would have to be a number of clocks to produce the result I'm getting.
Do you mean RDLUT from COGn, occurring on the same address, and on the same sysclk as the WRLUT from COGm, is corrupted ?
ie it is neither the old value, nor the new value ?
Do you have come test code that reproduces this ?
ie it is neither the old value, nor the new value ?
Definitely not the new value. I don't think an old value could upset things the way it has because it will be the same every time and the importance of the data is metronomic ...
Comments
EDIT: I'd call it a base address rather than start address. "Start" might be mistaken for start of execution.
EDIT2: Hmm, base is wrong too, ORG is not a relative thing at all. Section origin then.
That really isn't true. One of the things chip made explicit early on was the fact that data can be intermingled with code.
I still don't quite see the danger of using LOC with a relative address, at least one above $400. After this code: PA and PB should have the same value. Am I missing something?
Edit: typo
p2asm looks OK, it produces the same thing as fastspin does, and when I run the result on the FPGA both pa and pb have the same value. I've attached the code I used for testing: foo.bas is the original source, foo.spin2 is the raw PASM produced by fastspin, foo.lst is the listing file that p2asm produces when it compiles foo.spin2. The output is: which is correct (1064 = $428, which is where the label ends up in memory).
Note that there's a bug in fastspin 3.9.10 such that it cannot handle @ and \ in inline assembly. That's fixed in the current github sources, so you'll need to use those if you want to regenerate foo.spin2.
but this code shows PA = $40C and PB = $40C shows
Edit: Pnut switches to absolute because the ORG directive causes a domain crossiing.
I think you're confusing the instruction encoding with what is actually put in the register when the instruction executes. If you execute the "relative addressing" version of the loc instruction, the PC after the instruction ($404) is added to the offset ($8 in this case) to get the final value of $40c. In other words, at run time PA and PB will end up with the same value of $40c in them when the two instructions execute.
(Try it!)
WHAT ?!?!
Look at foo.spin2 and/or foo.lst that I posted a few pages back (that's foo.bas converted to PASM2 by fastspin). The relevant instructions are: Note that the first loc is encoded as $fe90001c (relative addressing) whereas the second loc is encoded as $fea00428 (absolute addressing). At runtime they both put $428 into the respective registers, as is proven by the program output.
The reason is simple: the PC relative "loc" instruction adds the next PC (PC+4) to the offset to get the value to put into the register, just like a relative "jmp" adds the next PC to the offset to get the new PC. So the first loc, at address $408, adds $40c to the offset $1c to get the final value $428.
Note that it isn't *just* the offset that is different in the two loc encodings, there's actually a bit in the instruction that says whether the offset is absolute or relative.
You should be able to assemble and run foo.spin2 with PNut to verify this. Actually maybe not, it may use @@@, so you may have to use fastspin or p2asm. But all 3 assemblers agree about the encoding of the LOC instructions, so this isn't some quirk of fastspin or p2asm, it's the way the hardware works.
Oh, oops, I've not been examining the final register content ... and I was convinced I was too, damn ...
If I move the routine to a different location other than $400 I get the correct value in the relative mode, but an incorrect value in the absolute mode. This kind of shows the value of having position-independent-code. It appears that my linker isn't adjusting the address for the absolution mode. It doesn't surprise me since I don't recall handling relocation for the LOC command.
I'm going to take the warning print out for the LOC command.
There is still the bug in Pnut though. It is in the cogexec LOC instruction encoding for PC-relative encoding below absolute $400. I guess that's where Cluso came unstuck and got me digging.
I've just been experimenting with building some diagnostic code and discovered it would be nice to know if the caller code was from cogexec or hubexec. A third status bit in the stacked address maybe.
In this case I'm wanting a subroutine to extract the encoding of the instruction prior to the call. If I don't know whether the caller was in cogexec at the time or not then I can't calculate the relative address from the call stack.
EDIT: Ah, forgot that code can't execute below $400 in hubRAM. That should be enough ...
EDIT2: And working source code:
Or is that case intentional because hubexec can't go there?
That is one of the problems I found - the hard way as I wasted a whole day trying to find a bug in my program.
I've bumped into a design flaw/bug in lutRAM sharing! RDLUT data, or address, is being garbaged if the sharing cog WRLUTs to the same address on the same sysclock.
In my case, an instruction stall would also mess me up but it would have to be a number of clocks to produce the result I'm getting.
PS: I'm very certain. Testing is on P123 board with v32i image loaded.
Do you mean RDLUT from COGn, occurring on the same address, and on the same sysclk as the WRLUT from COGm, is corrupted ?
ie it is neither the old value, nor the new value ?
Do you have come test code that reproduces this ?
It's messy and non-specific.