waitcnt instruction unexpectedly altering memory location - known bug?
borisg
Posts: 39
Sorry if this is something that's been mentioned before but I couldn't find an instance of it searching the forum yesterday. Ran into this aberrant behavior yesterday when I was up most of the night trying to debug what I thought was simple Pasm code to create a 1 MHz clock on the Prop. Because this is running very close to the limits of a cog, decided to dedicate a cog to performing nothing except updating a pair of 64 bit hub data areas every microsecond (the reason for using two of them was to indicate which one was being updated and which one was free to read, or so the plan was until the program refused to work). Stripped down to its essentials, the PASM code is as follows:
mov Temp, #80
add Temp, cnt
add Temp, #80 ' shouldn't be needed but increased reliability
loop waitcnt Temp, #80
add ClockLow, #1 wc ' update ClockLow and see if carry
if_c add ClockHigh, #1
cmp RegSet, #0 wz ' determine which Hub RAM address set to use
if_nz jmp #RegSet1
wrlong ClockLow, #$100
if_c wrlong ClockHigh, #$104 ' write to first hub clock location
jmp #ExitCode
RegSet1
wrlong ClockLow, #$108
if_c wrlong ClockHigh, #$10C ' move clock to alternate register set
ExitCode
xor RegSet, #1 '
jmp #loop ' and repeat ad-infinitum
Omitted is the setup code which loads Temp with cnt+80.
The initial setup of the data area for this code was:
HubBase long
IOPin long
IOMask long
RegSet long 0
Temp long
ClockLow long 0
ClockHigh long 0
HubClockLow1 long
HubClockHigh1 long
HubClockLow2 long
HubClockHigh2 long
What puzzled me was ClockLow in hub RAM showing what looked like cnt output as it was incrementing far too fast and not overflowing. Adding in strategic xor outa 1 statements and hooking up a logic analyzer demonstrated that the code was indeed generating pulses at a 1 MHz rate. Given that I've been hacking meatware for the last few months and not hardware, I assumed it was my mistake somewhere in the code and had a very frustrating night of debugging where I took out every fancy feature that I had inserted into the code (it was supposed to originally create the clock at a programatically specified memory location in hub RAM). I generally allocate the first 512 bytes of hub RAM for my personal use hence the absolute addressing in the above code.
Following a call from my girlfriend who thought I'd had enough human machine interaction for the week, as I was preparing to leave, it suddenly occurred to me that perhaps the waitcnt instruction is contaminating the memory location after Temp. I inserted a dummy variable between Temp and ClockLow, and suddenly my 1 MHz counter was working!!. I then replaced one of the wrlong instructions to output the value of my spacer variable, Dummy1, to hub RAM, and sure enough there was a clear and offset version of cnt.
Normally, waitcnt is specified as being written as:
waitcnt Target, Delta
which would require 2 longwords in cog RAM for Delta > 511. In my case, Delta = 80 so I thought I could get away using a literal variable (as the Propeller documentation states this is permissible). Hence, my waitcnt instruction was written as:
waitcnt Temp, #80
I'll get around to posting my code demonstrating this later as the late night of debugging resulted in some very ugly code with all of the various hacks I tried to coerce appropriate behavior from a very recalcitrant cog.
NOTE: This 1 MHz counter is very close to the limits of what the Prop can do and, depending on the vagaries of hub access delay, one has as few as 8-12 clock cycles that can be added to the loop (at 80 MHz clock speed).
It may be that I've done something stupid elsewhere in the code, but aside from mindless variable setup that has been omitted, the guts of the routine are given above.
Anyone else noticed this?
mov Temp, #80
add Temp, cnt
add Temp, #80 ' shouldn't be needed but increased reliability
loop waitcnt Temp, #80
add ClockLow, #1 wc ' update ClockLow and see if carry
if_c add ClockHigh, #1
cmp RegSet, #0 wz ' determine which Hub RAM address set to use
if_nz jmp #RegSet1
wrlong ClockLow, #$100
if_c wrlong ClockHigh, #$104 ' write to first hub clock location
jmp #ExitCode
RegSet1
wrlong ClockLow, #$108
if_c wrlong ClockHigh, #$10C ' move clock to alternate register set
ExitCode
xor RegSet, #1 '
jmp #loop ' and repeat ad-infinitum
Omitted is the setup code which loads Temp with cnt+80.
The initial setup of the data area for this code was:
HubBase long
IOPin long
IOMask long
RegSet long 0
Temp long
ClockLow long 0
ClockHigh long 0
HubClockLow1 long
HubClockHigh1 long
HubClockLow2 long
HubClockHigh2 long
What puzzled me was ClockLow in hub RAM showing what looked like cnt output as it was incrementing far too fast and not overflowing. Adding in strategic xor outa 1 statements and hooking up a logic analyzer demonstrated that the code was indeed generating pulses at a 1 MHz rate. Given that I've been hacking meatware for the last few months and not hardware, I assumed it was my mistake somewhere in the code and had a very frustrating night of debugging where I took out every fancy feature that I had inserted into the code (it was supposed to originally create the clock at a programatically specified memory location in hub RAM). I generally allocate the first 512 bytes of hub RAM for my personal use hence the absolute addressing in the above code.
Following a call from my girlfriend who thought I'd had enough human machine interaction for the week, as I was preparing to leave, it suddenly occurred to me that perhaps the waitcnt instruction is contaminating the memory location after Temp. I inserted a dummy variable between Temp and ClockLow, and suddenly my 1 MHz counter was working!!. I then replaced one of the wrlong instructions to output the value of my spacer variable, Dummy1, to hub RAM, and sure enough there was a clear and offset version of cnt.
Normally, waitcnt is specified as being written as:
waitcnt Target, Delta
which would require 2 longwords in cog RAM for Delta > 511. In my case, Delta = 80 so I thought I could get away using a literal variable (as the Propeller documentation states this is permissible). Hence, my waitcnt instruction was written as:
waitcnt Temp, #80
I'll get around to posting my code demonstrating this later as the late night of debugging resulted in some very ugly code with all of the various hacks I tried to coerce appropriate behavior from a very recalcitrant cog.
NOTE: This 1 MHz counter is very close to the limits of what the Prop can do and, depending on the vagaries of hub access delay, one has as few as 8-12 clock cycles that can be added to the loop (at 80 MHz clock speed).
It may be that I've done something stupid elsewhere in the code, but aside from mindless variable setup that has been omitted, the guts of the routine are given above.
Anyone else noticed this?
Comments
If you want to use hardcoded hub addresses you should use something at the high end of memory, such as $7f00 instead of $100. Or better yet, use the PAR register to pass the starting address of your HubClockxxx locations.
EDIT: Oh, and post your code between "code" and "/code" tags or you won't get any help from Phil.
If you use long without a value, no cog register is allocated for that Label. You need to write long 0.
Long without a value is used to force a long align of the following data (also bytes or words).
So with the above setup Temp and ClockLow for example access the same cog register.
Andy
And yes, I did belatedly RTFM, but it wasn't stated in BOLD letters that if one writes:
Varname LONG 0
Varname2 LONG
that the addresses of Varname and Varname2 are identical. Hence the seeming hardware failure as my variable Temp, which held the parameter for Waitcnt instruction had the same address as my variable ClockLow which I naively assumed was a separate variable. When I put in the variable Dummy1, it was as:
Dummy1 LONG 0
So, by initializing all of my variables to 0, suddenly the program works. Hard to believe that it's taken me this long to discover this idiosyncracy of PASM. In any other language I program in, defining a variable is sufficient to reserve memory for it and the behavior of LONG differs between VAR and DAT blocks. I guess I now know why the mysterious res keyword was added to the language.
Just looked back at a few of my programs I was unable to get working despite performing hand execution of code -- same problem, use of the LONG keyword to define variables that I assumed had storage allocated for them. Things do get interesting when almost all of ones variables share the same address. I guess the moral of this little episode is that even if one has been programming for 44 years, a quick glance through the manual might be in order before starting to code on a new CPU instead of just looking for the list of registers and opcodes and starting to bang out code.
You could also check the assembler (ga?) in GCC, as that is a little more 'normal'
Another moral is that PASM really needs a cleanup pass, to implement some safe defaults, and align it more with conventions.
IIRC PASM was coded in X86 ASM, which is a less than ideal choice for code support.
Varname LONG
Varname2 LONG 0 You really think you discovered that by yourself? LONG is used to reserve initialized long data in cog memory. This has many more usage cases than only defining variables. So what do you expect from the compiler if you reserve an initialized long and don't say to what value it should initialize?
Perhaps one expect an error message, but as I said before, LONG alone is valid directive to force alignement.
RES is not mysterious, it is there for reseving uninitialized long data space, mainly for variables or buffers. The advantage of RES is that it does not reserve unnecessary space in HubMemory (in the PASM image that gets loaded to the cog later).
Andy
But yes, a close inspection of the manual is a good idea as PASM does have some interesting quirks. For example what sequence of bytes does the following statement in a DAT section produce in memory?
Figuring out what the @ operator does will drive you to drink!
Edit: Actually I don't recall that the result of using the plus operator on strings in DAT is spelled out in the manual.
Andy