Propeller Tricks & Traps (Last update 21 June 2007)

Phil Pilgrim (PhiPi) · 2006-03-09 07:07

Hi All,

I've begun a little document called "Propeller Tricks and Traps". This is meant to be an unofficial, open-ended, never-to-be-finished compendium of hints and gotchas discovered while learning how to use and program the Propeller. At some point, it may even become somewhat organized; although the theme will always be "rough and ready", rather than "polished and perfect". And I welcome corrections, clarifications, and contributions from any and all readers. They will be included (or not) under the terms explained in the document's introductory page. (Alliteration is not a requirement, by the way.)

As updates are made, I will upload them to this selfsame posting, replacing the current version, but with an update notation at the bottom of the post. So, for now, here's the first installment.

Happy reading!
-Phil

Update: Removed a reference to immediate operands being sign extended from bit 8. 'Tain't so. Thanks to Chip for catching this one!
Update (10Mar06): Added some additional stuff...
Update (24Mar06): Added three new tricks and three new traps. Thanks to all the contributors!
Update (02Jun06): Added two new tricks and one new trap.
Update (19Jun07): Corrected an error on page 5 in the state machine example.
Update (20Jun07): Made additional changes to state machine example to tighten code and better illustrate the principle.
Update (21Jun07): Modified the sign-extend example to include Spin's pre-defined operators.
Update (28Sep01): Corrected an error on page 8 in the PAR trap.

Post Edited (Phil Pilgrim (PhiPi)) : 9/28/2007 7:32:52 PM GMT

Paul Baker · 2006-03-09 15:53

Here's a trap I found, the wrbyte, wrword and wrlong are in the format "wr(byte/word/long) data, destination" opposite of the standard "instruction destination, source". Feel free to add this to your document if you want.

Another one I haven't fully verified is if you want to·TEST ina, ina must be·the source and the value its tested against in the destination. I wrote code the other way and it didn't work until I swapped the two.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 3/9/2006 4:07:36 PM GMT

Phil Pilgrim (PhiPi) · 2006-03-09 17:59

Hi Paul,

Your right about the ina thing, and it's already in there -- top of page 6. Great call on wrlong, etc. I'll add that one for sure!

Thanks!
Phil

Paul Baker · 2006-03-09 18:59

Ah sorry, I guess I missed that when scanning the document.

This isn't a trap but it hung me up for a while, I had code that when like this:

 
        test Flags, cond      wz             'test if a flag is set
  if_z  call Sub1                            'call the first routine if flag wasn't set
  if_nz call Sub2                            'call the second routine if flag was set

It took me a while to figure out why the program wasn't working as expected until I realized that Sub1 affected the Z flag meaning Sub2 was also being called after returning from Sub1. This isn't a trap specific to the propeller, but since conditional execution is new to me (instead of jumping based on flags as is done with the SX), this was certainly a "gotcha".

PS thanks for creating this document, it has already saved my butt. I just wrote some routines using indirection, but I did not place instructions between the movd and mov (haven't started testing the code yet). Now I know in advance this can't be done.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 3/9/2006 7:03:50 PM GMT

AndreL · 2006-03-10 08:18

Add this one:

When using constants for immediate mode moves, eg.

mov dest, #value

value can ONLY be 9-bits!!!!

People forget that eventhough this is a 32-bit processor, there is only a 9-bit immediate. If you want a larger number, the concept of "literal pool", and you define a memory location with the value in it:

mov dest, value
..

value long $12345679

Notice "value" has NO # symbol now, since its a memory location 0-511.

Andre'

Beau Schwabe · 2006-03-10 16:05

Andre'

I'll further add, if you are reserving variables ' {value} res ', they MUST be placed AFTER the ' {value} long ' definitions.

In the attached image, the line that reads...

Switch res 1

...if placed before the ' {value} long ' definitions, it will cause unwanted results.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Post Edited (Beau Schwabe (Parallax)) : 3/10/2006 4:09:02 PM GMT

Phil Pilgrim (PhiPi) · 2006-03-10 16:58

Andr

Tracy Allen · 2006-03-10 21:04

Trap: Failure to account for the clock speed can lead to program lockup, especially with spin code.
For example, the following snippet out of the attached serialDemo.spin will lock up if the baud rate is set too high for the selected clock frequency. Here is the offending code:

t = cnt
repeat 10                  ' 10 bits
    waitcnt(t += bitclocks)      ' time for next bit edge, += is short for (t = t + bitclocks)
    outa[noparse][[/noparse]tx] := (b >>= 1) & 1    ' get the next bit

The trap there is that the value of the system counter, cnt is captured in the variable t before entering this routine. The idea then is that serial bits will be sent out at regular times, t+bitclocks, t+2*bitclocks, t+3*bitclocks, etc. However, the spin code in the line following the waitcnt instruction can take longer than bitclocks. It takes a certain determinate number of clock cycles. If that happens, cnt will already have passed t+bitclocks when it hits the waitcnt instruction, and it will have to wait there for the whole cycle of 32 bits to roll back to the value, for each bit. This limits the baud rate to 19200 for an 80 mhz clock (5mhz xtal + pll16x). Or 1200 baud at 5mhz xtal. Of course asm is faster, but this is still a trap for fast actions that use repeated waitcnt. When I asked Jeff about this, he was able to estimate quickly in his head how long the spin code would take, and therefore the maximum baud rate for that code.

Tip: If spin code locks up, look at the timing.

Trap: Code does not run as expected when using hyperterminal.

Tip: like the BASIC Stamp, the DTR line resets the Propeller, and when reset, the code stored in eeprom will be loaded, replacing whatever code was previously in the Propeller RAM. Hyperterminal brings DTR high when it connects and it brings DTR low when it disconnects. With the FDTI programming adapter at least, the Propeller resets when DTR goes from high to low. If you use the DEBUG window in the Stamp IDE, DTR is not automatically set, and you can manipulate it manually through the DTR checkbox to see the effect of reset on the Propeller. The reset line on the Propeller demo board does not have capacitors. It is connected straight through to the rst\ line on the programming adapter.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Phil Pilgrim (PhiPi) · 2006-03-10 23:29

Hi Tracy,

I just uploaded the latest additions before seeing your post. I'll catch yours in the next upload.

Thanks!
Phil

Paul Baker · 2006-03-14 03:38

New trap: The RES keyword must occur after all code and data within a cog, Chip provided feedback on my DScope code saying that RES before code only advances the pointer but doesn't actually create the space, from what I can surmise RES is to be used in conjuction with FIT to ensure proper space is provided for but fits within the limits of cog memory.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Phil Pilgrim (PhiPi) · 2006-03-14 19:27

Quite by accident, I stumbled upon a data block constructor that doesn't seem to be documented anywhere (yet). If it's intended to be a standard, documented feature, it probably doesn't need to go in PT&T, but I'm not sure what its real status is.

By now, we're all familiar with this constructor which defines an array that's duplicated for each instance of an object:

[b]CON[/b]
  MySize = 64
[b]VAR[/b]
  [b]LONG[/b]    MyVar[noparse][[/noparse]MySize]

But what I couldn't find documented anywhere is this constructor:

[b]CON[/b]
  MySize = 64
[b]DAT[/b]
  MyVar    [b]LONG[/b]    $AA55[noparse][[/noparse]MySize]

This defines a block of MySize LONGs in cog memory, initialized to $AA55 (assuming it's part of an assembler program that gets its own cog). But, perhaps more interestingly, it also defines a single instance of an array in hub RAM, initialized to $AA55, that can be shared by multiple instances of the defining object -- or anyone else, for that matter, if they know its address.

If someone can point out a reference to this notation in the docs, I'd be grateful. Until then, I suppose it's a trick.

Thanks,
Phil

Paul Baker · 2006-03-16 19:10

I finally have a trick to share instead of just traps, Ive figured out an alternate way of using pointers in assembly. It is geared to sequential pointer accessing, here is an example of it's use:

...
:loop    mov  :buffer, ina
         add  :loop, :d_inc
         djnz :i, #:loop
...
:i       LONG BufferSize
:d_inc   LONG $0000_0200             '1<<9 (least significant bit of destination field)
:buffer                              'array of longs for buffer space are here

The nice thing about this method is you don't need a variable to hold the pointer, the mov instruction starts off with the initial pointer and is incremented with each iteration of the loop. The same can be done for the source field as well by doing a "add target, #1", and pointer decrements would use a sub instruction.

Now here's the really nifty trick, say you have filled a buffer inside the cog and you want to flush the data to hub memory, you can adjust the source and destination pointers in the same instruction like so:

...
          mov    :hbptr, par
:wrloop   wrlong :buffer, :hbptr
          add    :wrloop, :s_d_inc
          djnz   :i, #:wrloop
...
:i        LONG   BufferSize
:s_d_inc  LONG   $0000_0204      'increment cog pntr by 1 (destination field) hub pntr by 4 (a long) (source field)
:hbptr    LONG   $0000_0000
:buffer                          'array of longs for buffer space

Since the modification to the pointers is done in a post fashion, the djnz serves as the required 1 instruction buffer between modifcation and execution of the instruction (no need for nops!). Not shown when combining the two above code snippets is that :i must be reinitialized to BufferSize, since it will be 0 after leaving the first snippet's loop.

P.S. The second code snippet really makes excellent use of hub operations, since 2 assembly instructions can be performed between hub operations and stay in the fastest sync possible, the flushing operation will flush the data in the fastest possible time. If you used seperate movs and movd operations, the loop would miss its next hub operation window.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 3/16/2006 7:33:39 PM GMT

Phil Pilgrim (PhiPi) · 2006-03-16 19:29

Hey Paul,

That's a really neat trick! I especially like the second one. Of course, it's necessary to re-initialize the counter and embedded pointer(s) each time you use them, and make sure that no carries can occur out of their respective fields in the mov and wrlong instructions.

-Phil

Paul Baker · 2006-03-16 19:38

Right, I neglected to include those caviats because my code runs once (no re-initilization of mov instruction required) and Im dealing with buffers in cog space, so the pointer will never exceed $1F0 and no carry into another field will happen.

Oh wait a second, I really boffed up the second example, the trick doesn't work as expected. Because the contents of :hbptr are used, the code as given will cause the hub adress to be incremented to contents of buffer[noparse][[/noparse]2], contents of buffer[noparse][[/noparse]6],...

Sorry about that, but it would work for cog buffer to cog buffer transfers.

Grrr, I should double examine my ideas before posting them. I did·it for the first·example, but in my glee to take it to its (at the time) logical conclusion·resulted in me not using as much scrutiny on the second example.·

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 3/16/2006 7:49:06 PM GMT

Phil Pilgrim (PhiPi) · 2006-03-16 20:03

Hi Paul,

So just change it to:

          movd :wrloop,#buffer
          movs :wrloop,par
          nop
:wrloop   wrlong 0-0, #0-0
          ..

I think that would fix it. The "#" refers to a bit in the instruction field, so it won't be affected by the movs. I do think it's important to specially designate the target of code modification somehow, so it's apparent what's being done to it -- even if it requires an extra instruction. My preference is "0-0".

-Phil

Paul Baker · 2006-03-16 20:10

Problem is with the wrlong the hub address is in the source, using the method you just wrote limits the addressable space in the hub to the first 512 locations. An aside question, can you gaurantee the run time location of variables in hub memory? How would this be done, when many objects can be loaded in through the hierarchy, does the top level object always get loaded first? What if your object is intended to be loaded into a lower portion of the hierarchy (as my object is). Hmmm I think maybe Chip and Jeff are the only people that can answer this.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Phil Pilgrim (PhiPi) · 2006-03-16 20:43

Yup, you're right. 'Looks like two different entities will have to get incremented after all.

-P.

cgracey · 2006-03-16 22:18

You can't pick WHERE variables are going to be, as the compiler allocates them automatically. Chances are, your code will displace them beyond the first 512 bytes, anyway. Best to just use a pointer without caveats. I like how you guys are figuring out slick ways to do things. I'm expecting to be surprised soon by something I·never thought·of.

Paul Baker said...
Problem is with the wrlong the hub address is in the source, using the method you just wrote limits the addressable space in the hub to the first 512 locations. An aside question, can you gaurantee the run time location of variables in hub memory? How would this be done, when many objects can be loaded in through the hierarchy, does the top level object always get loaded first? What if your object is intended to be loaded into a lower portion of the hierarchy (as my object is). Hmmm I think maybe Chip and Jeff are the only people that can answer this.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

Paul Baker · 2006-03-16 22:53

Chip Gracey said...
I'm expecting to be surprised soon by something I·never thought·of.

That's a difficult expectation to live up to, coming from you [noparse]:)[/noparse]. I know Andre did so, but Im not his caliber of programmer.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Phil Pilgrim (PhiPi) · 2006-03-21 19:17

Hi Beau,

I don't have anything hooked up to try this, but I think the problem may be in the repeat loop. By putting the while at the beginning of the loop, you're testing Sync once before it's even assigned. Since local variables don't get initialized, there's no telling what its value might be at that point. Try doing it this way:

  repeat
    Sync := PULSIN_CLK(RXPin, 1)
  while Sync <= 8000

-Phil

Beau Schwabe · 2006-03-21 20:17

Phil,

Your code still reports '0' when trying to display the value of Sync.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Phil Pilgrim (PhiPi) · 2006-03-21 20:52

Hi Beau,

I just figured it out, and this is a real trap for experienced programmers! The operator for "less than or equal" in Spin is =<, not <=. This has bitten me several times, and I still didn't see it!

-Phil

Paul Baker · 2006-03-21 21:24

Oh yeah your right Phil, <= is the assigment operator for <. So the expression was being evaluated as Sync := Sync < 8000

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Phil Pilgrim (PhiPi) · 2006-03-21 23:21

Hi Beau,

The summary you've reproduced is a little ambiguous, because "<=" and "<" are not the same operator. As Paul points out, a <= b is the same as a := a < b. If you look on page 164 of the manual, you'll see what we're referring to. Of course, that still doesn't explain why Sync is coming up zero, if you used "=<".

-Phil

Paul Baker · 2006-03-22 03:01

Even X <= 8000 doesn't evaluate correctly. If X is 0, X < 8000 is true and X should be -1 or 32 bits of 1.

The only difference between the two is when the two terms are equal (and =< doesn't modify the left term).

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 3/22/2006 3:40:52 AM GMT

Phil Pilgrim (PhiPi) · 2006-03-22 04:42

Hey guys,

We must be using different versions of Spin, 'cuz I can't get it to fail (or else I just don't understand the problem

):

[b]CON[/b]

  [b]_clkmode[/b]      = [b]xtal[/b]1 + [b]pll[/b]16x
  [b]_xinfreq[/b]      = 5_000_000

[b]VAR[/b]

  [b]long[/b]  i, j

[b]OBJ[/b]

  print : "print"
  
[b]PUB[/b] start

  print.start(0, 0, -9600)
  print.cls
  
  i := 0  
  print.f4([b]string[/b]("i = %4.1d:  i < 8000 = %2.1d,  i =< 8000 = %2.1d,  i == 8000 = %2.1d\r"), i, i < 8000, i =< 8000, i == 8000)
  print.f4([b]string[/b]("i = %4.1d:  i > 8000 = %2.1d,  i => 8000 = %2.1d,  i <> 8000 = %2.1d\r\r"), i, i > 8000, i => 8000, i <> 8000)

  [b]repeat[/b] i [b]from[/b] 7999 to 8001
    print.f4([b]string[/b]("i = %d:  i < 8000 = %2.1d,  i =< 8000 = %2.1d,  i == 8000 = %2.1d\r"), i, i < 8000, i =< 8000, i == 8000)
    print.f4([b]string[/b]("i = %d:  i > 8000 = %2.1d,  i => 8000 = %2.1d,  i <> 8000 = %2.1d\r\r"), i, i > 8000, i => 8000, i <> 8000)

  i := 0
  i <= 8000
  print.f2([b]string[/b]("i = %4.1d:  i <= 8000:  i = %2.1d\r"), 0, i)

  [b]repeat[/b] j [b]from[/b] 7999 to 8001
    i := j
    i <= 8000
    print.f2([b]string[/b]("i = %d:  i <= 8000:  i = %2.1d\r"), j, i)

This prints:

i =    0:  i < 8000 = -1,  i =< 8000 = -1,  i == 8000 =  0
i =    0:  i > 8000 =  0,  i => 8000 =  0,  i <> 8000 = -1

i = 7999:  i < 8000 = -1,  i =< 8000 = -1,  i == 8000 =  0
i = 7999:  i > 8000 =  0,  i => 8000 =  0,  i <> 8000 = -1

i = 8000:  i < 8000 =  0,  i =< 8000 = -1,  i == 8000 = -1
i = 8000:  i > 8000 =  0,  i => 8000 = -1,  i <> 8000 =  0

i = 8001:  i < 8000 =  0,  i =< 8000 =  0,  i == 8000 =  0
i = 8001:  i > 8000 = -1,  i => 8000 = -1,  i <> 8000 = -1

i =    0:  i <= 8000:  i = -1
i = 7999:  i <= 8000:  i = -1
i = 8000:  i <= 8000:  i =  0
i = 8001:  i <= 8000:  i =  0

It looks right to me.

-Phil

Phil Pilgrim (PhiPi) · 2006-03-22 05:47

It does? Well, the part about not exiting I got, since the Count++ always bumps the -1 from the compare back to zero before the next compare. But for the other cases, I get this instead:

··"=<" yields pattern, and "<" yields pattern - 1.

BUT, in the process, I learned something new about Spin:

··outa[noparse][[/noparse]16..23] displays the pattern backwards, and
··outa[noparse][[/noparse]23..16] displays it forward!

What the ... ? I had no idea Spin would do that -- or even that a reverse ellipsis was allowed! That makes my day! (I really do need to print out the manual and read it cover to cover.)

Thanks, Beau!

-Phil

Jon Williams · 2006-03-29 02:54

Chip showed me a neat today.

The DS1620 returns its sign in bit8 of the 9-bit temperature, so using the sign extension operators (~~X or ~X) is not possible.· I started with this:

· if (tempc & $00_00_01_00)
··· tempc |= $FF_FF_FF_00

Here's the trick that Chip showed me that lets you extend the sign no matter what position it's in:

· tempc := tempc << 23 ~> 23

The value 23 is used for the sign bit in bit8 (31 - 8).· The first section moves the sign bit to bit32 of tempc, the second half shifts the value back while padding new bits to the "left" of the value with the sign bit.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Jon Williams
Applications Engineer, Parallax

Post Edited (Jon Williams (Parallax)) : 3/29/2006 7:53:34 AM GMT

Phil Pilgrim (PhiPi) · 2006-03-29 07:38

That's a neat trick! I'll include it in the next update.

-Phil

pjv · 2006-04-10 18:21

Hi All;

This tidbit is probably not useful or interesting to most, but just in case.......

I've been messing with my favourite...... high speed stuff, and I think I found a trap in high speed pulse detection.

The " waitpeq " (and presumably the "waitpne") instructions will not detect pulses as short as one clock cycle.

When running at 80 Mhz this means "waitpeq" will not respond to pulses as short as 12.5 nano seconds.

My tests are with synchronous clocks, and I have not tested this with different clock phases as that is rather difficult for me to achieve.

Just thought some of y'all might like to know!

Cheers,

Peter (pjv)

SteveW · 2006-04-10 18:36

>my tests are with synchronous clocks, and I have not tested this with different clock phases as that is rather difficult for me to achieve.

Well, sort of... By the time you're playing with nanoseconds, light is conveniently slow. Say, 2*10^8m/s for a random bit of coax.
200 metres per microsecond, 0.2 metres per nanosecond. With 2.5 metres of cable that you're prepared to chop up, (and some suitable termination resistors to keep things under control), the world's your oyster, just pick a phase. Go on, abuse those latches. See if you can goad them into metastability, too, while you're in there [noparse]:)[/noparse]

Steve

Propeller Tricks & Traps (Last update 21 June 2007)

Comments