Shop OBEX P1 Docs P2 Docs Learn Events
Strange assembly issue I can't wrap my brain around. — Parallax Forums

Strange assembly issue I can't wrap my brain around.

jhhjhh Posts: 28
edited 2011-04-12 12:35 in Propeller 1
I'm having a strange issue I wonder if anyone could cast some light on? I'm not sure if I am dealing with a bug in my code, in bst, or there's some errata item I am unaware of.

The story goes like this: In the course of optimizing some code, I found something I'd done caused the code to no longer work as before. This was really odd, because I was removing something no longer used, and not even in the area of code that now seemed to be failing. But I could toggle my code back and forth and see that it was failing.

Something caused me to then try to instead of removing that code, replacing it with a NOP. It worked again. This was odd. It's not a timing related matter, as again, the code in question isn't even the code that started to fail. I then wasted some time looking into it from the perspective of "I must be having alignment issues", but then realized that this can be, because it's all longs being added and removed by putting a NOP in or removing it.

So then I tried moving the NOP around some. Moving it a short distance caused no problems, so I though I might then be able to use that to zero in on WHERE the issue really was. While it didn't help much in that regard, I found there was a clear "wall" where I could not move it past. I cannot for the life of me explain why it should affect anything in the vicinity of the code, which is some pretty simple bit testing/fiddling:
                        test    i2cSDAMask,ina     wc
                        nop         ' NOP(s) like to live here.
                        andn    outa,i2cSCLMask
                        or      dira,i2cSCLMask
                        or      dira,i2cSDAMask

I thought I'd put it to one side, one long want going to make or break me, and set about some more, but found I'd hit a different kind of wall in that I was having to replace 1 for 1 anything I removed with a NOP to have code continue to work.

But I can ALWAYS move that NOP outside the area of affected code path in the test case, which is just completely inexplicable to me. I got to the point there were 3 NOPs in a group, and moving any one of them ahead, or commenting any of them out, would cause the code to fail again.

When I got to 4, I found I could move them past that "wall" and found a new one just short of the end of the code in a DAT variable area.

The code I am seeing the effect in is attached here. It only runs on a C3 board by design right now.

It's a simple demo of SD, you use option 1 first to mount the card, and then 2 to get a directory of the root, or 3 to hex dump sector zero. When the problem happens you will see "FileSystem Corrupted" as the mount command runs, and dumping the sector 0 at that point you will see a much different output than when it mounts cleanly. As attached, it's in the working state. Mess with the NOPs and you will find what I am seeing here.

It's like the code wants to be a minimum size at those two points and refuses to go lower without failing.

The ONLY difference is a NOP or where it lives, and where it's critical it must live before has NOTHING to do with the demo that fails code-path wise, and any code path lives BEFORE the "critical NOP-padding" point.

Anyone who could look and provide insights would do wonders for what remaining hair I have left. :)

Thanks for listening.

Added notes:

- this is code i wasn't ready to release yet, so it's a bit messy re: comments right now.
- the top level object is the c3io_demo_fatengine_sterm.spin
- unlike the real fatengine, the sdhc/fat32 support in this may be broken still, use a non SDHC card with FAT16 if trying it out.

c3io-nop-oddness.zip

Comments

  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-04-09 20:50
    attachment.php?attachmentid=78421&d=1297987572
  • Heater.Heater. Posts: 21,230
    edited 2011-04-09 21:12
    My bet is that you have a missing # somewhere, or one where it should not be.

    With that common mistake you can end up storing data into your code or jumping to some unexpected addresses.
    It's in the nature of things that sometimes, by chance, this can look like it works properly.
    But then as you add and remove code different places get over written or different addresses jumped to and weird failures start to show up.
    By adding and removing NOPs you are at least moving code around.
  • jhhjhh Posts: 28
    edited 2011-04-09 21:41
    Thanks. Now you mention it, that seems pretty likely, as I had some muddied understanding around the # prefix early on. I'll review for that...

    Update: Yes, while I haven't quire nailed it all yet, found one case which has reduced how many NOPs I am using already... (or certainly changed the dynamics of the issue)
  • jhhjhh Posts: 28
    edited 2011-04-10 19:33
    I think I've found all the bad hash symbol reference that may exist, and squashed those. To confirm my understanding, has symbols are used in assembly for:

    - referring to things in the CON block (what is the meaning of NOT using a hash symbol to refer to those in assembly?)
    - referring to the ADDRESS of a label (vs. the contents of the address at that label.)

    The last suspect I have is that it occurs to me I have a routine doing an RDLONG out of a buffer that is BYTE oriented from the SPIN routine... which has me again wondering if I have an alignment issue?

    What *is* the effect of doing long operations on a non-aligned hub address? Does it simply mask the lower 2 bits if the address and use that, or does it actually transfer from an unaligned area in some way?
  • potatoheadpotatohead Posts: 10,261
    edited 2011-04-10 20:19
    The octothorpe (yes, that's the name of the thing) means immediate, or direct addressing, where the value contained in the instruction is the argument, NOT a pointer to the memory address containing the value used as the argument.

    jmp 5, means jump to the memory location associated with the value in memory location 5. Say that value is 10. The jump would go to location 10, in that case. This is a indirect jump. Most addressing on the propeller is indirect addressing! It is a memory to memory design, which makes it easy, particularly with the nice COG addressing in play, but for a few cases, like jmp, and self-modifying code, where one really can get caught on that octothorpe. (happens to me all the time, and it's the first thing I look for)

    jmp #5, means jump directly to memory location 5, where the value in the instruction itself is the argument.

    All HUB operations are aligned. For a long, the lower two bits are always zero, for a word, the lowest bit is zero, and bytes are atomic, all bits significant.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-04-10 21:06
    potatohead wrote:
    The octothorpe (yes, that's the name of the thing) ...
    One of my customers/suppliers/clients has a PBX system that must've been obtained from the UK. One of the menu selections is prefaced (in American English) by:
    "If you don't know your party's three-digit extension, please press the pound sign for a company directory."

    But once you've done that and punched in the person's name, a nice lady's voice, with a very British accent comes on, and says:
    "You've entered John Doe at extension 123. If this selection is correct, please press the squayah sign."

    The what? It took me forever to realize that the "squayah sign" was the "square sign", i.e. the octothorpe.

    -Phil
  • jhhjhh Posts: 28
    edited 2011-04-12 12:22
    At long last, I found it last night....
    :writeReadRepeat        cmp     ioCmd, #ioSpiRead   wz ' figure out what to send
                     if_z   mov     localData, $FF        ' for read-only, send a $FF
                     if_nz  rdbyte  localData,ioBufAdr     ' for write ops, send the data in buffer
    

    /facepalm

    And now that I've been through this I can see why I had to pad NOPs, and why the "wall" where I couldn't move them was where it was. (Just before address $FF...)

    Thanks anyone who piped up for your guidance on this...
  • Heater.Heater. Posts: 21,230
    edited 2011-04-12 12:35
    Well done.

    Those hash errors are a pain. As you describe, you have a piece of code that appears to work, you add some new code, it fails. Naturally you are convinced the problem is in the new code not the "working" old code. So you keep on "not looking".

    If you use BST instead of the Propeller Tool it issues warnings when you compile code with missing hashes. Of course some times you don't want the hash but it's a good reminder anyway. It's wise to comment the places where you really don't want a hash.
Sign In or Register to comment.