Shop OBEX P1 Docs P2 Docs Learn Events
Strange problem with SX48 at 20Mhz — Parallax Forums

Strange problem with SX48 at 20Mhz

paolopaolo Posts: 17
edited 2009-01-30 14:01 in General Discussion
Hello,

I have a very strange problem running the SX48BD at 20Mhz.
I'll try to explain what's happening even though my English is not perfect.

In the past, I had a similar problem, and I thought it was due to the hardware (e.g. clock path, vcc noise etc.).
Now I'm having the same problem again with a completely new board, and I cannot realize what is causing it.
The problem is that the SXB program "crashes" when it calls a specific subroutine from a specific address. If I insert some NOP's somewhere in the code, the program starts to work fine. This problem also disappears by using a 50Mhz clock instead of 20Mhz.

I've written a sample code that simply calls some subroutines at the same memory locations of my original program. To do that, I've used the ADDRESS directive.
The program has a SUB_1 routine that prints "OK" on a RS232 line, and a SUB_2 routine that simply calls SUB_1.
If I use the debugger, and execute the program step-by-step, it works as expected. If I run the program at full speed, it doesn't work.

For instance, if I insert a breakpoint at the instruction "temp1 = 1" in SUB_1, when the breakpoint pops-up I can see that the message "OK" has been sent over the serial port as expected.
If I now move the breakpoint to the instruction "temp1 = 2", and click RUN again, the breakpoint never pops-up.
If I reset the program, and insert the breakpoint at the instruction "temp1 = 3" on the main program, the breakpoint never pops-up.
If, instead, I insert the breakpoint at the instruction "goto StopHere", the breakpoint pops-up (I don't know how), but the content of the register temp1 is 1 and not 3 as expected.

Note that if I change the directive "ADDRESS $83E" into "ADDRESS $83D" the program works as expected.
Also changing the clock frequency form 20MHz to 50MHz the program works fine.

Is there anybody that could try this code using the SX48 prototype board? If it work fine then the problem is indeed due to my hardware.

Thank you.
Paolo.

Here is the code:

DEVICE SX48, OSCHS2
FREQ 20_000_000

SIn var rb.3 'RS232 RX
SOut var rb.5 'RS232 TX

temp1 var byte

PROGRAM Start
ADDRESS $43
SUB_1 SUB 0
SUB_2 SUB 0
TX_BYTE SUB 1

Start:
SUB_1
SUB_2
temp1 = 3
StopHere:
goto StopHere

ADDRESS $83E
SUB SUB_2
temp1 = 1
SUB_1
temp1 = 2
ENDSUB

SUB SUB_1
TX_BYTE "O"
TX_BYTE "K"
ENDSUB

SUB TX_BYTE
temp1 = __PARAM1
SEROUT SOut, T9600, temp1
ENDSUB

Post Edited (paolo) : 1/22/2009 11:55:36 AM GMT

Comments

  • BeanBean Posts: 8,129
    edited 2009-01-22 12:22
    Paolo,
    Try changing OSCHS2 to something lower (HS1 or XT2).
    See if that fixes it.

    I will see if I can find an SX48 protoboard and try it when I get a chance.

    Bean.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • paolopaolo Posts: 17
    edited 2009-01-22 12:49
    Hi Bean,
    I've tried but with no luck.
    Note that the problem is present either using a ceramic resonator or the Sx-Key as clock source.

    Paolo
  • BeanBean Posts: 8,129
    edited 2009-01-22 16:12
    Paolo,
    I tried it, and it does what you say.

    Very strange. I'm thinking it has to do with the debugger. I took the SRC code generated at 50MHz, and just changed the frequency and it wouldn't work.

    Hopefully Peter will chime in if he has any thoughts.

    Bean.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • PJMontyPJMonty Posts: 983
    edited 2009-01-22 16:59
    Bean,

    If the problem occurs when the board is run with either a ceramic resonator or the SX-Key generating the clock, how can it be a problem with the debugger?

    Thanks,
    PeterM
  • paolopaolo Posts: 17
    edited 2009-01-22 17:10
    Well, the good news is that my new four layer PCB is OK smile.gif

    At the beginning I also were thinking that the problem was due to the debugger, but the program "crashes" even without the SX-Key!
    I suspect that the chip doesn't like frequencies lower than 40 or 50 MHz!
  • BeanBean Posts: 8,129
    edited 2009-01-22 18:12
    Okay, I'm stumped...

    Here is the simplest program that I can still cause the problem.

    Help anyone ???

    The problem seems to be caused by CALLs located at specific address locations ? But I don't know why the frequency makes a difference though ??? Very weird...
    ' PROBLEM PROGRAM
    ' If you use RUN RB.0 never goes high
    ' IF you use DEBUG->RUN it never breaks
    ' IF you use DEBUG->WALK it does break, and RB.0 goes high
    ' In the comments below "working" means that it does break and RB.0 does go highby using DEBUG->RUN
    '
    DEVICE SX48, OSCXT1
    FREQ 32_000_000 ' 32 doesn't work, 33 works sometimes by hitting RESET then RUN; 34 works always
     
    PROGRAM Start NOSTARTUP
     
    ADDRESS $43
    
    ' If you re-arrange the order of the SUB definitions, it starts working ???
    SUB_1   SUB 0 
    SUB_2   SUB 0
    TX_BYTE SUB 0
     
    Start: 
      LOW RB.0
      SUB_2
    END
     
    ADDRESS $840
     
    SUB SUB_2
      SUB_1
      BREAK
      RB.0 = 1
    ENDSUB
     
    SUB SUB_1
      \NOP   ' If you remove this instruction, it starts working ???
      TX_BYTE
    ENDSUB
     
    SUB TX_BYTE
    ENDSUB
    


    Bean.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • pjvpjv Posts: 1,903
    edited 2009-01-23 04:12
    WOW, This IS bizzare!

    I confirm the findings, and can add some more observations.

    The problem is similar with the serial (F series) key and IDE V3.2.3 as with the new USB Key and 3.3.92 Beta.

    Relocating the ARDDRESS $840 anywhere to $83F continues to make it crash. Beyond that or below that it seems fine.

    Changing port bits makes no difference.

    Driven with the USB Key, all oscillator amplifier settings LP through HS3 behave similarly...... strangely enough, the waveform of OSC1 does not change appreciably through this range of settings whereas I had expected it to. Running a spectrum anlysis, the spectrum and "birdies" also do not change.

    I suspected the classic "consecutive I/O" read after write problem because SX/B does a final port write in a LOW or HIGH instruction, but that seems not the case as I pulled the port bit pretty hard in each direction, and the behaviour remained the same.

    I also noted that the new USB key is VERY lumpy in the frequencies it can generate. Stepping the synthesizer in 0.01 MHZ in the RUN menu, the frequency does not change for many instances, and then it suddenly jumps a bunch. For example, stepping from 14.33 MHz to 14.34, the measured frequency changed from 14.0 to 14.6 MHz. Thats about 4%...... pretty ugly as we know that the SX clock will mess up when a sudden jerk is applied to the osc input . Also, the old key's waveform observed out of OSC1 is much much quieter.

    Anyhow, enough ranting... I hope someone can figure out what is going on here. Very disconcerting for those who thought the silicon was rock solid.

    And maybe it is, but a good explanation would be great.

    Will spend more time on this later.

    Cheers,

    Peter (pjv)
  • paolopaolo Posts: 17
    edited 2009-01-23 09:11
    Hi,

    I've been living with this problem for almost a year (see my post "jump to the wrong page" http://forums.parallax.com/forums/default.aspx?f=7&m=257881).
    After having spent days (and nights) trying to find the bug in my program, I got the conclusion that my PCB layout was causing this strange behavior, so I redesigned a new PCB with separate ground and power plane layers and lots of decoupling capacitors.

    From my experience, I can say that the problem is caused by jumps across page boundaries. This is very common in SX-B since all the subroutines are defined in the first page.
    My workaround used to be the change of the location of the calling (or called) subroutine.

    Now I'm just curious to know the reason of this behavior.

    Thanks,
    Paolo
  • BeanBean Posts: 8,129
    edited 2009-01-23 12:39
    PJMonty,
    · I didn't realize the problem occured when used stand-alone. I thought it only happened when debugging.

    PJV,
    · I'm glad someone else was able to duplicate the problem.

    · It must be pretty hard to get it to happen because I don't recall ever running into this. But it would be nice is someone who knows the internals could take a peek and see if there is a solution. Or at least a method to avoid this. It kinda looks like a subroutine stack overflow ? But it uses only 3 levels, so I don't see how that would be it.


    Bean.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • BeanBean Posts: 8,129
    edited 2009-01-23 14:19
    Okay here is another program that more clearly shows the problem.

    ' PROBLEM PROGRAM 2
    ' PAGE instruction does not set STATUS.PAx bits when certain addresses are used, and ONLY at lower frequencies ???
    '   BUT if you use DEBUG->WALK the codes works everytime.
    '
    ' Set FREQ to 30_000_000. 
    '   Use DEBUG->RUN. When breakpoint is hit you will see that the PAx bits in STATUS are NOT set correctly. 
    '   If you use STEP, you will see that execution is at address $000.
    '   Hit RESET, then WALK, now at breakpoint the PAx bits ARE set correctly.
    '   If you use STEP, you will see that execution is at address $600.
    '
    ' Set FREQ to 35_000_000.
    '   Use DEBUG->RUN. When breakpoint is hit you will see that the PAx bits in STATUS ARE set correctly.
    '
    ' Set FREQ to 30_000_000
    '    Use RUN->RUN to program SX48. Note that RB.1 does NOT go high when reset button on SX board is pressed.
    '    Use RUN->CLOCK to change the clock to 35MHz. Note that RB.1 DOES go high when reset button on SX board is pressed.
    '      The problem is NOT in the debugger. It happens in run mode too.
    '
    DEVICE SX48, OSCXT1
    FREQ 30_000_000 ' 30 and below shows problem; 35 and above problem is not there
    
    PROGRAM Start NOSTARTUP
    
    ADDRESS $43
    
    BREAK
    SUB_1   SUB 0 ' This sub is at address $600, but jumps to $000.
                  ' Even though a PAGE instruction is there the PAx bits don't get set ???
                  ' If you use DEBUG->WALK, or a clock above 35MHz, it works as it should
    
    SUB_2   SUB 0 
    
    Start: 
      LOW RB
      SUB_2
    END
    
    ADDRESS $600
    
    SUB SUB_1
      RB.1 = 1
    ENDSUB
    
    ADDRESS $83F
    
    SUB SUB_2
      RB.0 = 1
      SUB_1
      RB.2 = 1
    ENDSUB
    



    Comment welcome...

    Bean.


    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • Peter VerkaikPeter Verkaik Posts: 3,956
    edited 2009-01-23 15:48
    Bean,
    The BREAK before
    SUB_1 SUB 0
    is located at address $43, as is the code for PAGE instruction from
    JMP @SUB_1
        68  =00000043         BREAK                          ;BREAK
        69                  
        70  =00000043       __SUB_1:                         ;SUB_1   SUB 0 ' This sub is at address $600, but jumps to $000.
        71  0043  0013        JMP @SUB_1                    
            0044  0A00
    

    Doesn't BREAK need a storage location?
    (i.e shouldn't the JMP @SUB_1 be at $44)
    If BREAK overwrites the PAGE, the jump to $600 could become a jump to $000

    regards peter
  • pjvpjv Posts: 1,903
    edited 2009-01-23 16:27
    @Peter V.

    I believe Break has its own internal register and does not consume a spot in memory..... it just does a contiuous 12 bit address compare with the program counter, and halts execution on a match.

    Cheers,

    Peter (pjv)
  • BeanBean Posts: 8,129
    edited 2009-01-23 16:29
    Here is another program. It shows that what's happening is that the first instruction is not being executed properly.

    DEVICE SX48, OSCXT1
    FREQ 30_000_000 
    RESET Start
     
    Start:
      MOV W,#$1F                    
      MOV M,W
      MOV !RB,#0                    
      CLR RB                        
      CALL @__SUB_2
      ; If everything went right, RB.0, RB.1, RB.2 and RB.7 should be high
      JMP @$
     
    ORG $43
    __SUB_1:                        
      BREAK
      ; It seems this first instruction is not executed properly
      SETB RB.7  ; If this comes BEFORE "PAGE SUB_1", then RB.7 will NOT get set, but execution will flow as it should.
      PAGE SUB_1 ; If this comes BEFORE "SETB RB.7", then RB.7 WILL get set, but execution will NOT flow as it should
      JMP SUB_1                    
     
    __SUB_2:
      JMP @SUB_2                    
    
    ORG $600
    
    SUB_1:
      SETB RB.1
      RETP
                                    
    ORG $83F
    SUB_2:
      SETB RB.0
      CALL @__SUB_1
      SETB RB.2
      RETP
    

    Bean


    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • paolopaolo Posts: 17
    edited 2009-01-23 17:29
    Bean, this can explain why the page command is not executed after an SX/B function or subroutine call. In fact the PAGE command is always the first instruction of the subroutine in the SX/B compiled code.
    Probably this is due to the pre-fetch cycles with the internal multi-stage pipeline, but only who knows the internal architecture of the chip can give us an answer.

    Paolo.
  • BeanBean Posts: 8,129
    edited 2009-01-23 17:35
    Paolo,
    Yes, I'm thinking that I will put NOPs as the first instruction for defining SUBs and FUNCs to get around this problem.

    So that "MySub SUB 0" will generate:

    __MySub:
    NOP
    JMP @MySub

    I think that will solve the problem, but will use 1 extra instruction per SUB/FUNC declaration. A small price to pay in my opinion.

    Bean.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • RobotWorkshopRobotWorkshop Posts: 2,307
    edited 2009-01-23 18:10
    Is this just going to be rolled into the latest SX/B 2.0 or will there be an interim upgrade to address this in the current build?

    Robert
  • Peter VerkaikPeter Verkaik Posts: 3,956
    edited 2009-01-23 18:21
    Bean,

    If this can only happen when using BREAK, why not insert a NOP after BREAK?

    regards peter
    ·
  • BeanBean Posts: 8,129
    edited 2009-01-23 18:57
    Okay the problem seems to happen anytime you do a JMP or CALL from an address above $7FF directly to an address that is $7FE less than where you jumped from.
    For example if you jump from $87E to $080, the instruction at $080 is not executed. I'm sure this has something to do with the way the pipeline is cleared during a jump/call.
    Here is yet another program that shows the problem.

    DEVICE SX48, OSCXT1
    FREQ 30_000_000 
     
    RESET Start
     
    Start:
      MOV W,#$00
      JMP @DoIt
     
    ORG $080 ; You can change this to whatever address you want.
    FirstPage:
      BREAK
      ; It seems this first instruction is not executed properly
      MOV W,#$55
      XOR W,#$FF
      JMP @$
     
                                    
    ORG FirstPage + $7FD
    DoIt:
      PAGE FirstPage
      JMP FirstPage
    

    Bean
    P.S. Peter the problem happens during normal run too. Not just when debugging or when using BREAK.


    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • BeanBean Posts: 8,129
    edited 2009-01-23 19:15
    Same problem happens going the other way too.
    DEVICE SX48, OSCXT1
    FREQ 30_000_000 
     
    FarPage = $880 ; You can change this to whatever address you want.
    RESET Start
     
    Start:
      MOV W,#$00
      JMP @DoIt
     
    ORG FarPage - $803
    DoIt:
      PAGE FarPage
      JMP FarPage
    
    ORG FarPage
    FarPage:
      BREAK
      MOV W,#$55   ; It seems this first instruction is not executed properly
      XOR W,#$FF
      JMP @$
    

    If my results are correct, the problem is pretty unlikely to happen. As the location jumped from must be exactly the correct number of addresses away from the location being jumped too.

    I don't think adding the NOP would be worth it. The problem could happen at ANY jump or call, although it would be a rare occurence.

    PJMonty, maybe you could add a catch for this in the SX-Key assembler ? Throw an error or warning anytime a CALL @label, or JMP @label where the @label address has the exact offset that causes the problem.

    Bean.


    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.




    Post Edited (Bean (Hitt Consulting)) : 1/23/2009 7:25:06 PM GMT
  • PJMontyPJMonty Posts: 983
    edited 2009-01-23 20:37
    Bean,

    Possibly, but it would need to be approved and paid for by Parallax. Just so I'm clear, you're saying the following:

    If (current address > $0800) and (@label address == (current address - $07FE)) then
      warn user
    



    I only say "possibly" since there are many things about the design of SASM (I didn't create it, I just maintain it) that often make "simple" changes deceptively hard.

    Thanks,
    PeterM
  • BeanBean Posts: 8,129
    edited 2009-01-23 21:31
    Peter,

    If (@label address == (current address - $07FE)) or (@label address == (current address + $0803)) Then
      warn user
    

    · I don't think it is a really big deal, but I think users should be aware that it CAN happen.

    · P.S. I still get the·error "List index out of bounds (0)" on both of my machines when I try to print from version 3.2.92h. Just so you know.

    Bean.



    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • PJMontyPJMonty Posts: 983
    edited 2009-01-24 00:59
    Bean,

    A couple of posts back you wrote:

    ...anytime you do a JMP or CALL from an address above $7FF...

    This isn't referenced in your conditions. Is doing this from an address above $7FF part of the issue or not? If so, then shouldn't it be:

    if (current address > $07FF) then
      If (@label address == (current address - $07FE)) or (@label address == (current address + $0803)) Then
        warn user
    



    Thanks,
    PeterM
  • BeanBean Posts: 8,129
    edited 2009-01-24 02:44
    Peter,
    That was before I realized the problem occurs in either direction.

    Bean.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
  • paolopaolo Posts: 17
    edited 2009-01-24 17:28
    Bean,
    some posts back you wrote:
    If my results are correct, the problem is pretty unlikely to happen.
    Well, in this case it seams that I'm completely "out of luck"!
    In the last year I've written three different firmwares for three different industrial applications (robotics, disc autoloaders and so on). Each program uses almost all the available flash memory, and this is probably why I often got this problem.

    I think that adding a catch in the SX-Assembler is a good solution, but up to now we have only empirical information about the possible workaround. We cannot be sure that only a jump of $7FF words can cause the error.
    I think that somebody at Partallax should analyze the hardware of the chip, find the origin of the problem and publish an official document that says something like: "...If you use a clock frequency lower than 35MHz, you cannot jump at a memory location nnn words above or below the location of the jump instruction..."

    I like the SX-48, it's is an excelent processor (I'm going to buy other 300 pcs. for my latest project), but a deterministic approach is important in order to use this chip in the most possible reliable way.

    Thanks,
    Paolo

    First Murphy's law: "If anything can go wrong, it will"

    Post Edited (paolo) : 1/24/2009 6:59:38 PM GMT
  • RickBRickB Posts: 395
    edited 2009-01-29 06:34
    Is this being looked at by the Gurus at Parallax?

    Rick
  • william chanwilliam chan Posts: 1,326
    edited 2009-01-29 07:32
    Does this problem also occur on SX20 and SX28?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.fd.com.my
    www.mercedes.com.my
  • PJMontyPJMonty Posts: 983
    edited 2009-01-29 19:34
    William,

    Good question. Try it out and let us know what you find.

    Thanks,
    PeterM
  • ZootZoot Posts: 2,227
    edited 2009-01-30 13:48
    Bean said...
    Okay the problem seems to happen anytime you do a JMP or CALL from an address above $7FF directly to an address that is $7FE less than where you jumped from.

    But on an SX20/28 you can't have an instruction at an address past $7FE, and if you wanted a PAGE/JMP at the highest registers, those instructions would have to be at $7FD and $7FE.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    When the going gets weird, the weird turn pro. -- HST

    1uffakind.com/robots/povBitMapBuilder.php
    1uffakind.com/robots/resistorLadder.php
  • BeanBean Posts: 8,129
    edited 2009-01-30 14:01
    Yeah, the problem cannot occur on the SX20/28 because the address space isn't big enough.

    Bean.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    ·The next time you need a hero don't look up in the sky...Look in the mirror.


    ·
Sign In or Register to comment.