Shop OBEX P1 Docs P2 Docs Learn Events
Self modifying code spacer instruction requirement (SETS/SETD) — Parallax Forums

Self modifying code spacer instruction requirement (SETS/SETD)

ozpropdevozpropdev Posts: 2,793
edited 2015-11-19 01:31 in Propeller 2
Formely titled : FPGA platforms exhibit different behaviour (SETS issue)
Hi All

I've been playing around with some video stuff and have found some weird differences
between the various FPGA builds.

I have built a NTSC driver for the Nano that has minimal impact on HUB usage.
The font data is stored in cog so all that is required is a ~800 bytes text buffer.

This driver works differently on the Nano/DE2 platforms compared to the P123-A7 platform.

On the Nano/DE2 build: (see ntsc_nano.jpg)
The characters are displayed in the incorrect position (offset by 1 - see picture).
The first character displayed is actually the last character in the line buffer.
Their also is a random artifact appearing on the first character.
Otherwise picture is stable and clear.

On the P123-A7 build: (see ntsc_p123.jpg)
The characters are displayed in the correct position but several positions are
displaying the wrong characters.
Image has lots of random artifacts which can be cleaned up by inserting a single NOP
in the approprite spot (see code)

I'm guessing their might some compile/timing differences between Cyclone IV and Cyclone V.
I suspected my code (and still do) but getting two totally different results on different
platforms has got the old brain foggy again.

Any ideas?

P.S. You may notice a row of '*' and '?' missing.
These are flashing characters and I missed timed taking the picture. :)

1056 x 592 - 315K
1056 x 592 - 311K
«1

Comments

  • LoopyBytelooseLoopyByteloose Posts: 12,537
    edited 2015-11-15 10:57
    Quartus II is a rather huge application. And the Cyclone IV and Cyclone V do indeed have generational differences that may explain some of this.

    At the core of the question is 'What exactly is Altera providing?". Altera is unlikely to enter into any discussion unless you are a business and have at least the potential to buy a licensed version of Quartus II for about $3000USD.

    So the comunity of free users is pretty much left in the dark and can only speculate.

    So, it pretty much leaves Parallax to working it out and communicating with Altera. We might come up with a satisfactory explaintion here, but I wouldn't waste my time trying to engage Altera in a reply.
  • RaymanRayman Posts: 14,768
    I'd wonder if it's the DACs... Maybe different s (scale) values would get you the same output?
  • Loopy
    I never expected to get Altera involved at all!
    I've always found the Parallax community way more helpful. :)
  • @Rayman
    Incorrect signal levels still doesn't explain random bytes though. :(
  • rjo__rjo__ Posts: 2,114
    I'm getting a correct display on P123-A7
    by inserting an extra NOP in .nxtc... so two nop there...
    above in .loop1, I can remove the NOP... leave it as it is or have two NOPs and display is still good.
  • rjo__rjo__ Posts: 2,114
    .....
    1920 x 1080 - 416K
  • I can play with this today. I have the different boards to test on.

    Really need to identify problems like this so we chase after the correct bugs in the FPGA image.
  • rjo__rjo__ Posts: 2,114
    edited 2015-11-15 13:58
    So... it looks like the issue is in the rep block?

    SETS?
    mov pixels, 0-0?

    I'm guessing SETS inside the REP is being delayed.
  • rjo__rjo__ Posts: 2,114
    I seem to remember that there is a difference in how the cyclone V handles Verilog immediate assignments... In the IV, if you make conflicting assignments, the last one wins and there is nothing to flag the conflict. On the V, it throws a message or error... can't remember which.

  • RaymanRayman Posts: 14,768
    Are there some instructions that aren't supposed to be in rep loop?
  • rjo__rjo__ Posts: 2,114
    I think Chip posted something about that... can't exactly remember... but that could be the answer.
  • rjo__rjo__ Posts: 2,114
    I only have the P123-A7 available. I am really interested in the fact that the behavior seems to vary depending upon the cyclone variant... and whether this points to something in the Verilog handling by Quartus. I know that I had code that wasn't good form and it got pointed at by Quartus when I ran it for the A7... the problem is that I can't remember exactly what happened:)
  • Wow! My DE2 output looks identical to your DE0 output.

    My P123 is garbage. It has artifacts everywhere. The blinking dots and ?'s are good but everything else is illegible. One NOP or 2 NOP makes no difference. In fact, my most stable P123 display is without any NOP.

    For reference, Potatohead's NTSC text driver is rock solid on both DE2 and P123.

    Nothing jumps out at me yet but its kind of scary that things are that different!
  • Both my Nano and DE2 produce the same output and the P123-A7 is different (worse).
    I had earlier suspected the REP block but changing it to a DJNZ loop made no difference.
    I will keep chipping away at it, pardon the pun :)

  • Eureka!
    It appears the problem is that SETS requires two NOP's spacing not one!
    		sets	.get_pixels,asx
    		nop
    		nop
    .get_pixels	mov	pixels,0-0
    

    All tests so far on all three platforms now are correct and stable.
    Will test on some more monitors later tonight. :)

  • Heater.Heater. Posts: 21,230
    Whilst that extra NOP is a solution to get your code working I would say that it is not the solution but a work around.

    We still don't know the root cause of the different behaviour between Nano/DE2 and P123-A7 builds.

    Such a thing is very suspicious and should got to the bottom of. I guess only Chip is in a position to do that. Hope this is reported as a bug somewhere noticeable.
  • jmgjmg Posts: 15,175
    ozpropdev wrote: »
    Eureka!
    It appears the problem is that SETS requires two NOP's spacing not one!
    ...
    All tests so far on all three platforms now are correct and stable.
    So both output sets change, to converge on a 3rd, but correct display ?
    I suppose some marginal timing (multicycle or race? ) could give different ultimate failures on differing processes.
    If this passed all timing checks, it would seem there is good reason to be nervous more may be hiding...
  • ozpropdevozpropdev Posts: 2,793
    edited 2015-11-16 13:26
    Using ALTI instead of SETS works fine on Nano/DE2-115 and P123-A7.
    		alti	asx,#%000_000_100		
    .get_pixels	mov	pixels,0-0
    
    Hope that gives you a clue Chip. :)
  • The SETS, SETD modify any instruction's field, but if that instruction is already in the pipeline you will miss it.

    The instructions, ALTI, ALTR, ALTD and ALTS modify the NEXT instruction. This is internal to the pipeline stages.
  • RaymanRayman Posts: 14,768
    ALTI is a horrible name for that instruction...
    Why isn't it called "ALTDS" ?

  • cgraceycgracey Posts: 14,209
    Rayman wrote: »
    ALTI is a horrible name for that instruction...
    Why isn't it called "ALTDS" ?

    Because it can modify the whole instruction, not just D and S.
  • cgraceycgracey Posts: 14,209
    So, is it definite that two NOPs after SETS does the trick and takes all bad behavior away?
  • RaymanRayman Posts: 14,768
    edited 2015-11-16 23:46
    I was looking at Jakacki's explanation for ALTI and didn't see that...
    Actually, I don't see how ALTI could modify the whole instruction.
    Guess I have to read the docs again.

    I still don't like the name though.

    Unless "SETI" also modifies I, D and S that is...
  • So from this we arrive at:
    SELF MODIFYING CODE RULE #1 - Two spacer instructions are required to allow correct pipeline fetch.
    and there are distinct differences in timing between Cyclone IV and V.




  • ALTISD, ALTDSI, ALTINS?

    ALTINST?

    Just typing these out shows a little ambiguity or difficulty in reading to me.

    I personally like ALTISD the best.

    Alter Instruction, Source, Destination.
  • ozpropdevozpropdev Posts: 2,793
    edited 2015-11-17 00:00
    Thinking some more....
    If the pipeline is 2 stage? then surely one space instruction is enough.
    The results from the P123-A7 (Cyclone V) were ~95% correct with one spacer.
    I feel their may still be something else going on here.
    Perhaps Chip can shine some light on this.

  • ozpropdevozpropdev Posts: 2,793
    edited 2015-11-17 03:34
    cgracey wrote: »
    So, is it definite that two NOPs after SETS does the trick and takes all bad behavior away?
    That's correct Chip


  • Yeah, like a signal arriving just a little late...
  • AribaAriba Posts: 2,690
    edited 2015-11-17 00:06
    You can use ALTS here:
    		alts	asx,#0		
    .get_pixels	mov	pixels,0-0
    
    Perhaps ALTI can be renamed to ALTNEXT or ALTNXT, so it does not suggest a specific field.

    Regarding differencies for Cyclone 4 and 5:
    Maybe the dual ported RAM behaves differently on read while write. Some FPGAs read the new value written and some return the old value when the same address is written and read.

    Andy
  • That is a significant difference!
Sign In or Register to comment.