Shop OBEX P1 Docs P2 Docs Learn Events
Preditor - Speed Help - PASM to the Rescue!!! — Parallax Forums

Preditor - Speed Help - PASM to the Rescue!!!

CassLanCassLan Posts: 586
edited 2010-01-03 01:51 in Propeller 1
This is the routine that fills the DisplayBuffer with exactly what should be in the text editing window (ie whitespaces are " ").
It takes the Address of the start of a line, as well as the horizontal offset and fills buffer[noparse]/noparse.
It works well but is slow, granted this does the whole editing window, while it probably only needs to do one line at a time while editing, but I thought I would ask some people who are more experienced than myself (most of youlol.gif ) if you see anything that would shave some clocks off this.
PUB DisplayBufferFill(addr,horizoffset)
' Fills the Display Buffer for the editing window as it will display on screen
  counter :=  0           'index of Buffer array
  counter2 := 0           'index of DisplayBuffer
  counter4 := 0           'current cursor location on that line
  counter5 := 0           'horizoffset counter
  counter6 := 0           'horizontal EOL Flag
'
' Lets start by determining what we COULD be running into
' With Horizontal Offset of 0
' 1) A single Character..followed by more characters
' 2) A single Character..followed by an EOL (Carriage Return)
' 3) An EOL..followed by more characters
' 4) An EOL..followed by another EOL
'
' With a Horizontal Offset of some value
' 5) Skipping over Characters..to Display a single character followed by more characters
' 6) Skipping over Characters and EOL .. to Display nothing on that line
' 7) Skipping over EOL..to display nothing on that line
'
  repeat until counter2 == DisplayBufferSize             ' do this until we have filled out display buffer
     counter6 := 0  'reset horiz EOL Flag
     If HorizOffset <> 0 and counter4 == 0
        'we have a horizoffset value to consider and we are at the beginning of the line
        'We are skipping over characters..we need to check what they are
        counter5:=0
        repeat until counter5 == horizoffset ' do this for every horizontal offset value (character we are skipping)
           case sdcard.vbpeek(0,addr+counter)
              32..126:             'we are skipping standard characters
                 counter++         'increment buffer index
                 counter5++        'increment horizoffset counter
              13:                  ' we have come across an EOL before we are displaying any chars on this line
                 if sdcard.vbpeek(0,addr+counter+1) == 10          'quick check to see if we have a linefeed (most likely)
                    counter++                        ' if we do increment the Buffer index to skip over this we are now lined up on it
                 repeat ((DisplayWidth-2)-counter4)  ' Fill the DisplayBuffer with spaces until the end of the display on that line
                    buffer[noparse][[/noparse]counter2]:=32                 
                    counter2++                       ' increment the DisplayBuffer index as we insert spaces
                 counter++   'increment the Buffer index (passed the linefeed/EOL)
                 counter4:= 0'reset the CharactersOnLine counter (should,have been 0 anyway)
                 'we should exit this loop at this point
                 counter5 := horizoffset             ' we will no longer display characters on this line
                 counter6 := 1        'set this flag to skip the line char render since we are at a new line now
              other:      ' for odd chars, we will treat just like standards for now
                 counter++         'increment buffer index
                 counter5++        'increment horizoffset counter
                                  
     If counter6 == 0
        case sdcard.vbpeek(0,addr+counter)
           32..126:             'standard character
              buffer[noparse][[/noparse]counter2] :=  sdcard.vbpeek(0,addr+counter) 'place this chacter value in the Display Buffer
              counter++   'increment the Buffer index
              counter2++  'increment the DisplayBuffer index
              counter4++  'increment the CharacterOnLine counter
           149:             'standard character
              buffer[noparse][[/noparse]counter2] :=  15  'place this chacter value in the Display Buffer
              counter++   'increment the Buffer index
              counter2++  'increment the DisplayBuffer index
              counter4++  'increment the CharacterOnLine counter
           13:
              if sdcard.vbpeek(0,addr+counter+1) == 10          'quick check to see if we have a linefeed (most likely)
                 counter++                        ' if we do increment the Buffer index to skip over this we are now lined up on it
              repeat ((DisplayWidth-2)-counter4)  ' Fill the DisplayBuffer with spaces until the end of the display on that line
                 buffer[noparse][[/noparse]counter2] := 32                 
                 counter2++                       ' increment the DisplayBuffer index as we insert spaces
              counter++   'increment the Buffer Index (now passed the LineFeed onto next character)
              counter4:=0 'reset out CharactersOnLine counter back to 0..we have ended this line
           other:          ' Just to catch some unusual chars..we may end up getting them, we will treat as a standard char with a funny display to catch it
              buffer[noparse][[/noparse]counter2] := 127  '&#61567;
              counter++   'increment the Buffer index
              counter2++  'increment the DisplayBuffer index
              counter4++  'increment the CharacterOnLine counter
        ' Now we need to see if we just put the last character on a line that will fit           
        if counter4 == (DisplayWidth-2)
           repeat until sdcard.vbpeek(0,addr+counter) == 13    ' we search until we find an EOL in the Buffer, the displaybuffer is already lines up for the next line
              counter++        
           if sdcard.vbpeek(0,addr+counter+1) == 10          'quick check to see if we have a linefeed (most likely)
              counter++                        'if we do increment the Buffer index to skip over this we are now lined up on it
           counter++   'increment the Buffer index
           counter4:=0 'reset the CharacterOnLine counter


▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


NYC Area Prop Club

Prop Forum Search (Via Google)



Post Edited (CassLan) : 1/2/2010 11:23:13 PM GMT

Comments

  • Cluso99Cluso99 Posts: 18,069
    edited 2009-12-27 16:03
    Casslan: From my understanding of the implementation of spin, expressions like this can be sped up...
        repeat ((DisplayWidth-2)-counter4)  ' Fill the DisplayBuffer with spaces until the end of the display on that line
          buffer[noparse][[/noparse]counter2]:=32                 
          counter2++                       ' increment the DisplayBuffer index as we insert spaces
    
     
    -----------------------------------------------------
     
        tmp := (DisplayWidth-2)-counter4   ' faster to preevaluate ???? (unsure)
        repeat tmp                         ' Fill the DisplayBuffer with spaces until the end of the display on that line
          buffer[noparse][[/noparse]counter2++]:=32           ' postincrement      
    
    
    


    There are a few places where the optimisation (above) on buffer[noparse][[/noparse]counter2++] := xxxx can be used. There is a specific optimisation (shortcut) in the spin interpreter that does not use an extra instruction to do pre & post incrementing, although it does depend on how it is being used.

    Since you are doing a sdcard.vbpeek quite often, this could also be a place to look.





    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)
    · Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
  • rokickirokicki Posts: 1,000
    edited 2009-12-27 19:07
    Based on the way you are using virtual memory, and the knowledge that the SD card always moves chunks of 512 bytes in and out,
    I think you should rearrange this code to take that into account. That is, when "skipping over" anything or "scanning" for anything
    or "copying" anything, you are just going character by character forward; keep a pointer to a 512-byte buffer and a count of how
    many characters "remain" in the buffer to be consumed, and this way you can eliminate more than 99% of the vbpeek() calls.
  • localrogerlocalroger Posts: 3,452
    edited 2009-12-27 23:02
    To expand on what rokicki said ... this is a function I've coded at least 4 times since the first time I did it for a C64.· The way you did it has the advantage of being simple and easy to debug with small code size, but it is·very slow.

    If you have the RAM the way I'd approach it is to have a buffer large enough to hold the maximum number of characters you can ever display on the screen (or, if you don't have that much RAM, on a line, though that will slow things down and complicate them a bit).· Have a function that uses block accesses to load the buffer, then scan the buffer to draw the screen with word wrap and all that.

    On the Prop I'd also concentrate on using my control logic to identify and locate words, but once i'd identified·and located a word I'd use bytemove (which is way faster than byte-by-byte access in Spin) to actually stuff it into the video buffer.
  • CassLanCassLan Posts: 586
    edited 2009-12-28 04:04
    Cluso:
    I was thinking that the pre-evaluation might be helpfull, also the ++ inline with the command as well, Its easier for me to read the way it is now, so I will try that and see what kind of benefits it yeilds.

    rokicki:
    Hmmmm, I like what your saying, I don't think though that every vbpeek call transfers 512 bytes from the card, I believe there is some logic in that driver that has its own sector sized buffer which is checked in case what your looking for is in the current sector that's in memory, its essentially exactly what your talking about, I will ask MagIO about that.

    localroger:
    Gotcha. I will see about that approach.

    Thanks for the input, I need to do some timing tests, but I'm pretty sure the major delay is in the displaying.
    Will keep you posted!

    Rick

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    NYC Area Prop Club

    Prop Forum Search (Via Google)

    ·
  • Cluso99Cluso99 Posts: 18,069
    edited 2009-12-28 04:13
    CassLan: I don't like unreadable code either, but sometimes it is a must, so good comments are required (like post increment)

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)
    · Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
  • MagIO2MagIO2 Posts: 2,243
    edited 2009-12-28 14:13
    You already mentioned the best improvement: Only update what needs to be updated.

    And yes, you are right. The vMem has a 512 byte buffer per slot and only reloads if necessary. I implemented the slots with your editor in mind and suggest to use 1 slot for RHB, 1 slot for LHB and 1 slot for filling the display buffer. (The other one can then be used for copy and paste).

    I guess your lines can be longer than the number of characters of a row?
    So you can have up to 3 sectors of data displayed on your screen.

    I'll have a closer look at your code this evening.·
  • CassLanCassLan Posts: 586
    edited 2009-12-29 02:47
    I shaved the display portion (for an entire editing window) from 12mil to 7mil. I'm pretty sure that updating the minimum amount of data AND display is all that needs to be done. I will keep all posted.

    Thanks,

    Rick

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    NYC Area Prop Club

    Prop Forum Search (Via Google)

    ·
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2009-12-29 04:38
        repeat ((DisplayWidth-2)-counter4)  ' Fill the DisplayBuffer with spaces until the end of the display on that line
          buffer[noparse][[/noparse]counter2]:=32                 
          counter2++                       ' increment the DisplayBuffer index as we insert spaces
    
    
    


    is equivalent to

        bytefill(@buffer[noparse][[/noparse]counter2], 32, DisplayWidth - 2 - counter4)
        counter2 += DisplayWidth - 2 - counter4 'This line may not be necessary, depending on what the subsequent code expects.
    
    
    


    The latter should be a lot faster.

    -Phil
  • CassLanCassLan Posts: 586
    edited 2009-12-29 13:08
    Of course Phil!!!
    At first I was using the vMem as the display buffer, which does not have a bytefill function, then I switched it to use RAM and missed that.

    Thanks!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    NYC Area Prop Club

    Prop Forum Search (Via Google)

    ·
  • CassLanCassLan Posts: 586
    edited 2009-12-31 14:46
    I have made progress and wanted to share some of the re-implementations I have gone through along with the results:

    The goal being to read the pre-filled GAP buffer which contains file contents as-is with the exception of being broken into two halves.

    When I initially started this thread I was doing the following:
    A) Filling a byte Display Buffer which is sized to fit the editing window with the contents of the GAP Buffer, taking into account EOLs, lines being longer than the screen..etc
    B) Taking that Display Buffer and feeding it to the display driver once character at a time
    The results:
    A ~ 19mil clocks, B ~ 12mil clocks, total ~ 31mil

    Then I tweaked B..Feeding Lines instead of characters...B ~ 7mil, total ~ 26mil
    I further tweaked B..Opting to directly edit the Screen Buffer memory instead of using the Display Driver functions...B ~ 3.5mil, total ~ 22.5mil

    At this point it occured to me to be done with·the usage of my own Display Buffer, and have (A) just fill the Display Drivers screen memory directly...A ~ 23mil, B~0, total~23mil
    The performance increase was really none, except that I freed up usage of my general purpose buffer, which I was keeping pretty small until I wanted to use it as a display buffer.
    So Now I was able to shrink that back down to 256 from 1240..almost 1kB of RAM freed!

    But I really wasn't happy with the results..because at this point the screen wouldn't actually update very fast, AND you could see it writing out the chars instead of it appearing to just be the whole screen updating at once.

    So, now that I had a free general purpose buffer at my disposal, and since it was suggested (thanks rokicki and localroger ) I used that as the input for (A) instead of the direct calls to MagIO's vmfunctions, refilling the buffer as needed.
    And the results are: 10mil clocks!!!!

    So in Short I started at 31mil...and ended up at 10mil + 984bytes lighter!!!

    Thanks for everyones suggestions, I have a feeling I can get this down to 6mil... [noparse]:)[/noparse]

    smile.gif· Rick

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    NYC Area Prop Club

    Prop Forum Search (Via Google)

    ·
  • BaggersBaggers Posts: 3,019
    edited 2009-12-31 15:55
    Nice going Rick, Keep up the great work [noparse]:)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    http://www.propgfx.co.uk/forum/·home of the PropGFX Lite

    ·
  • CassLanCassLan Posts: 586
    edited 2010-01-02 23:20
    I'm not sure why it took me so long to try it, but I wrote my first PASM code today to help with the speed, and it brought the parsing/display speed from ~10 mil clocks (after sweating for days with it) to an astounding ....

    ~440,000 clockshop.gif!!!·· Thats over 180 possible refreshes per second!!!

    And that includes reading all the data for a whole screen from the GAP buffer (sd card·via FSRW and MagIO's vMem functions), parsing/finding end of lines..horizontal offseting..etc!!!
    It does everything that my spin code did just ALOT faster.

    The cog starts with the program, and when it sees a certain variable in main ram equal 1, it knows that the buffer in main ram is ready for it and it just rips through it [noparse]:)[/noparse]

    Then when its done it·sets a seperate variable in main ram so that spin can know that its done [noparse]:)[/noparse]

    I have a feeling that screen display/refresh speed will no longer be an issue cool.gif
    I also have a sneaking suspicion that this code can be made even faster, but I'm happy with the speed as is lol.gif

    I have to thank Parallax for the awesome PASM Webinar: [url=http://www.parallax.com/Portals/0/Downloads/mm/video/Webinar/2009-12-10-Webinar-[Full].mp4]http://www.parallax.com/Portals/0/Downloads/mm/video/Webinar/2009-12-10-Webinar-[noparse][[/noparse]Full].mp4[/url]
    Which kind of gave me a push to do it.
    And It would NOT have been possible for me to write this without Aribas' Propeller Assembler Source-code Debugger: http://propeller.wikispaces.com/PASD
    Its a nice, light weight, free tool, the instructions are good, and I was able to use it to help me within 30 seconds of downloading it. Thanks Ariba!!

    Well lets see now,
    Storage is not a problem, Speed is not a problem, Memory Management is not a problem...I guess I have no more excuses guys [noparse]:)[/noparse]

    Till later,

    Rick







    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    NYC Area Prop Club

    Prop Forum Search (Via Google)

    ·
  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2010-01-03 01:51
    Sounds Great Rick!!!!! Looking forward to Preditor!

    OBC

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    New to the Propeller?

    Visit the: The Propeller Pages @ Warranty Void.
  • mparkmpark Posts: 1,305
    edited 2010-01-03 01:51
    Kudos, Rick! That's great news.
Sign In or Register to comment.