Shop OBEX P1 Docs P2 Docs Learn Events
P2 Experiments- P1 source to P2 Fastspin — Parallax Forums

P2 Experiments- P1 source to P2 Fastspin

So I've played with TaqOZ a bit and now I'm wanting to try converting P1 source to P2. I ran my touchscreen desktop app through fastspin and got what looks like a complete ASM output. I haven't gone through all of it yet but it looks like it should run. I'm wondering if there's any known gotchas I should watch out for? I'm finishing up new hardware now, waiting on a couple final things. Progress has been slow over the holidays but I'm getting ready for another big push after the new year.

Thanks for any advice!
«1

Comments

  • whickerwhicker Posts: 749
    edited 2018-12-31 06:29
    Here is what I'm encountering with Fastspin:

    I am using Windows 7.

    1) I heard about issues with loadp2.exe so I grabbed and compiled the version from here:
    https://github.com/davehein/p2gcc

    I placed the newly compiled loadp2.exe version 0.007 into the /bin folder of spin2gui


    2) When running spin2gui, I had to open the menu labeled "Commands" and select "Configure Commands..."
    This brings up a window called "Executable Paths"
    I pressed the [P2 Defaults] button, but had to change the text in "Run Command" to this path:
    cmd.exe /c start %D/bin/loadp2 %B -CHIP -t -v
    
    Press OK to close the configure commands window.

    3) I opened up the sample file blink2.bas. Changed the first lines to this, and saved:
    const LED1 = 56
    const LED2 = 57
    

    4) Pressed the "Compile & Run" button.
    Got this in the terminal, because I'm using -v for loadp2 (verbose mode):
    Searching serial ports for a P2
    P2 version A found on serial port com9
    Setting clock_mode to 10c1f08
    Setting user_baud to 115200
    Loading C:/Propeller/spin2gui/samples/blink2.binary - 1440 bytes
    C:/Propeller/spin2gui/samples/blink2.binary loaded
    [ Entering terminal mode.  Press ESC to exit. ]
    

    5) Now I'm stuck. It looks like fastspin inserts a hubset #255 into the asm code. From various other comments, this isn't going to work on the P2-EVAL board.

    Here is the beginning of the .lst file it generates for blink2.bas, with me having changed the pin numbers of led1 and led2.
    00000                 | con
    00000                 | 	led1 = 56
    00000                 | 	led2 = 57
    00000                 | dat
    00000 000             | 	org	0
    00000 000 
    00000 000             | entry
    00000 000 00F00FF2    | 	cmp	ptra, #0 wz
    00004 001 1800905D    |  if_ne	jmp	#spininit
    00008 002 30F003F6    | 	mov	ptra, objptr
    0000c 003 00F007F1    | 	add	ptra, #0
    00010 004 00FE65FD    | 	hubset	#255
    00014 005 D404C0FD    | 	calla	#@_program
    

  • twm47099twm47099 Posts: 867
    edited 2018-12-31 07:13
    I rename the .pasm file to .spin2 and then open the .spin2 file and change the hubset #255 to hubset #0 then press compile & run (the spin2 file). That will run the P2-ES using RCFAST (20MHz). There are changes you can make to the spin2 file to change the clock speed:
    https://forums.parallax.com/discussion/comment/1459371/#Comment_1459371
  • whickerwhicker Posts: 749
    edited 2018-12-31 07:52
    Thanks @twm47099 !!!

    When changing the .p2asm file to .spin2 as you described, and adding the _CLOCKFREQ constants from that message, I think I had to go down further below to set the clock frequency as well in the COG_BSS_START section, if present?
    | COG_BSS_START
    | 	fit	496
    | 	orgh	$400
    | 	'SEEMS TO MAKE A DIFFERENCE? First long is CPU Clock?
    | 	long	_CLOCKFREQ	'was long 80000000
    | 	...
    | hubentry
    
  • So @cheezus, a gotcha is it looks like fastspin is not currently generating code for setting the CPU clock frequency correctly.
  • Try commenting out the "hubset #$ff". By default, loadp2 will set up the system clock to 80 MHz. If you want the user program to set up the clock, then you have to specify the -SINGLE option to loadp2.
  • There's a new fastspin / spin2gui release that should work much better. It no longer sets the clock automatically, you'll have to either have loadp2 do that or else insert an explicit clkset(mode, freq) at the beginning of your Spin or BASIC code.
  • Thanks guys! I've been working on getting the new hardware ready, turns out I'm short on pin headers :| I'll have to figure out something to keep me busy while I wait.


    I'm really impressed with fastspin, I need to clean up the project so it will compile for P1. I'm still not clear on the translation of DAT sections. It LOOKS like they have been parsed to P2asm correctly (of course there may be issues of timing or something?) but it can't be THAT simple can it?
  • potatoheadpotatohead Posts: 10,261
    edited 2018-12-31 23:19
    Why not?

    That is what normally happens. Assembly boils down to just being a set of directives to assemble a binary file. It may or may not be instructions. Could just be data.
  • cheezus wrote: »
    I'm really impressed with fastspin, I need to clean up the project so it will compile for P1. I'm still not clear on the translation of DAT sections. It LOOKS like they have been parsed to P2asm correctly (of course there may be issues of timing or something?) but it can't be THAT simple can it?

    I'm glad it's working for you. fastspin doesn't try to do any translation of P1 asm to P2 asm, so you're probably just getting lucky if your DAT section has much legacy P1 assembly code in it. The P2 instruction set is "mostly" a superset of P1, and I have kept a few P1 assembly conventions (e.g. fastspin accepts "wc,wz" as well as "wcz"), so some P1 asm will compile OK. But in many cases you'll need to put an "#ifdef __P2__" in the DAT section with the P2 specific assembly.

    The Spin code (outside of DAT) should mostly translate fine, so the more you can put in Spin the better. Performance shouldn't be an issue at all, fastspin is compiling to hubexec code on P2, so your P2 compiled Spin code will probably run faster than hand-crafted P1 assembly.
  • So having spent some more time writing spin code, compiling through fastspin and looking at the output I'm even more impressed than I was initially. I've still been using the proptool for the bulk of editing but I think it's time for me to look at some of the other editors. I've always used proptool and it's just that comfortable place to develop for the p1. There's a lot of nice offerings, I've just got to be willing to get out of my comfort zone.

    The one thing that I'm looking for in a toolchain is conditional compilation. If I understand correctly fastspin has a preprocessor, with built in directives?

    I still haven't tried to run any code on the P2 yet, I really wanted to get something working on the p1 first. I finally got the Spin p1 version working and before going to bed I thought I'd try fastspin and... well it sure lives up to its name. It went from some 10s of seconds to draw a 320x240 blank screen to at least a refresh a second? This is with no work to optimize the spin code for speed. I'd LIKE to create an object that works for P1 and P2 and it seems doable but there's a lot of gotchas with the current codebase.

    Is there a "fastspin" manual I'm missing?

    Thanks as always to all the forum members here, you are awesome!
  • cheezus wrote: »
    I've still been using the proptool for the bulk of editing but I think it's time for me to look at some of the other editors. I've always used proptool and it's just that comfortable place to develop for the p1.
    It's definitely worth checking out @Rayman 's SpinEdit, it's got an interface very similar to proptool but can use openspin or fastspin to do the compiling.
    The one thing that I'm looking for in a toolchain is conditional compilation. If I understand correctly fastspin has a preprocessor, with built in directives?

    Yes. See the docs/spin.md file in the fastspin or spin2gui release. It's a basic subset of the C preprocessor, with #define, #ifdef / #else / #endif, and #include. There are some predefined symbols, like __P2__ if compiling for the P2 (again, documented in the spin.md file; that's a GitHub mark-down file, so readable as plain text).

  • I've been converting the markdown (.md) files to pdf for my own use when i grab a new release.
    makes for an easier read.
  • cheezuscheezus Posts: 298
    edited 2019-02-09 06:02
    Okay, I did it again! Still working on P1 source..

    Seems fastspin doesn't like the copy of fsrw I've been using, giving me an error that's got me scratching my head right now.
    "fsrw_Ray.Spin(568) error: internal error : expected statement list, got 87"

    So I go looking for an 87 in the code and can't find a dec, hex or binary... But line 568 is a blank line so I thought maybe something in the last function isn't properly formatted or something. I'm sure it's something simple really.
    pub nextfile(fbuf) | i, t, at, lns
    {{
    '   Find the next file in the root directory and extract its
    '   (8.3) name into fbuf.  Fbuf must be sized to hold at least
    '   13 characters (8 + 1 + 3 + 1).  If there is no next file,
    '   -1 will be returned.  If there is, 0 will be returned.
    }}
       repeat
          if (bufat => bufend)
             t := pfillbuf
             if (t < 0)
                return t
             if (((floc >> SECTORSHIFT) & ((1 << clustershift) - 1)) == 0)
                fclust++
          at := @buf + bufat
          if (byte[at] == 0)
             return -1
          bufat += DIRSIZE
          if (byte[at] <> $e5 and (byte[at][$0b] & $18) == 0)
             lns := fbuf
             repeat i from 0 to 10
                byte[fbuf] := byte[at][i]
                fbuf++
                if (byte[at][i] <> " ")
                   lns := fbuf
                if (i == 7 or i == 10)
                   fbuf := lns
                   if (i == 7)
                      byte[fbuf] := "."
                      fbuf++
             byte[fbuf] := 0
             fsize2:= brlong(at+$1c)
             return 0
                 
    

    I was finally able to trim down Touch.Spin enough to compile the desktop app. I'm still rewriting the ASM driver in Spin... Got a long way to go and starting to wonder if I should just bypass this step and jump to P2asm. I wish I had the time and "resources" to devote to serious development... But, almost 3 year old twin boys who think playing with wires and cords is almost as amazing as beating each other with plastic baseball bats; a cat that thinks jumper wires are the best thing since hair ties and feathers; as well as preparing for a move all slow the process.


    btw, I still haven't rtfm :P

    *edit, added sources i'm trying to build, duh
  • An "internal error" is pretty much always a bug in fastspin: it indicates that some sanity check that I added to the code failed.

    In this case it looks like fastspin messed up compiling the line:
      \sdspi.readblock(0, @buf)
    
    in the mount(basepin) function in fsrw_Ray.spin. I'll try to sort out what's wrong, but for now probably removing the \ will get it going.


  • Thanks as always for taking a look at that.

    One of the things I noticed after I posted was touch.spin compiled program size comes out to 35580 bytes? @msrobots mentioned running out of memory and this is entirely possible. I've been trying to shave longs by moving all large constants to a DAT block. Not sure if this is the best way to go...
  • cheezus wrote: »
    Thanks as always for taking a look at that.

    One of the things I noticed after I posted was touch.spin compiled program size comes out to 35580 bytes? @msrobots mentioned running out of memory and this is entirely possible. I've been trying to shave longs by moving all large constants to a DAT block. Not sure if this is the best way to go...

    Yes, unfortunately fastspin binaries are quite a bit bigger than regular Spin ones, since they compile to PASM rather than to compressed bytecode.

    Moving large constants to a DAT block probably won't make much difference; the compiler tries to keep only one copy of each large immediate constant. But it won't hurt either, and maybe you'll notice some things the compiler missed. I kind of doubt you'll get 3K back though :(.

    I have some ideas on how to allow fastspin to compress instructions (like PropGCC's CMM mode), but that isn't going to happen in the near future.
  • ersmith wrote: »
    Yes, unfortunately fastspin binaries are quite a bit bigger than regular Spin ones, since they compile to PASM rather than to compressed bytecode.

    Moving large constants to a DAT block probably won't make much difference; the compiler tries to keep only one copy of each large immediate constant. But it won't hurt either, and maybe you'll notice some things the compiler missed. I kind of doubt you'll get 3K back though :(.

    I have some ideas on how to allow fastspin to compress instructions (like PropGCC's CMM mode), but that isn't going to happen in the near future.

    When I started playing with this I wasn't even sure it would work but now I'm starting to see potential. Part of the reason for these experiments is I'm trying to understand ways to optimize as well as restructure things. As it stands now, there's a lot of code that only runs once like display inits. This is one place I've always thought things could be improved. I've thought about putting the init code into CSV files on the SD and just parse though a file for display init but it really seems best to have the display init code in EEPROM in case there's something wrong with the SD I can still init the display and print debugging info.

    For a while I used a small program to init the display, start the SD and chain into another program that actually loaded the desktop. From here each program could be chained to and when finished kicks back to the EEPROM. This would check the SRAM to see if the desktop attributes were still in ram and if so skip loading the desktop and just display the pre-loaded desktop. The return to desktop from a running app is instant, which is nice because loading an 800x480 image from SD seems like it takes forever!

    There were several problems with doing things this way but the biggest annoyance, other than loading the desktop for the first time, was switching between displays. The previous hardware design didn't use the LCD /RD and simply pulled up to 3v3. The new hardware should allow me to determine which controller is plugged in, instead of changing a line in the source code and recompiling the entire package. The biggest problem I'm having is figuring out how to handle 2 different resolutions, as for a long time the 320x240 "package" was completely separate from the 800x480.

    Getting back the 3k is going to be tricky but with 1mw of SRAM at least I have some place to stuff data between programs run off SD. Its very interesting looking at the spin code and then seeing the pasm output. I haven't really noticed your compiler "slacking off" anywhere and the moving large constants is just to satisfy the FIT directive. I think I may have made some mistakes and need to double check pointer vs register. I'm also thinking that display init data could be overwritten for buffers if I could wrap my head around that mess.
    
        sdbuffer                    byte 0[1024]                   ' 1024 byte buffer for sd card interface and also for sending one row 320 x3 = 960 bytes max picture size 
        buffer2                     byte 0[512]                    ' 512 general purpose hub buffer 
        rambuffer                   word 0[1024]                    ' 512 byte (256 word) buffer easier for using with words, needed to double this
    

    I was having trouble with some piece of code overflowing the rambuffer so that's 4x larger than it needs to be. There's plenty of places where this code can be optimized, I'm just trying to understand the best ways to optimize for P1 and looking for ideas in the P2 rewrite.
  • If you're running out of COG memory (and yes, having lots of large constants will do that :( ), have you tried disabling the code cache with --fcache=0? That'll reclaim a fair chunk of COG memory pretty easily, albeit with a bit of a hit to performance.
  • I'm finally getting to "the driver rewrite" while simultaneous bringing up the new hardware. Now that things are starting to show signs of life I need some help figuring out the right way to "segment" things a bit. The one obvious thing that stood out while working on the test harness was the display initialization code. With the init's for 3 displays the compiler was generating a LOT of big numbers and eating into the cog. I can also see how using out new pin instructions could save some cog memory too but at the expense of backwards compatibility.
    CON ''PINS
      LCD_RS  = |< 16       '
      LCD_CS  = |< 20       ' p5 -1 -d_en
      
      ''LCD_RD  = |< 18     ' p6 -0 -d_r
      ''LCD_WR  = |< 19     ' p6 -1_d_W
      ''LCD_RST = |< 21     ' GP3 & P10
        
      LCD_PINS = LCD_CS 
      LCD_DIRS = LCD_PINS | LCD_RS | $FFFF
      
        D_RST = 10
      
        SPI_A0 = 10   
        SPI_A1  =11
    
        SD_DO   =   12
        SD_CLK  =   13
        SD_DI   =   14
        SD_CS   =   15
        
        GROUP_EN    = 21
        CLOCK_PIN   = 20 
      
    CON ''OTHER STUFFS
      _1ms      = 1_000_000 /     1_000                     ' Divisor for 1 ms                                                                      
    
      'define latch bits
        #1, _ramEnable, _ram_rW, _flash_rW, _dispEnable, _disp_rW, _flashEnable     '' version 1, UGLY
        _group0, _group1, _group2, _group3                                          '' group bits
        
    ''OBJ   disp '' display object - SSD1963,SSD1289 specific - todo
    
    VAR  
        word curx, cury, ScreenWidth, ScreenHeight
        word BackFontColor, FontHeight
        byte DisplayMode, orientation                                                                                                                          
        byte rambuffer[512]
    
        'cog stuffs
        byte latchvalue, spiselect
    
                                                              
    PUB Testing | i    
       
       clkset(_SETFREQ, _CLOCKFREQ)
       waitcnt(cnt+clkfreq*2)
        
      Reset_Display
      Start_SSD1963
    
       'Draw(0, 0, 479, 799)
       REPEAT   
         Draw(0, 0, 479, 799)
         repeat (480 * 800)
           Pixel(0)  
    
         Draw(0, 0, 479, 799)
         repeat (480 * 800) 
           Pixel($ffff)
    
     repeat
       pause1ms(1000)
    
    PUB pause1ms(period) | clkcycles                        '' Pause execution for period (in units of 1 ms).
      clkcycles := ((clkfreq / _1ms * period) - 4296) #> 381     ' Calculate 1 ms time unit
      waitcnt(clkcycles + cnt)                                   ' Wait for designated time   
    
    ''*********************************************************************************************************************************************************
    
    ' ***************** Start display routines *************************
    
    
    PUB Draw(x1, y1, x2, y2)                                '' sets the pixel to x1,y1 and then fills the next (x2-x1)*(y2-y1) pixels
    { 
       ifnot orientation                                ' landscape mode so swap x and y for 1963
            result :=x1                                       ' swap x1 and y1
            x1 := y1
            y1 := result
            result := x2                                      ' swap x2 and y2
            x2 :=y2
            y2 := result
    }
       DisplayEnable                                    ' enable one or both displays
    
        ''  ssd1963
         Lcd_Write_Com($002B)
         Lcd_Write_Data(x1>>8)
         Lcd_Write_Data(x1&$ff)
         Lcd_Write_Data(x2>>8)
         Lcd_Write_Data(x2&$ff)
            
         Lcd_Write_Com($002A)
         Lcd_Write_Data(y1>>8)
         Lcd_Write_Data(y1&$ff)
         Lcd_Write_Data(y2>>8)
         Lcd_Write_Data(y2&$ff)    
    
         Lcd_Write_Com($002c)
    
    {    ''  ssd1289
          Displaycmd($0044,(x2<<8)+x1)           
          Displaycmd($0045,y1)
          Displaycmd($0046,y2)
          Displaycmd($004e,x1)
          Displaycmd($004f,y1)
          Lcd_Write_Com($0022)
    }      
       LCD_RS_High
    
    PUB Pixel(pixelcolor)                                   '' send out a pixel
         Lcd_Write_Fast_Data(pixelcolor)   ' need to set RS high at beginning of this group (ie call Draw first)
         ' it is more efficent to send one Draw command then lots of pixels than sending individual pixels
         ' but of course, you can send draw x,y,x,y where x and y are the same and then send one pixel
    
    
     ' ***********************************************************
     
    '' DISPLAY SETTINGS
       
    PRI DisplayEnable
        latchvalue :=( latchvalue & $FC) | _group2) | _dispEnable
        SetLatch( latchvalue )
        OUTA |= LCD_PINS                                 ' set /cs high
        DIRA |= LCD_DIRS                                 ' enable these pins for output
    
    PRI SpinHubToDisplay(hubaddress,number)| i
    '    DisplayEnable
        repeat i from 0 to number -1
          pixel(long[hubaddress][i])
    
    PRI DisplayCmd(c,d) ' instruction in one method
        Lcd_Write_Com(c) ' send out a word
        Lcd_Write_Data(d)
    
    PRI Lcd_Write_Com(LCDlong)
        LCD_RS_low                               ' can do rs first then cs - better for latch board
        LCD_Writ_Bus(LCDlong)
     
    PRI Lcd_Write_Data(LCDlong)
        LCD_RS_High                              ' can do rs first then cs
        LCD_Writ_Bus(LCDlong)
    
    PRI Lcd_Write_Fast_Data(LCDlong)            ' write RS elsewhere then skip the RS above as this is a latch output and takes longer
        LCD_Writ_Bus(LCDlong)
    
    PRI LCD_Writ_Bus(LCDLong)
        LCDLong &= $0000_FFFF
        OUTA &= %11111111_11111111_00000000_00000000        ' set P0-P15 to zero ready to OR
        OUTA |= LCDLong                                     ' merge with the word to output
        LCD_CS_Low
        LCD_CS_High    
    
    PRI LCD_CS_Low
        OUTA &= !LCD_CS
       
    PRI LCD_CS_High                                         
        OUTA |= LCD_CS
    
    PRI LCD_RS_Low                                          
        OUTA &= !LCD_RS
    
    PRI LCD_RS_High            
        OUTA |= LCD_RS
    
    PRI SetLatch(value)
        OUTA |= GROUP_EN
        OUTA |= CLOCK_PIN
        OUTA &= !$FF
        value &= $FF
        OUTA |= value
        OUTA &= CLOCK_PIN
        OUTA |= CLOCK_PIN
        OUTA &= GROUP_EN
    
    PRI SetCounter(address) | olv
        olv := latchvalue
        SetLatch($0)
        OUTA &= CLOCK_PIN
        OUTA &= !$F_FFFF
        address &= $F_FFFF
        OUTA |= address
        OUTA |= CLOCK_PIN
        SetLatch(olv)
                
    PRI Reset_Display |   olv 
        olv := latchvalue
        SetLatch(_group3)
        OUTA &= D_RST
        pause1ms(10) 
        OUTA |= D_RST                                     
        SetLatch(  olv )
        
    DAT
       VDP        long  479     ' v pixels
       HDP        long  799     ' h pixels
       HT         long  928
       HPS        long  46
       LPS        long  15
       VT         long  525
       VPS        long  16
       FPS        long  8
       HPW        long  48
       VPW        long  16
    
    PRI Start_SSD1963
        
        Lcd_Write_Com ($00E2)                             ' //PLL multiplier, set PLL clock to 120M
        Lcd_Write_Data($0023)                             ' //N=0x36 for 6.5M, 0x23 for 10M crystal ' $21??
        Lcd_Write_Data($0002)                             '
        Lcd_Write_Data($0054)                             ' dummy byte?  54??
        Lcd_Write_Com ($00E0)                             ' // PLL enable
        Lcd_Write_Data($0001)                             ' set pll             '
        pause1ms(1)                                      '
        Lcd_Write_Com ($00E0)                             ' pll                         '
        Lcd_Write_Data($0003)                             ' lock
        pause1ms(5)
        Lcd_Write_Com ($0001)                             ' // software reset
        pause1ms(5)
        Lcd_Write_Com ($00E6)                             ' //PLL setting for PCLK, depends on resolution
        Lcd_Write_Data($0003)                             '
        Lcd_Write_Data($00ff)                             '
        Lcd_Write_Data($00ff)                             '
        Lcd_Write_Com ($00B0)                              ' //LCD SPECIFICATION
        Lcd_Write_Data($0000)                             '
        Lcd_Write_Data($0000)                             '
        Lcd_Write_Data((HDP>>8)&$00FF)                    ' //Set HDP 'hps??
        Lcd_Write_Data(HDP&$00FF)
        Lcd_Write_Data((VDP>>8)&$00FF)                    ' //Set VDP 'vps??
        Lcd_Write_Data(VDP&$00FF)
        Lcd_Write_Data($0000)
        Lcd_Write_Com ($00B4)                             ' //HSYNC
        Lcd_Write_Data((HT>>8)&$00FF)                     ' //Set HT
        Lcd_Write_Data(HT&$00FF)
        Lcd_Write_Data((HPS>>8)&$00FF)                    ' //Set HPS
        Lcd_Write_Data(HPS&$00FF)
        Lcd_Write_Data(HPW)                               ' //Set HPW
        Lcd_Write_Data((LPS>>8)&$00FF)                    ' //Set HPS
        Lcd_Write_Data(LPS&$00FF)                         '
        Lcd_Write_Data($0000)                             '
        Lcd_Write_Com ($00B6)                             ' //VSYNC
        Lcd_Write_Data((VT>>8)&$00FF)                     ' //Set VT
        Lcd_Write_Data(VT&$00FF)
        Lcd_Write_Data((VPS>>8)&$00FF)                    ' //Set VPS
        Lcd_Write_Data(VPS&$00FF)
        Lcd_Write_Data(VPW)                               ' //Set VPW
        Lcd_Write_Data((FPS>>8)&$00FF)                    ' //Set FPS
        Lcd_Write_Data(FPS&$00FF)                         '
        Lcd_Write_Com ($00BA)                             '
        Lcd_Write_Data($0005)                             ' //GPIO[3:0] out 1
        Lcd_Write_Com ($00B8)                             '
        Lcd_Write_Data($0007)                             ' //GPIO3=input, GPIO[2:0]=output
        Lcd_Write_Data($0001)                             ' //GPIO0 normal
        Lcd_Write_Com ($0036)                              ' //rotation
        Lcd_Write_Data($0021)                              ' 3 is 180    21??
        Lcd_Write_Com ($00F0)                             ' //pixel data interface
        Lcd_Write_Data($0003)                             ' 16 bit / 565
        pause1ms(5)
        Lcd_Write_Com ($0029)                              ' //display on
        Lcd_Write_Com ($00d0)                              ' //dynamic backlight
        Lcd_Write_Data($000d)
        Lcd_Write_Com($002c)
        LCD_RS_High
        
    PRI Start_SSD1289                                       ' based on C driver   
        Displaycmd($0000,$0001)  'Turn Oscillator on                                POR-$0000
        Displaycmd($0003,$A8A4)  'Power control (1)                                 POR-$6664
        Displaycmd($000C,$0000)  'Power control (2)                                 POR- ?
        Displaycmd($000D,$080C)  'Power control (3)                                 POR-$0009
        Displaycmd($000E,$2B00)  'Power control (4)                                 POR-$3200
        Displaycmd($001E,$00B0)  'Power control (5)                                 POR-$0029
        Displaycmd($0001,$2B3F)  'Driver output control,    *landscape?? $6B3F*     POR-$433F
        Displaycmd($0002,$0600)  'LCD drive AC control                              POR-$0400
        Displaycmd($0010,$0000)  'Sleep Mode                sleep mode off          POR-$0001
        Displaycmd($0011,$6070)  'Entry Mode,               *landscape? $4030*      POR-$6830
        Displaycmd($0006,$0000)  'Compare Register (2)                              POR-$0000
        Displaycmd($0016,$EF1C)  'Horizontal Porch                                  POR-$EFC1
        Displaycmd($0017,$0003)  'Vertical Porch                                    POR-$0003
        Displaycmd($0007,$0233)  'Display Control    '0033?                               POR-$0000
        Displaycmd($000B,$0000)  'Frame cycle Control              'd308                 POR-$D308
        Displaycmd($000F,$0000)  'Gate Scan start position                          POR-$0000
        Displaycmd($0041,$0000)  'Vertical Scroll Control (1)                       POR-$0000
        Displaycmd($0042,$0000)  'Vertical Scroll Control (2)                       POR-$0000
        Displaycmd($0048,$0000)  'First Window Start                                POR-$0000
        Displaycmd($0049,$013F)  'First Window End                                  POR-$013F
        Displaycmd($004A,$0000)  'Second Window Start                               POR-$0000
        Displaycmd($004B,$0000)  'Second Window End                                 POR-$013F
        Displaycmd($0044,$EF00)  'Horizontal Ram Address Postion                    POR-$EF00
        Displaycmd($0045,$0000)  'Vertical Ram Address Start                        POR-$0000
        Displaycmd($0046,$013F)  'Vertical Ram Address End                          POR-$013F
    
        Displaycmd($0030,$0707)
        Displaycmd($0031,$0204)
        Displaycmd($0032,$0204)
        Displaycmd($0033,$0502)
        Displaycmd($0034,$0507)
        Displaycmd($0035,$0204)
        Displaycmd($0036,$0204)
        Displaycmd($0037,$0502)
        Displaycmd($003A,$0302)
        Displaycmd($003B,$0302)
        Displaycmd($0023,$0000)   'RAM write data mask (1)                           POR-$0000
        Displaycmd($0024,$0000)   'RAM write data mask (2)                           POR-$0000
        Displaycmd($0025,$8000)   'not in datasheet?
        Displaycmd($004f,$0000)   'RAM Y address counter                             POR-$0000
        Displaycmd($004e,$0000)   'RAM X address counter                             POR-$0000
        Lcd_Write_Com($0022)
    
    

    At this point things look pretty simple, put all the display specific stuff in a file. But every time I try to break things down it seems to complicated. In this example, DISPLAY_RS is controlled by it's pin, but the previous hardware (do I really have to support it? probably not?) it's controlled by a latch. Reset_Display, Set_Counter, and Set_Latch are new and totally untested..

    Does anyone have tips for the best way to manage a HUGE program, that's meant to run across multiple hardware? There's already quite a few pieces to this puzzle and it seems like I'm either going to end up with lots of little files, or one REALLY long one. Ifdefs really add to character count and I still don't quite get the syntax.

    As always, any advice appreciated!
  • ersmithersmith Posts: 6,088
    edited 2019-04-03 17:18
    If you're looking for portability between P1 and P2 I would suggest abstracting the pin manipulation. fastspin already has dirxxx_(pin) and drvxxx_(pin) for both P1 and P2, so you could use these. For setting multiple groups of pins I'd write an object for sending data across the pins, which can use OUTA on P1 and OUTA/OUTB on P2 (and perhaps use the extended pin instructions on the next version of P2).

    In general I would try to factor out common code between your hardware into one object, and have the hardware specific parts in other objects (something like "common.spin", "driver1.spin", and "driver2.spin". That's how I did my VGA driver: there's a common file that has
    all the text output routines, separate modules for setting up different modes (1024x768, 800x600, and so on) and then another common low level VGA driver that actually bangs the bits on the hardware.

    #ifdef is a really powerful helper. I regularly do things like starting files with:
    #define DEBUG
    #define FEATURE1
    #define FEATURE2
    
    and then within the code having bits like:
    PUB myfunc(arg)
    #ifdef DEBUG
       ser.printf("reached myfunc, arg=%d\n", arg)
    #endif
    
    To assist with debugging or for easily turning feaures on and off during development. For example when I'm done debugging I just comment out the "#define DEBUG" line, and the debug code is no longer compiled.

    For big things like switching hardware or processors, I would suggest that ifdef be used in large chunks, ideally at the highest level to control including whole objects rather than inside of functions. So don't write:
    PUB do_output
    #ifdef __P2__  
       p2 output code
    #else
       p1 output code
    #endif
    
    PUB do_input
    #ifdef __P2__
      p2 input code
    #else
      p1 input code
    #endif
    
    That leads to a rats nest of #ifdefs and is hard to read (in my opinion). Instead I would do something like:
    OBJ
    #ifdef __P2__
       p: "p2_routines.spin"
    #else
       p: "p1_routines.spin"
    #endif
    
    and then use "p.do_output", "p.do_input", etc.

    In general I would definitely urge abstracting things and writing lots of little functions over writing large complicated functions. The smaller the pieces you break things up into, the easier the code will be to read and the less likely it is that bugs will creep in. Don't worry about creating even tiny functions -- in fact fastspin will happily inline any small functions (ones that are just a few lines), and at optimization level -O2 it will inline any function that's only used once, so it's OK to write your code in a very modular fashion.

    The other tip I would have is to try to make things data driven where you're doing the same thing over and over but with different parameters. So for example your last function consists basically of a long list of things like:
    PRI Start_SSD1289                                       ' based on C driver   
        Displaycmd($0000,$0001)  'Turn Oscillator on                                POR-$0000
        Displaycmd($0003,$A8A4)  'Power control (1)                                 POR-$6664
        Displaycmd($000C,$0000)  'Power control (2)                                 POR- ?
        Displaycmd($000D,$080C)  'Power control (3)                                 POR-$0009
        Displaycmd($000E,$2B00)  'Power control (4)                                 POR-$3200
        Displaycmd($001E,$00B0)  'Power control (5)                                 POR-$0029
        Displaycmd($0001,$2B3F)  'Driver output control,    *landscape?? $6B3F*     POR-$433F
        Displaycmd($0002,$0600)  'LCD drive AC control                              POR-$0400
        Displaycmd($0010,$0000)  'Sleep Mode                sleep mode off          POR-$0001
        Displaycmd($0011,$6070)  'Entry Mode,               *landscape? $4030*      POR-$6830
        ...
    
    I'd change this into a display list instead:
    DAT
    ssd1289_startlist
        word $0000, $0001 ' Turn Oscillator on
        word $0003, $A8A4 ' Power control (1)
        word $000C, $0000 ' Power control (2)
        ...
        word $FFFF, $FFFF ' end of list; some values that should never appear
    
    PUB Displaylist(ptr) | reg, val
       repeat
         reg := word[ptr]
         val := word[ptr+2]
         if reg == $FFFF and val == $FFFF
            return
         Displaycmd(reg, val)
    
    PRI Start_SSD1289
        Displaylist(@ssd1289_startlist)
    
  • Thanks @ersmith for those tips. I've wanted to rewrite the display inits for a while but I always end up stuck. I had not considered using a list value that would never appear as a terminator, I'll have to see if I can make that work somehow. I had the idea of using a list but could never figure out how to make it work with the SSD1963's multiple data per command.


    I've made a lot of progress and am hopefully close to having something to show. I did a FB live video of the 'cardboard box notUnIpod' messing with the fonts. Now that I have a working FSRW I've been testing, I'm trying to get Ymodem working and getting close. Right now I'm banging my head against a wall because numbers.spin doesn't seem to work right. I took a quick look and can't see what would break toStr. I tried hijacking the number functions from std_text_routines.spinh but it's ugly...


    Could use some sage advice since I'm a bit stuck and have several drivers that are getting close but still need to be debugged!
        if packet==0
          'Send filename and length
          i:=strsize(@fbuf)
          bytemove(@pdata,@fbuf,i+1)      
          p:=num.tostr(size,num#DEC)     '' is fat.fsize
          j:=strsize(p)
          bytemove(@pdata+i+1,p,j+1)
          repeat k from i+1+j+1 to 128
            pdata[k]:=0
    
    ...
        ser.decuns(size, 14)  '' works for printing
          ser.str(num.tostr(size,num#DSDEC14)) ' I do this kind of thing a lot :(
    ..
    

    I had to fiddle with FSRW to make it return a file's size while walking through the directory and once I got that working the only thing missing (other than CHAIN) is damn ToStr. I think numbers.spin would be an ESSENTIAL have, as well as some string stuffs. I admit to being super lazy in this regard, but that's the point of libraries right??

  • I wasn't aware that numbers.spin didn't work. I don't have a lot of time to debug right now but I'll try to take a look at it soon. All plain spin code is supposed to work with fastspin (modulo size and speed issues), so if any doesn't it's a bug. It'd be great if you could narrow down what part of tostr() is failing. My guess is that there may be some bug in fastspin's operator precedence, but that's just a wild guess.

  • cheezuscheezus Posts: 298
    edited 2019-08-09 02:50
    ersmith wrote: »
    I wasn't aware that numbers.spin didn't work. I don't have a lot of time to debug right now but I'll try to take a look at it soon. All plain spin code is supposed to work with fastspin (modulo size and speed issues), so if any doesn't it's a bug. It'd be great if you could narrow down what part of tostr() is failing. My guess is that there may be some bug in fastspin's operator precedence, but that's just a wild guess.

    I will try to narrow some specifics down, it seems that fromStr was working although I didn't realize I was using it at the time. I have noticed some bugs with operator precedence and will keep that in mind, as I have recently. I'm very liberal with my bracketing, mostly for human readability but libraries... I haven't played with optimization settings either so that could help narrow it down. I also noticed a bug with return values and will try to get a detail of what's working and what's not.

    I've got a ton of code to debug myself and half the time when I find what could be a bug, it's from debugging by buggy code. Separating the two can be difficult in the moment and I rarely get a chance to go back and reproduce the bug, sorry.

    One thing I'm noticing is std_text_routines.spinh has some interesting code that works as an include (pretty sure) but not directly.
      repeat
        if (val < 0)
          ' synthesize unsigned division from signed
          ' basically shift val right by 2 to make it positive
          ' then adjust the result afterwards by the bit we
          ' shifted out
          r1 := val&1  ' capture low bit
          q1 := val>>1 ' divide val by 2
          digit := r1 + 2*(q1 // base)
          val := 2*(q1 / base)
          if (digit => base)
            val++
        digit -= base
     '   else   '' compiler whines about indention because of line directly above
        if (val >= 0) '' but this seems to work okay
          digit := val // base
          val := val / base
    
        if (digit => 0 and digit =< 9)
           digit += "0"
        else
           digit := (digit - 10) + "A"
        buf[i++] := digit
        --digitsNeeded
      while (val <> 0 or digitsNeeded > 0) and (i < 32)
      if (signflag > 1)
        tx(signflag)
        
      '' now print the digits in reverse order
      repeat while (i > 0)
        tx(buf[--i])
    
    


    I haven't tested this yet but here's what I was thinking a workaround for now
    Pri NumToStr(val, base, digitsNeeded) | i, digit, r1, q1
    
      '' make sure we will not overflow our buffer
      if (digitsNeeded > 32)
        digitsNeeded := 32
    
      '' accumulate the digits
      i := 0
      buf[i++] := 0     'z term string chz
      repeat
        if (val < 0)
          ' synthesize unsigned division from signed
          ' basically shift val right by 2 to make it positive
          ' then adjust the result afterwards by the bit we
          ' shifted out
          r1 := val&1  ' capture low bit
          q1 := val>>1 ' divide val by 2
          digit := r1 + 2*(q1 // base)
          val := 2*(q1 / base)
          if (digit => base)
            val++
        digit -= base
        'else        ' no like
        if (val >= 0)
          digit := val // base
          val := val / base
        if (digit => 0 and digit =< 9)
           digit += "0"
        else
           digit := (digit - 10) + "A"
        
        buf[i++] := digit
        --digitsNeeded
      while (val <> 0 or digitsNeeded > 0) and (i < 32)     
      return @buf  + i ' return buffer address + ptr to most sig char
    

    I've almost got ymodem working and that will help with testing significantly. I was sending small files okay but I think the serial port on my laptop is causing a problem with the longer transfers. 2 steps forward, 2 lightyears to go.
  • cheezuscheezus Posts: 298
    edited 2019-08-09 07:13
    I decided to try Kye's String Engine and it almost solved my tostring problem. Right now I'm really starting to wonder where i'm going wrong. I think the code below SHOULD figure the number of digits when places =0 but I can't seem to make it work right!?! I guess I'll start trying to make sense of this but it seems like repeat is exiting the loop because I can never seem to get more than one digit!!
    PUB NumToStr(value, places) | p,  n
        if places > 0
            p := places
        else
            p :=1
            n := value   
            repeat while n > 0 
                n := n /  10    
                if n > 0           
                    p += 1                            
        return  str.integerToDecimal(value, p) +1
    

    I've tried a bunch of different ways of doing this but it's as if p never increments? I don't get it!

    *edit

    ughhhh, it's a sign problem... I think I have a fix..

    *aedit -

    Still not sure why numbers.spin isn't working but it' been broken for quite some time now iirc. I actually wanted to use Kye's string engine in Ymodem for a while because I think it would simplify handling user input. I'm making progress again though and can deal with the signed / unsigned issue since the only place it showed up is in the final total of the directory. I might need to change this to display total kb on disk instead of bytes but shouldn't break too much, other than the transfers of very large files that are probably impractical to transfer this way anyway..
  • I think Numbers.spin is relying on the order of evaluation of side-effects like ++; that is, expressions like:
        byte[@BCX0][Digits++ >> 1] += ||(Num // Base) << (4 * Digits&1)
    
    seem to come out differently in standard Spin and fastspin (I think fastspin is evaluating the Digits++ on the left hand side before the Digits on the right hand side, but regular Spin does it the other way around). I'm going to have to think carefully about that one, but for now I suggest avoiding expressions that are that tricky (personally I find it confusing to read anyway!). Thanks for finding this!

    std_text_routines.spinh isn't intended to work stand-alone, but rather converts any object with a "tx" method into one that also has "dec", "hex", and so on. A simple kind of decimal to string capability can be built with this by creating an object where "tx" just stores a character into a string:
    con
      BUFSIZ = 32
      
    var
       byte sbuf[BUFSIZ + 1] ' buffer for output
       long i ' output index
    
    ' tx copies the character to the buffer
    pub tx(c)
      if i < BUFSIZ
        sbuf[i++] := c
    
    ' fetch zero terminates the buffer, resets the index, and returns a pointer
    ' to the buffer
    pub fetch
      sbuf[i] := 0
      i := 0
      return @sbuf
    
    #include "spin/std_text_routines.spinh"
    
    ' convert integer to decimal string
    pub todec(n)
      dec(n)
      return fetch
    
    ' convert integer to hex string with w digits (default 8)
    pub tohex(n, w=8)
      hex(n, w)
      return fetch
    
  • cheezuscheezus Posts: 298
    edited 2019-08-10 22:41
    ersmith wrote: »
    I think Numbers.spin is relying on the order of evaluation of side-effects like ++; that is, expressions like:
        byte[@BCX0][Digits++ >> 1] += ||(Num // Base) << (4 * Digits&1)
    
    seem to come out differently in standard Spin and fastspin (I think fastspin is evaluating the Digits++ on the left hand side before the Digits on the right hand side, but regular Spin does it the other way around). I'm going to have to think carefully about that one, but for now I suggest avoiding expressions that are that tricky (personally I find it confusing to read anyway!). Thanks for finding this!


    This is one that caught me up several times. One of the things I hate about "black box" object usage is really hard to understand code like this. When things break it becomes nearly impossible to debug. Kye's string engine works great, although it only handles signed decimal. I'm sure there's a few other formats that may get tricky, I'll have to see when I get there. Right now it looks a little ugly but it's working.
    PUB NumToStr(value, places) | p,  n
        if places == 0
            p := 3
            n := value   
            repeat while (n /= 10) > 10   
                if n > 10 
                    p ++                                                                
        else
            p := places 
            
        return  str.integerToDecimal(value, p-1) +1
    


    Now I'm to the place of testing the memory chips and was hoping to use the old XMM cache test to verify but now I have to wrap my head around not having movd / movs.
            movs    :ld, line
            movd    :st, line
    


    I'm happy that things are progressing and everything seems to work good so far.

    SRW_CHZrc3 works really well and the smartpin spi code seems very stable. There's a lot of possible optimizations but I think the next thing to work on is LUT sharing. I've got a build of Ymodem that can only send to the SD card, receive is broken but I think it has something to do with the smartpin serial?? Haven't really debugged this yet but the one nice thing I did notice is the limiting factor now is the serial, with transfer speed constant over any sysclock setting (as long as it's able to keep up with the baud). I've also been running 460800 baud, and had a few tests complete at 921600 baud, although I think self heating and sysclock >240mhz is a problem right now.

    I'm also able to turn my tft lcd on and display some text. Started working on cog code for that, although SRAM is the next thing I need to verify. I've got some SPI code that should read the touch adc but have not even tried testing yet. Things are progressing though and it's amazing how fast this chip is!
  • Just a note: I think I've figured out why Numbers.spin doesn't work right in fastspin, and I have checked workarounds into github (so it should work in the next fastspin release). There were two issues:

    (1) As noted above, the order of evaluation of things like x[Digit++] |= Digit was different between openspin and fastspin; fastspin followed GCC's convention (the Digit++ on the LHS got evaluated before the Digit on the RHS) but openspin did it the other way around.

    (2) Numbers.spin relies on "x/0" to equal 0, whereas fastspin's divide routine returned -1 for this.

    I think relying on the value of division by 0 is particularly ugly, but I'll change fastspin to match the original Spin since perhaps other code relies on this too :(.
  • -1 is TRUE in SPIN, anything else is false. That was done to make booleans in conditional expressions easier. Just FYI.

  • potatohead wrote: »
    -1 is TRUE in SPIN, anything else is false. That was done to make booleans in conditional expressions easier. Just FYI.

    Thanks. fastspin has always done this for Spin code, and so that wasn't an issue. I suppose one could argue that making "x/0" be 0 ties in to it being "false", but that's tenuous, since valid division like "1/2" also produces 0. Anyway, since division by 0 is undefined there's no particular harm in making the result 0 like Spin does, although many processors do return $FFFFFFFF (maximum 32 bit integer) for that case, as it's the natural output of the division algorithm when used with a divisor of 0.
  • cheezuscheezus Posts: 298
    edited 2019-12-07 05:30
    I think I've either found a bug in fastspin or maybe I'm just doing something wrong. Lookupz seems to be misbehaving as some working P1 spin doesn't seem to work on the P2. Here's the offending usage.
      value <<= (8 - digits) << 2
      repeat digits
        curx := TextChar(font,lookupz((value <-= 4) & $F : "0".."9", "A".."F"),Curx,Cury)
    

    compiles to
    _texthex
    	mov	COUNT_, #17
    	calla	#pushregs_
    	mov	local01, arg01
    	mov	local02, arg02
    	mov	local03, arg03
    	mov	local04, local02
    	mov	local05, #8
    	sub	local05, local03
    	shl	local05, #2
    	shl	local04, local05
    	mov	local02, local04
    	mov	local06, local03
    LR__0045
    	cmp	local06, #0 wz
     if_e	jmp	#LR__0046
    	mov	local04, local01
    	mov	local07, local02
    	rol	local07, #4
    	mov	local02, local07
    	mov	local05, local02
    	and	local05, #15
    	mov	local08, local05
    	mov	local09, #0
    	add	ptr__dat__, ##5324
    	mov	local10, ptr__dat__
    	sub	ptr__dat__, ##5324
    	mov	local11, local10
    	mov	local12, #16
    	mov	arg01, local08
    	mov	arg02, local09
    	mov	arg03, local11
    	mov	arg04, local12
    	calla	#__system___lookup
    	mov	local13, result1
    	mov	local14, local13
    	add	objptr, ##3324
    	rdword	local15, objptr
    	sub	objptr, ##3324
    	add	objptr, ##3326
    	rdword	local16, objptr
    	sub	objptr, ##3326
    	mov	arg01, local04
    	mov	arg02, local14
    	mov	arg03, local15
    	mov	arg04, local16
    	calla	#_textchar
    	mov	local17, result1
    	add	objptr, ##3324
    	wrword	local17, objptr
    	sub	objptr, ##3324
    	mov	local04, local06
    	sub	local04, #1
    	mov	local06, local04
    	jmp	#LR__0045
    LR__0046
    	mov	ptra, fp
    	calla	#popregs_
    _texthex_ret
    	reta
    

    That's broken my hex to ascii method currently.

    I'm also having problems with something like this.
    CON
        SPI_A0  =  16  
    …
        DIR := $0003_0000 
        OUT :=  ((spi_select & %11) << SPI_A0)
    

    Again, I haven't had a chance to track this down and have OUT hard-coded to 0 for now. I tried a few different things and not quite sure what's going on.
    Solved with a small wait, seems to be a race condition with the hardware? Need to track that down later when there's no ribbon cable :neutral: \

    I've got a LOT of bugs in my own code to find still, I'm sure. And I still have a long way to go, but my project is starting to show signs of life. Hopefully @ersmith has some answers.
Sign In or Register to comment.