P2 Experiments- P1 source to P2 Fastspin

cheezus · 2018-12-31 05:42

So I've played with TaqOZ a bit and now I'm wanting to try converting P1 source to P2. I ran my touchscreen desktop app through fastspin and got what looks like a complete ASM output. I haven't gone through all of it yet but it looks like it should run. I'm wondering if there's any known gotchas I should watch out for? I'm finishing up new hardware now, waiting on a couple final things. Progress has been slow over the holidays but I'm getting ready for another big push after the new year.

Thanks for any advice!

whicker · 2018-12-31 06:18

Here is what I'm encountering with Fastspin:

I am using Windows 7.

1) I heard about issues with loadp2.exe so I grabbed and compiled the version from here:
https://github.com/davehein/p2gcc

I placed the newly compiled loadp2.exe version 0.007 into the /bin folder of spin2gui

2) When running spin2gui, I had to open the menu labeled "Commands" and select "Configure Commands..."
This brings up a window called "Executable Paths"
I pressed the [P2 Defaults] button, but had to change the text in "Run Command" to this path:

cmd.exe /c start %D/bin/loadp2 %B -CHIP -t -v

Press OK to close the configure commands window.

3) I opened up the sample file blink2.bas. Changed the first lines to this, and saved:

const LED1 = 56
const LED2 = 57

4) Pressed the "Compile & Run" button.
Got this in the terminal, because I'm using -v for loadp2 (verbose mode):

Searching serial ports for a P2
P2 version A found on serial port com9
Setting clock_mode to 10c1f08
Setting user_baud to 115200
Loading C:/Propeller/spin2gui/samples/blink2.binary - 1440 bytes
C:/Propeller/spin2gui/samples/blink2.binary loaded
[ Entering terminal mode.  Press ESC to exit. ]

5) Now I'm stuck. It looks like fastspin inserts a hubset #255 into the asm code. From various other comments, this isn't going to work on the P2-EVAL board.

Here is the beginning of the .lst file it generates for blink2.bas, with me having changed the pin numbers of led1 and led2.

00000                 | con
00000                 | 	led1 = 56
00000                 | 	led2 = 57
00000                 | dat
00000 000             | 	org	0
00000 000 
00000 000             | entry
00000 000 00F00FF2    | 	cmp	ptra, #0 wz
00004 001 1800905D    |  if_ne	jmp	#spininit
00008 002 30F003F6    | 	mov	ptra, objptr
0000c 003 00F007F1    | 	add	ptra, #0
00010 004 00FE65FD    | 	hubset	#255
00014 005 D404C0FD    | 	calla	#@_program

twm47099 · 2018-12-31 07:12

I rename the .pasm file to .spin2 and then open the .spin2 file and change the hubset #255 to hubset #0 then press compile & run (the spin2 file). That will run the P2-ES using RCFAST (20MHz). There are changes you can make to the spin2 file to change the clock speed:
https://forums.parallax.com/discussion/comment/1459371/#Comment_1459371

whicker · 2018-12-31 07:51

Thanks @twm47099 !!!

When changing the .p2asm file to .spin2 as you described, and adding the _CLOCKFREQ constants from that message, I think I had to go down further below to set the clock frequency as well in the COG_BSS_START section, if present?

| COG_BSS_START
| 	fit	496
| 	orgh	$400
| 	'SEEMS TO MAKE A DIFFERENCE? First long is CPU Clock?
| 	long	_CLOCKFREQ	'was long 80000000
| 	...
| hubentry

whicker · 2018-12-31 08:11

So @cheezus, a gotcha is it looks like fastspin is not currently generating code for setting the CPU clock frequency correctly.

Dave Hein · 2018-12-31 11:02

Try commenting out the "hubset #$ff". By default, loadp2 will set up the system clock to 80 MHz. If you want the user program to set up the clock, then you have to specify the -SINGLE option to loadp2.

ersmith · 2018-12-31 21:15

There's a new fastspin / spin2gui release that should work much better. It no longer sets the clock automatically, you'll have to either have loadp2 do that or else insert an explicit clkset(mode, freq) at the beginning of your Spin or BASIC code.

cheezus · 2018-12-31 22:49

Thanks guys! I've been working on getting the new hardware ready, turns out I'm short on pin headers

I'll have to figure out something to keep me busy while I wait.

I'm really impressed with fastspin, I need to clean up the project so it will compile for P1. I'm still not clear on the translation of DAT sections. It LOOKS like they have been parsed to P2asm correctly (of course there may be issues of timing or something?) but it can't be THAT simple can it?

potatohead · 2018-12-31 23:16

Why not?

That is what normally happens. Assembly boils down to just being a set of directives to assemble a binary file. It may or may not be instructions. Could just be data.

ersmith · 2019-01-01 06:09

cheezus wrote: »

I'm really impressed with fastspin, I need to clean up the project so it will compile for P1. I'm still not clear on the translation of DAT sections. It LOOKS like they have been parsed to P2asm correctly (of course there may be issues of timing or something?) but it can't be THAT simple can it?

I'm glad it's working for you. fastspin doesn't try to do any translation of P1 asm to P2 asm, so you're probably just getting lucky if your DAT section has much legacy P1 assembly code in it. The P2 instruction set is "mostly" a superset of P1, and I have kept a few P1 assembly conventions (e.g. fastspin accepts "wc,wz" as well as "wcz"), so some P1 asm will compile OK. But in many cases you'll need to put an "#ifdef __P2__" in the DAT section with the P2 specific assembly.

The Spin code (outside of DAT) should mostly translate fine, so the more you can put in Spin the better. Performance shouldn't be an issue at all, fastspin is compiling to hubexec code on P2, so your P2 compiled Spin code will probably run faster than hand-crafted P1 assembly.

cheezus · 2019-01-18 03:11

So having spent some more time writing spin code, compiling through fastspin and looking at the output I'm even more impressed than I was initially. I've still been using the proptool for the bulk of editing but I think it's time for me to look at some of the other editors. I've always used proptool and it's just that comfortable place to develop for the p1. There's a lot of nice offerings, I've just got to be willing to get out of my comfort zone.

The one thing that I'm looking for in a toolchain is conditional compilation. If I understand correctly fastspin has a preprocessor, with built in directives?

I still haven't tried to run any code on the P2 yet, I really wanted to get something working on the p1 first. I finally got the Spin p1 version working and before going to bed I thought I'd try fastspin and... well it sure lives up to its name. It went from some 10s of seconds to draw a 320x240 blank screen to at least a refresh a second? This is with no work to optimize the spin code for speed. I'd LIKE to create an object that works for P1 and P2 and it seems doable but there's a lot of gotchas with the current codebase.

Is there a "fastspin" manual I'm missing?

Thanks as always to all the forum members here, you are awesome!

ersmith · 2019-01-18 17:15

cheezus wrote: »

I've still been using the proptool for the bulk of editing but I think it's time for me to look at some of the other editors. I've always used proptool and it's just that comfortable place to develop for the p1.

It's definitely worth checking out @Rayman 's SpinEdit, it's got an interface very similar to proptool but can use openspin or fastspin to do the compiling.

The one thing that I'm looking for in a toolchain is conditional compilation. If I understand correctly fastspin has a preprocessor, with built in directives?

Yes. See the docs/spin.md file in the fastspin or spin2gui release. It's a basic subset of the C preprocessor, with #define, #ifdef / #else / #endif, and #include. There are some predefined symbols, like __P2__ if compiling for the P2 (again, documented in the spin.md file; that's a GitHub mark-down file, so readable as plain text).

whicker · 2019-01-19 05:28

I've been converting the markdown (.md) files to pdf for my own use when i grab a new release.
makes for an easier read.

cheezus · 2019-02-09 06:00

Okay, I did it again! Still working on P1 source..

Seems fastspin doesn't like the copy of fsrw I've been using, giving me an error that's got me scratching my head right now.
"fsrw_Ray.Spin(568) error: internal error : expected statement list, got 87"

So I go looking for an 87 in the code and can't find a dec, hex or binary... But line 568 is a blank line so I thought maybe something in the last function isn't properly formatted or something. I'm sure it's something simple really.

pub nextfile(fbuf) | i, t, at, lns
{{
'   Find the next file in the root directory and extract its
'   (8.3) name into fbuf.  Fbuf must be sized to hold at least
'   13 characters (8 + 1 + 3 + 1).  If there is no next file,
'   -1 will be returned.  If there is, 0 will be returned.
}}
   repeat
      if (bufat => bufend)
         t := pfillbuf
         if (t < 0)
            return t
         if (((floc >> SECTORSHIFT) & ((1 << clustershift) - 1)) == 0)
            fclust++
      at := @buf + bufat
      if (byte[at] == 0)
         return -1
      bufat += DIRSIZE
      if (byte[at] <> $e5 and (byte[at][$0b] & $18) == 0)
         lns := fbuf
         repeat i from 0 to 10
            byte[fbuf] := byte[at][i]
            fbuf++
            if (byte[at][i] <> " ")
               lns := fbuf
            if (i == 7 or i == 10)
               fbuf := lns
               if (i == 7)
                  byte[fbuf] := "."
                  fbuf++
         byte[fbuf] := 0
         fsize2:= brlong(at+$1c)
         return 0

I was finally able to trim down Touch.Spin enough to compile the desktop app. I'm still rewriting the ASM driver in Spin... Got a long way to go and starting to wonder if I should just bypass this step and jump to P2asm. I wish I had the time and "resources" to devote to serious development... But, almost 3 year old twin boys who think playing with wires and cords is almost as amazing as beating each other with plastic baseball bats; a cat that thinks jumper wires are the best thing since hair ties and feathers; as well as preparing for a move all slow the process.

btw, I still haven't rtfm :P

*edit, added sources i'm trying to build, duh

ersmith · 2019-02-09 13:16

An "internal error" is pretty much always a bug in fastspin: it indicates that some sanity check that I added to the code failed.

In this case it looks like fastspin messed up compiling the line:

  \sdspi.readblock(0, @buf)

in the mount(basepin) function in fsrw_Ray.spin. I'll try to sort out what's wrong, but for now probably removing the \ will get it going.

cheezus · 2019-02-09 17:38

Thanks as always for taking a look at that.

One of the things I noticed after I posted was touch.spin compiled program size comes out to 35580 bytes? @msrobots mentioned running out of memory and this is entirely possible. I've been trying to shave longs by moving all large constants to a DAT block. Not sure if this is the best way to go...

ersmith · 2019-02-10 13:15

cheezus wrote: »

Thanks as always for taking a look at that.

One of the things I noticed after I posted was touch.spin compiled program size comes out to 35580 bytes? @msrobots mentioned running out of memory and this is entirely possible. I've been trying to shave longs by moving all large constants to a DAT block. Not sure if this is the best way to go...

Yes, unfortunately fastspin binaries are quite a bit bigger than regular Spin ones, since they compile to PASM rather than to compressed bytecode.

Moving large constants to a DAT block probably won't make much difference; the compiler tries to keep only one copy of each large immediate constant. But it won't hurt either, and maybe you'll notice some things the compiler missed. I kind of doubt you'll get 3K back though

.

I have some ideas on how to allow fastspin to compress instructions (like PropGCC's CMM mode), but that isn't going to happen in the near future.

cheezus · 2019-02-10 22:59

ersmith wrote: »

Yes, unfortunately fastspin binaries are quite a bit bigger than regular Spin ones, since they compile to PASM rather than to compressed bytecode.

Moving large constants to a DAT block probably won't make much difference; the compiler tries to keep only one copy of each large immediate constant. But it won't hurt either, and maybe you'll notice some things the compiler missed. I kind of doubt you'll get 3K back though .

I have some ideas on how to allow fastspin to compress instructions (like PropGCC's CMM mode), but that isn't going to happen in the near future.

When I started playing with this I wasn't even sure it would work but now I'm starting to see potential. Part of the reason for these experiments is I'm trying to understand ways to optimize as well as restructure things. As it stands now, there's a lot of code that only runs once like display inits. This is one place I've always thought things could be improved. I've thought about putting the init code into CSV files on the SD and just parse though a file for display init but it really seems best to have the display init code in EEPROM in case there's something wrong with the SD I can still init the display and print debugging info.

For a while I used a small program to init the display, start the SD and chain into another program that actually loaded the desktop. From here each program could be chained to and when finished kicks back to the EEPROM. This would check the SRAM to see if the desktop attributes were still in ram and if so skip loading the desktop and just display the pre-loaded desktop. The return to desktop from a running app is instant, which is nice because loading an 800x480 image from SD seems like it takes forever!

There were several problems with doing things this way but the biggest annoyance, other than loading the desktop for the first time, was switching between displays. The previous hardware design didn't use the LCD /RD and simply pulled up to 3v3. The new hardware should allow me to determine which controller is plugged in, instead of changing a line in the source code and recompiling the entire package. The biggest problem I'm having is figuring out how to handle 2 different resolutions, as for a long time the 320x240 "package" was completely separate from the 800x480.

Getting back the 3k is going to be tricky but with 1mw of SRAM at least I have some place to stuff data between programs run off SD. Its very interesting looking at the spin code and then seeing the pasm output. I haven't really noticed your compiler "slacking off" anywhere and the moving large constants is just to satisfy the FIT directive. I think I may have made some mistakes and need to double check pointer vs register. I'm also thinking that display init data could be overwritten for buffers if I could wrap my head around that mess.


    sdbuffer                    byte 0[1024]                   ' 1024 byte buffer for sd card interface and also for sending one row 320 x3 = 960 bytes max picture size 
    buffer2                     byte 0[512]                    ' 512 general purpose hub buffer 
    rambuffer                   word 0[1024]                    ' 512 byte (256 word) buffer easier for using with words, needed to double this

I was having trouble with some piece of code overflowing the rambuffer so that's 4x larger than it needs to be. There's plenty of places where this code can be optimized, I'm just trying to understand the best ways to optimize for P1 and looking for ideas in the P2 rewrite.

ersmith · 2019-02-11 00:18

If you're running out of COG memory (and yes, having lots of large constants will do that

), have you tried disabling the code cache with --fcache=0? That'll reclaim a fair chunk of COG memory pretty easily, albeit with a bit of a hit to performance.

cheezus · 2019-04-03 11:53

I'm finally getting to "the driver rewrite" while simultaneous bringing up the new hardware. Now that things are starting to show signs of life I need some help figuring out the right way to "segment" things a bit. The one obvious thing that stood out while working on the test harness was the display initialization code. With the init's for 3 displays the compiler was generating a LOT of big numbers and eating into the cog. I can also see how using out new pin instructions could save some cog memory too but at the expense of backwards compatibility.

CON ''PINS
  LCD_RS  = |< 16       '
  LCD_CS  = |< 20       ' p5 -1 -d_en
  
  ''LCD_RD  = |< 18     ' p6 -0 -d_r
  ''LCD_WR  = |< 19     ' p6 -1_d_W
  ''LCD_RST = |< 21     ' GP3 & P10
    
  LCD_PINS = LCD_CS 
  LCD_DIRS = LCD_PINS | LCD_RS | $FFFF
  
    D_RST = 10
  
    SPI_A0 = 10   
    SPI_A1  =11

    SD_DO   =   12
    SD_CLK  =   13
    SD_DI   =   14
    SD_CS   =   15
    
    GROUP_EN    = 21
    CLOCK_PIN   = 20 
  
CON ''OTHER STUFFS
  _1ms      = 1_000_000 /     1_000                     ' Divisor for 1 ms                                                                      

  'define latch bits
    #1, _ramEnable, _ram_rW, _flash_rW, _dispEnable, _disp_rW, _flashEnable     '' version 1, UGLY
    _group0, _group1, _group2, _group3                                          '' group bits
    
''OBJ   disp '' display object - SSD1963,SSD1289 specific - todo

VAR  
    word curx, cury, ScreenWidth, ScreenHeight
    word BackFontColor, FontHeight
    byte DisplayMode, orientation                                                                                                                          
    byte rambuffer[512]

    'cog stuffs
    byte latchvalue, spiselect

                                                          
PUB Testing | i    
   
   clkset(_SETFREQ, _CLOCKFREQ)
   waitcnt(cnt+clkfreq*2)
    
  Reset_Display
  Start_SSD1963

   'Draw(0, 0, 479, 799)
   REPEAT   
     Draw(0, 0, 479, 799)
     repeat (480 * 800)
       Pixel(0)  

     Draw(0, 0, 479, 799)
     repeat (480 * 800) 
       Pixel($ffff)

 repeat
   pause1ms(1000)

PUB pause1ms(period) | clkcycles                        '' Pause execution for period (in units of 1 ms).
  clkcycles := ((clkfreq / _1ms * period) - 4296) #> 381     ' Calculate 1 ms time unit
  waitcnt(clkcycles + cnt)                                   ' Wait for designated time   

''*********************************************************************************************************************************************************

' ***************** Start display routines *************************


PUB Draw(x1, y1, x2, y2)                                '' sets the pixel to x1,y1 and then fills the next (x2-x1)*(y2-y1) pixels
{ 
   ifnot orientation                                ' landscape mode so swap x and y for 1963
        result :=x1                                       ' swap x1 and y1
        x1 := y1
        y1 := result
        result := x2                                      ' swap x2 and y2
        x2 :=y2
        y2 := result
}
   DisplayEnable                                    ' enable one or both displays

    ''  ssd1963
     Lcd_Write_Com($002B)
     Lcd_Write_Data(x1>>8)
     Lcd_Write_Data(x1&$ff)
     Lcd_Write_Data(x2>>8)
     Lcd_Write_Data(x2&$ff)
        
     Lcd_Write_Com($002A)
     Lcd_Write_Data(y1>>8)
     Lcd_Write_Data(y1&$ff)
     Lcd_Write_Data(y2>>8)
     Lcd_Write_Data(y2&$ff)    

     Lcd_Write_Com($002c)

{    ''  ssd1289
      Displaycmd($0044,(x2<<8)+x1)           
      Displaycmd($0045,y1)
      Displaycmd($0046,y2)
      Displaycmd($004e,x1)
      Displaycmd($004f,y1)
      Lcd_Write_Com($0022)
}      
   LCD_RS_High

PUB Pixel(pixelcolor)                                   '' send out a pixel
     Lcd_Write_Fast_Data(pixelcolor)   ' need to set RS high at beginning of this group (ie call Draw first)
     ' it is more efficent to send one Draw command then lots of pixels than sending individual pixels
     ' but of course, you can send draw x,y,x,y where x and y are the same and then send one pixel


 ' ***********************************************************
 
'' DISPLAY SETTINGS
   
PRI DisplayEnable
    latchvalue :=( latchvalue & $FC) | _group2) | _dispEnable
    SetLatch( latchvalue )
    OUTA |= LCD_PINS                                 ' set /cs high
    DIRA |= LCD_DIRS                                 ' enable these pins for output

PRI SpinHubToDisplay(hubaddress,number)| i
'    DisplayEnable
    repeat i from 0 to number -1
      pixel(long[hubaddress][i])

PRI DisplayCmd(c,d) ' instruction in one method
    Lcd_Write_Com(c) ' send out a word
    Lcd_Write_Data(d)

PRI Lcd_Write_Com(LCDlong)
    LCD_RS_low                               ' can do rs first then cs - better for latch board
    LCD_Writ_Bus(LCDlong)
 
PRI Lcd_Write_Data(LCDlong)
    LCD_RS_High                              ' can do rs first then cs
    LCD_Writ_Bus(LCDlong)

PRI Lcd_Write_Fast_Data(LCDlong)            ' write RS elsewhere then skip the RS above as this is a latch output and takes longer
    LCD_Writ_Bus(LCDlong)

PRI LCD_Writ_Bus(LCDLong)
    LCDLong &= $0000_FFFF
    OUTA &= %11111111_11111111_00000000_00000000        ' set P0-P15 to zero ready to OR
    OUTA |= LCDLong                                     ' merge with the word to output
    LCD_CS_Low
    LCD_CS_High    

PRI LCD_CS_Low
    OUTA &= !LCD_CS
   
PRI LCD_CS_High                                         
    OUTA |= LCD_CS

PRI LCD_RS_Low                                          
    OUTA &= !LCD_RS

PRI LCD_RS_High            
    OUTA |= LCD_RS

PRI SetLatch(value)
    OUTA |= GROUP_EN
    OUTA |= CLOCK_PIN
    OUTA &= !$FF
    value &= $FF
    OUTA |= value
    OUTA &= CLOCK_PIN
    OUTA |= CLOCK_PIN
    OUTA &= GROUP_EN

PRI SetCounter(address) | olv
    olv := latchvalue
    SetLatch($0)
    OUTA &= CLOCK_PIN
    OUTA &= !$F_FFFF
    address &= $F_FFFF
    OUTA |= address
    OUTA |= CLOCK_PIN
    SetLatch(olv)
            
PRI Reset_Display |   olv 
    olv := latchvalue
    SetLatch(_group3)
    OUTA &= D_RST
    pause1ms(10) 
    OUTA |= D_RST                                     
    SetLatch(  olv )
    
DAT
   VDP        long  479     ' v pixels
   HDP        long  799     ' h pixels
   HT         long  928
   HPS        long  46
   LPS        long  15
   VT         long  525
   VPS        long  16
   FPS        long  8
   HPW        long  48
   VPW        long  16

PRI Start_SSD1963
    
    Lcd_Write_Com ($00E2)                             ' //PLL multiplier, set PLL clock to 120M
    Lcd_Write_Data($0023)                             ' //N=0x36 for 6.5M, 0x23 for 10M crystal ' $21??
    Lcd_Write_Data($0002)                             '
    Lcd_Write_Data($0054)                             ' dummy byte?  54??
    Lcd_Write_Com ($00E0)                             ' // PLL enable
    Lcd_Write_Data($0001)                             ' set pll             '
    pause1ms(1)                                      '
    Lcd_Write_Com ($00E0)                             ' pll                         '
    Lcd_Write_Data($0003)                             ' lock
    pause1ms(5)
    Lcd_Write_Com ($0001)                             ' // software reset
    pause1ms(5)
    Lcd_Write_Com ($00E6)                             ' //PLL setting for PCLK, depends on resolution
    Lcd_Write_Data($0003)                             '
    Lcd_Write_Data($00ff)                             '
    Lcd_Write_Data($00ff)                             '
    Lcd_Write_Com ($00B0)                              ' //LCD SPECIFICATION
    Lcd_Write_Data($0000)                             '
    Lcd_Write_Data($0000)                             '
    Lcd_Write_Data((HDP>>8)&$00FF)                    ' //Set HDP 'hps??
    Lcd_Write_Data(HDP&$00FF)
    Lcd_Write_Data((VDP>>8)&$00FF)                    ' //Set VDP 'vps??
    Lcd_Write_Data(VDP&$00FF)
    Lcd_Write_Data($0000)
    Lcd_Write_Com ($00B4)                             ' //HSYNC
    Lcd_Write_Data((HT>>8)&$00FF)                     ' //Set HT
    Lcd_Write_Data(HT&$00FF)
    Lcd_Write_Data((HPS>>8)&$00FF)                    ' //Set HPS
    Lcd_Write_Data(HPS&$00FF)
    Lcd_Write_Data(HPW)                               ' //Set HPW
    Lcd_Write_Data((LPS>>8)&$00FF)                    ' //Set HPS
    Lcd_Write_Data(LPS&$00FF)                         '
    Lcd_Write_Data($0000)                             '
    Lcd_Write_Com ($00B6)                             ' //VSYNC
    Lcd_Write_Data((VT>>8)&$00FF)                     ' //Set VT
    Lcd_Write_Data(VT&$00FF)
    Lcd_Write_Data((VPS>>8)&$00FF)                    ' //Set VPS
    Lcd_Write_Data(VPS&$00FF)
    Lcd_Write_Data(VPW)                               ' //Set VPW
    Lcd_Write_Data((FPS>>8)&$00FF)                    ' //Set FPS
    Lcd_Write_Data(FPS&$00FF)                         '
    Lcd_Write_Com ($00BA)                             '
    Lcd_Write_Data($0005)                             ' //GPIO[3:0] out 1
    Lcd_Write_Com ($00B8)                             '
    Lcd_Write_Data($0007)                             ' //GPIO3=input, GPIO[2:0]=output
    Lcd_Write_Data($0001)                             ' //GPIO0 normal
    Lcd_Write_Com ($0036)                              ' //rotation
    Lcd_Write_Data($0021)                              ' 3 is 180    21??
    Lcd_Write_Com ($00F0)                             ' //pixel data interface
    Lcd_Write_Data($0003)                             ' 16 bit / 565
    pause1ms(5)
    Lcd_Write_Com ($0029)                              ' //display on
    Lcd_Write_Com ($00d0)                              ' //dynamic backlight
    Lcd_Write_Data($000d)
    Lcd_Write_Com($002c)
    LCD_RS_High
    
PRI Start_SSD1289                                       ' based on C driver   
    Displaycmd($0000,$0001)  'Turn Oscillator on                                POR-$0000
    Displaycmd($0003,$A8A4)  'Power control (1)                                 POR-$6664
    Displaycmd($000C,$0000)  'Power control (2)                                 POR- ?
    Displaycmd($000D,$080C)  'Power control (3)                                 POR-$0009
    Displaycmd($000E,$2B00)  'Power control (4)                                 POR-$3200
    Displaycmd($001E,$00B0)  'Power control (5)                                 POR-$0029
    Displaycmd($0001,$2B3F)  'Driver output control,    *landscape?? $6B3F*     POR-$433F
    Displaycmd($0002,$0600)  'LCD drive AC control                              POR-$0400
    Displaycmd($0010,$0000)  'Sleep Mode                sleep mode off          POR-$0001
    Displaycmd($0011,$6070)  'Entry Mode,               *landscape? $4030*      POR-$6830
    Displaycmd($0006,$0000)  'Compare Register (2)                              POR-$0000
    Displaycmd($0016,$EF1C)  'Horizontal Porch                                  POR-$EFC1
    Displaycmd($0017,$0003)  'Vertical Porch                                    POR-$0003
    Displaycmd($0007,$0233)  'Display Control    '0033?                               POR-$0000
    Displaycmd($000B,$0000)  'Frame cycle Control              'd308                 POR-$D308
    Displaycmd($000F,$0000)  'Gate Scan start position                          POR-$0000
    Displaycmd($0041,$0000)  'Vertical Scroll Control (1)                       POR-$0000
    Displaycmd($0042,$0000)  'Vertical Scroll Control (2)                       POR-$0000
    Displaycmd($0048,$0000)  'First Window Start                                POR-$0000
    Displaycmd($0049,$013F)  'First Window End                                  POR-$013F
    Displaycmd($004A,$0000)  'Second Window Start                               POR-$0000
    Displaycmd($004B,$0000)  'Second Window End                                 POR-$013F
    Displaycmd($0044,$EF00)  'Horizontal Ram Address Postion                    POR-$EF00
    Displaycmd($0045,$0000)  'Vertical Ram Address Start                        POR-$0000
    Displaycmd($0046,$013F)  'Vertical Ram Address End                          POR-$013F

    Displaycmd($0030,$0707)
    Displaycmd($0031,$0204)
    Displaycmd($0032,$0204)
    Displaycmd($0033,$0502)
    Displaycmd($0034,$0507)
    Displaycmd($0035,$0204)
    Displaycmd($0036,$0204)
    Displaycmd($0037,$0502)
    Displaycmd($003A,$0302)
    Displaycmd($003B,$0302)
    Displaycmd($0023,$0000)   'RAM write data mask (1)                           POR-$0000
    Displaycmd($0024,$0000)   'RAM write data mask (2)                           POR-$0000
    Displaycmd($0025,$8000)   'not in datasheet?
    Displaycmd($004f,$0000)   'RAM Y address counter                             POR-$0000
    Displaycmd($004e,$0000)   'RAM X address counter                             POR-$0000
    Lcd_Write_Com($0022)

At this point things look pretty simple, put all the display specific stuff in a file. But every time I try to break things down it seems to complicated. In this example, DISPLAY_RS is controlled by it's pin, but the previous hardware (do I really have to support it? probably not?) it's controlled by a latch. Reset_Display, Set_Counter, and Set_Latch are new and totally untested..

Does anyone have tips for the best way to manage a HUGE program, that's meant to run across multiple hardware? There's already quite a few pieces to this puzzle and it seems like I'm either going to end up with lots of little files, or one REALLY long one. Ifdefs really add to character count and I still don't quite get the syntax.

As always, any advice appreciated!

ersmith · 2019-04-03 17:17

If you're looking for portability between P1 and P2 I would suggest abstracting the pin manipulation. fastspin already has dirxxx_(pin) and drvxxx_(pin) for both P1 and P2, so you could use these. For setting multiple groups of pins I'd write an object for sending data across the pins, which can use OUTA on P1 and OUTA/OUTB on P2 (and perhaps use the extended pin instructions on the next version of P2).

In general I would try to factor out common code between your hardware into one object, and have the hardware specific parts in other objects (something like "common.spin", "driver1.spin", and "driver2.spin". That's how I did my VGA driver: there's a common file that has
all the text output routines, separate modules for setting up different modes (1024x768, 800x600, and so on) and then another common low level VGA driver that actually bangs the bits on the hardware.

#ifdef is a really powerful helper. I regularly do things like starting files with:

#define DEBUG
#define FEATURE1
#define FEATURE2

and then within the code having bits like:

PUB myfunc(arg)
#ifdef DEBUG
   ser.printf("reached myfunc, arg=%d\n", arg)
#endif

To assist with debugging or for easily turning feaures on and off during development. For example when I'm done debugging I just comment out the "#define DEBUG" line, and the debug code is no longer compiled.

For big things like switching hardware or processors, I would suggest that ifdef be used in large chunks, ideally at the highest level to control including whole objects rather than inside of functions. So don't write:

PUB do_output
#ifdef __P2__  
   p2 output code
#else
   p1 output code
#endif

PUB do_input
#ifdef __P2__
  p2 input code
#else
  p1 input code
#endif

That leads to a rats nest of #ifdefs and is hard to read (in my opinion). Instead I would do something like:

OBJ
#ifdef __P2__
   p: "p2_routines.spin"
#else
   p: "p1_routines.spin"
#endif

and then use "p.do_output", "p.do_input", etc.

In general I would definitely urge abstracting things and writing lots of little functions over writing large complicated functions. The smaller the pieces you break things up into, the easier the code will be to read and the less likely it is that bugs will creep in. Don't worry about creating even tiny functions -- in fact fastspin will happily inline any small functions (ones that are just a few lines), and at optimization level -O2 it will inline any function that's only used once, so it's OK to write your code in a very modular fashion.

The other tip I would have is to try to make things data driven where you're doing the same thing over and over but with different parameters. So for example your last function consists basically of a long list of things like:

PRI Start_SSD1289                                       ' based on C driver   
    Displaycmd($0000,$0001)  'Turn Oscillator on                                POR-$0000
    Displaycmd($0003,$A8A4)  'Power control (1)                                 POR-$6664
    Displaycmd($000C,$0000)  'Power control (2)                                 POR- ?
    Displaycmd($000D,$080C)  'Power control (3)                                 POR-$0009
    Displaycmd($000E,$2B00)  'Power control (4)                                 POR-$3200
    Displaycmd($001E,$00B0)  'Power control (5)                                 POR-$0029
    Displaycmd($0001,$2B3F)  'Driver output control,    *landscape?? $6B3F*     POR-$433F
    Displaycmd($0002,$0600)  'LCD drive AC control                              POR-$0400
    Displaycmd($0010,$0000)  'Sleep Mode                sleep mode off          POR-$0001
    Displaycmd($0011,$6070)  'Entry Mode,               *landscape? $4030*      POR-$6830
    ...

I'd change this into a display list instead:

DAT
ssd1289_startlist
    word $0000, $0001 ' Turn Oscillator on
    word $0003, $A8A4 ' Power control (1)
    word $000C, $0000 ' Power control (2)
    ...
    word $FFFF, $FFFF ' end of list; some values that should never appear

PUB Displaylist(ptr) | reg, val
   repeat
     reg := word[ptr]
     val := word[ptr+2]
     if reg == $FFFF and val == $FFFF
        return
     Displaycmd(reg, val)

PRI Start_SSD1289
    Displaylist(@ssd1289_startlist)

cheezus · 2019-08-09 01:22

Thanks @ersmith for those tips. I've wanted to rewrite the display inits for a while but I always end up stuck. I had not considered using a list value that would never appear as a terminator, I'll have to see if I can make that work somehow. I had the idea of using a list but could never figure out how to make it work with the SSD1963's multiple data per command.

I've made a lot of progress and am hopefully close to having something to show. I did a FB live video of the 'cardboard box notUnIpod' messing with the fonts. Now that I have a working FSRW I've been testing, I'm trying to get Ymodem working and getting close. Right now I'm banging my head against a wall because numbers.spin doesn't seem to work right. I took a quick look and can't see what would break toStr. I tried hijacking the number functions from std_text_routines.spinh but it's ugly...

Could use some sage advice since I'm a bit stuck and have several drivers that are getting close but still need to be debugged!

    if packet==0
      'Send filename and length
      i:=strsize(@fbuf)
      bytemove(@pdata,@fbuf,i+1)      
      p:=num.tostr(size,num#DEC)     '' is fat.fsize
      j:=strsize(p)
      bytemove(@pdata+i+1,p,j+1)
      repeat k from i+1+j+1 to 128
        pdata[k]:=0

...
    ser.decuns(size, 14)  '' works for printing
      ser.str(num.tostr(size,num#DSDEC14)) ' I do this kind of thing a lot :(
..

I had to fiddle with FSRW to make it return a file's size while walking through the directory and once I got that working the only thing missing (other than CHAIN) is damn ToStr. I think numbers.spin would be an ESSENTIAL have, as well as some string stuffs. I admit to being super lazy in this regard, but that's the point of libraries right??

ersmith · 2019-08-09 01:47

I wasn't aware that numbers.spin didn't work. I don't have a lot of time to debug right now but I'll try to take a look at it soon. All plain spin code is supposed to work with fastspin (modulo size and speed issues), so if any doesn't it's a bug. It'd be great if you could narrow down what part of tostr() is failing. My guess is that there may be some bug in fastspin's operator precedence, but that's just a wild guess.

cheezus · 2019-08-09 02:32

ersmith wrote: »

I wasn't aware that numbers.spin didn't work. I don't have a lot of time to debug right now but I'll try to take a look at it soon. All plain spin code is supposed to work with fastspin (modulo size and speed issues), so if any doesn't it's a bug. It'd be great if you could narrow down what part of tostr() is failing. My guess is that there may be some bug in fastspin's operator precedence, but that's just a wild guess.

I will try to narrow some specifics down, it seems that fromStr was working although I didn't realize I was using it at the time. I have noticed some bugs with operator precedence and will keep that in mind, as I have recently. I'm very liberal with my bracketing, mostly for human readability but libraries... I haven't played with optimization settings either so that could help narrow it down. I also noticed a bug with return values and will try to get a detail of what's working and what's not.

I've got a ton of code to debug myself and half the time when I find what could be a bug, it's from debugging by buggy code. Separating the two can be difficult in the moment and I rarely get a chance to go back and reproduce the bug, sorry.

One thing I'm noticing is std_text_routines.spinh has some interesting code that works as an include (pretty sure) but not directly.

  repeat
    if (val < 0)
      ' synthesize unsigned division from signed
      ' basically shift val right by 2 to make it positive
      ' then adjust the result afterwards by the bit we
      ' shifted out
      r1 := val&1  ' capture low bit
      q1 := val>>1 ' divide val by 2
      digit := r1 + 2*(q1 // base)
      val := 2*(q1 / base)
      if (digit => base)
        val++
    digit -= base
 '   else   '' compiler whines about indention because of line directly above
    if (val >= 0) '' but this seems to work okay
      digit := val // base
      val := val / base

    if (digit => 0 and digit =< 9)
       digit += "0"
    else
       digit := (digit - 10) + "A"
    buf[i++] := digit
    --digitsNeeded
  while (val <> 0 or digitsNeeded > 0) and (i < 32)
  if (signflag > 1)
    tx(signflag)
    
  '' now print the digits in reverse order
  repeat while (i > 0)
    tx(buf[--i])

I haven't tested this yet but here's what I was thinking a workaround for now

Pri NumToStr(val, base, digitsNeeded) | i, digit, r1, q1

  '' make sure we will not overflow our buffer
  if (digitsNeeded > 32)
    digitsNeeded := 32

  '' accumulate the digits
  i := 0
  buf[i++] := 0     'z term string chz
  repeat
    if (val < 0)
      ' synthesize unsigned division from signed
      ' basically shift val right by 2 to make it positive
      ' then adjust the result afterwards by the bit we
      ' shifted out
      r1 := val&1  ' capture low bit
      q1 := val>>1 ' divide val by 2
      digit := r1 + 2*(q1 // base)
      val := 2*(q1 / base)
      if (digit => base)
        val++
    digit -= base
    'else        ' no like
    if (val >= 0)
      digit := val // base
      val := val / base
    if (digit => 0 and digit =< 9)
       digit += "0"
    else
       digit := (digit - 10) + "A"
    
    buf[i++] := digit
    --digitsNeeded
  while (val <> 0 or digitsNeeded > 0) and (i < 32)     
  return @buf  + i ' return buffer address + ptr to most sig char

I've almost got ymodem working and that will help with testing significantly. I was sending small files okay but I think the serial port on my laptop is causing a problem with the longer transfers. 2 steps forward, 2 lightyears to go.

cheezus · 2019-08-09 06:07

I decided to try Kye's String Engine and it almost solved my tostring problem. Right now I'm really starting to wonder where i'm going wrong. I think the code below SHOULD figure the number of digits when places =0 but I can't seem to make it work right!?! I guess I'll start trying to make sense of this but it seems like repeat is exiting the loop because I can never seem to get more than one digit!!

PUB NumToStr(value, places) | p,  n
    if places > 0
        p := places
    else
        p :=1
        n := value   
        repeat while n > 0 
            n := n /  10    
            if n > 0           
                p += 1                            
    return  str.integerToDecimal(value, p) +1

I've tried a bunch of different ways of doing this but it's as if p never increments? I don't get it!

*edit

ughhhh, it's a sign problem... I think I have a fix..

*aedit -

Still not sure why numbers.spin isn't working but it' been broken for quite some time now iirc. I actually wanted to use Kye's string engine in Ymodem for a while because I think it would simplify handling user input. I'm making progress again though and can deal with the signed / unsigned issue since the only place it showed up is in the final total of the directory. I might need to change this to display total kb on disk instead of bytes but shouldn't break too much, other than the transfers of very large files that are probably impractical to transfer this way anyway..

ersmith · 2019-08-09 12:55

I think Numbers.spin is relying on the order of evaluation of side-effects like ++; that is, expressions like:

    byte[@BCX0][Digits++ >> 1] += ||(Num // Base) << (4 * Digits&1)

seem to come out differently in standard Spin and fastspin (I think fastspin is evaluating the Digits++ on the left hand side before the Digits on the right hand side, but regular Spin does it the other way around). I'm going to have to think carefully about that one, but for now I suggest avoiding expressions that are that tricky (personally I find it confusing to read anyway!). Thanks for finding this!

std_text_routines.spinh isn't intended to work stand-alone, but rather converts any object with a "tx" method into one that also has "dec", "hex", and so on. A simple kind of decimal to string capability can be built with this by creating an object where "tx" just stores a character into a string:

con
  BUFSIZ = 32
  
var
   byte sbuf[BUFSIZ + 1] ' buffer for output
   long i ' output index

' tx copies the character to the buffer
pub tx(c)
  if i < BUFSIZ
    sbuf[i++] := c

' fetch zero terminates the buffer, resets the index, and returns a pointer
' to the buffer
pub fetch
  sbuf[i] := 0
  i := 0
  return @sbuf

#include "spin/std_text_routines.spinh"

' convert integer to decimal string
pub todec(n)
  dec(n)
  return fetch

' convert integer to hex string with w digits (default 8)
pub tohex(n, w=8)
  hex(n, w)
  return fetch

cheezus · 2019-08-10 22:27

ersmith wrote: »
I think Numbers.spin is relying on the order of evaluation of side-effects like ++; that is, expressions like:
    byte[@BCX0][Digits++ >> 1] += ||(Num // Base) << (4 * Digits&1)
seem to come out differently in standard Spin and fastspin (I think fastspin is evaluating the Digits++ on the left hand side before the Digits on the right hand side, but regular Spin does it the other way around). I'm going to have to think carefully about that one, but for now I suggest avoiding expressions that are that tricky (personally I find it confusing to read anyway!). Thanks for finding this!

This is one that caught me up several times. One of the things I hate about "black box" object usage is really hard to understand code like this. When things break it becomes nearly impossible to debug. Kye's string engine works great, although it only handles signed decimal. I'm sure there's a few other formats that may get tricky, I'll have to see when I get there. Right now it looks a little ugly but it's working.

PUB NumToStr(value, places) | p,  n
    if places == 0
        p := 3
        n := value   
        repeat while (n /= 10) > 10   
            if n > 10 
                p ++                                                                
    else
        p := places 
        
    return  str.integerToDecimal(value, p-1) +1

Now I'm to the place of testing the memory chips and was hoping to use the old XMM cache test to verify but now I have to wrap my head around not having movd / movs.

        movs    :ld, line
        movd    :st, line

I'm happy that things are progressing and everything seems to work good so far.

SRW_CHZrc3 works really well and the smartpin spi code seems very stable. There's a lot of possible optimizations but I think the next thing to work on is LUT sharing. I've got a build of Ymodem that can only send to the SD card, receive is broken but I think it has something to do with the smartpin serial?? Haven't really debugged this yet but the one nice thing I did notice is the limiting factor now is the serial, with transfer speed constant over any sysclock setting (as long as it's able to keep up with the baud). I've also been running 460800 baud, and had a few tests complete at 921600 baud, although I think self heating and sysclock >240mhz is a problem right now.

I'm also able to turn my tft lcd on and display some text. Started working on cog code for that, although SRAM is the next thing I need to verify. I've got some SPI code that should read the touch adc but have not even tried testing yet. Things are progressing though and it's amazing how fast this chip is!

ersmith · 2019-08-13 12:48

Just a note: I think I've figured out why Numbers.spin doesn't work right in fastspin, and I have checked workarounds into github (so it should work in the next fastspin release). There were two issues:

(1) As noted above, the order of evaluation of things like x[Digit++] |= Digit was different between openspin and fastspin; fastspin followed GCC's convention (the Digit++ on the LHS got evaluated before the Digit on the RHS) but openspin did it the other way around.

(2) Numbers.spin relies on "x/0" to equal 0, whereas fastspin's divide routine returned -1 for this.

I think relying on the value of division by 0 is particularly ugly, but I'll change fastspin to match the original Spin since perhaps other code relies on this too

.

potatohead · 2019-08-13 17:57

-1 is TRUE in SPIN, anything else is false. That was done to make booleans in conditional expressions easier. Just FYI.

ersmith · 2019-08-13 18:32

potatohead wrote: »

-1 is TRUE in SPIN, anything else is false. That was done to make booleans in conditional expressions easier. Just FYI.

Thanks. fastspin has always done this for Spin code, and so that wasn't an issue. I suppose one could argue that making "x/0" be 0 ties in to it being "false", but that's tenuous, since valid division like "1/2" also produces 0. Anyway, since division by 0 is undefined there's no particular harm in making the result 0 like Spin does, although many processors do return $FFFFFFFF (maximum 32 bit integer) for that case, as it's the natural output of the division algorithm when used with a divisor of 0.

cheezus · 2019-12-06 03:13

I think I've either found a bug in fastspin or maybe I'm just doing something wrong. Lookupz seems to be misbehaving as some working P1 spin doesn't seem to work on the P2. Here's the offending usage.

  value <<= (8 - digits) << 2
  repeat digits
    curx := TextChar(font,lookupz((value <-= 4) & $F : "0".."9", "A".."F"),Curx,Cury)

compiles to

_texthex
	mov	COUNT_, #17
	calla	#pushregs_
	mov	local01, arg01
	mov	local02, arg02
	mov	local03, arg03
	mov	local04, local02
	mov	local05, #8
	sub	local05, local03
	shl	local05, #2
	shl	local04, local05
	mov	local02, local04
	mov	local06, local03
LR__0045
	cmp	local06, #0 wz
 if_e	jmp	#LR__0046
	mov	local04, local01
	mov	local07, local02
	rol	local07, #4
	mov	local02, local07
	mov	local05, local02
	and	local05, #15
	mov	local08, local05
	mov	local09, #0
	add	ptr__dat__, ##5324
	mov	local10, ptr__dat__
	sub	ptr__dat__, ##5324
	mov	local11, local10
	mov	local12, #16
	mov	arg01, local08
	mov	arg02, local09
	mov	arg03, local11
	mov	arg04, local12
	calla	#__system___lookup
	mov	local13, result1
	mov	local14, local13
	add	objptr, ##3324
	rdword	local15, objptr
	sub	objptr, ##3324
	add	objptr, ##3326
	rdword	local16, objptr
	sub	objptr, ##3326
	mov	arg01, local04
	mov	arg02, local14
	mov	arg03, local15
	mov	arg04, local16
	calla	#_textchar
	mov	local17, result1
	add	objptr, ##3324
	wrword	local17, objptr
	sub	objptr, ##3324
	mov	local04, local06
	sub	local04, #1
	mov	local06, local04
	jmp	#LR__0045
LR__0046
	mov	ptra, fp
	calla	#popregs_
_texthex_ret
	reta

That's broken my hex to ascii method currently.

I'm also having problems with something like this.

CON
    SPI_A0  =  16  
…
    DIR := $0003_0000 
    OUT :=  ((spi_select & %11) << SPI_A0)

Again, I haven't had a chance to track this down and have OUT hard-coded to 0 for now. I tried a few different things and not quite sure what's going on.
Solved with a small wait, seems to be a race condition with the hardware? Need to track that down later when there's no ribbon cable

\

I've got a LOT of bugs in my own code to find still, I'm sure. And I still have a long way to go, but my project is starting to show signs of life. Hopefully @ersmith has some answers.

P2 Experiments- P1 source to P2 Fastspin

Comments