Shop OBEX P1 Docs P2 Docs Learn Events
C Expressive Enough for Idiomatic PASM? - Page 4 — Parallax Forums

C Expressive Enough for Idiomatic PASM?

124»

Comments

  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-08-21 13:34
    Kevin Wood wrote:
    Wouldn't the "same" code in say 3 cogs just r/w to a single array?
    Consider the case of serial I/O drivers, each with its own buffer, or multiple FIR filters.

    -Phil
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-06 17:51
    I have been thinking some more about this topic. Rather than try to compile all of C (or any other language) into pasm I am wondering if one could consider a more simplified version of a higher level language.

    One big problem with any assembly is 'spaghetti' code that jumps all over the place, so one goal would be to hide all jumps (GOTO is evil, right?). That means that jumps are hidden inside higher level structures like If, Switch, Repeat.

    I've been thinking of a 'bottom up' approach. Think of a commonly used pasm structure, and then replicate it in a higher level language.
    eg Conditional branching.

    cmp variable1, variable2 wz
    if_z jmp,#mylabel

    That translates into four types of If statement - equals, not equals, greater than and less than. For the sake of simplicity, we can leave out "greater than or equal" and "less than or equal".

    With those four structures we can translate that to higher level languages. Because the pasm is the same, one can think about writing a pre-compiler to do three languages at once - C, Spin and Basic

    So in Basic
    If variable1 = variable2 then
       some code
    EndIf
    

    in C
    if (variable1 == variable2)
      {
       some code;
      }
    

    and in Spin
    if (variable1 == variable2)
       some code
    

    We could call this new language Spin_C_Basic or something? Write in whatever syntax you prefer.

    I'm thinking of this hybrid language because not everything can be done in every language.

    Basic does not support >> or << for instance.
    C does not support binary numbers
    and Spin may not be for every programmer

    For/Do/Repeat/While loops translate fairly easily into pasm without needing optimisation as you use the most optimised code you can.

    One structure worth considering is the DJNZ loop.
    Normally with a loop you would set a variable, then run some code, then subtract or add 1, then test if it has reached the end and jump to the beginning if not.
    That takes three pasm instructions but if you use DJNZ you can get that down to two instructions.
    The equivalent in a higher level language is the 'repeat' in Spin, where you do not really care what the variable is within the code - you just want to do something n times. I am not sure if such a structure exists within C or Basic. In C the specific instance is for(i=n;i=0;i--) but it is not explicit that this will be converted to a DJNZ loop but a for loop with i++ will not. So this might be an argument for using Spin's 'repeat' and calling this a hybrid language rather than a pure subset of a language.

    That could all be debatable of course!

    What I am thinking is that if you can create a structure with no goto statements and which runs just as fast as the best pasm code, it makes the code much more readable. The odd pasm command in amongst that structure is now not so bad - eg instructions that really have no high level language equivalent like rdlong and wrlong. So just put them in.

    There was an interesting example on the first post about maths
    a = b+c

    now in pasm that is really two instructions, and you want b and c to be unchanged at the end and the variable a is the one changing so this leads to
    a = b
    a = a+c

    which translates to two pasm instructions
    More complex things like a = b+c+d on one line might be best avoided though as it may lead to different pasm code. Maybe there is a 'perfect' optimisation but one gets into shuffling railcar algorithms.

    assembly is still 'close to the metal' so I'm thinking of keeping math to just one instruction per line.

    Another thing I was thinking about was the way pasm puts all the variable declarations at the end of the program, whereas higher level languages declare variables at the beginning. A precompiler could simply move them all to the end, but for Spin/C/Basic programmers it will be one less thing that is new and unfamiliar about pasm.

    Then there is the issue of variables and constants. This one gets me every time as I keep needing to remind myself that pasm is a 9 bit language and that the number 511 is special.

    mov variable,#511 is allowed
    mov variable,#512 is not

    and if you leave out that # the program won't work either.

    These things can be picked up by a precompiler, and if you try to put in a constant more than 511, it could suggest that you create a new long to store the variable/constant, or recycle another long that already is using this variable/constant, or if the number is divisible by 2, use the trick of setting a number then bitshifting it. You might also even declare these as 'constants' rather than as variables, and then if you accidentally wrote some code that changed the value, the precompiler could warn that this is a constant, not a variable.

    In summary, I think pasm can be made a lot more legible by adding a structure that is more like a higher level language, removing all jumps, removing the need to know the difference between :mylabel and label, removing the need for adding # before labels and some values, and adding in some common error checking.

    I'm going to have a go at writing a precompiler and see how it works out.
  • jazzedjazzed Posts: 11,803
    edited 2011-09-06 18:28
    C works good enough for most people when it is available.
    GCC does a very good job generating code that runs in a COG today.

    PASM can be used when C is not good enough.
    Many prefer PASM, so any tool should support that.

    If your new tool is good enough I'm sure people will use it.
    Maybe it is worth a new topic in the Propeller forum?
  • Heater.Heater. Posts: 21,230
    edited 2011-09-06 22:42
    There are, and will continue to be, a multitude of programming languages. All designed to address some issue or other. Or just because people seem to have a facination with creating such things.
    The fundamental idea with a language like C is to be able to express an algorithm in such a way that it can easily be moved from one processor architecture to another. In that light, creating a language that is heavily dependent on a particular prossessors instruction set serves no purpose. Might as well use the available assemblers.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-07 07:07
    The aim is to write some C code that;
    1) Passes through a compilation with a standard C compiler with no errors (even though it won't produce anything that works) AND
    2) can be passed through a precompiler that can convert it to pasm that works

    The test of 2 is to take some display code and see it run on a display.

    I'd like to edit my post above.

    First, the C language CAN distinguish between constants and variables. Just use 'const' before each declaration. This will be brilliant because it will enable error generation if you ever try to change a value that has been declared a constant. So that is one thing that the current pasm compilers can't do.

    Next, all variables are 'unsigned long'

    Also, all variables are global. But many subroutines only work on a few variables, so it seems to me neater to pass those variables to a function as you can then see which variables that function needed. The precompiler would ignore these passed variables as they are global.

    Next, C uses 'return' from functions, so it could be useful to spend one long for a generic variable 'returnvalue'

    There are a number of keywords in pasm that don't mean anything in C, like par and dira. We can declare these as fake variables but they don't use any longs.

    There are also a number of pasm commands that can be declared as fake functions, like rdlong().

    Finally, with respect to binary numbers, well they don't exist in C, but Leor Zolman of BDSC fame (good 'ol CP/M) posted a cunning solution using macros http://www.velocityreviews.com/forums/t318127-using-binary-numbers-in-c.html so we could use that.

    So, having set this up, time to convert some pasm to C. E&OE with my C programming - there are bound to be some semicolons left out and hopefully there are no syntax errors. I'll find them anyway when I run this through a real C compiler. In general, take some spaghetti code and write it such that there are no GOTO's and no jumps. Can it be done? yes, I think it can, and I think it is more easily readable with the indents.

    And by reverse engineering pasm into a higher level language, the code should end up the same. That is the aim, anyway!

    Original code (from my 256x224 line full color TV driver)
    ' Ram driver
    
    VAR
    
      long  cog
    
    PUB Start(ramptr) : okay
      stop
      okay := cog := cognew(@entry, ramptr) + 1  
    
    PUB stop
    
    '' Stop ram driver - frees a cog
    
      if cog
        cogstop(cog~ - 1)   
    
    DAT
    
    '*******************************
    '* Assembly language Ram driver *
    '*******************************
    
    ' check the line number. If it has changed, then start reading the next line in and put @ screen
    ' if it has not changed then repeat the loop
    ' if it is the last line then restart the ram counter variable
    
    ' currentline counts up from 0 to tiles*16
    ' the video driver recycles the same 8*256 bytes in the screen buffer
    ' first, wait for currentline to equal zero and when it does, start reading in ram to @screen
    ' check the currentline. Shift left by 2 and mask-and with 1. 0-3 is 0, 4-7 is 1, 8-11 is 0, 12-15 is 1 etc
    
    ' Ram format. Each line is up to 512 pixels long. 256 is the standard video driver but over 300 does work
    ' so each line is 9 bits
    ' Then there are 4 lines in a group. Read these all in from ram in one subroutine call
    ' So that is 11 bits.
    ' And then these are in two groups of 4. So that is 12 bits.
    ' So - once 8 lines have been read in (12 bits = 4096 bytes), then change the latch
    ' There are 256/8 = 32 groups of 8 lines so that is 5 bits of the latch that are used
    ' and the 512k ram chip can store 4 frames (if needed)
    ' P0-7 is the data
    ' P8-P19 is the address for 8 lines (12 bits)
    ' P20 controls the latch enable
    ' P21 is /rd, P22 is /wr and P23 is /oe (chip select)
    ' ** There are pullups on P20 to P23 as the ram chip needs to retain data when transferring from spin to cog **
    
    
    
    
            org
    
    entry
                            mov             t1,par
                            mov             status,t1
                            add             t1,#4
                            rdlong          _nextline,t1               ' line number in hub (updated by graphics cog) use with rdlong t1,_nextline
                            add             t1,#4
                            rdlong          screen_buf,t1              ' screen buffer location
                            add             t1,#4
                            rdlong          pixels_x,t1                 ' pixels per line
                            add             t1,#4
                            rdlong          serial_buffer,t1            ' serial buffer for data coming in
    
                            mov             ramline,screen_buf
                            add             ramline,topram              ' location of the temp 256 byte buffer that is rewritten each alternate line
                            mov             serial_buf_last,serial_buffer
                            add             serial_buf_last,#256        ' last entry in the serial buffer            
                            mov             outa,oewrrd                 ' oe/wr/rd/latch all high  %00000000_11110000_00000000_00000000         
                            mov             dira,dira_pins              ' enable these pins for output  %00000000_11111111_11111111_00000000
    
    ' in 'spaghetti' code with jumps all over the place - but in a way that maximises speed (this works better than djnz loops)
    ' reset everything
    ' wait for zero line
    ' label1
    ' read line of data
    ' add 2 to local line counter
    ' if locallinecounter>=224 then back to beginning
    ' add 1 to blockcounter
    ' if blockcounter=16 then reset blockcounter and increment latchcounter
    ' wait for current_line to equal localline (will be 2,4,6,8 etc)
    ' goto label1
    
    ' *** don't forget # after all jmp and call instructions!!! ****
    
    main_loop_1             mov             sram_address,#0                 ' start reading ram at address #0
                            mov             localcounter,#0                 ' my local line counter
                            mov             blockcounter,#0                 ' block is 256 bytes, 16 blocks per latch
                            mov             latchcounter,#0                 ' block is 4096 bytes reset block counter
                            call            #selectlatch                    ' latch to zero
    main_loop_2             mov             hubaddr,ramline                 ' setup for rdblock
                            mov             len,pixels_x                    ' setup for rdblock
    main_loop_3             rdlong          current_line,_nextline          ' read the line that is being displayed
                            cmp             current_line,localcounter wz    ' equal to localcounter?
                  if_nz     jmp             #main_loop_3                    ' loop until it is
                            call            #rdblock                        ' read in 256 bytes
                            add             sram_address,#256               ' next 256 bytes
                            add             localcounter,#2                 ' add 2 get ready for next line
                            cmp             localcounter,#224 wc            ' end of the screen = 224
    '              if_nc     jmp             #main_loop_1                    ' restart everything
                  if_nc     jmp             #process_serial   ' any new pixels?  
                            add             blockcounter,#1                 ' add 1 to block counter
                            cmp             blockcounter,#16  wz            ' time for new block?
                  if_nz     jmp             #main_loop_2                    ' skip if still on current block
                            mov             sram_address,#0                 ' reset the ram address
                            mov             blockcounter,#0                 ' reset blockcounter
                            add             latchcounter,#1                 ' add 1 to latchcounter
                            call            #selectlatch                    ' latch to latchcounter
                            jmp             #main_loop_2
    
    
    ' 10k pullups on /oe, /rd and /wr and latch and don't put leds on these three pins
    
    'Read a "len" byte block given by "sram_address" into hub at "hubaddr"  - Cluso99's ram driver
    ' preserves sram_address
    rdblock               
                            mov             t1,sram_address         ' store sram address
                            shl             t1,#8                   ' shift left so in the right position
                            or              t1,oerd                 ' %00000000_01010000_00000000_00000000 ' /oe and /rd low 
                            ' outa pre-filled with the address shifted over by 8 and the /oe and /rd low
                            mov             outa,t1                        ' send it out
                            'nop                             ' cluso's driver has a nop but it seems ok without one    
    rdloop                  mov             t2, ina               ' read byte from SRAM \ ignores upper bits
                            wrbyte          t2, hubaddr           ' copy byte to hub    /
                            add             hubaddr, #1             ' inc hub pointer
                            add             outa, #(1 << 8)         ' inc sram address
                            djnz            len, #rdloop            ' loop for xxx bytes
                            mov             outa,oewrrd             ' oe,wr and rd and latch all high
    rdblock_ret             ret
    
    
    ' pass latchcounter
    selectlatch             mov              dira,blockmask            ' enable different pins to reading, ie P0-7 are outputs now
                            mov              t1,latchcounter           ' get the latch counter
                            and              t1,#%01111111              'mask off high bit as 512k/4096/16 is 128
                            mov              outa,t1                    ' output blockcount byte
                            and              outa,latchlow              ' set latch low
                            'nop                                                       ' delay not needed
                            'nop
                            mov              dira,dira_pins             ' enable these pins for output  %00000000_11111111_11111111_00000000
                            mov              outa,oewrrd                ' set oe,wr,rd and latch high
    selectlatch_ret         ret
                          
    ' Write a "len" byte block given by t3 in external ram from hub at t4  - Cluso99's ram driver 
    wrblock
                            mov             dira,pin0to23            ' dira pins %00000000_11111111_11111111_11111111                   
    wrloop                  mov             t1,t3                    ' store sram address in t1
                            shl             t1,#8                    ' shift left so in the right position
                            rdbyte          t2,t4                    ' get the byte
                            or              t1,oewr                  ' or with %00000000_00110000_00000000_00000000
                            or              t1,t2                    ' or with the hub address
                            mov             outa,t1                  ' send it out
                            nop                                      ' wait a little
                            nop
                            or              outa,oewrrd              ' oe,wr and rd all high
                            add             t4,#1                    ' add 1 to hub address
                            add             t3,#1                    ' add 1 to sram address
                            djnz            len, #wrloop             ' loop for xxx bytes
                            mov              dira,dira_pins          ' enable these pins for output  %00000000_11111111_11111111_00000000 
                            mov             outa,oewrrd              ' oe,wr and rd and latch all high
    wrblock_ret             ret
    
    
    
    
    
    ' read serial data. The first byte is the row number, then 256 bytes
    ' if the last byte is zero then has been read, if not zero then read
    ' if the last byte is not zero, then process the data and then write a zero to the last byte
    ' if processing and the row is even then store to hub ram
    ' if the row is odd then store to external ram
    
    process_serial          rdbyte   t1,serial_buf_last                 ' value in the last byte of the buffer
                            cmp      t1,#0 wz
                      if_z  jmp      #main_loop_1                       ' do nothing if zero
                            mov      t2,serial_buffer                   ' t2 is a counter
                            mov      t3,screen_buf                      ' t3 is the screen buffer
                            rdbyte   t1,t2                              ' t1 is the row number
                            mov      t5,t1                              ' t5 also contains the row number            
                            add      t2,#1                              ' increment - 1st byte is row number, data starts at 1
                            mov      t4,#256                            ' count timer
                            and      t1,#1                              ' mask out all but LSB
                            cmp      t1,#1 wz                           ' test if odd or even
                      if_z jmp      #serial_external                    ' odd so write to external ram
                            'put in hub ram line 0 =0, 2=1, 4=2 (y/2)*256 is the same as y*128
                            shl      t5,#7                              ' multiply by 128
                            add      t3,t5                              ' get hub location to store
    serial_hub
                            rdbyte   t1,t2                              'get the first byte t1 = row number, multiply by 256 and add to t3
                            wrbyte   t1,t3                              ' store to hub ram
                            add      t2,#1                              ' increment source
                            add      t3,#1                              ' increment destination
                            djnz     t4,#serial_hub                    ' do 256 times            
                            jmp      #serial_finish                     ' set last byte to zero to say finished
    
    serial_external        ' move data to external ram t5 contains row number - see pixels routine for the maths
                            rdbyte   t1,serial_buffer                   ' t1 is the row number 0-31 = block 0, 32-63=block1            
                            shr      t1,#5                              ' divide by 32
                            mov      latchcounter,t1                    ' set up latchcounter
                            call     #selectlatch                       ' select the correct latch
                            rdbyte   t3,serial_buffer                   ' get the row number again
                            and      t3,#%11111                         ' mask off so max is 31 as sram resets each block
                            sub      t3,#1                              ' subtract 1
                            shl      t3,#7                              ' multiply by 128 ie 256*(y-1)/2
                            mov      len,#256                           ' 256 bytes to write
                            mov      t4,serial_buffer                   ' serial buffer location
                            add      t4,#1                              ' add 1 so points to correct place
                            call     #wrblock                           ' write len bytes to sram t3 from hub t4
                            jmp      #serial_finish                     ' set last byte to zero to say finished reading
    
    serial_finish           mov      t1,#0                              ' set value to non zero to say this has been read
                            wrbyte   t1,serial_buf_last                 ' store to the serial buffer
                            jmp      #main_loop_1
    'rwo 1 =sram0,row 3=sram256, row 5=sram512, row33=sram0, block1 
    
    finish     jmp #finish
    
    
    ' Initialised data
    'address_mask    long %00000000_00000000_00011111_11111111                     ' 13 bits max 32 lines otherwise need to latch
    oewrrd          long %00000000_11110000_00000000_00000000                     ' 23 = oe, 22=wr, 21=rd, latch high
    oerd            long %00000000_01010000_00000000_00000000                     ' oe and rd low
    oewr            long %00000000_00110000_00000000_00000000                     ' oe and wr low
    dira_pins       long %00000000_11111111_11111111_00000000                     ' direction pins to be enabled                         
    pin0to23        long %00000000_11111111_11111111_11111111                     ' for testing
    zero            long %00000000_00000000_00000000_00000000                     ' for testing
    'smallblock      long 1024
    fivetwelve      long %00000000_00000000_00000010_00000000                       ' 512 for sram increment
    blockmask       long %00000000_00010000_00000000_11111111                       ' mask for block select
    latchlow        long %11111111_11101111_11111111_11111111                               ' mask for P20 latch                                                   
    topram          long 28672
    
    ' Uninitialized data
    
    status          res 1        ' status location in hub ram
    t1              res 1        ' local variable
    t2              res 1        ' local variable
    t3              res 1        ' local variable
    t4              res 1        ' local variable
    t5              res 1        ' local variable
    screen_buf      res 1        ' location of the screen buffer
    _nextline       res 1        ' currrent line - updated by the video driver
    sram_address    res 1        ' ram address value
    hubaddr         res 1
    len             res 1
    current_line    res 1
    pixels_x        res 1
    localcounter    res 1
    blockcounter    res 1
    latchcounter    res 1
    serial_buffer   res 1
    serial_buf_last res 1
    ramline         res 1 ' location of a single line in the top part of the screen memory that can be rewritten
    
            
            fit 496
    

    and converted to a C syntax (apologies if this has wordwrap on your computer, if it does then copy it into Notepad or Word and turn wordwrap off)
    // External ram pasm driver in PasmC
    // variables passed to functions are fake and are ignored by the compiler but are there for improving readability (all variables are global)
    // the generic return variable from a function is returnvalue
    
    // Constants (all global)
    const unsigned long oewrrd =         %00000000_11110000_00000000_00000000;            	//   23 = oe, 22=wr, 21=rd, latch high
    const unsigned long oerd =           %00000000_01010000_00000000_00000000;              // oe and rd low
    const unsigned long oewr =           %00000000_00110000_00000000_00000000;              // oe and wr low
    const unsigned long dira_pins =      %00000000_11111111_11111111_00000000;              // direction pins to be enabled
    const unsigned long pin0to23 =       %00000000_11111111_11111111_11111111;              // for testing
    const unsigned long zero =           %00000000_00000000_00000000_00000000;              // for testing
    const unsigned long fivetwelve =     %00000000_00000000_00000010_00000000;              // 512 for sram increment
    const unsigned long blockmask =      %00000000_00010000_00000000_11111111;              // mask for block select
    const unsigned long latchlow =       %11111111_11101111_11111111_11111111;              // mask for P20 latch  
    const unsigned long topram =         28672;						// size of ram
    
    // Variables (all global)
    
    unsigned long returnvalue; 								// generic return from functions
    unsigned long status;          								// status location in hub ram
    unsigned long t1;              								// local variable
    unsigned long t2;              								// local variable
    unsigned long t3;              								// local variable
    unsigned long t4;              								// local variable
    unsigned long t5;              								// local variable
    unsigned long screen_buf;      								// location of the screen buffer
    unsigned long _nextline;       								// currrent line - updated by the video driver
    unsigned long sram_address;    								// ram address value
    unsigned long hubaddr;         
    unsigned long len;             
    unsigned long current_line;    
    unsigned long pixels_x;        
    unsigned long localcounter;   
    unsigned long blockcounter;    
    unsigned long latchcounter;    
    unsigned long serial_buffer;   
    unsigned long serial_buf_last; 
    unsigned long ramline;         								// location of a single line in the top part of the screen memory that can be rewritten
    
    // Keyword variables for the propeller, these are declared here but don't cost any longs
    // par, outa,dira,ina
    
    unsigned long par;
    unsigned long outa;
    unsigned long dira;
    unsigned long ina;
    
    //        org
    //entry
    
    main()
    {
    	t1 = par;						// mov             t1,par
    	status = t1;						// mov             status,t1
    	t1 += 4;						// add             t1,#4
            rdlong(_nextline,t1);           			// rdlong          _nextline,t1               ' line number in hub (updated by graphics cog) use with rdlong t1,_nextline
    	t1 += 4;						// add             t1,#4
            rdlong(screen_buf,t1);					// rdlong          screen_buf,t1              ' screen buffer location
            t1 += 4;                				// add             t1,#4
            rdlong(pixels_x,t1);					// rdlong          pixels_x,t1                 ' pixels per line
            t1 += 4;						// add             t1,#4
            rdlong(serial_buffer,t1);				// rdlong          serial_buffer,t1            ' serial buffer for data coming in
    	ramline = screen_buf;					// mov             ramline,screen_buf
            ramline += topram;					// add             ramline,topram              ' location of the temp 256 byte buffer that is rewritten each alternate line
            serial_buf_last = serial_buffer;			// mov             serial_buf_last,serial_buffer
            serial_buf_last += 256;         			// add             serial_buf_last,#256        ' last entry in the serial buffer            
            outa = oewrrd;                				// mov             outa,oewrrd                 ' oe/wr/rd/latch all high  %00000000_11110000_00000000_00000000         
            dira = dira_pins					// mov             dira,dira_pins              ' enable these pins for output  %00000000_11111111_11111111_00000000
    	do							// main loop 1
    	{
    		sram_address = 0;	// mov             sram_address,#0                 ' start reading ram at address #0
                    localcounter = 0;	// mov             localcounter,#0                 ' my local line counter
                    blockcounter = 0;	// mov             blockcounter,#0                 ' block is 256 bytes, 16 blocks per latch
                    latchcounter = 0;	// mov             latchcounter,#0                 ' block is 4096 bytes reset block counter
    		selectlatch(latchcounter);	// call            #selectlatch                    ' latch to zero
    		do
    		{
                 		hubaddr = ramline;			// mov             hubaddr,ramline                 ' setup for rdblock
                            len = pixels_x;				// mov             len,pixels_x                    ' setup for rdblock
    			do
    			{
    				rdlong(current_line,_nextline);	// rdlong          current_line,_nextline          ' read the line that is being displayed
    			}
    			while(current_line != localcounter);	// cmp             current_line,localcounter wz    ' equal to localcounter? then if_nz     jmp             #main_loop_3                    ' loop until it is
    			rdblock(len,sram_address,hubaddr);	// call            #rdblock                        ' read in 256 bytes
    			sram_address += 256;			// add             sram_address,#256               ' next 256 bytes
            		localcounter += 2;			// add             localcounter,#2                 ' add 2 get ready for next line
    			blockcounter++;        			// add             blockcounter,#1                 ' add 1 to block counter
    			if (blockcounter > 15)			// time for a new block?
    			{
    				sram_address = 0;		// mov             sram_address,#0                 ' reset the ram address
                            	blockcounter = 0;		// mov             blockcounter,#0                 ' reset blockcounter
                            	latchcounter++;			// add             latchcounter,#1                 ' add 1 to latchcounter
            			selectlatch(latchcounter);	// call            #selectlatch                    ' latch to latchcounter
    			}
    		}
    		while(localcounter < 224);
    		process_serial();				// any new pixels?
    	} 
    	while(1);						// repeat main loop forever
    }
    
    rdblock(unsigned long len,unsigned long sram_address, unsigned long hubaddr) // read a "len" byte block given by "sram_address" into hub at "hubaddr"  - Cluso99's ram driver 
    {              
    	t1 = sram_address;					// mov             t1,sram_address         ' store sram address
            t1 <<= 8;						// shl             t1,#8                   ' shift left so in the right position
            t1 |= oerd;						// or              t1,oerd                 ' %00000000_01010000_00000000_00000000 ' /oe and /rd low 
            outa = t1;						// mov             outa,t1                        ' send it out
            // nop();							// 'nop                             ' cluso's driver has a nop but it seems ok without one    
    	for(len = len;len > 0;len--)				// generic djnz loop, translated to a djnz at the closing curly bracket
    	{
    		t2 = ina;					// mov             t2, ina               ' read byte from SRAM \ ignores upper bits
                    wrbyte(hubaddr,t2);				// wrbyte(destination,source)  wrbyte          t2, hubaddr           ' copy byte to hub    /
                    hubaddr++;        				// add             hubaddr, #1             ' inc hub pointer
                    outa += 256;					// add             outa, #(1 << 8)         ' inc sram address
            }							// djnz            len, #rdloop            ' loop for xxx bytes
            outa = oewrrd;                				// mov             outa,oewrrd             ' oe,wr and rd and latch all high
    }								// rdblock_ret     ret                     ' return from a function
    
    selectlatch(unsigned long latchcounter)				// pass latchcounter
    {
    	dira = blockmask;					// mov              dira,blockmask            ' enable different pins to reading, ie P0-7 are outputs now
            t1 = latchcounter;					// mov              t1,latchcounter           ' get the latch counter
            t1 &= %01111111;					// and              t1,#%01111111              'mask off high bit as 512k/4096/16 is 128
            outa = t1;						// mov              outa,t1                    ' output blockcount byte
            outa &= latchlow;					// and              outa,latchlow              ' set latch low
            							// 'nop                                                       ' delay not needed
                            					// 'nop
          	dira = dira_pins;					// mov              dira,dira_pins             ' enable these pins for output  %00000000_11111111_11111111_00000000
            outa = oewrrd;						// mov              outa,oewrrd                ' set oe,wr,rd and latch high
    }								// selectlatch_ret         ret
                          
    wrblock(unsigned long len,unsigned long t3,unsigned long t4)     // ' Write a "len" byte block given by t3 in external ram from hub at t4  - Cluso99's ram driver 
    {
    	dira = pin0to23;					// mov             dira,pin0to23            ' dira pins %00000000_11111111_11111111_11111111 
    	for(len = len;len >0; len--)				// generic djnz loop                  
    	{                  
    		t1 = t3;					// mov             t1,t3                    ' store sram address in t1
                    t1 <<= 8;					// shl             t1,#8                    ' shift left so in the right position
                    rdbyte(t2,t4);					// rdbyte          t2,t4                    ' get the byte
                    t1 |= oewr;					// or              t1,oewr                  ' or with %00000000_00110000_00000000_00000000
                    t1 |= t2;					// or              t1,t2                    ' or with the hub address
                    outa = t1;					// mov             outa,t1                  ' send it out
                    nop();       					// nop                                      ' wait a little
                    nop();						// nop
                    outa |= oewrrd;					// or              outa,oewrrd              ' oe,wr and rd all high
                    t4++;						// add             t4,#1                    ' add 1 to hub address
                    t3++;						// add             t3,#1                    ' add 1 to sram address
    	}							// djnz            len, #wrloop             ' loop for xxx bytes
            dira = dira_pins;					// mov              dira,dira_pins          ' enable these pins for output  %00000000_11111111_11111111_00000000 
            outa = oewrrd;						// mov             outa,oewrrd              ' oe,wr and rd and latch all high
    }								// wrblock_ret             ret
    
    
    process_serial()
    {
    	rdbyte(t1,serial_buf_last);		// rdbyte   t1,serial_buf_last                 ' value in the last byte of the buffer
            if (t1 != 0)				// cmp      t1,#0 wz  if_z  jmp      #main_loop_1                       ' do nothing if zero
    	{                       
    		t2 = serial_buffer;		// mov      t2,serial_buffer                   ' t2 is a counter
                    t3 = screen_buf;		// mov      t3,screen_buf                      ' t3 is the screen buffer
                    rdbyte(t1,t2);			// rdbyte   t1,t2                              ' t1 is the row number
                    t5 = t1;			// mov      t5,t1                              ' t5 also contains the row number            
                    t2 += 1;			// add      t2,#1                              ' increment - 1st byte is row number, data starts at 1
                    t4 = 256;			// mov      t4,#256                            ' count timer
                    t1 &= 1;			// and      t1,#1                              ' mask out all but LSB
    		if (t1 == 0)
    		{				// cmp      t1,#1 wz                           ' test if odd or even,  if_z jmp      #serial_external                    ' odd so write to external ram
                            			//'put in hub ram line 0 =0, 2=1, 4=2 (y/2)*256 is the same as y*128
    			t5 <<= 7;		// shl      t5,#7                              ' multiply by 128
    			t3 += t5;		// add      t3,t5                              ' get hub location to store
    			for (t4 = t4;t4 > 0;t4--)	// generic djnz loop
    			{
    				rdbyte(t1,t2);	// rdbyte   t1,t2                              'get the first byte t1 = row number, multiply by 256 and add to t3
                            	wrbyte(t3,t1);	// wrbyte   t1,t3                              ' store to hub ram
                            	t2++;		// add      t2,#1                              ' increment source
                            	t3++;		// add      t3,#1                              ' increment destination
                            }			//djnz     t4,#serial_hub                    ' do 256 times, jmp      #serial_finish                     ' set last byte to zero to say finished
    		}
    	        else
    		{
    		rdbyte(t1,serial_buffer);	// rdbyte   t1,serial_buffer                   ' t1 is the row number 0-31 = block 0, 32-63=block1            
                    t1 >>= 5;			// shr      t1,#5                              ' divide by 32
                    latchcounter = t1;		// mov      latchcounter,t1                    ' set up latchcounter
                    selectlatch(latchcounter);	// call     #selectlatch                       ' select the correct latch
                    rdbyte(t3,serial_buffer);	// rdbyte   t3,serial_buffer                   ' get the row number again
                    t3 &= %11111;		        // and      t3,#%11111                         ' mask off so max is 31 as sram resets each block
                    t3--;				// sub      t3,#1                              ' subtract 1
                    t3 <<= 7;		        // shl      t3,#7                              ' multiply by 128 ie 256*(y-1)/2
                    len = 256;			// mov      len,#256                           ' 256 bytes to write
                    t4 = serial_buffer;		// mov      t4,serial_buffer                   ' serial buffer location
                    t4++;				// add      t4,#1                              ' add 1 so points to correct place
    		wrblock(len,t3,t4);		// call     #wrblock                           ' write len bytes to sram t3 from hub t4
    		}
    	}
    	t1 = 0;					// mov      t1,#0                              ' set value to non zero to say this has been read
            wrbyte(serial_buf_last,t1);		// wrbyte   t1,serial_buf_last                 ' store to the serial buffer
    }
    
    // list of fake functions that do nothing (in a simulator though they could be very useful)
    nop()
    {
    }
    
    rdbyte(unsigned long destination, unsigned long source)
    {
    }
    
    wrbyte(unsigned long destination, unsigned long source)
    {
    }
    }
    
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-09-07 07:42
    Dr_A, that's pretty clever. It reminds me a lot of PL360, the "high-level assembler language" designed by Niklaus Wirth for the IBM 360:

    -Phil
  • Mike GreenMike Green Posts: 23,101
    edited 2011-09-07 20:48
    Phil,
    I worked a lot with PL360 maybe 35 years ago, made extensive changes to the existing compiler and used it for a large project. I've thought several times about its applicability to the Propeller. The biggest issues involve the use of conditional execution and tight control of timing. I couldn't come up with a way to communicate what information would be needed. Less difficult, but not solved, would be how to communicate the use of the flags when they're used in complex ways. The 360 had just one condition code that was always set by a large group of instructions in a mostly consistent way. The processors that had multiple execution units did all of the work in hardware to keep the execution units busy and sort out data dependencies.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-09-07 21:10
    Mike,

    38 years ago (OMG: scary realization!) is about the time I had my (brief) exposure to PL360. I was taking a compiler course in grad school (U. Mich.), and my finals project partner and I decided to write the compiler in PL360. I really liked the concept, and it was possible to write tight -- but readable -- code.

    Something like PL360 would certainly be nice for the Propeller, but I can well understand the difficulties arising from conditional execution and timing issues. Should you decide to pursue it, though, I would definitely be a willing guinea pig!

    -Phil
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-07 22:56
    Next step is to write the code to translate this. Because there is only one instruction per line it should be a fairly simple 'find and replace' eg <<= is replaced with shl.

    It might also be possible to add in Basic and Spin syntax - I'll see how the coding works out. Maybe even PL360!
  • Bill HenningBill Henning Posts: 6,445
    edited 2011-09-08 13:35
    Minor correction:

    rdlong r,x

    only takes 8 cycles if the two "hub delay slots" can be filled with non-hub instructions
    long a,b,c;
    
    c = a + b
    

    with a, b and c being hub variables, would compile to:
    // best case
    
    rdlong r0, a_ptr   ' 16-25 cycles
    rdlong r1, b_ptr   ' 16 cycles
    add  r0, r1            '  free
    wrlong  r0, c_ptr  ' 16 cycles
    
    // GCC might be able to insert another 6 non-hub instructions in the unused delay slots
    
    // Total execution time: 48-57 cycles
    
    a_ptr  long @hub_a
    b_ptr  long @hub_b
    c_ptr  long @hub_c
    

    Now a propeller oriented cog only mode could come up with:
    mov  c,a
    add   c,b
    
    // 8 cycles, 6..7+ faster than best case hub variable version
    
    a  long
    b  long
    c  long
    

    I am still very impressed with the code quality shown in this thread for the early gcc port!
    ersmith wrote: »
    Arrays will have to be stored in hub memory. In fact by default all data will be stored in hub memory, so if you just declare "int x" or "int a[9]" these will be declaring hub variables. To put "x" in cog memory you'll have to give it a special attribute. There will be no way (at least in the first release) to put an array like "a" into cog memory.

    Yes, it will be possible to create/access arrays in cog memory using inline assembly. We'll probably create some macros for doing this, so everyone doesn't have to re-invent the wheel.

    Note, though, that if you think about it arrays in cog memory are not any more efficient than arrays in hub memory. The sequence to access a pointer in hub memory (r = *x) would look something like:
    rdlong r,x
    
    and it takes 8 cycles (if we're aligned on the hub window). The same sequence for accessing a cog ram variable would look like:
    movs :temp,x
    nop
    :temp mov r,0-0
    
    and it takes 12 cycles. So the hub memory code will actually be faster, if it can be kept aligned to the hub window access. gcc tries to do this; it can't always succeed, but keeping track of tedious things like instruction timings is something computers are actually pretty good at.

    Eric
  • AribaAriba Posts: 2,690
    edited 2011-09-08 18:20
    @Dr_Acula

    Before you do the work and reinvent the AAC look at this thread from Bob Anderson:
    http://forums.parallax.com/showthread.php?117211

    Andy
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-08 18:36
    Great link, thanks Ariba.

    I'm still experimenting with either a 'new' language, or whether it is possible to create something that is a standard language like C. I think the latter is almost possible.

    We are all thinking very similar thoughts here. Bob Anderson's work is nice because he is writing a C-like language in C itself (I was going to use vb.net but C# is very similar).

    Down the track I'd like to think about an IDE that has an 'autocomplete' feature that writes some of the code needed to integrate cogs. Write some of the 'postbox' code automatically so you don't have to remember that you need to increment pointers in the pasm side by 4 instead of 1. A first step down that path is to pick one language and I think C is a good choice (though I think this is entirely possible in Spin (and also an augmented version of Basic if you allow things like bitshifting)).
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-11 23:40
    I did some experimental coding last night. I've just had a long read through Bill Henning's LMM thread http://forums.parallax.com/showthread.php?89640-ANNOUNCING-Large-memory-model-for-Propeller-assembly-language-programs!

    I've started with the idea that you translate machine instructions into C rather than the other way round, so there is no need for stacks and asking questions about the most efficient code because you write it to be the most efficient.

    Where can you take this?

    Well, I started with trying to work with Spin, Basic, C and PASM and I think the number of combinations goes up as a factorial. Add in one more language (Pascal, or Java) and it becomes so daunting that no-one will ever do it.

    So I started with another approach. Translate Spin to Basic. Basic to C. C to Pasm. Pasm to Spin. Go round in a loop like that and you can start with any one of those languages and generate the other three. If you go round in a loop, you should end up with the same code at the end in the language you started.

    Now - if you wanted to add another language, you only have to insert it into the loop, ie write two translators.

    So I started with Spin to Basic then Basic to C. I have hand coded Pasm to C above. Most of this is just 'find and replace' and sometimes it is easy like 'find := and replace with ='.

    Some are a little more complex - eg with Spin, there is a CON section but for both C and Basic that gets replaced with things like Public Const or const. Each language seems to have the things you need. Even modern versions of Basic have << bitshifts so we can replicate pasm instructions without resorting to too many fake function calls.

    I was even getting a bit carried away thinking that if this works in cog pasm, could it work in LMM pasm as well? It probably can, and it would be fast too, because if you wrote in the limited Spin language you would effectively be writing in compiled spin rather than interpreted spin.

    One could even think about objects using this language. An object is self contained in terms of constants and variables, and one can replicate that in vb.net syntax using classes, and in C# using the same class structure. Spin defines objects as separate files, whereas .net languages define objects within the same large text file but with classes.

    I'm experimenting with tiny programs at the moment, but if you replace pasm's "entry" with vb.net's standard form1_load, you could start to have code that might be able to run in C# and vb.net. That could be very handy for debugging your pasm code - just drop it into .net.

    I'm reading this tutorial on OOP http://msdn.microsoft.com/en-us/library/ms973814.aspx and there is some information in there that is relevant to the way Spin works. This could also lead to an OOP version of LMM Spin that could take us to Big Spin using external memory.

    I'm not sure where it could go. First step is to get a loop working where the code goes in as any language and comes out after being translated to the other three and then comes back the same code. Easy if you insist on strict syntax rules eg "a = b" with spaces, but a little harder if you have to allow for "a= b" and "a =b", with or without a comment line, and with or without the comment starting after spaces/tabs or straight after the 'b'"

    The strings.instr function is getting a good workout but has to be tested over and over with real code because if it returns zero for no match that crashes most string functions.

    I don't think this is going to lead to "printf" in a cog, but it could lead to a C program with all the code in C and no pasm visible. Back to coding...
  • Heater.Heater. Posts: 21,230
    edited 2011-09-12 01:50
    Dr_A,
    So I started with another approach. Translate Spin to Basic. Basic to C. C to Pasm. Pasm to Spin. Go round in a loop like that and you can start with any one of those languages and generate the other three. If you go round in a loop, you should end up with the same code at the end in the language you started.
    If you haven't already I think you are about to lose your mind:)

    I think you will find that in general, given some assembler language code compiled from a high level language it is impossible translate that back to anything that looks like the original high level language source. For example when compiling a C program to ASM it is impossible to get the original C source back from the ASM listing.

    Basically the compilation process throws away a lot if information that was in the original high level language source. Making the dis-assembly impossible.

    This is a general problem in all computation, given the number 9 you have no idea if that was computed from 3 times 3 or 7 + 2 or whatever.

    The dis-assembly problem gets even worse if the compiler has been busy optimizing your code as it goes.

    My conclusion: You are attempting the impossible and will go nuts trying:)

    Aside: Every time I say something is impossible around here someone manages to go ahead and do it. Now a days I say everything is impossible just to ensure that it gets done:)

    Turning to the topic of this thread: C Expressive enough for idiomatic PASM?

    Perhaps not. I suspect there are always features of processors that are hard to make use of from bog standard C. Often these features are accommodated by making use of some "magic" built in functions. Or attributes attached to functions/variables. Or odd compiler pragmas. (All of which makes the resulting code totally non-portable by the way) It may well be impossible to produce the most optimal PASM from a C program as a programmer might claim to be able to do by hand.

    However, noting that the evolving Propeller GCC C compiler can produce real native PASM to run in COG I think you will be surprised at how well it does.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-12 17:02
    Good points heater
    For example when compiling a C program to ASM it is impossible to get the original C source back from the ASM listing.

    Basically the compilation process throws away a lot if information that was in the original high level language source. Making the dis-assembly impossible.

    This is the key to what I believe is possible - write code that does not throw away any information.

    Consider an "if" statement. In C
    if (condition == value)
    {
      some code
    }
    

    and in pasm
             cmp condition,value wz
    if_nz    jmp #skipcode
             some code
    skipcode more code
    

    No information is lost there. But what is difficult is extracting the information. I am thinking of two pieces of generic code.

    The first extracts information out of a single line. Add in spaces and then cut up the line into various elements. Then look for matches.
    The second piece of code looks for patterns between lines. eg if line1, element1 is "cmp" and line2, element2 = "jmp" then this is an 'if' statement.

    The patterns are different for loops - for, while, repeat, do, and different again for select/case/switch but I believe the patterns are different and so it should be possible to do the translation without losing information.

    Not all pasm code can be reverse translated - eg escaping from the middle of subroutines with a jmp, but my feeling is this sort of 'bad spaghetti' code is the sort of code to avoid anyway.

    The code ought to end up as standard C, Spin and Basic - just a subset of these languages (things like floating point maths are not going to be easy, but strings might be possible).

    I shall give this a try (or go mad in the process!)
  • Heater.Heater. Posts: 21,230
    edited 2011-09-13 01:50
    Dr_A,
    This is the key to what I believe is possible - write code that does not throw
    away any information. ...No information is lost there.
    That's very clever, but you are only trying to trick people into agreeing with
    your delusion:)

    The reason your example appears not to throw away any information is that there
    is pretty much no information in the original C source! That code will not
    compile as it is so you at least have to wrap it up a bit and declare your
    variables:
    int main(int argc, char* argv[])
    {
      int condition;
      int value;
      if (condition == value)
      {
        // Some code
      }
      return(0);
    }
    
    Now potentially that PASM sequence could result from compiling this. And
    potentially you could translate that PASM sequence back to the program above.

    BUT wait a minute. Who said condition and value were signed integers?
    They could as well have been chars or shorts, they could have been signed or
    not. They could have been pointers or the names of arrays (same thing in C) or
    pointers to functions.
    From that little ASM you cannot tell how condition and value are declared.
    Therefore you cannot recreate the C program from the PASM.

    Why do these types matter?
    Well what if they were actually declared as char in the original C source? From
    translating this PASM snippet you may end up with them as int. BUT it might be
    that else where in the program it relies on an incrementing char value to
    overflow and it's value wraps around after 255. As we have decided they are
    ints that incrementing will not behave as expected.
    So you would have to carefully analyze the entire body of PASM to find all the
    variables, see how they are used and then deduce the correct types for them.

    What if they were pointers?
    In C adding 1 to a pointer (prt +=1 ) will actually increase the pointer by the
    size of the type that it points at. So if ptr points to 32 bit int you will see
    "ADD PTR, #4" in the PASM. If you have not deduced the types correctly this is
    going to go all wrong when translating back to C.

    It gets worse.
    I put condition and value as local variables in the code, perhaps they were
    supposed to be global or static or volatile.

    Anyway I think that in your example, as simple as it is, the C source is not
    representative of that PASM. Given the PASM it might be better to translate it
    to the following C snippet:
    if (condtion == value) zeroFlag = 1;
    if (zeroFlag == 0) goto skipCode;
      // Some code
    skipCode:
      // More code
    
    Why?
    1) In PASM the zero flag only changes when you specify that it can with WZ. So
    effectively it acts like a boolean variable that can hold a truth value for a
    long time as many other instructions execute. It could be that further down the
    PASM code the value of the zero flag set in your snippet is used again. Your C
    version has not preserved that value.

    2) Unless you have analyzed the entire program you cannot be sure, looking at
    the PASM nippet, that the lable "skipCode" is not used from some other place in
    the PASM, call, jump, self, modifying, whatever. So it's better to keep it in the C source.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-13 06:56
    Ah, I have you intrigued then?

    First
    int main(int argc, char* argv[])
    {
      int condition;
      int value;
      if (condition == value)
      {
        // Some code
      }
      return(0);
    }
    

    Yes I agree.
    Well we have two types of variables in PasmC and I think there are only two. They are
    1) const unsigned long myconstant = nnnn
    2) unsigned long myvariable

    These then end up at the bottom of the DAT section and ideally you would group all the constants together and then all the variables.
    The constants are all values above 511 as anything under 512 can be written as a #number.

    I can't see how you can have integers, or pointers bytes or chars or even char arrays. They don't really mean anything in pasm.

    So this is a specific subset of the C language but if we work within the limitations of what a cog does we can still write C code. I think...
    It gets worse.
    I put condition and value as local variables in the code, perhaps they were
    supposed to be global or static or volatile.

    I am fairly certain that there are no local variables at all and they are all global. That is the way pasm works within a cog. So you declare them before the main() and that fits with the rules of C. No variables are declared within functions as everything is global.

    I know that sounds a bit strange, but these are the rules of cog code that can't be changed. What you can do though is note what those rules are, then write C code that obeys those rules. So, all variables are unsigned longs, they are all global so get declared at the beginning of the program, and there are no local variables. That still works for C, and then any code you write ought to be possible to debug on any C compiler.

    That ought to be useful.
    1) In PASM the zero flag only changes when you specify that it can with WZ. So
    effectively it acts like a boolean variable that can hold a truth value for a
    long time as many other instructions execute. It could be that further down the
    PASM code the value of the zero flag set in your snippet is used again. Your C
    version has not preserved that value.

    That gets tricky. Yes, the cog remembers the flag for a long time until it is changed again. BUT - I have preferred when writing code not to rely on this too much for fear that one day I might add some extra code in between the working code and upset the whole program.

    So, I write with a programming style where any flags are tested on the next line, and then after that the flag is no longer valid.

    I think that translates to C. 'if' statements are tested straight after and branch or don't branch. Subsequent 'if' statements within the first statement then are tested again. Loops are tested either at the beginning of the loop or at the end depending on what C loop structure is being used.

    The only one I'm not sure about is 'switch' statements with breaks and I need to think about that, though I suspect they can be simplified down to a series of 'if' statements.

    The structure that could be a little complex would be if...else statements. There are several jumps implicit in that code (three I think). But I think that the structure is consistent, so I still think that could be reverse compiled back to the C.
    2) Unless you have analyzed the entire program you cannot be sure, looking at
    the PASM nippet, that the lable "skipCode" is not used from some other place in
    the PASM, call, jump, self, modifying, whatever. So it's better to keep it in the C source.

    Can I take a raincheck on self modifying code?? *please*

    Actually, going off on a tangent, when (apart from maybe LMM) would one ever need self modifying code? Is there not a solution that can be done with reloadable cogjects if one runs out of memory etc?


    As for labels, no the label 'skipcode' is not used for anything else apart from that 'if' jump. I think it has to be that way. If you truly want spaghetti code, write in pasm. Even though C does allow 'goto', I believe that it is quite possible to write a C cog program with no 'goto' or 'jmp' instructions as part of the program. They are all hidden within higher level constructs, and more importantly, I believe they still compile to the most efficient code possible. Maybe there is a cunning bit of code that does not fit within this rule, but I think that if you can write a video driver in C, with all the really tight timing requirements one needs, then pretty much all cog code ought to be possible in this variant of C. Happy to be proved wrong though!

    To expand on that further, when I took my working high-res graphics driver and reverse compiled it to C by hand, I do need to confess that I rewrote the code in the process so that it obeyed all the rules of 'if' jumps and loops. There were some cunning jumps into other bits of code, and I removed them in the process. I think it added a couple more instructions. But it made the code far more readable. So yes, this concept is going to fail for code where the package has been squeezed into a cog with one or two longs to spare (eg some of the CP/M code).

    But what you lose in adding a few longs, you gain in spades with much more legible code. Conditional branches make sense, indents work as they should, and the overall readability of the code is far better in C than it is in pasm. IMHO *grin*

    Please keep the critiques coming. This is extremely useful in defining the rules of PasmC and what you can do and what you can't.
  • Dave HeinDave Hein Posts: 6,347
    edited 2011-09-13 10:03
    There was a thread called "Quine in Spin" about 6 months ago. A quine is a program that prints out it's own source. I wrote a quine in Spin that would disassemble it's own bytecodes and generate the Spin source. It is posted near the end of the thread. I had to use a subset of the Spin language to be able to convert from bytecodes back to Spin. The biggest limitation was how to interpret jumps. I basically had to implement IF's and loops using "repeat while".

    The biggest issue with converting from one language to another, and back is that you will have to severely limit the syntax that you can use in any language to the lowest common "denominator" of all the languages you include. I think it would be useful to be able to convert C or Spin directly into PASM. Converting from PASM back to a high level language may not be so useful, and is difficult because it is hard to convert jumps back into structured code.

    Bean's PropBasic does a good job of converting Basic in to PASM or LMM PASM. Programs could be written in a subset of C or Spin that could be converted to PropBasic, and then compiled into PASM. Another approach with Spin would be to compile it to bytecodes, and then convert the bytecodes into PASM. However, a big problem with this approach is that the Spin VM relies heavily on the stack for every operation that is perform. It would be possible to create an optimizer for the generated Spin bytecodes that would convert stack pushes and pops to loads and stores. In Spin, a statement such as "A := B + C' would generate code like
    PUSH B
    PUSH C
    ADD
    POP A
    
    which could be optimzed in PASM to
    mov A, B
    add A, C
    
    if A, B and C were treated as cog variables instead of hub variables.
  • Heater.Heater. Posts: 21,230
    edited 2011-09-13 10:44
    Anyway, presumeably everyone here has noticed there is a GCC based C compiler about to hit the scene. It can generate PASM to run in COG, PASM to run from HUB (LMM), and PASM to run from external memory (XMM). Rumour has it that it's native COG PASM is pretty tight and speedy.
    So a lot of the proposals going on this thread seem moot.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-09-13 12:17
    Since I'm the one who started this thread with a rather provocative question, I guess I should mention now that I'm pretty satisfied with what the GCC folks are showing us. I probably won't ever use Propeller C for anything, but I think it will be a net positive for those of Parallax's customers who demand it.

    As to the original (rhetorical) question, the answer is, "No, of course not." But the discussion it provoked has been a really good one, and I'd like to thank everyone who's contributed to the thread. :)

    -Phil
  • Dave HeinDave Hein Posts: 6,347
    edited 2011-09-13 16:24
    As to the original (rhetorical) question, the answer is, "No, of course not." But the discussion it provoked has been a really good one, and I'd like to thank everyone who's contributed to the thread.
    Phil, I disagree with your conclusion. It is certainly possible to write a C compiler that generates PASM in the same style as a human PASM programmer. So, I agree that you question was rhetorical, but the answer would be "Yes, of course".
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-09-13 16:43
    Dave, that's certain not a response I expected! However, you said, "It is certainly possible to write a C compiler that generates PASM in the same style as a human PASM programmer," which actually answers a different question, since it accepts the constraints of the C language.

    The question in my title was more about the language itself. Phrased more precisely, it asks, "For any given PASM program P, is there a computable mapping f (i.e. a compiler) from some C program C such that f(C) = P (or an equivalent to P of equal or less size, requiring equal or fewer clock cycles to execute)?" If you answer "yes" to that I will have to say, "Show me! And show the GCC guys, too!" :)

    -Phil
  • Bill HenningBill Henning Posts: 6,445
    edited 2011-09-13 17:00
    Dave, that's certain not a response I expected! However, you said, "It is certainly possible to write a C compiler that generates PASM in the same style as a human PASM programmer," which actually answers a different question, since it accepts the constraints of the C language.

    The question in my title was more about the language itself. Phrased more precisely, it asks, "For any given PASM program P, is there a computable mapping f (i.e. a compiler) from some C program C such that f(C) = P (or an equivalent to P of equal or less size, requiring equal or fewer clock cycles to execute)?" If you answer "yes" to that I will have to say, "Show me! And show the GCC guys, too!" :)

    -Phil

    LOL... you are getting formal now!

    Actually, I agree with your premise.

    I do not belive any compiler for the prop can generate code as good as a really good pasm programmer.

    Cog mode is generating very good code, especially when you are careful and make sure no stack access will occur.

    LMM mode is generating very good LMM code, which will improve as we add more optimizations.
  • Dave HeinDave Hein Posts: 6,347
    edited 2011-09-13 17:57
    So if I understand your new premise, you are asking if it's possible to write C code that maps directly into PASM code as if I was writing PASM code directly. It's possible to write a compiler that recognizes patterns in a C statement that would translate directly into a PASM instruction. Something like "x = (x << y) | ((unsigned)x >> (32 - y))" could be recognized as "rol x, y". A bit-reverse loop could be recognized and converted into a rev instruction. So yes, the C language is capable of expressing instructions that could be converted into highly optimized PASM that conforms to the way people would program PASM manually, which I believe is what you mean by idiomatic.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2011-09-13 18:17
    Dave,

    Idiomatic PASM goes well beyond the kind of example you cite. It includes effective use of self-modifying code, conditional execution for more than just jmps, taking good advantage of the more esoteric instructions, like cmpsub, addabs, sumc, and the mux and wait instructions, including waitvid for generating waveforms, to name just a few. Even a simple optimization like keeping a flag state around for use further on is probably beyond C's mission statement. So I stand by my, "No, of course not."

    -Phil
  • kenr707kenr707 Posts: 151
    edited 2011-09-13 21:03
    Any human writing assembly code can beat any compiler on a 20 instruction program. Any compiler can beat any human on a million-instruction program. (That's before somebody modifies the requirements.) In between, you need to thoughtfully balance the costs and benefits. Most commercial users would say that you should use assembly only when you've demonstrated that you can't meet the needs with compiled code.
  • Heater.Heater. Posts: 21,230
    edited 2011-09-14 03:19
    No compiler is ever going to put together the kind of tangled, short circuited, fragile, brittle, unmaintainable PASM code that I have created for the Z80 emulation or ZPU virtual machine:)
    Not unless the C source is written as a single function with no loops, all control flow done via GOTO etc etc.
    At which point there is little point in using a high level language anyway.

    On the other hand. Rumor has it that the propeller GCC compiler is now producing PASM to run in COG for the fft_benchmark that is only a tad slower than my hand made PASM version.

    Maybe my PASM is just crappy:(
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2011-09-14 06:37
    You are right heater, it is enough to drive one mad!

    especially when there are people out there who can write compilers that could fit in a cog?? http://bellard.org/otcc/

    My code is slow and horribly inefficient and I'm sure there are many ways to optimise things. And it will never satisfy the qualification of 'idiomatic'

    At the same time though, it is a start towards a higher level language suitable for producing cog code that has no 'goto' statements.

    Testing 'if' statements first. I'll then move on to 'for', and 'while'
    if (i == j)
    {
      if (n == m)
      {
        a = b;
      }
    }
    c = d;
    if (k != g)
    {
      e = f;
    }
    g = h;
    

    and the output (which I hope is correct) is
    L0000       cmp i,j wz
    L0001 if_nz jmp #L0007
    L0002       cmp n,m wz
    L0003 if_nz jmp #L0007
    L0004       mov a,b
    L0007       mov c,d
    L0008       cmp k,g wz
    L0009 if_z  jmp #L0012
    L0010       mov e,f
    L0012       mov g,h
    L0013 
    

    the vb.net code is
    Imports System.IO
    Imports System.Windows.Forms
    Imports System.IO.Ports
    Imports Strings = Microsoft.VisualBasic ' so can use things like left( and right( for strings
    Public Class Form1
        Public Filename As String
        Private Sub ExitToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles ExitToolStripMenuItem.Click
            End
        End Sub
    
        Private Sub NewToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles NewToolStripMenuItem.Click
            Filename = "New.Spin"
            'Filename = InputBox("Filename eg NEW.C or NEW.BAS or NEW.SPIN or NEW.PASM", , Filename)
            RichTextBox1.Text = ""
            RichTextBox1.AppendText("if (i == j)" + vbCrLf)
            RichTextBox1.AppendText("{" + vbCrLf)
            RichTextBox1.AppendText("  if (n == m)" + vbCrLf)
            RichTextBox1.AppendText("  {" + vbCrLf)
            RichTextBox1.AppendText("    a = b;" + vbCrLf)
            RichTextBox1.AppendText("  }" + vbCrLf)
            RichTextBox1.AppendText("}" + vbCrLf)
            RichTextBox1.AppendText("c = d;" + vbCrLf)
            RichTextBox1.AppendText("if (k != g)" + vbCrLf)
            RichTextBox1.AppendText("{" + vbCrLf)
            RichTextBox1.AppendText("  e = f;" + vbCrLf)
            RichTextBox1.AppendText("}" + vbCrLf)
            RichTextBox1.AppendText("g = h;" + vbCrLf)
        End Sub
    
        Private Sub ToOtherLanguagesToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles ToOtherLanguagesToolStripMenuItem.Click
            RichTextBox2.Text = RichTextBox1.Text ' copy spin to the basic text box
            ConvertCtoPasm()
            TabControl1.SelectedIndex = 1 ' display the code
        End Sub
    
        Private Sub ConvertCtoPasm()
            ' convert richtextbox2 to basic syntax
            Dim LineOfText As String
            Dim i, j As Integer
            Dim a, b, c As Integer
            Dim v1, v2 As String
            Dim TextArray(2000) As String
            Dim BracketCounter As Integer = 1
            Dim BracketStack(20) As Integer
            Dim BracketStackCounter As Integer = 0
            ' move data into a string array - easier to work with than richtextbox
            i = 0
            For Each Line As String In RichTextBox2.Lines ' much faster than for i=0 to richtextbox1.lines.length
                TextArray(i) = Line
                i += 1
            Next
            TextArray(i) = "End File" ' add an end of file marker
            ' remove all leading spaces
            i = 0
            Do
                TextArray(i) = LTrim(TextArray(i))
                i += 1
            Loop Until TextArray(i) = "End File"
            ' remove all other spaces
            i = 0
            Do
                TextArray(i) = TextArray(i).Replace(" ", String.Empty)
                i += 1
            Loop Until TextArray(i) = "End File"
            ' label the brackets
            i = 0
            BracketCounter = 1
            Do
                If Strings.Left(TextArray(i), 1) = "{" Then
                    TextArray(i) = "BracketLeft" + Trim(Strings.Str(BracketCounter))
                    BracketStack(BracketStackCounter) = BracketCounter
                    BracketCounter += 1
                    BracketStackCounter += 1
                End If
                If Strings.Left(TextArray(i), 1) = "}" Then
                    BracketStackCounter -= 1
                    TextArray(i) = "BracketRight" + Trim(Strings.Str(BracketStack(BracketStackCounter)))
                End If
                i += 1
            Loop Until TextArray(i) = "End File"
            ' add in label numbers
            i = 0
            Do
                LineOfText = "00000" + Trim(Strings.Str(i))
                LineOfText = "L" + Strings.Right(LineOfText, 4) ' always L+4 characters long
                TextArray(i) = LineOfText + " " + TextArray(i)
                i += 1
            Loop Until TextArray(i) = "End File"
            ' search for 'if (n == m) and replace the line below with a if_nz jump
            i = 0
            Do
                If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "==") <> 0 Then
                    ReplaceRightBracket(TextArray, TextArray(i + 1), "if_nz jmp #", i) ' pass the left bracket number
                End If
                i += 1
            Loop Until TextArray(i) = "End File"
            ' search for if(n==m) and replace with cmp
            i = 0
            Do
                If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "==") <> 0 Then
                    LineOfText = TextArray(i)
                    a = Strings.InStr(LineOfText, "(") ' flags to separate out the variables
                    b = Strings.InStr(LineOfText, "==")
                    c = Strings.InStr(LineOfText, ")")
                    v1 = Strings.Mid(LineOfText, a + 1, b - a - 1)
                    v2 = Strings.Mid(LineOfText, b + 2, c - b - 2)
                    TextArray(i) = Strings.Left(TextArray(i), 5) + "       cmp " + v1 + "," + v2 + " wz"
                End If
                i += 1
            Loop Until TextArray(i) = "End File"
            ' search for 'if (n != m) and replace the line below with a if_z jump
            i = 0
            Do
                If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "!=") <> 0 Then
                    ReplaceRightBracket(TextArray, TextArray(i + 1), "if_z  jmp #", i) ' pass the left bracket number
                End If
                i += 1
            Loop Until TextArray(i) = "End File"
            ' search for if(n!=m) and replace with cmp
            i = 0
            Do
                If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "!=") <> 0 Then
                    LineOfText = TextArray(i)
                    a = Strings.InStr(LineOfText, "(") ' flags to separate out the variables
                    b = Strings.InStr(LineOfText, "!=")
                    c = Strings.InStr(LineOfText, ")")
                    v1 = Strings.Mid(LineOfText, a + 1, b - a - 1)
                    v2 = Strings.Mid(LineOfText, b + 2, c - b - 2)
                    TextArray(i) = Strings.Left(TextArray(i), 5) + "       cmp " + v1 + "," + v2 + " wz"
                End If
                i += 1
            Loop Until TextArray(i) = "End File"
    
    
            ' do all "a=b;" lines - check not an if statement
            i = 0
            Do
                If Strings.InStr(TextArray(i), "=") <> 0 And Strings.Mid(TextArray(i), 7, 3) <> "if(" Then
                    LineOfText = TextArray(i)
                    a = 7
                    b = Strings.InStr(LineOfText, "=")
                    c = Strings.InStr(LineOfText, ";")
                    v1 = Strings.Mid(LineOfText, a, b - a) ' variable 1
                    v2 = Strings.Mid(LineOfText, b + 1, c - b - 1) ' variable 2
                    TextArray(i) = Strings.Left(TextArray(i), 5) + "       mov " + v1 + "," + v2
                End If
                i += 1
            Loop Until TextArray(i) = "End File"
    
    
            ' convert back to the rich text box and remove redundant right brackets
            i = 0
            RichTextBox2.Text = ""
            Do
                If Strings.Mid(TextArray(i), 7, 12) <> "BracketRight" Then
                    RichTextBox2.AppendText(TextArray(i) + vbCrLf)
                End If
                i += 1
            Loop Until TextArray(i) = "End File"
        End Sub
    
        Private Sub ReplaceRightBracket(ByVal TextArray() As String, ByVal LeftBracket As String, ByVal ReplaceString As String, ByVal i As Integer)
            ' take the value of the leftbracket, search through and replace the same rightbracket with replacestring
            Dim j As Integer
            ' find the matching right bracket number
            j = 0
            While Strings.Mid(TextArray(j), 7) <> "BracketRight" + Strings.Mid(LeftBracket, 18) And TextArray(j) <> "End File"
                j += 1 ' j contains the line with the right bracket
            End While
            ' now increment j until a real line of code (skip through this bracket and any further right brackets)
            While Strings.Mid(TextArray(j), 7, 12) = "BracketRight"
                j += 1
            End While
            TextArray(i + 1) = Strings.Left(TextArray(i + 1), 6) + ReplaceString + Strings.Left(TextArray(j), 5)
        End Sub
    End Class
    
  • tonyp12tonyp12 Posts: 1,951
    edited 2011-09-18 09:38
    L0008 cmp k,g wz
    L0009 if_z jmp #L0012
    L0010 mov e,f

    Could it be optimized to create this shorter faster version?
    L0008 cmp k,g wz
    L0009 if_nz mov e,f
Sign In or Register to comment.