C Expressive Enough for Idiomatic PASM?

Phil Pilgrim (PhiPi) · 2011-08-21 13:34

Kevin Wood wrote:

Wouldn't the "same" code in say 3 cogs just r/w to a single array?

Consider the case of serial I/O drivers, each with its own buffer, or multiple FIR filters.

-Phil

Dr_Acula · 2011-09-06 17:51

I have been thinking some more about this topic. Rather than try to compile all of C (or any other language) into pasm I am wondering if one could consider a more simplified version of a higher level language.

One big problem with any assembly is 'spaghetti' code that jumps all over the place, so one goal would be to hide all jumps (GOTO is evil, right?). That means that jumps are hidden inside higher level structures like If, Switch, Repeat.

I've been thinking of a 'bottom up' approach. Think of a commonly used pasm structure, and then replicate it in a higher level language.
eg Conditional branching.

cmp variable1, variable2 wz
if_z jmp,#mylabel

That translates into four types of If statement - equals, not equals, greater than and less than. For the sake of simplicity, we can leave out "greater than or equal" and "less than or equal".

With those four structures we can translate that to higher level languages. Because the pasm is the same, one can think about writing a pre-compiler to do three languages at once - C, Spin and Basic

So in Basic

If variable1 = variable2 then
   some code
EndIf

in C

if (variable1 == variable2)
  {
   some code;
  }

and in Spin

if (variable1 == variable2)
   some code

We could call this new language Spin_C_Basic or something? Write in whatever syntax you prefer.

I'm thinking of this hybrid language because not everything can be done in every language.

Basic does not support >> or << for instance.
C does not support binary numbers
and Spin may not be for every programmer

For/Do/Repeat/While loops translate fairly easily into pasm without needing optimisation as you use the most optimised code you can.

One structure worth considering is the DJNZ loop.
Normally with a loop you would set a variable, then run some code, then subtract or add 1, then test if it has reached the end and jump to the beginning if not.
That takes three pasm instructions but if you use DJNZ you can get that down to two instructions.
The equivalent in a higher level language is the 'repeat' in Spin, where you do not really care what the variable is within the code - you just want to do something n times. I am not sure if such a structure exists within C or Basic. In C the specific instance is for(i=n;i=0;i--) but it is not explicit that this will be converted to a DJNZ loop but a for loop with i++ will not. So this might be an argument for using Spin's 'repeat' and calling this a hybrid language rather than a pure subset of a language.

That could all be debatable of course!

What I am thinking is that if you can create a structure with no goto statements and which runs just as fast as the best pasm code, it makes the code much more readable. The odd pasm command in amongst that structure is now not so bad - eg instructions that really have no high level language equivalent like rdlong and wrlong. So just put them in.

There was an interesting example on the first post about maths
a = b+c

now in pasm that is really two instructions, and you want b and c to be unchanged at the end and the variable a is the one changing so this leads to
a = b
a = a+c

which translates to two pasm instructions
More complex things like a = b+c+d on one line might be best avoided though as it may lead to different pasm code. Maybe there is a 'perfect' optimisation but one gets into shuffling railcar algorithms.

assembly is still 'close to the metal' so I'm thinking of keeping math to just one instruction per line.

Another thing I was thinking about was the way pasm puts all the variable declarations at the end of the program, whereas higher level languages declare variables at the beginning. A precompiler could simply move them all to the end, but for Spin/C/Basic programmers it will be one less thing that is new and unfamiliar about pasm.

Then there is the issue of variables and constants. This one gets me every time as I keep needing to remind myself that pasm is a 9 bit language and that the number 511 is special.

mov variable,#511 is allowed
mov variable,#512 is not

and if you leave out that # the program won't work either.

These things can be picked up by a precompiler, and if you try to put in a constant more than 511, it could suggest that you create a new long to store the variable/constant, or recycle another long that already is using this variable/constant, or if the number is divisible by 2, use the trick of setting a number then bitshifting it. You might also even declare these as 'constants' rather than as variables, and then if you accidentally wrote some code that changed the value, the precompiler could warn that this is a constant, not a variable.

In summary, I think pasm can be made a lot more legible by adding a structure that is more like a higher level language, removing all jumps, removing the need to know the difference between :mylabel and label, removing the need for adding # before labels and some values, and adding in some common error checking.

I'm going to have a go at writing a precompiler and see how it works out.

jazzed · 2011-09-06 18:28

C works good enough for most people when it is available.
GCC does a very good job generating code that runs in a COG today.

PASM can be used when C is not good enough.
Many prefer PASM, so any tool should support that.

If your new tool is good enough I'm sure people will use it.
Maybe it is worth a new topic in the Propeller forum?

Heater. · 2011-09-06 22:42

There are, and will continue to be, a multitude of programming languages. All designed to address some issue or other. Or just because people seem to have a facination with creating such things.
The fundamental idea with a language like C is to be able to express an algorithm in such a way that it can easily be moved from one processor architecture to another. In that light, creating a language that is heavily dependent on a particular prossessors instruction set serves no purpose. Might as well use the available assemblers.

Dr_Acula · 2011-09-07 07:07

The aim is to write some C code that;
1) Passes through a compilation with a standard C compiler with no errors (even though it won't produce anything that works) AND
2) can be passed through a precompiler that can convert it to pasm that works

The test of 2 is to take some display code and see it run on a display.

I'd like to edit my post above.

First, the C language CAN distinguish between constants and variables. Just use 'const' before each declaration. This will be brilliant because it will enable error generation if you ever try to change a value that has been declared a constant. So that is one thing that the current pasm compilers can't do.

Next, all variables are 'unsigned long'

Also, all variables are global. But many subroutines only work on a few variables, so it seems to me neater to pass those variables to a function as you can then see which variables that function needed. The precompiler would ignore these passed variables as they are global.

Next, C uses 'return' from functions, so it could be useful to spend one long for a generic variable 'returnvalue'

There are a number of keywords in pasm that don't mean anything in C, like par and dira. We can declare these as fake variables but they don't use any longs.

There are also a number of pasm commands that can be declared as fake functions, like rdlong().

Finally, with respect to binary numbers, well they don't exist in C, but Leor Zolman of BDSC fame (good 'ol CP/M) posted a cunning solution using macros http://www.velocityreviews.com/forums/t318127-using-binary-numbers-in-c.html so we could use that.

So, having set this up, time to convert some pasm to C. E&OE with my C programming - there are bound to be some semicolons left out and hopefully there are no syntax errors. I'll find them anyway when I run this through a real C compiler. In general, take some spaghetti code and write it such that there are no GOTO's and no jumps. Can it be done? yes, I think it can, and I think it is more easily readable with the indents.

And by reverse engineering pasm into a higher level language, the code should end up the same. That is the aim, anyway!

Original code (from my 256x224 line full color TV driver)

' Ram driver

VAR

  long  cog

PUB Start(ramptr) : okay
  stop
  okay := cog := cognew(@entry, ramptr) + 1  

PUB stop

'' Stop ram driver - frees a cog

  if cog
    cogstop(cog~ - 1)   

DAT

'*******************************
'* Assembly language Ram driver *
'*******************************

' check the line number. If it has changed, then start reading the next line in and put @ screen
' if it has not changed then repeat the loop
' if it is the last line then restart the ram counter variable

' currentline counts up from 0 to tiles*16
' the video driver recycles the same 8*256 bytes in the screen buffer
' first, wait for currentline to equal zero and when it does, start reading in ram to @screen
' check the currentline. Shift left by 2 and mask-and with 1. 0-3 is 0, 4-7 is 1, 8-11 is 0, 12-15 is 1 etc

' Ram format. Each line is up to 512 pixels long. 256 is the standard video driver but over 300 does work
' so each line is 9 bits
' Then there are 4 lines in a group. Read these all in from ram in one subroutine call
' So that is 11 bits.
' And then these are in two groups of 4. So that is 12 bits.
' So - once 8 lines have been read in (12 bits = 4096 bytes), then change the latch
' There are 256/8 = 32 groups of 8 lines so that is 5 bits of the latch that are used
' and the 512k ram chip can store 4 frames (if needed)
' P0-7 is the data
' P8-P19 is the address for 8 lines (12 bits)
' P20 controls the latch enable
' P21 is /rd, P22 is /wr and P23 is /oe (chip select)
' ** There are pullups on P20 to P23 as the ram chip needs to retain data when transferring from spin to cog **




        org

entry
                        mov             t1,par
                        mov             status,t1
                        add             t1,#4
                        rdlong          _nextline,t1               ' line number in hub (updated by graphics cog) use with rdlong t1,_nextline
                        add             t1,#4
                        rdlong          screen_buf,t1              ' screen buffer location
                        add             t1,#4
                        rdlong          pixels_x,t1                 ' pixels per line
                        add             t1,#4
                        rdlong          serial_buffer,t1            ' serial buffer for data coming in

                        mov             ramline,screen_buf
                        add             ramline,topram              ' location of the temp 256 byte buffer that is rewritten each alternate line
                        mov             serial_buf_last,serial_buffer
                        add             serial_buf_last,#256        ' last entry in the serial buffer            
                        mov             outa,oewrrd                 ' oe/wr/rd/latch all high  %00000000_11110000_00000000_00000000         
                        mov             dira,dira_pins              ' enable these pins for output  %00000000_11111111_11111111_00000000

' in 'spaghetti' code with jumps all over the place - but in a way that maximises speed (this works better than djnz loops)
' reset everything
' wait for zero line
' label1
' read line of data
' add 2 to local line counter
' if locallinecounter>=224 then back to beginning
' add 1 to blockcounter
' if blockcounter=16 then reset blockcounter and increment latchcounter
' wait for current_line to equal localline (will be 2,4,6,8 etc)
' goto label1

' *** don't forget # after all jmp and call instructions!!! ****

main_loop_1             mov             sram_address,#0                 ' start reading ram at address #0
                        mov             localcounter,#0                 ' my local line counter
                        mov             blockcounter,#0                 ' block is 256 bytes, 16 blocks per latch
                        mov             latchcounter,#0                 ' block is 4096 bytes reset block counter
                        call            #selectlatch                    ' latch to zero
main_loop_2             mov             hubaddr,ramline                 ' setup for rdblock
                        mov             len,pixels_x                    ' setup for rdblock
main_loop_3             rdlong          current_line,_nextline          ' read the line that is being displayed
                        cmp             current_line,localcounter wz    ' equal to localcounter?
              if_nz     jmp             #main_loop_3                    ' loop until it is
                        call            #rdblock                        ' read in 256 bytes
                        add             sram_address,#256               ' next 256 bytes
                        add             localcounter,#2                 ' add 2 get ready for next line
                        cmp             localcounter,#224 wc            ' end of the screen = 224
'              if_nc     jmp             #main_loop_1                    ' restart everything
              if_nc     jmp             #process_serial   ' any new pixels?  
                        add             blockcounter,#1                 ' add 1 to block counter
                        cmp             blockcounter,#16  wz            ' time for new block?
              if_nz     jmp             #main_loop_2                    ' skip if still on current block
                        mov             sram_address,#0                 ' reset the ram address
                        mov             blockcounter,#0                 ' reset blockcounter
                        add             latchcounter,#1                 ' add 1 to latchcounter
                        call            #selectlatch                    ' latch to latchcounter
                        jmp             #main_loop_2


' 10k pullups on /oe, /rd and /wr and latch and don't put leds on these three pins

'Read a "len" byte block given by "sram_address" into hub at "hubaddr"  - Cluso99's ram driver
' preserves sram_address
rdblock               
                        mov             t1,sram_address         ' store sram address
                        shl             t1,#8                   ' shift left so in the right position
                        or              t1,oerd                 ' %00000000_01010000_00000000_00000000 ' /oe and /rd low 
                        ' outa pre-filled with the address shifted over by 8 and the /oe and /rd low
                        mov             outa,t1                        ' send it out
                        'nop                             ' cluso's driver has a nop but it seems ok without one    
rdloop                  mov             t2, ina               ' read byte from SRAM \ ignores upper bits
                        wrbyte          t2, hubaddr           ' copy byte to hub    /
                        add             hubaddr, #1             ' inc hub pointer
                        add             outa, #(1 << 8)         ' inc sram address
                        djnz            len, #rdloop            ' loop for xxx bytes
                        mov             outa,oewrrd             ' oe,wr and rd and latch all high
rdblock_ret             ret


' pass latchcounter
selectlatch             mov              dira,blockmask            ' enable different pins to reading, ie P0-7 are outputs now
                        mov              t1,latchcounter           ' get the latch counter
                        and              t1,#%01111111              'mask off high bit as 512k/4096/16 is 128
                        mov              outa,t1                    ' output blockcount byte
                        and              outa,latchlow              ' set latch low
                        'nop                                                       ' delay not needed
                        'nop
                        mov              dira,dira_pins             ' enable these pins for output  %00000000_11111111_11111111_00000000
                        mov              outa,oewrrd                ' set oe,wr,rd and latch high
selectlatch_ret         ret
                      
' Write a "len" byte block given by t3 in external ram from hub at t4  - Cluso99's ram driver 
wrblock
                        mov             dira,pin0to23            ' dira pins %00000000_11111111_11111111_11111111                   
wrloop                  mov             t1,t3                    ' store sram address in t1
                        shl             t1,#8                    ' shift left so in the right position
                        rdbyte          t2,t4                    ' get the byte
                        or              t1,oewr                  ' or with %00000000_00110000_00000000_00000000
                        or              t1,t2                    ' or with the hub address
                        mov             outa,t1                  ' send it out
                        nop                                      ' wait a little
                        nop
                        or              outa,oewrrd              ' oe,wr and rd all high
                        add             t4,#1                    ' add 1 to hub address
                        add             t3,#1                    ' add 1 to sram address
                        djnz            len, #wrloop             ' loop for xxx bytes
                        mov              dira,dira_pins          ' enable these pins for output  %00000000_11111111_11111111_00000000 
                        mov             outa,oewrrd              ' oe,wr and rd and latch all high
wrblock_ret             ret





' read serial data. The first byte is the row number, then 256 bytes
' if the last byte is zero then has been read, if not zero then read
' if the last byte is not zero, then process the data and then write a zero to the last byte
' if processing and the row is even then store to hub ram
' if the row is odd then store to external ram

process_serial          rdbyte   t1,serial_buf_last                 ' value in the last byte of the buffer
                        cmp      t1,#0 wz
                  if_z  jmp      #main_loop_1                       ' do nothing if zero
                        mov      t2,serial_buffer                   ' t2 is a counter
                        mov      t3,screen_buf                      ' t3 is the screen buffer
                        rdbyte   t1,t2                              ' t1 is the row number
                        mov      t5,t1                              ' t5 also contains the row number            
                        add      t2,#1                              ' increment - 1st byte is row number, data starts at 1
                        mov      t4,#256                            ' count timer
                        and      t1,#1                              ' mask out all but LSB
                        cmp      t1,#1 wz                           ' test if odd or even
                  if_z jmp      #serial_external                    ' odd so write to external ram
                        'put in hub ram line 0 =0, 2=1, 4=2 (y/2)*256 is the same as y*128
                        shl      t5,#7                              ' multiply by 128
                        add      t3,t5                              ' get hub location to store
serial_hub
                        rdbyte   t1,t2                              'get the first byte t1 = row number, multiply by 256 and add to t3
                        wrbyte   t1,t3                              ' store to hub ram
                        add      t2,#1                              ' increment source
                        add      t3,#1                              ' increment destination
                        djnz     t4,#serial_hub                    ' do 256 times            
                        jmp      #serial_finish                     ' set last byte to zero to say finished

serial_external        ' move data to external ram t5 contains row number - see pixels routine for the maths
                        rdbyte   t1,serial_buffer                   ' t1 is the row number 0-31 = block 0, 32-63=block1            
                        shr      t1,#5                              ' divide by 32
                        mov      latchcounter,t1                    ' set up latchcounter
                        call     #selectlatch                       ' select the correct latch
                        rdbyte   t3,serial_buffer                   ' get the row number again
                        and      t3,#%11111                         ' mask off so max is 31 as sram resets each block
                        sub      t3,#1                              ' subtract 1
                        shl      t3,#7                              ' multiply by 128 ie 256*(y-1)/2
                        mov      len,#256                           ' 256 bytes to write
                        mov      t4,serial_buffer                   ' serial buffer location
                        add      t4,#1                              ' add 1 so points to correct place
                        call     #wrblock                           ' write len bytes to sram t3 from hub t4
                        jmp      #serial_finish                     ' set last byte to zero to say finished reading

serial_finish           mov      t1,#0                              ' set value to non zero to say this has been read
                        wrbyte   t1,serial_buf_last                 ' store to the serial buffer
                        jmp      #main_loop_1
'rwo 1 =sram0,row 3=sram256, row 5=sram512, row33=sram0, block1 

finish     jmp #finish


' Initialised data
'address_mask    long %00000000_00000000_00011111_11111111                     ' 13 bits max 32 lines otherwise need to latch
oewrrd          long %00000000_11110000_00000000_00000000                     ' 23 = oe, 22=wr, 21=rd, latch high
oerd            long %00000000_01010000_00000000_00000000                     ' oe and rd low
oewr            long %00000000_00110000_00000000_00000000                     ' oe and wr low
dira_pins       long %00000000_11111111_11111111_00000000                     ' direction pins to be enabled                         
pin0to23        long %00000000_11111111_11111111_11111111                     ' for testing
zero            long %00000000_00000000_00000000_00000000                     ' for testing
'smallblock      long 1024
fivetwelve      long %00000000_00000000_00000010_00000000                       ' 512 for sram increment
blockmask       long %00000000_00010000_00000000_11111111                       ' mask for block select
latchlow        long %11111111_11101111_11111111_11111111                               ' mask for P20 latch                                                   
topram          long 28672

' Uninitialized data

status          res 1        ' status location in hub ram
t1              res 1        ' local variable
t2              res 1        ' local variable
t3              res 1        ' local variable
t4              res 1        ' local variable
t5              res 1        ' local variable
screen_buf      res 1        ' location of the screen buffer
_nextline       res 1        ' currrent line - updated by the video driver
sram_address    res 1        ' ram address value
hubaddr         res 1
len             res 1
current_line    res 1
pixels_x        res 1
localcounter    res 1
blockcounter    res 1
latchcounter    res 1
serial_buffer   res 1
serial_buf_last res 1
ramline         res 1 ' location of a single line in the top part of the screen memory that can be rewritten

        
        fit 496

and converted to a C syntax (apologies if this has wordwrap on your computer, if it does then copy it into Notepad or Word and turn wordwrap off)

// External ram pasm driver in PasmC
// variables passed to functions are fake and are ignored by the compiler but are there for improving readability (all variables are global)
// the generic return variable from a function is returnvalue

// Constants (all global)
const unsigned long oewrrd =         %00000000_11110000_00000000_00000000;            	//   23 = oe, 22=wr, 21=rd, latch high
const unsigned long oerd =           %00000000_01010000_00000000_00000000;              // oe and rd low
const unsigned long oewr =           %00000000_00110000_00000000_00000000;              // oe and wr low
const unsigned long dira_pins =      %00000000_11111111_11111111_00000000;              // direction pins to be enabled
const unsigned long pin0to23 =       %00000000_11111111_11111111_11111111;              // for testing
const unsigned long zero =           %00000000_00000000_00000000_00000000;              // for testing
const unsigned long fivetwelve =     %00000000_00000000_00000010_00000000;              // 512 for sram increment
const unsigned long blockmask =      %00000000_00010000_00000000_11111111;              // mask for block select
const unsigned long latchlow =       %11111111_11101111_11111111_11111111;              // mask for P20 latch  
const unsigned long topram =         28672;						// size of ram

// Variables (all global)

unsigned long returnvalue; 								// generic return from functions
unsigned long status;          								// status location in hub ram
unsigned long t1;              								// local variable
unsigned long t2;              								// local variable
unsigned long t3;              								// local variable
unsigned long t4;              								// local variable
unsigned long t5;              								// local variable
unsigned long screen_buf;      								// location of the screen buffer
unsigned long _nextline;       								// currrent line - updated by the video driver
unsigned long sram_address;    								// ram address value
unsigned long hubaddr;         
unsigned long len;             
unsigned long current_line;    
unsigned long pixels_x;        
unsigned long localcounter;   
unsigned long blockcounter;    
unsigned long latchcounter;    
unsigned long serial_buffer;   
unsigned long serial_buf_last; 
unsigned long ramline;         								// location of a single line in the top part of the screen memory that can be rewritten

// Keyword variables for the propeller, these are declared here but don't cost any longs
// par, outa,dira,ina

unsigned long par;
unsigned long outa;
unsigned long dira;
unsigned long ina;

//        org
//entry

main()
{
	t1 = par;						// mov             t1,par
	status = t1;						// mov             status,t1
	t1 += 4;						// add             t1,#4
        rdlong(_nextline,t1);           			// rdlong          _nextline,t1               ' line number in hub (updated by graphics cog) use with rdlong t1,_nextline
	t1 += 4;						// add             t1,#4
        rdlong(screen_buf,t1);					// rdlong          screen_buf,t1              ' screen buffer location
        t1 += 4;                				// add             t1,#4
        rdlong(pixels_x,t1);					// rdlong          pixels_x,t1                 ' pixels per line
        t1 += 4;						// add             t1,#4
        rdlong(serial_buffer,t1);				// rdlong          serial_buffer,t1            ' serial buffer for data coming in
	ramline = screen_buf;					// mov             ramline,screen_buf
        ramline += topram;					// add             ramline,topram              ' location of the temp 256 byte buffer that is rewritten each alternate line
        serial_buf_last = serial_buffer;			// mov             serial_buf_last,serial_buffer
        serial_buf_last += 256;         			// add             serial_buf_last,#256        ' last entry in the serial buffer            
        outa = oewrrd;                				// mov             outa,oewrrd                 ' oe/wr/rd/latch all high  %00000000_11110000_00000000_00000000         
        dira = dira_pins					// mov             dira,dira_pins              ' enable these pins for output  %00000000_11111111_11111111_00000000
	do							// main loop 1
	{
		sram_address = 0;	// mov             sram_address,#0                 ' start reading ram at address #0
                localcounter = 0;	// mov             localcounter,#0                 ' my local line counter
                blockcounter = 0;	// mov             blockcounter,#0                 ' block is 256 bytes, 16 blocks per latch
                latchcounter = 0;	// mov             latchcounter,#0                 ' block is 4096 bytes reset block counter
		selectlatch(latchcounter);	// call            #selectlatch                    ' latch to zero
		do
		{
             		hubaddr = ramline;			// mov             hubaddr,ramline                 ' setup for rdblock
                        len = pixels_x;				// mov             len,pixels_x                    ' setup for rdblock
			do
			{
				rdlong(current_line,_nextline);	// rdlong          current_line,_nextline          ' read the line that is being displayed
			}
			while(current_line != localcounter);	// cmp             current_line,localcounter wz    ' equal to localcounter? then if_nz     jmp             #main_loop_3                    ' loop until it is
			rdblock(len,sram_address,hubaddr);	// call            #rdblock                        ' read in 256 bytes
			sram_address += 256;			// add             sram_address,#256               ' next 256 bytes
        		localcounter += 2;			// add             localcounter,#2                 ' add 2 get ready for next line
			blockcounter++;        			// add             blockcounter,#1                 ' add 1 to block counter
			if (blockcounter > 15)			// time for a new block?
			{
				sram_address = 0;		// mov             sram_address,#0                 ' reset the ram address
                        	blockcounter = 0;		// mov             blockcounter,#0                 ' reset blockcounter
                        	latchcounter++;			// add             latchcounter,#1                 ' add 1 to latchcounter
        			selectlatch(latchcounter);	// call            #selectlatch                    ' latch to latchcounter
			}
		}
		while(localcounter < 224);
		process_serial();				// any new pixels?
	} 
	while(1);						// repeat main loop forever
}

rdblock(unsigned long len,unsigned long sram_address, unsigned long hubaddr) // read a "len" byte block given by "sram_address" into hub at "hubaddr"  - Cluso99's ram driver 
{              
	t1 = sram_address;					// mov             t1,sram_address         ' store sram address
        t1 <<= 8;						// shl             t1,#8                   ' shift left so in the right position
        t1 |= oerd;						// or              t1,oerd                 ' %00000000_01010000_00000000_00000000 ' /oe and /rd low 
        outa = t1;						// mov             outa,t1                        ' send it out
        // nop();							// 'nop                             ' cluso's driver has a nop but it seems ok without one    
	for(len = len;len > 0;len--)				// generic djnz loop, translated to a djnz at the closing curly bracket
	{
		t2 = ina;					// mov             t2, ina               ' read byte from SRAM \ ignores upper bits
                wrbyte(hubaddr,t2);				// wrbyte(destination,source)  wrbyte          t2, hubaddr           ' copy byte to hub    /
                hubaddr++;        				// add             hubaddr, #1             ' inc hub pointer
                outa += 256;					// add             outa, #(1 << 8)         ' inc sram address
        }							// djnz            len, #rdloop            ' loop for xxx bytes
        outa = oewrrd;                				// mov             outa,oewrrd             ' oe,wr and rd and latch all high
}								// rdblock_ret     ret                     ' return from a function

selectlatch(unsigned long latchcounter)				// pass latchcounter
{
	dira = blockmask;					// mov              dira,blockmask            ' enable different pins to reading, ie P0-7 are outputs now
        t1 = latchcounter;					// mov              t1,latchcounter           ' get the latch counter
        t1 &= %01111111;					// and              t1,#%01111111              'mask off high bit as 512k/4096/16 is 128
        outa = t1;						// mov              outa,t1                    ' output blockcount byte
        outa &= latchlow;					// and              outa,latchlow              ' set latch low
        							// 'nop                                                       ' delay not needed
                        					// 'nop
      	dira = dira_pins;					// mov              dira,dira_pins             ' enable these pins for output  %00000000_11111111_11111111_00000000
        outa = oewrrd;						// mov              outa,oewrrd                ' set oe,wr,rd and latch high
}								// selectlatch_ret         ret
                      
wrblock(unsigned long len,unsigned long t3,unsigned long t4)     // ' Write a "len" byte block given by t3 in external ram from hub at t4  - Cluso99's ram driver 
{
	dira = pin0to23;					// mov             dira,pin0to23            ' dira pins %00000000_11111111_11111111_11111111 
	for(len = len;len >0; len--)				// generic djnz loop                  
	{                  
		t1 = t3;					// mov             t1,t3                    ' store sram address in t1
                t1 <<= 8;					// shl             t1,#8                    ' shift left so in the right position
                rdbyte(t2,t4);					// rdbyte          t2,t4                    ' get the byte
                t1 |= oewr;					// or              t1,oewr                  ' or with %00000000_00110000_00000000_00000000
                t1 |= t2;					// or              t1,t2                    ' or with the hub address
                outa = t1;					// mov             outa,t1                  ' send it out
                nop();       					// nop                                      ' wait a little
                nop();						// nop
                outa |= oewrrd;					// or              outa,oewrrd              ' oe,wr and rd all high
                t4++;						// add             t4,#1                    ' add 1 to hub address
                t3++;						// add             t3,#1                    ' add 1 to sram address
	}							// djnz            len, #wrloop             ' loop for xxx bytes
        dira = dira_pins;					// mov              dira,dira_pins          ' enable these pins for output  %00000000_11111111_11111111_00000000 
        outa = oewrrd;						// mov             outa,oewrrd              ' oe,wr and rd and latch all high
}								// wrblock_ret             ret


process_serial()
{
	rdbyte(t1,serial_buf_last);		// rdbyte   t1,serial_buf_last                 ' value in the last byte of the buffer
        if (t1 != 0)				// cmp      t1,#0 wz  if_z  jmp      #main_loop_1                       ' do nothing if zero
	{                       
		t2 = serial_buffer;		// mov      t2,serial_buffer                   ' t2 is a counter
                t3 = screen_buf;		// mov      t3,screen_buf                      ' t3 is the screen buffer
                rdbyte(t1,t2);			// rdbyte   t1,t2                              ' t1 is the row number
                t5 = t1;			// mov      t5,t1                              ' t5 also contains the row number            
                t2 += 1;			// add      t2,#1                              ' increment - 1st byte is row number, data starts at 1
                t4 = 256;			// mov      t4,#256                            ' count timer
                t1 &= 1;			// and      t1,#1                              ' mask out all but LSB
		if (t1 == 0)
		{				// cmp      t1,#1 wz                           ' test if odd or even,  if_z jmp      #serial_external                    ' odd so write to external ram
                        			//'put in hub ram line 0 =0, 2=1, 4=2 (y/2)*256 is the same as y*128
			t5 <<= 7;		// shl      t5,#7                              ' multiply by 128
			t3 += t5;		// add      t3,t5                              ' get hub location to store
			for (t4 = t4;t4 > 0;t4--)	// generic djnz loop
			{
				rdbyte(t1,t2);	// rdbyte   t1,t2                              'get the first byte t1 = row number, multiply by 256 and add to t3
                        	wrbyte(t3,t1);	// wrbyte   t1,t3                              ' store to hub ram
                        	t2++;		// add      t2,#1                              ' increment source
                        	t3++;		// add      t3,#1                              ' increment destination
                        }			//djnz     t4,#serial_hub                    ' do 256 times, jmp      #serial_finish                     ' set last byte to zero to say finished
		}
	        else
		{
		rdbyte(t1,serial_buffer);	// rdbyte   t1,serial_buffer                   ' t1 is the row number 0-31 = block 0, 32-63=block1            
                t1 >>= 5;			// shr      t1,#5                              ' divide by 32
                latchcounter = t1;		// mov      latchcounter,t1                    ' set up latchcounter
                selectlatch(latchcounter);	// call     #selectlatch                       ' select the correct latch
                rdbyte(t3,serial_buffer);	// rdbyte   t3,serial_buffer                   ' get the row number again
                t3 &= %11111;		        // and      t3,#%11111                         ' mask off so max is 31 as sram resets each block
                t3--;				// sub      t3,#1                              ' subtract 1
                t3 <<= 7;		        // shl      t3,#7                              ' multiply by 128 ie 256*(y-1)/2
                len = 256;			// mov      len,#256                           ' 256 bytes to write
                t4 = serial_buffer;		// mov      t4,serial_buffer                   ' serial buffer location
                t4++;				// add      t4,#1                              ' add 1 so points to correct place
		wrblock(len,t3,t4);		// call     #wrblock                           ' write len bytes to sram t3 from hub t4
		}
	}
	t1 = 0;					// mov      t1,#0                              ' set value to non zero to say this has been read
        wrbyte(serial_buf_last,t1);		// wrbyte   t1,serial_buf_last                 ' store to the serial buffer
}

// list of fake functions that do nothing (in a simulator though they could be very useful)
nop()
{
}

rdbyte(unsigned long destination, unsigned long source)
{
}

wrbyte(unsigned long destination, unsigned long source)
{
}
}

Phil Pilgrim (PhiPi) · 2011-09-07 07:42

Dr_A, that's pretty clever. It reminds me a lot of PL360, the "high-level assembler language" designed by Niklaus Wirth for the IBM 360:

http://en.wikipedia.org/wiki/PL360

-Phil

Mike Green · 2011-09-07 20:48

Phil,
I worked a lot with PL360 maybe 35 years ago, made extensive changes to the existing compiler and used it for a large project. I've thought several times about its applicability to the Propeller. The biggest issues involve the use of conditional execution and tight control of timing. I couldn't come up with a way to communicate what information would be needed. Less difficult, but not solved, would be how to communicate the use of the flags when they're used in complex ways. The 360 had just one condition code that was always set by a large group of instructions in a mostly consistent way. The processors that had multiple execution units did all of the work in hardware to keep the execution units busy and sort out data dependencies.

Phil Pilgrim (PhiPi) · 2011-09-07 21:10

Mike,

38 years ago (OMG: scary realization!) is about the time I had my (brief) exposure to PL360. I was taking a compiler course in grad school (U. Mich.), and my finals project partner and I decided to write the compiler in PL360. I really liked the concept, and it was possible to write tight -- but readable -- code.

Something like PL360 would certainly be nice for the Propeller, but I can well understand the difficulties arising from conditional execution and timing issues. Should you decide to pursue it, though, I would definitely be a willing guinea pig!

-Phil

Dr_Acula · 2011-09-07 22:56

Next step is to write the code to translate this. Because there is only one instruction per line it should be a fairly simple 'find and replace' eg <<= is replaced with shl.

It might also be possible to add in Basic and Spin syntax - I'll see how the coding works out. Maybe even PL360!

Bill Henning · 2011-09-08 13:35

Minor correction:

rdlong r,x

only takes 8 cycles if the two "hub delay slots" can be filled with non-hub instructions

long a,b,c;

c = a + b

with a, b and c being hub variables, would compile to:

// best case

rdlong r0, a_ptr   ' 16-25 cycles
rdlong r1, b_ptr   ' 16 cycles
add  r0, r1            '  free
wrlong  r0, c_ptr  ' 16 cycles

// GCC might be able to insert another 6 non-hub instructions in the unused delay slots

// Total execution time: 48-57 cycles

a_ptr  long @hub_a
b_ptr  long @hub_b
c_ptr  long @hub_c

Now a propeller oriented cog only mode could come up with:

mov  c,a
add   c,b

// 8 cycles, 6..7+ faster than best case hub variable version

a  long
b  long
c  long

I am still very impressed with the code quality shown in this thread for the early gcc port!

ersmith wrote: »
Arrays will have to be stored in hub memory. In fact by default all data will be stored in hub memory, so if you just declare "int x" or "int a[9]" these will be declaring hub variables. To put "x" in cog memory you'll have to give it a special attribute. There will be no way (at least in the first release) to put an array like "a" into cog memory.

Yes, it will be possible to create/access arrays in cog memory using inline assembly. We'll probably create some macros for doing this, so everyone doesn't have to re-invent the wheel.

Note, though, that if you think about it arrays in cog memory are not any more efficient than arrays in hub memory. The sequence to access a pointer in hub memory (r = *x) would look something like:
rdlong r,x
and it takes 8 cycles (if we're aligned on the hub window). The same sequence for accessing a cog ram variable would look like:
movs :temp,x
nop
:temp mov r,0-0
and it takes 12 cycles. So the hub memory code will actually be faster, if it can be kept aligned to the hub window access. gcc tries to do this; it can't always succeed, but keeping track of tedious things like instruction timings is something computers are actually pretty good at.

Eric

Ariba · 2011-09-08 18:20

@Dr_Acula

Before you do the work and reinvent the AAC look at this thread from Bob Anderson:
http://forums.parallax.com/showthread.php?117211

Andy

Dr_Acula · 2011-09-08 18:36

Great link, thanks Ariba.

I'm still experimenting with either a 'new' language, or whether it is possible to create something that is a standard language like C. I think the latter is almost possible.

We are all thinking very similar thoughts here. Bob Anderson's work is nice because he is writing a C-like language in C itself (I was going to use vb.net but C# is very similar).

Down the track I'd like to think about an IDE that has an 'autocomplete' feature that writes some of the code needed to integrate cogs. Write some of the 'postbox' code automatically so you don't have to remember that you need to increment pointers in the pasm side by 4 instead of 1. A first step down that path is to pick one language and I think C is a good choice (though I think this is entirely possible in Spin (and also an augmented version of Basic if you allow things like bitshifting)).

Dr_Acula · 2011-09-11 23:40

I did some experimental coding last night. I've just had a long read through Bill Henning's LMM thread http://forums.parallax.com/showthread.php?89640-ANNOUNCING-Large-memory-model-for-Propeller-assembly-language-programs!

I've started with the idea that you translate machine instructions into C rather than the other way round, so there is no need for stacks and asking questions about the most efficient code because you write it to be the most efficient.

Where can you take this?

Well, I started with trying to work with Spin, Basic, C and PASM and I think the number of combinations goes up as a factorial. Add in one more language (Pascal, or Java) and it becomes so daunting that no-one will ever do it.

So I started with another approach. Translate Spin to Basic. Basic to C. C to Pasm. Pasm to Spin. Go round in a loop like that and you can start with any one of those languages and generate the other three. If you go round in a loop, you should end up with the same code at the end in the language you started.

Now - if you wanted to add another language, you only have to insert it into the loop, ie write two translators.

So I started with Spin to Basic then Basic to C. I have hand coded Pasm to C above. Most of this is just 'find and replace' and sometimes it is easy like 'find := and replace with ='.

Some are a little more complex - eg with Spin, there is a CON section but for both C and Basic that gets replaced with things like Public Const or const. Each language seems to have the things you need. Even modern versions of Basic have << bitshifts so we can replicate pasm instructions without resorting to too many fake function calls.

I was even getting a bit carried away thinking that if this works in cog pasm, could it work in LMM pasm as well? It probably can, and it would be fast too, because if you wrote in the limited Spin language you would effectively be writing in compiled spin rather than interpreted spin.

One could even think about objects using this language. An object is self contained in terms of constants and variables, and one can replicate that in vb.net syntax using classes, and in C# using the same class structure. Spin defines objects as separate files, whereas .net languages define objects within the same large text file but with classes.

I'm experimenting with tiny programs at the moment, but if you replace pasm's "entry" with vb.net's standard form1_load, you could start to have code that might be able to run in C# and vb.net. That could be very handy for debugging your pasm code - just drop it into .net.

I'm reading this tutorial on OOP http://msdn.microsoft.com/en-us/library/ms973814.aspx and there is some information in there that is relevant to the way Spin works. This could also lead to an OOP version of LMM Spin that could take us to Big Spin using external memory.

I'm not sure where it could go. First step is to get a loop working where the code goes in as any language and comes out after being translated to the other three and then comes back the same code. Easy if you insist on strict syntax rules eg "a = b" with spaces, but a little harder if you have to allow for "a= b" and "a =b", with or without a comment line, and with or without the comment starting after spaces/tabs or straight after the 'b'"

The strings.instr function is getting a good workout but has to be tested over and over with real code because if it returns zero for no match that crashes most string functions.

I don't think this is going to lead to "printf" in a cog, but it could lead to a C program with all the code in C and no pasm visible. Back to coding...

Heater. · 2011-09-12 01:50

Dr_A,

So I started with another approach. Translate Spin to Basic. Basic to C. C to Pasm. Pasm to Spin. Go round in a loop like that and you can start with any one of those languages and generate the other three. If you go round in a loop, you should end up with the same code at the end in the language you started.

If you haven't already I think you are about to lose your mind:)

I think you will find that in general, given some assembler language code compiled from a high level language it is impossible translate that back to anything that looks like the original high level language source. For example when compiling a C program to ASM it is impossible to get the original C source back from the ASM listing.

Basically the compilation process throws away a lot if information that was in the original high level language source. Making the dis-assembly impossible.

This is a general problem in all computation, given the number 9 you have no idea if that was computed from 3 times 3 or 7 + 2 or whatever.

The dis-assembly problem gets even worse if the compiler has been busy optimizing your code as it goes.

My conclusion: You are attempting the impossible and will go nuts trying:)

Aside: Every time I say something is impossible around here someone manages to go ahead and do it. Now a days I say everything is impossible just to ensure that it gets done:)

Turning to the topic of this thread: C Expressive enough for idiomatic PASM?

Perhaps not. I suspect there are always features of processors that are hard to make use of from bog standard C. Often these features are accommodated by making use of some "magic" built in functions. Or attributes attached to functions/variables. Or odd compiler pragmas. (All of which makes the resulting code totally non-portable by the way) It may well be impossible to produce the most optimal PASM from a C program as a programmer might claim to be able to do by hand.

However, noting that the evolving Propeller GCC C compiler can produce real native PASM to run in COG I think you will be surprised at how well it does.

Dr_Acula · 2011-09-12 17:02

Good points heater

For example when compiling a C program to ASM it is impossible to get the original C source back from the ASM listing.

Basically the compilation process throws away a lot if information that was in the original high level language source. Making the dis-assembly impossible.

This is the key to what I believe is possible - write code that does not throw away any information.

Consider an "if" statement. In C

if (condition == value)
{
  some code
}

and in pasm

         cmp condition,value wz
if_nz    jmp #skipcode
         some code
skipcode more code

No information is lost there. But what is difficult is extracting the information. I am thinking of two pieces of generic code.

The first extracts information out of a single line. Add in spaces and then cut up the line into various elements. Then look for matches.
The second piece of code looks for patterns between lines. eg if line1, element1 is "cmp" and line2, element2 = "jmp" then this is an 'if' statement.

The patterns are different for loops - for, while, repeat, do, and different again for select/case/switch but I believe the patterns are different and so it should be possible to do the translation without losing information.

Not all pasm code can be reverse translated - eg escaping from the middle of subroutines with a jmp, but my feeling is this sort of 'bad spaghetti' code is the sort of code to avoid anyway.

The code ought to end up as standard C, Spin and Basic - just a subset of these languages (things like floating point maths are not going to be easy, but strings might be possible).

I shall give this a try (or go mad in the process!)

Heater. · 2011-09-13 01:50

Dr_A,

This is the key to what I believe is possible - write code that does not throw
away any information. ...No information is lost there.

That's very clever, but you are only trying to trick people into agreeing with
your delusion:)

The reason your example appears not to throw away any information is that there
is pretty much no information in the original C source! That code will not
compile as it is so you at least have to wrap it up a bit and declare your
variables:

int main(int argc, char* argv[])
{
  int condition;
  int value;
  if (condition == value)
  {
    // Some code
  }
  return(0);
}

Now potentially that PASM sequence could result from compiling this. And
potentially you could translate that PASM sequence back to the program above.

BUT wait a minute. Who said condition and value were signed integers?
They could as well have been chars or shorts, they could have been signed or
not. They could have been pointers or the names of arrays (same thing in C) or
pointers to functions.
From that little ASM you cannot tell how condition and value are declared.
Therefore you cannot recreate the C program from the PASM.

Why do these types matter?
Well what if they were actually declared as char in the original C source? From
translating this PASM snippet you may end up with them as int. BUT it might be
that else where in the program it relies on an incrementing char value to
overflow and it's value wraps around after 255. As we have decided they are
ints that incrementing will not behave as expected.
So you would have to carefully analyze the entire body of PASM to find all the
variables, see how they are used and then deduce the correct types for them.

What if they were pointers?
In C adding 1 to a pointer (prt +=1 ) will actually increase the pointer by the
size of the type that it points at. So if ptr points to 32 bit int you will see
"ADD PTR, #4" in the PASM. If you have not deduced the types correctly this is
going to go all wrong when translating back to C.

It gets worse.
I put condition and value as local variables in the code, perhaps they were
supposed to be global or static or volatile.

Anyway I think that in your example, as simple as it is, the C source is not
representative of that PASM. Given the PASM it might be better to translate it
to the following C snippet:

if (condtion == value) zeroFlag = 1;
if (zeroFlag == 0) goto skipCode;
  // Some code
skipCode:
  // More code

Why?
1) In PASM the zero flag only changes when you specify that it can with WZ. So
effectively it acts like a boolean variable that can hold a truth value for a
long time as many other instructions execute. It could be that further down the
PASM code the value of the zero flag set in your snippet is used again. Your C
version has not preserved that value.

2) Unless you have analyzed the entire program you cannot be sure, looking at
the PASM nippet, that the lable "skipCode" is not used from some other place in
the PASM, call, jump, self, modifying, whatever. So it's better to keep it in the C source.

Dr_Acula · 2011-09-13 06:56

Ah, I have you intrigued then?

First

int main(int argc, char* argv[])
{
  int condition;
  int value;
  if (condition == value)
  {
    // Some code
  }
  return(0);
}

Yes I agree.
Well we have two types of variables in PasmC and I think there are only two. They are
1) const unsigned long myconstant = nnnn
2) unsigned long myvariable

These then end up at the bottom of the DAT section and ideally you would group all the constants together and then all the variables.
The constants are all values above 511 as anything under 512 can be written as a #number.

I can't see how you can have integers, or pointers bytes or chars or even char arrays. They don't really mean anything in pasm.

So this is a specific subset of the C language but if we work within the limitations of what a cog does we can still write C code. I think...

It gets worse.
I put condition and value as local variables in the code, perhaps they were
supposed to be global or static or volatile.

I am fairly certain that there are no local variables at all and they are all global. That is the way pasm works within a cog. So you declare them before the main() and that fits with the rules of C. No variables are declared within functions as everything is global.

I know that sounds a bit strange, but these are the rules of cog code that can't be changed. What you can do though is note what those rules are, then write C code that obeys those rules. So, all variables are unsigned longs, they are all global so get declared at the beginning of the program, and there are no local variables. That still works for C, and then any code you write ought to be possible to debug on any C compiler.

That ought to be useful.

1) In PASM the zero flag only changes when you specify that it can with WZ. So
effectively it acts like a boolean variable that can hold a truth value for a
long time as many other instructions execute. It could be that further down the
PASM code the value of the zero flag set in your snippet is used again. Your C
version has not preserved that value.

That gets tricky. Yes, the cog remembers the flag for a long time until it is changed again. BUT - I have preferred when writing code not to rely on this too much for fear that one day I might add some extra code in between the working code and upset the whole program.

So, I write with a programming style where any flags are tested on the next line, and then after that the flag is no longer valid.

I think that translates to C. 'if' statements are tested straight after and branch or don't branch. Subsequent 'if' statements within the first statement then are tested again. Loops are tested either at the beginning of the loop or at the end depending on what C loop structure is being used.

The only one I'm not sure about is 'switch' statements with breaks and I need to think about that, though I suspect they can be simplified down to a series of 'if' statements.

The structure that could be a little complex would be if...else statements. There are several jumps implicit in that code (three I think). But I think that the structure is consistent, so I still think that could be reverse compiled back to the C.

2) Unless you have analyzed the entire program you cannot be sure, looking at
the PASM nippet, that the lable "skipCode" is not used from some other place in
the PASM, call, jump, self, modifying, whatever. So it's better to keep it in the C source.

Can I take a raincheck on self modifying code?? *please*

Actually, going off on a tangent, when (apart from maybe LMM) would one ever need self modifying code? Is there not a solution that can be done with reloadable cogjects if one runs out of memory etc?

As for labels, no the label 'skipcode' is not used for anything else apart from that 'if' jump. I think it has to be that way. If you truly want spaghetti code, write in pasm. Even though C does allow 'goto', I believe that it is quite possible to write a C cog program with no 'goto' or 'jmp' instructions as part of the program. They are all hidden within higher level constructs, and more importantly, I believe they still compile to the most efficient code possible. Maybe there is a cunning bit of code that does not fit within this rule, but I think that if you can write a video driver in C, with all the really tight timing requirements one needs, then pretty much all cog code ought to be possible in this variant of C. Happy to be proved wrong though!

To expand on that further, when I took my working high-res graphics driver and reverse compiled it to C by hand, I do need to confess that I rewrote the code in the process so that it obeyed all the rules of 'if' jumps and loops. There were some cunning jumps into other bits of code, and I removed them in the process. I think it added a couple more instructions. But it made the code far more readable. So yes, this concept is going to fail for code where the package has been squeezed into a cog with one or two longs to spare (eg some of the CP/M code).

But what you lose in adding a few longs, you gain in spades with much more legible code. Conditional branches make sense, indents work as they should, and the overall readability of the code is far better in C than it is in pasm. IMHO *grin*

Please keep the critiques coming. This is extremely useful in defining the rules of PasmC and what you can do and what you can't.

Dave Hein · 2011-09-13 10:03

There was a thread called "Quine in Spin" about 6 months ago. A quine is a program that prints out it's own source. I wrote a quine in Spin that would disassemble it's own bytecodes and generate the Spin source. It is posted near the end of the thread. I had to use a subset of the Spin language to be able to convert from bytecodes back to Spin. The biggest limitation was how to interpret jumps. I basically had to implement IF's and loops using "repeat while".

The biggest issue with converting from one language to another, and back is that you will have to severely limit the syntax that you can use in any language to the lowest common "denominator" of all the languages you include. I think it would be useful to be able to convert C or Spin directly into PASM. Converting from PASM back to a high level language may not be so useful, and is difficult because it is hard to convert jumps back into structured code.

Bean's PropBasic does a good job of converting Basic in to PASM or LMM PASM. Programs could be written in a subset of C or Spin that could be converted to PropBasic, and then compiled into PASM. Another approach with Spin would be to compile it to bytecodes, and then convert the bytecodes into PASM. However, a big problem with this approach is that the Spin VM relies heavily on the stack for every operation that is perform. It would be possible to create an optimizer for the generated Spin bytecodes that would convert stack pushes and pops to loads and stores. In Spin, a statement such as "A := B + C' would generate code like

PUSH B
PUSH C
ADD
POP A

which could be optimzed in PASM to

mov A, B
add A, C

if A, B and C were treated as cog variables instead of hub variables.

Heater. · 2011-09-13 10:44

Anyway, presumeably everyone here has noticed there is a GCC based C compiler about to hit the scene. It can generate PASM to run in COG, PASM to run from HUB (LMM), and PASM to run from external memory (XMM). Rumour has it that it's native COG PASM is pretty tight and speedy.
So a lot of the proposals going on this thread seem moot.

Phil Pilgrim (PhiPi) · 2011-09-13 12:17

Since I'm the one who started this thread with a rather provocative question, I guess I should mention now that I'm pretty satisfied with what the GCC folks are showing us. I probably won't ever use Propeller C for anything, but I think it will be a net positive for those of Parallax's customers who demand it.

As to the original (rhetorical) question, the answer is, "No, of course not." But the discussion it provoked has been a really good one, and I'd like to thank everyone who's contributed to the thread.

-Phil

Dave Hein · 2011-09-13 16:24

Phil Pilgrim (PhiPi) wrote: »

As to the original (rhetorical) question, the answer is, "No, of course not." But the discussion it provoked has been a really good one, and I'd like to thank everyone who's contributed to the thread.

Phil, I disagree with your conclusion. It is certainly possible to write a C compiler that generates PASM in the same style as a human PASM programmer. So, I agree that you question was rhetorical, but the answer would be "Yes, of course".

Phil Pilgrim (PhiPi) · 2011-09-13 16:43

Dave, that's certain not a response I expected! However, you said, "It is certainly possible to write a C compiler that generates PASM in the same style as a human PASM programmer," which actually answers a different question, since it accepts the constraints of the C language.

The question in my title was more about the language itself. Phrased more precisely, it asks, "For any given PASM program P, is there a computable mapping f (i.e. a compiler) from some C program C such that f(C) = P (or an equivalent to P of equal or less size, requiring equal or fewer clock cycles to execute)?" If you answer "yes" to that I will have to say, "Show me! And show the GCC guys, too!"

-Phil

Bill Henning · 2011-09-13 17:00

Phil Pilgrim (PhiPi) wrote: »

Dave, that's certain not a response I expected! However, you said, "It is certainly possible to write a C compiler that generates PASM in the same style as a human PASM programmer," which actually answers a different question, since it accepts the constraints of the C language.

The question in my title was more about the language itself. Phrased more precisely, it asks, "For any given PASM program P, is there a computable mapping f (i.e. a compiler) from some C program C such that f(C) = P (or an equivalent to P of equal or less size, requiring equal or fewer clock cycles to execute)?" If you answer "yes" to that I will have to say, "Show me! And show the GCC guys, too!"

-Phil

LOL... you are getting formal now!

Actually, I agree with your premise.

I do not belive any compiler for the prop can generate code as good as a really good pasm programmer.

Cog mode is generating very good code, especially when you are careful and make sure no stack access will occur.

LMM mode is generating very good LMM code, which will improve as we add more optimizations.

Dave Hein · 2011-09-13 17:57

So if I understand your new premise, you are asking if it's possible to write C code that maps directly into PASM code as if I was writing PASM code directly. It's possible to write a compiler that recognizes patterns in a C statement that would translate directly into a PASM instruction. Something like "x = (x << y) | ((unsigned)x >> (32 - y))" could be recognized as "rol x, y". A bit-reverse loop could be recognized and converted into a rev instruction. So yes, the C language is capable of expressing instructions that could be converted into highly optimized PASM that conforms to the way people would program PASM manually, which I believe is what you mean by idiomatic.

Phil Pilgrim (PhiPi) · 2011-09-13 18:17

Dave,

Idiomatic PASM goes well beyond the kind of example you cite. It includes effective use of self-modifying code, conditional execution for more than just jmps, taking good advantage of the more esoteric instructions, like cmpsub, addabs, sumc, and the mux and wait instructions, including waitvid for generating waveforms, to name just a few. Even a simple optimization like keeping a flag state around for use further on is probably beyond C's mission statement. So I stand by my, "No, of course not."

-Phil

kenr707 · 2011-09-13 21:03

Any human writing assembly code can beat any compiler on a 20 instruction program. Any compiler can beat any human on a million-instruction program. (That's before somebody modifies the requirements.) In between, you need to thoughtfully balance the costs and benefits. Most commercial users would say that you should use assembly only when you've demonstrated that you can't meet the needs with compiled code.

Heater. · 2011-09-14 03:19

No compiler is ever going to put together the kind of tangled, short circuited, fragile, brittle, unmaintainable PASM code that I have created for the Z80 emulation or ZPU virtual machine:)
Not unless the C source is written as a single function with no loops, all control flow done via GOTO etc etc.
At which point there is little point in using a high level language anyway.

On the other hand. Rumor has it that the propeller GCC compiler is now producing PASM to run in COG for the fft_benchmark that is only a tad slower than my hand made PASM version.

Maybe my PASM is just crappy:(

Dr_Acula · 2011-09-14 06:37

You are right heater, it is enough to drive one mad!

especially when there are people out there who can write compilers that could fit in a cog?? http://bellard.org/otcc/

My code is slow and horribly inefficient and I'm sure there are many ways to optimise things. And it will never satisfy the qualification of 'idiomatic'

At the same time though, it is a start towards a higher level language suitable for producing cog code that has no 'goto' statements.

Testing 'if' statements first. I'll then move on to 'for', and 'while'

if (i == j)
{
  if (n == m)
  {
    a = b;
  }
}
c = d;
if (k != g)
{
  e = f;
}
g = h;

and the output (which I hope is correct) is

L0000       cmp i,j wz
L0001 if_nz jmp #L0007
L0002       cmp n,m wz
L0003 if_nz jmp #L0007
L0004       mov a,b
L0007       mov c,d
L0008       cmp k,g wz
L0009 if_z  jmp #L0012
L0010       mov e,f
L0012       mov g,h
L0013

the vb.net code is

Imports System.IO
Imports System.Windows.Forms
Imports System.IO.Ports
Imports Strings = Microsoft.VisualBasic ' so can use things like left( and right( for strings
Public Class Form1
    Public Filename As String
    Private Sub ExitToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles ExitToolStripMenuItem.Click
        End
    End Sub

    Private Sub NewToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles NewToolStripMenuItem.Click
        Filename = "New.Spin"
        'Filename = InputBox("Filename eg NEW.C or NEW.BAS or NEW.SPIN or NEW.PASM", , Filename)
        RichTextBox1.Text = ""
        RichTextBox1.AppendText("if (i == j)" + vbCrLf)
        RichTextBox1.AppendText("{" + vbCrLf)
        RichTextBox1.AppendText("  if (n == m)" + vbCrLf)
        RichTextBox1.AppendText("  {" + vbCrLf)
        RichTextBox1.AppendText("    a = b;" + vbCrLf)
        RichTextBox1.AppendText("  }" + vbCrLf)
        RichTextBox1.AppendText("}" + vbCrLf)
        RichTextBox1.AppendText("c = d;" + vbCrLf)
        RichTextBox1.AppendText("if (k != g)" + vbCrLf)
        RichTextBox1.AppendText("{" + vbCrLf)
        RichTextBox1.AppendText("  e = f;" + vbCrLf)
        RichTextBox1.AppendText("}" + vbCrLf)
        RichTextBox1.AppendText("g = h;" + vbCrLf)
    End Sub

    Private Sub ToOtherLanguagesToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles ToOtherLanguagesToolStripMenuItem.Click
        RichTextBox2.Text = RichTextBox1.Text ' copy spin to the basic text box
        ConvertCtoPasm()
        TabControl1.SelectedIndex = 1 ' display the code
    End Sub

    Private Sub ConvertCtoPasm()
        ' convert richtextbox2 to basic syntax
        Dim LineOfText As String
        Dim i, j As Integer
        Dim a, b, c As Integer
        Dim v1, v2 As String
        Dim TextArray(2000) As String
        Dim BracketCounter As Integer = 1
        Dim BracketStack(20) As Integer
        Dim BracketStackCounter As Integer = 0
        ' move data into a string array - easier to work with than richtextbox
        i = 0
        For Each Line As String In RichTextBox2.Lines ' much faster than for i=0 to richtextbox1.lines.length
            TextArray(i) = Line
            i += 1
        Next
        TextArray(i) = "End File" ' add an end of file marker
        ' remove all leading spaces
        i = 0
        Do
            TextArray(i) = LTrim(TextArray(i))
            i += 1
        Loop Until TextArray(i) = "End File"
        ' remove all other spaces
        i = 0
        Do
            TextArray(i) = TextArray(i).Replace(" ", String.Empty)
            i += 1
        Loop Until TextArray(i) = "End File"
        ' label the brackets
        i = 0
        BracketCounter = 1
        Do
            If Strings.Left(TextArray(i), 1) = "{" Then
                TextArray(i) = "BracketLeft" + Trim(Strings.Str(BracketCounter))
                BracketStack(BracketStackCounter) = BracketCounter
                BracketCounter += 1
                BracketStackCounter += 1
            End If
            If Strings.Left(TextArray(i), 1) = "}" Then
                BracketStackCounter -= 1
                TextArray(i) = "BracketRight" + Trim(Strings.Str(BracketStack(BracketStackCounter)))
            End If
            i += 1
        Loop Until TextArray(i) = "End File"
        ' add in label numbers
        i = 0
        Do
            LineOfText = "00000" + Trim(Strings.Str(i))
            LineOfText = "L" + Strings.Right(LineOfText, 4) ' always L+4 characters long
            TextArray(i) = LineOfText + " " + TextArray(i)
            i += 1
        Loop Until TextArray(i) = "End File"
        ' search for 'if (n == m) and replace the line below with a if_nz jump
        i = 0
        Do
            If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "==") <> 0 Then
                ReplaceRightBracket(TextArray, TextArray(i + 1), "if_nz jmp #", i) ' pass the left bracket number
            End If
            i += 1
        Loop Until TextArray(i) = "End File"
        ' search for if(n==m) and replace with cmp
        i = 0
        Do
            If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "==") <> 0 Then
                LineOfText = TextArray(i)
                a = Strings.InStr(LineOfText, "(") ' flags to separate out the variables
                b = Strings.InStr(LineOfText, "==")
                c = Strings.InStr(LineOfText, ")")
                v1 = Strings.Mid(LineOfText, a + 1, b - a - 1)
                v2 = Strings.Mid(LineOfText, b + 2, c - b - 2)
                TextArray(i) = Strings.Left(TextArray(i), 5) + "       cmp " + v1 + "," + v2 + " wz"
            End If
            i += 1
        Loop Until TextArray(i) = "End File"
        ' search for 'if (n != m) and replace the line below with a if_z jump
        i = 0
        Do
            If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "!=") <> 0 Then
                ReplaceRightBracket(TextArray, TextArray(i + 1), "if_z  jmp #", i) ' pass the left bracket number
            End If
            i += 1
        Loop Until TextArray(i) = "End File"
        ' search for if(n!=m) and replace with cmp
        i = 0
        Do
            If Strings.Mid(TextArray(i), 7, 3) = "if(" And Strings.InStr(TextArray(i), "!=") <> 0 Then
                LineOfText = TextArray(i)
                a = Strings.InStr(LineOfText, "(") ' flags to separate out the variables
                b = Strings.InStr(LineOfText, "!=")
                c = Strings.InStr(LineOfText, ")")
                v1 = Strings.Mid(LineOfText, a + 1, b - a - 1)
                v2 = Strings.Mid(LineOfText, b + 2, c - b - 2)
                TextArray(i) = Strings.Left(TextArray(i), 5) + "       cmp " + v1 + "," + v2 + " wz"
            End If
            i += 1
        Loop Until TextArray(i) = "End File"


        ' do all "a=b;" lines - check not an if statement
        i = 0
        Do
            If Strings.InStr(TextArray(i), "=") <> 0 And Strings.Mid(TextArray(i), 7, 3) <> "if(" Then
                LineOfText = TextArray(i)
                a = 7
                b = Strings.InStr(LineOfText, "=")
                c = Strings.InStr(LineOfText, ";")
                v1 = Strings.Mid(LineOfText, a, b - a) ' variable 1
                v2 = Strings.Mid(LineOfText, b + 1, c - b - 1) ' variable 2
                TextArray(i) = Strings.Left(TextArray(i), 5) + "       mov " + v1 + "," + v2
            End If
            i += 1
        Loop Until TextArray(i) = "End File"


        ' convert back to the rich text box and remove redundant right brackets
        i = 0
        RichTextBox2.Text = ""
        Do
            If Strings.Mid(TextArray(i), 7, 12) <> "BracketRight" Then
                RichTextBox2.AppendText(TextArray(i) + vbCrLf)
            End If
            i += 1
        Loop Until TextArray(i) = "End File"
    End Sub

    Private Sub ReplaceRightBracket(ByVal TextArray() As String, ByVal LeftBracket As String, ByVal ReplaceString As String, ByVal i As Integer)
        ' take the value of the leftbracket, search through and replace the same rightbracket with replacestring
        Dim j As Integer
        ' find the matching right bracket number
        j = 0
        While Strings.Mid(TextArray(j), 7) <> "BracketRight" + Strings.Mid(LeftBracket, 18) And TextArray(j) <> "End File"
            j += 1 ' j contains the line with the right bracket
        End While
        ' now increment j until a real line of code (skip through this bracket and any further right brackets)
        While Strings.Mid(TextArray(j), 7, 12) = "BracketRight"
            j += 1
        End While
        TextArray(i + 1) = Strings.Left(TextArray(i + 1), 6) + ReplaceString + Strings.Left(TextArray(j), 5)
    End Sub
End Class

tonyp12 · 2011-09-18 09:38

L0008 cmp k,g wz
L0009 if_z jmp #L0012
L0010 mov e,f

Could it be optimized to create this shorter faster version?
L0008 cmp k,g wz
L0009 if_nz mov e,f

C Expressive Enough for Idiomatic PASM?

Comments