Catalina 2.9

David Betz · 2011-03-17 19:49

RossH wrote: »

That is odd. Can you try a different SD Card? If possible, try one that has just been formatted with a cluster size of 32k and an MBR?

Ross.

Hmmm... This is even more odd. I took a different SD card and formatted it for FAT/32k and wrote test.bas on it. Now when I type "load test.bas" I get an infinite string of error messages complaining about an unknown character code. It seems it can mount the card and open the file but it reads garbage from the file. I put the card back into my PC and verified that the data was written correctly to the card.

RossH · 2011-03-17 20:00

I agree, this is getting more worrying.

Could be a timing problem in my SD card handling, or it could be a problem with your C3.

Have you successfully used the SD card on your C3 for anything else?

Ross.

David Betz · 2011-03-17 20:02

RossH wrote: »

I agree, this is getting more worrying.

Could be a timing problem in my SD card handling, or it could be a problem with your C3.

Have you successfully used the SD card on your C3 for anything else?

Ross.

Both cards work fine with xbasic compiled with ZOG running on the same C3.

RossH · 2011-03-17 20:15

David Betz wrote: »

Both cards work fine with xbasic compiled with ZOG running on the same C3.

Okay. Zog presumably uses the SD access code in the original caching driver, whereas Catalina uses separate SD code, and needs to synchronize the two.

I've run lots of programs that use the SD card and not seen a problem, but a possible explanation is that the timing of SD cards varies and my code just happens to work with the card I use.

I'll try a few different cards tonight and see if I can get it to fail.

Ross.

David Betz · 2011-03-17 20:20

Could you send me the xbasic.binary file that is working for you? Maybe I have some problem with the way I setup the Catalina toolchain.

Thanks,
David

RossH · 2011-03-17 20:35

David Betz wrote: »

Could you send me the xbasic.binary file that is working for you? Maybe I have some problem with the way I setup the Catalina toolchain.

Thanks,
David

Okay - I'll do that when I get home.

Ross.

jazzed · 2011-03-17 20:37

Nice to see that you've integrated the new homespun features.

RossH · 2011-03-17 22:07

jazzed wrote: »

Nice to see that you've integrated the new homespun features.

Yes, homespun's new features allow me to simplify a lot of things internally. I am trying to make 3.0 a lot "cleaner" than previous versions. I still have some way to go!

Ros.

RossH · 2011-03-18 02:16

David,

I think I have identified (but not solved) the problem. It seems to be a problem with the integer version of the library. When I compile with -lcx everything seems to work. When I compile with -lcix it resets the C3 when you try and do a "load".

Attached is a binary (compiled with libcx) that works on my C3. Here are the makefile options:

CFLAGS=-x3
LDFLAGS=$(CFLAGS) -DC3 -M256k -DCACHED -lcx -y

Can you confirm this works on your C3? Note you will need to use a version of xmm.binary compiled with both the C3 and CACHED option.

Ross.

RossH · 2011-03-18 02:39

David, Jazzed ...

Well, I just rebuilt all the libraries, and the problem seems to have disappeared - i.e. xbasic now works whether compiled with -lcix or -lcx.

In my 3.0 pre-release "upgrade" I tried to include the only the changed library files, but I must have done something wrong. I have emailed you both a complete new compiled set of library files which you should unzip over the upgrade. Or you can just rebuild the library from source.

David, let me know if this fixes your problem,

Ross,

David Betz · 2011-03-18 04:32

RossH wrote: »

David, Jazzed ...

Well, I just rebuilt all the libraries, and the problem seems to have disappeared - i.e. xbasic now works whether compiled with -lcix or -lcx.

In my 3.0 pre-release "upgrade" I tried to include the only the changed library files, but I must have done something wrong. I have emailed you both a complete new compiled set of library files which you should unzip over the upgrade. Or you can just rebuild the library from source.

David, let me know if this fixes your problem,

Ross,

Yes, the new libraries fix my SD card problem. Thanks!

The Catalina version of xbasic compiles a lot faster than the ZOG version but takes far longer to load. One reason, of course, is that it is almost twice as big but I think payload is also slower than zogload. I wonder if I could modify zogload to load Catalina binaries? As far as execution speed goes, both are very slow running entirely from external memory but they are close enough that I'll have to use a stopwatch to determine which is faster. Anyway, thanks for the help with Catalina!

Dr_Acula · 2011-03-18 06:05

Loadable COG code - two options now working:
1) Compile cog code to an array, include the array in the C program (ends up in external memory), then load and run OR
2) Compile cog code to a binary file, rename to a .cog file, download to sd card, load into external ram, load and run

I need to tidy this up and have just one "serial_start". Probably with two functions - one loads the cog from the array in hub, and one loads it from the sd card. There are merits in both systems so all the differences between the two systems ought to come down to calling one load function or the other.

But it does work, and this is pretty cool!

Where do you think a logical place would be for the array that contains the cog code - either at the beginning of the program all in a group, or would you put each array after its associated pasm code? I can see merits in both options so I'm not sure about that.

/* PASM demo for use with Catalina IDE Precompiler and Compiler */

#include <stdio.h>

	unsigned long external_serial[511];					// external memory store for .cog file


/* PASM Code for compilation using SpinC and inclusion as a data file
PASM Start mycogject.spin
CON
                                                                     ' add your values here
  _clkfreq = 80_000_000                                              ' 5Mhz Crystal
  _clkmode = xtal1 + pll16x                                          ' x 16
                                                                     ' Start of C hub constants
                                                                     ' End of C constants
PUB Main
    coginit(1,@entry,0)                                           ' cog 1, cogstart, dummy value

DAT

'***********************************
'* Assembly language serial driver *
'***********************************

                        org
'
'
' Entry
'
entry                   mov     t1,par                'get structure address
                        add     t1,#4 << 2            'skip past heads and tails

                        rdlong  t2,t1                 'get rx_pin
                        mov     rxmask,#1
                        shl     rxmask,t2

                        add     t1,#4                 'get tx_pin
                        rdlong  t2,t1
                        mov     txmask,#1
                        shl     txmask,t2

                        add     t1,#4                 'get rxtx_mode
                        rdlong  rxtxmode,t1

                        add     t1,#4                 'get bit_ticks
                        rdlong  bitticks,t1

                        add     t1,#4                 'get buffer_ptr
                        rdlong  rxbuff,t1
                        mov     txbuff,rxbuff
                        add     txbuff,#16

                        test    rxtxmode,#%100  wz    'init tx pin according to mode
                        test    rxtxmode,#%010  wc
        if_z_ne_c       or      outa,txmask
        if_z            or      dira,txmask

                        mov     txcode,#transmit      'initialize ping-pong multitasking
'
'
' Receive
'
receive                 jmpret  rxcode,txcode         'run a chunk of transmit code, then return

                        test    rxtxmode,#%001  wz    'wait for start bit on rx pin
                        test    rxmask,ina      wc
        if_z_eq_c       jmp     #receive

                        mov     rxbits,#9             'ready to receive byte
                        mov     rxcnt,bitticks
                        shr     rxcnt,#1
                        add     rxcnt,cnt                          

:bit                    add     rxcnt,bitticks        'ready next bit period

:wait                   jmpret  rxcode,txcode         'run a chuck of transmit code, then return

                        mov     t1,rxcnt              'check if bit receive period done
                        sub     t1,cnt
                        cmps    t1,#0           wc
        if_nc           jmp     #:wait

                        test    rxmask,ina      wc    'receive bit on rx pin
                        rcr     rxdata,#1
                        djnz    rxbits,#:bit

                        shr     rxdata,#32-9          'justify and trim received byte
                        and     rxdata,#$FF
                        test    rxtxmode,#%001  wz    'if rx inverted, invert byte
        if_nz           xor     rxdata,#$FF

                        rdlong  t2,par                'save received byte and inc head
                        add     t2,rxbuff
                        wrbyte  rxdata,t2
                        sub     t2,rxbuff
                        add     t2,#1
                        and     t2,#$0F
                        wrlong  t2,par

                        jmp     #receive              'byte done, receive next byte
'
'
' Transmit
'
transmit                jmpret  txcode,rxcode         'run a chunk of receive code, then return

                        mov     t1,par                'check for head <> tail
                        add     t1,#2 << 2
                        rdlong  t2,t1
                        add     t1,#1 << 2
                        rdlong  t3,t1
                        cmp     t2,t3           wz
        if_z            jmp     #transmit

                        add     t3,txbuff             'get byte and inc tail
                        rdbyte  txdata,t3
                        sub     t3,txbuff
                        add     t3,#1
                        and     t3,#$0F
                        wrlong  t3,t1

                        or      txdata,#$100          'ready byte to transmit
                        shl     txdata,#2
                        or      txdata,#1
                        mov     txbits,#11
                        mov     txcnt,cnt

:bit                    test    rxtxmode,#%100  wz    'output bit on tx pin according to mode
                        test    rxtxmode,#%010  wc
        if_z_and_c      xor     txdata,#1
                        shr     txdata,#1       wc
        if_z            muxc    outa,txmask        
        if_nz           muxnc   dira,txmask
                        add     txcnt,bitticks        'ready next cnt

:wait                   jmpret  txcode,rxcode         'run a chunk of receive code, then return

                        mov     t1,txcnt              'check if bit transmit period done
                        sub     t1,cnt
                        cmps    t1,#0           wc
        if_nc           jmp     #:wait

                        djnz    txbits,#:bit          'another bit to transmit?

                        jmp     #transmit             'byte done, transmit next byte
'
'
' Uninitialized data
'
t1                      res     1
t2                      res     1
t3                      res     1

rxtxmode                res     1
bitticks                res     1

rxmask                  res     1
rxbuff                  res     1
rxdata                  res     1
rxbits                  res     1
rxcnt                   res     1
rxcode                  res     1

txmask                  res     1
txbuff                  res     1
txdata                  res     1
txbits                  res     1
txcnt                   res     1
txcode                  res     1
PASM End
*/ 

void mycogject(int cognumber, unsigned long *parameters_array)                                        // this name copied from the .spin name in the pasm section - names must match eg void mycogject matches mycogject.spin. Also first code after this must be the .h array file. Put your code after the };
{
       /** 
        * @file mycogject_array.h
        * Created with spin.binary PASM to C Array Converter.
        * Copyright (c) 2011, John Doe
        */
       unsigned long mycogject_array[] =
       {
           0xa0bca9f0, 0x80fca810, 0x08bcaa54, 0xa0fcb201, 
           0x2cbcb255, 0x80fca804, 0x08bcaa54, 0xa0fcbe01, 
           0x2cbcbe55, 0x80fca804, 0x08bcae54, 0x80fca804, 
           0x08bcb054, 0x80fca804, 0x08bcb454, 0xa0bcc05a, 
           0x80fcc010, 0x627cae04, 0x617cae02, 0x689be85f, 
           0x68abec5f, 0xa0fcc833, 0x5cbcbc64, 0x627cae01, 
           0x613cb3f2, 0x5c640016, 0xa0fcb809, 0xa0bcba58, 
           0x28fcba01, 0x80bcbbf1, 0x80bcba58, 0x5cbcbc64, 
           0xa0bca85d, 0x84bca9f1, 0xc17ca800, 0x5c4c001f, 
           0x613cb3f2, 0x30fcb601, 0xe4fcb81e, 0x28fcb617, 
           0x60fcb6ff, 0x627cae01, 0x6cd4b6ff, 0x08bcabf0, 
           0x80bcaa5a, 0x003cb655, 0x84bcaa5a, 0x80fcaa01, 
           0x60fcaa0f, 0x083cabf0, 0x5c7c0016, 0x5cbcc85e, 
           0xa0bca9f0, 0x80fca808, 0x08bcaa54, 0x80fca804, 
           0x08bcac54, 0x863caa56, 0x5c680033, 0x80bcac60, 
           0x00bcc256, 0x84bcac60, 0x80fcac01, 0x60fcac0f, 
           0x083cac54, 0x68fcc300, 0x2cfcc202, 0x68fcc201, 
           0xa0fcc40b, 0xa0bcc7f1, 0x627cae04, 0x617cae02, 
           0x6ce0c201, 0x29fcc201, 0x70abe85f, 0x7497ec5f, 
           0x80bcc658, 0x5cbcc85e, 0xa0bca863, 0x84bca9f1, 
           0xc17ca800, 0x5c4c004d, 0xe4fcc446, 0x5c7c0033
       };

	printf("first long is %u \n",mycogject_array[0]);
       _coginit((int)parameters_array>>2, (int)mycogject_array>>2, cognumber);  // array name built from spin file name
}

void external_cog_load(int cognumber, unsigned long cogdata[], unsigned long *parameters_array)    	//  load a cog from external memory
{
	unsigned long hubcog[511];			
	int i;	
	for(i=0;i<512;i++)								
	{
		hubcog[i]=cogdata[i];					// move from external memory to a local array in hub
	}
 	_coginit((int)parameters_array>>2, (int)hubcog>>2, cognumber);		// load the cog
}                                  

void clearscreen()                                                   // white text on dark blue background
{
       int i;
       for (i=0;i<40;i++)
       {
               t_setpos(0,0,i);                                      // move cursor to next line
               t_color(0,0x08FC);                                    // RRGGBBxx eg dark blue background 00001000 white text 11111100
       }
}

void sleep(int milliseconds)                                         // sleep function
{
       _waitcnt(_cnt()+(milliseconds*(_clockfreq()/1000))-4296);
}

char peek(int address)                                               // function implementation of peek
{
       return *((char *)address);
}

void poke(int address, char value)                                   // function implementation of poke
{
       *((char *)address) = value;
}

unsigned long serial_start(unsigned long rxpin,unsigned long txpin,unsigned long mode, unsigned long baudrate, unsigned long par[])
{
/*
PUB start(rxpin, txpin, mode, baudrate) : okay

'' Start serial driver - starts a cog
'' returns false if no cog available
''
'' mode bit 0 = invert rx
'' mode bit 1 = invert tx
'' mode bit 2 = open-drain/source tx
'' mode bit 3 = ignore tx echo on rx

  stop
  longfill(@rx_head, 0, 4)
  longmove(@rx_pin, @rxpin, 3)
  bit_ticks := clkfreq / baudrate
  buffer_ptr := @rx_buffer
  okay := cog := cognew(@entry, @rx_head) + 1
*/

	unsigned long okay;
	unsigned long bit_ticks;
	unsigned long buffer_ptr;
	par[0] = 0;						// rx_head   longfill(@rx_head, 0, 4)
	par[1] = 0;						// rx_tail
	par[2] = 0;						// tx_head
	par[3] = 0;						// tx_tail
	par[4] = rxpin;					//   longmove(@rx_pin, @rxpin, 3)
	par[5] = txpin;					// note - if rewrite the pasm code could save a couple of hub longs here
	par[6] = mode;					// as rxpin and txpin are not used anywhere else
	bit_ticks = _clockfreq() / baudrate;   		//   bit_ticks := clkfreq / baudrate
	par[7] = bit_ticks;
	buffer_ptr = (unsigned long)&par[9];		//   buffer_ptr := @rx_buffer  points to start of circular buffer
	par[8] = buffer_ptr;					// pointer to the start of the circular buffers
								// rx buffer is 9 to 12 and tx buffer is 13 to 16 (16 bytes =4 longs)
	mycogject(7,par);					// pass the packaged up array
	// okay returns the cog number or -1 if a fail page 119 manual. Ignored here
	printf("par array is at %u \n",(unsigned long)&par[0]);
	printf("par array entry 1 is at %u \n",(unsigned long)&par[1]);
	printf("par array entry 7 is at %u \n",(unsigned long)&par[7]);
	printf("rx_head is at %u \n",(unsigned long)&par[9]);
	printf("buffer_ptr is %u \n",par[8]);
	return okay;
}

unsigned long new_serial_start(unsigned long rxpin,unsigned long txpin,unsigned long mode, unsigned long baudrate, unsigned long par[], unsigned long cogdata[])
{
/*
PUB start(rxpin, txpin, mode, baudrate) : okay

'' Start serial driver - starts a cog
'' returns false if no cog available
''
'' mode bit 0 = invert rx
'' mode bit 1 = invert tx
'' mode bit 2 = open-drain/source tx
'' mode bit 3 = ignore tx echo on rx

  stop
  longfill(@rx_head, 0, 4)
  longmove(@rx_pin, @rxpin, 3)
  bit_ticks := clkfreq / baudrate
  buffer_ptr := @rx_buffer
  okay := cog := cognew(@entry, @rx_head) + 1
*/

	unsigned long okay;
	unsigned long bit_ticks;
	unsigned long buffer_ptr;
	par[0] = 0;						// rx_head   longfill(@rx_head, 0, 4)
	par[1] = 0;						// rx_tail
	par[2] = 0;						// tx_head
	par[3] = 0;						// tx_tail
	par[4] = rxpin;					//   longmove(@rx_pin, @rxpin, 3)
	par[5] = txpin;					// note - if rewrite the pasm code could save a couple of hub longs here
	par[6] = mode;					// as rxpin and txpin are not used anywhere else
	bit_ticks = _clockfreq() / baudrate;   		//   bit_ticks := clkfreq / baudrate
	par[7] = bit_ticks;
	buffer_ptr = (unsigned long)&par[9];		//   buffer_ptr := @rx_buffer  points to start of circular buffer
	par[8] = buffer_ptr;					// pointer to the start of the circular buffers
								// rx buffer is 9 to 12 and tx buffer is 13 to 16 (16 bytes =4 longs)
	//mycogject(7,par);					// pass the packaged up array
	external_cog_load(7,cogdata,par);			// load from external ram
	// okay returns the cog number or -1 if a fail page 119 manual. Ignored here
	printf("par array is at %u \n",(unsigned long)&par[0]);
	printf("par array entry 1 is at %u \n",(unsigned long)&par[1]);
	printf("par array entry 7 is at %u \n",(unsigned long)&par[7]);
	printf("rx_head is at %u \n",(unsigned long)&par[9]);
	printf("buffer_ptr is %u \n",par[8]);
	return okay;
}



void serial_tx(char tx,unsigned long par[])
{
/*
PUB tx(txbyte)
'' Send byte (may wait for room in buffer)
  repeat until (tx_tail <> (tx_head + 1) & $F)
  tx_buffer[tx_head] := txbyte
  tx_head := (tx_head + 1) & $F
  if rxtx_mode & %1000
    rx
*/
	unsigned long tx_head;
	int address;
	while ( par[3] == (par[2] + 1 ) & 0xF) {} 	// par[2] is tx_head, par[3] is tx_tail
	poke(address,tx);				// poke the tx byte value to hub ram
	tx_head = par[2];				// get the head value
	address = par[8] + 16 + tx_head;		// location of rx buffer plus 16 to get tx buffer plus the head value
	poke(address,tx);				// poke the tx byte value to hub ram
	tx_head = tx_head + 1;			// add one
	tx_head = tx_head & 0xF; 			// logical and with 15
	par[2] = tx_head;				// store it back again
							// need to add the echo mode?
}

unsigned long serial_rxcheck(unsigned long par[])
{
/*
PUB rxcheck : rxbyte
'' Check if byte received (never waits)
'' returns -1 if no byte received, $00..$FF if byte
  rxbyte--
  if rx_tail <> rx_head
    rxbyte := rx_buffer[rx_tail]
    rx_tail := (rx_tail + 1) & $F
*/
	unsigned long rxbyte;			// actually is a long, so can return -1 FFFFFFFF if nothing and 0-FF if a byte
	int address;					// hub address
	rxbyte = 0;					// set explicitly to zero
	rxbyte = rxbyte - 1;				// return ffffffff if nothing
	if (par[1] != par[0])
	{
		address = par[8] + par[1];		// par[8] is the rx buffer, par[1] is rx_tail
		rxbyte = peek(address);		// get the return byte from the buffer
		par[1] = (par[1] +1) & 0xF;		// add one to tail
	}
	return rxbyte;
}

unsigned long serial_rx(unsigned long par[])
{
/*
PUB rx : rxbyte
'' Receive byte (may wait for byte)
'' returns $00..$FF
  repeat while (rxbyte := rxcheck) < 0	
*/
	unsigned long rxbyte;			// actually is a long, not a byte
	while ((rxbyte = serial_rxcheck(par)) == -1) {} // 0xffffffff and -1 works, but " < 0" gives a compiler error
	return rxbyte;				// return the value
}

int EoF (FILE* stream)
{
  	register int c, status = ((c = fgetc(stream)) == EOF);
  	ungetc(c,stream);
  	return status;
}

void readcog(char *filename,unsigned long external_cog[])			// read in a .cog file into external memory array 
{
	int i;
	FILE *FP1;
	i = 0;
	if((FP1=fopen(filename,"rb"))==0)					// open the file
   	{
  		fprintf(stderr,"Can't open file %s\n",filename);
		exit(1);
   	}
  	fseek(FP1,0,0);
  	while(!EoF(FP1) & (i<511))							// run until end of file
	{
		external_cog[i] = getc(FP1) | (getc(FP1)<<8) | (getc(FP1)<<16) | (getc(FP1)<<24);	// get the long
		// printf("%u ",external_cog[i]);
		i+=1;
	}
	if(FP1)
       {
     		fclose(FP1);							// close the file
     		FP1=NULL;
   	}
	printf("cog data 0 = %i \n",external_cog[0]);
}


void main ()
{
	int i;
	unsigned long received_byte;
	unsigned long serial_parameters[16];				// reserve hub space for buffer, head tail pointers
       clearscreen();
       printf("Clock speed %u \n",_clockfreq());                     // see page 28 of the propeller manual for other useful commands
       printf("Catalina running in cog number %i \n",_cogid());      // integer
	readcog("serial.cog",external_serial);				// read in serial.cog to external memory
//	serial_start(31,30,0,1200,serial_parameters);			// start serial cog pins 31,30, mode 0, 1200 baud
	new_serial_start(31,30,0,1200,serial_parameters,external_serial);			// start serial cog pins 31,30, mode 0, 1200 baud
       printf("Started serial driver\n");
	for(i=0; i<10; i++)
	{
		serial_tx(65+i,serial_parameters); 					// test sending a byte
		sleep(500);
		printf("send byte %u \n",65+i);
	}
/*
	printf("type some characters, will return that character plus 1 \n");
	for (i=0;i<19;i++)							// test 19 times, so tests buffer restarting
	{
		received_byte = serial_rx(serial_parameters);		// get a byte
		received_byte = received_byte + 1;				// add one and send it back
		serial_tx(received_byte,serial_parameters);
		printf("sent back byte %u \n",received_byte);
	}
*/
	printf("program finished \n");
	while (1); // Prop reboots on exit from main()
}

David Betz · 2011-03-18 14:22

I just timed running my xbasic test program on the C3 under both Catalina C and GCC/ZOG. Here are the results:

Catalina: 12 seconds
GCC/ZOG: 17 seconds

This isn't a competely fair comparison because the Catalina version was using the hires TV driver and the ZOG version was using the lowres TV driver. For some reason the hires driver isn't working under ZOG at the moment.

Of course, both of these results are horrible. This is a fairly simple program and shouldn''t take this long to compile into bytecodes and run. I need to test both C compilers with data in SRAM and stack in hub memory. That should speed up execution significantly. Unfortunately, neither C compiler supports that mode yet. I could also try data/stack in hub. That mode is supported in Catalina and could probably be added to ZOG fairly easily. It isn't done yet though so the test will have to wait.

In case you're interested, here is my test program:

10 dim z$(3) = { "foo", "bar", "silly" }
20 dim q$ = "abc"
30 def greeting$ = "Hello, world!"

40 testargs(greeting$, 42)

50 for x = 1 to 10
60   print x, square(x)
70 next x

80 r$ = "abd"
90 s$ = "cdf"
100 t$ = "cde"
110 if q$ < r$ then print "ok"
120 if s$ < t$ then print "bad"
130 print "q$ is '"; q$; "'"
140 print "length of q$ is "; len(q$)
150 print "first character of q$ is "; asc(q$)

160 for x = 0 to 2
170   print z$(x)
180 next x

190 print left$(greeting$, 5)
200 print right$(greeting$, 6)
210 print mid$(greeting$, 2, 6)
211 xx = 4
212 yy = 3
215 print "4 - 3 = " + str$(xx - yy)

220 def square(n)
230   return n * n
240 end def

250 def testargs(a$, i)
260   print a$, i
270 end def

Dr_Acula · 2011-03-18 15:11

That code looks interesting David. I wonder what a C version would run at?

Some musings re self contained cog programs:

1) Define the cog arrays at the beginning of the program -eg unsigned long serial_cog[511] and then all the data. The data corresponds to the pasm code. If you want to load a .cog off an sd card, it will overwrite this array with the sd card data. If you want to use the pure sd card system, don't put any data in this array (and you could leave out the pasm part too once .cog objects become standardised enough).

2) The equivalent of adding Spin objects I think becomes a merge of two files. Merge all the .cog arrays at the beginning so they end up one after the other. Then merge all the pasm comment code, one after the other. Then merge all the functions, one after the other. Then merge the main programs. This could be a command line program - two source files and a destination file. Then repeat for more objects. I think this might allow the release of C objects for use in a non object oriented language (and this ought to work the same for BCX basic and maybe PropBasic as well?). The merge program would need to know the start and end of the blocks but this is pretty easy with comment lines.

David Betz · 2011-03-18 15:24

I just tried building xbasic with -x4 to put the data in hub memory and the resulting binary didn't work. Nothing appeared on the TV when I loaded it. I also tried -DPC to use the serial HMI and got no response from the serial port. Is there anything else I need to do to build for this memory model other than using -x4?

David Betz · 2011-03-18 17:37

Ross,

I think you said at one point that you had not used my combined C3 SPI driver for Catalina but instead used separate drivers for SRAM/flash and the SD card. How do you coordinate between those two drivers? I'm wondering what I would have to do to use the SPI SRAM in my C code if I'm using the -x4 memory model that doesn't use it for Catalina itself.

Thanks,
David

RossH · 2011-03-18 18:35

David,

I just compiled xbasic to use the -x4 mode. You have to remember that with everything except code in Hub RAM, you probably cannot afford to dedicate 8K just as cache. I used a 2K cache and it works. You may be able to use a 4K cache - I didn't try that.

Here are the make options I used:

CFLAGS=-x4
LDFLAGS=$(CFLAGS) -DC3 -M256k -DCACHED_2K -D PC -DCR_ON_LF -lcix -y

Also note that you need to recompile the "utilities" folder using a command like:

build C3 CACHED_2K

I used the Parallax Serial Terminal program and everything seems to work ok. Doesn't seem to be any faster, although its difficult to tell with this program - you need to come up with a more compute intensive basic program to compare the two - all this program does is a small amount of I/O and I wouldn't expect much difference there.

Ross.

RossH · 2011-03-18 18:39

David Betz wrote: »

I need to test both C compilers with data in SRAM and stack in hub memory. That should speed up execution significantly. Unfortunately, neither C compiler supports that mode yet.

Hi David,

You appear to be describing what Catalina does in the -x3 mode - i.e. all the code is in SPI Flash. All the global data (and the heap) is in SPI RAM. The stack and all local data is in Hub.

Ross.

RossH · 2011-03-18 18:41

Hi Dr_A,

I'm impressed with what you've achieved so far - I hope you will keep it up. This weekend I have to spend working school projects with my kids - but I hope next week to try out your IDE and try out some 'cogjects'!

Ross.

David Betz · 2011-03-18 18:41

RossH wrote: »

Hi David,

You appear to be describing what Catalina does in the -x3 mode - i.e. all the code is in SPI Flash. All the global data (and the heap) is in SPI RAM. The stack and all local data is in Hub.

Ross.

Yes, that's exactly what I meant. I'll have to try that. Thanks! Do I have to rebuild the utilities to try that memory model?

RossH · 2011-03-18 18:48

David Betz wrote: »

Ross,

I think you said at one point that you had not used my combined C3 SPI driver for Catalina but instead used separate drivers for SRAM/flash and the SD card. How do you coordinate between those two drivers? I'm wondering what I would have to do to use the SPI SRAM in my C code if I'm using the -x4 memory model that doesn't use it for Catalina itself.

Thanks,
David

David,

The coordination is done using XMM API functions XMM_Activate and XMM_TriState (or equivalents). We are not really tristating any pins on the C3 (the terminology is a hangover from a platform where this was in fact the case - on the C3 we are instead assigning logical control of the SPI Bus) but the kernel has to make sure that it calls XMM_TriState before requesting a service from another cog that might need to use the SPI bus, and those cogs must do the equivalent of XMM_Activate to gain access to the SPI Bus and XMM_Tristate before returning control to the kernel.

To access the SPI RAM from C, I don't see why you don't just use the Catalina -x3 mode. That's what it's for.

Ross.

David Betz · 2011-03-18 19:02

RossH wrote: »

To access the SPI RAM from C, I don't see why you don't just use the Catalina -x3 mode. That's what it's for.

Because it's very slow. I figured my code would run faster if I put the code in flash, data in hub memory and just used the SRAM as a buffer for the editor and for scratch space for the compiler.

RossH · 2011-03-18 21:28

David Betz wrote: »

Because it's very slow. I figured my code would run faster if I put the code in flash, data in hub memory and just used the SRAM as a buffer for the editor and for scratch space for the compiler.

When you use the -x3 mode, that's essentially what you are doing. If you want data in Hub RAM then declare it as local (e.g. local to the main function). If you want data in SPI RAM then declare it as global (or allocate it on the heap).

For xbasic I didn't notice much speed difference between -x3 and -x4, so I wouldn't expect much of a performance gain in any case.

Ross.

RossH · 2011-03-18 21:29

David Betz wrote: »

Yes, that's exactly what I meant. I'll have to try that. Thanks! Do I have to rebuild the utilities to try that memory model?

No, you only have to rebuild the utilities if you change cache size.

Ross.

David Betz · 2011-03-18 22:58

RossH wrote: »

No, you only have to rebuild the utilities if you change cache size.

Ross.

Thanks Ross! I'll have to give that a try tomorrow.

Edit: Ummm... I must be losing my mind. The -x3 layout is what I started out with. I hadn't realized that it was putting the stack and locals in hub memory. I thought that they were being placed in SRAM along with the rest of the R/W data. So I guess I've pretty much gotten as much as I can out of Catalina. My only hope for better performance is to get ZOG working with the stack and locals in hub memory. That won't be easy though since the ZPU is big-endian and the Propeller is little-endian and there are endless conflicts trying to resolve that difference.

David Betz · 2011-03-18 23:29

RossH wrote: »
David,

I just compiled xbasic to use the -x4 mode. You have to remember that with everything except code in Hub RAM, you probably cannot afford to dedicate 8K just as cache. I used a 2K cache and it works. You may be able to use a 4K cache - I didn't try that.

Here are the make options I used:
CFLAGS=-x4
LDFLAGS=$(CFLAGS) -DC3 -M256k -DCACHED_2K -D PC -DCR_ON_LF -lcix -y
Also note that you need to recompile the "utilities" folder using a command like:
build C3 CACHED_2K
I used the Parallax Serial Terminal program and everything seems to work ok. Doesn't seem to be any faster, although its difficult to tell with this program - you need to come up with a more compute intensive basic program to compare the two - all this program does is a small amount of I/O and I wouldn't expect much difference there.

Ross.

I just tried -x4 as you described above except with the standard HIRES_TV HMI and it sort of works but loading test.bas fails. Is that likely because I ran out of memory?

RossH · 2011-03-18 23:34

Hi David,

That's what I've been trying to tell you!

However, I think there is scope for improvement - I've just been running some benchmarks, and the cache is not behaving nearly as well as it should. I think there may be something wrong in the cache hit detection - i.e. it is not detecting a cache hit when it should, and is reloading from SPI too often.

I'll let you know what I find out - but my time this weekend is fairly limited.

Ross.

RossH · 2011-03-18 23:35

David Betz wrote: »

I just tried -x4 as you described above except with the standard HIRES_TV HMI and it sort of works but loading test.bas fails. Is that likely because I ran out of memory?

Most likely. Try reducing the cache size (I suppose 1K is all that's left!) or use the LORES_TV option.

Ross.

David Betz · 2011-03-18 23:35

RossH wrote: »

Most likely. Try reducing the cache size.

Ross.

You mean it can go lower than 2K? I tried it again with -DPC and now I can load test.bas but it crashes when I type RUN.

RossH · 2011-03-18 23:36

Yes, you can go to 1K. But I had it working with PC and (I think) a 2K cache. Post your exact options.

Catalina 2.9

Comments