flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

Rayman · 2020-12-02 20:00

Ok, it does actually work when p is initialized to 0 on P2.

Rayman · 2020-12-03 17:59

Looking a the Flexspin docs, and the section "### Changing Hub address"...

It appears like I could compile some small app that could be copied over from uSD, say, at run time end then do something like this on it:

flexspin -2 -H 0x10000 -E fibo.bas

Would that work? If so, what's a good strategy for keeping it away from stack and heap?

ersmith · 2020-12-03 22:32

Rayman wrote: »
Looking a the Flexspin docs, and the section "### Changing Hub address"...

It appears like I could compile some small app that could be copied over from uSD, say, at run time end then do something like this on it:
flexspin -2 -H 0x10000 -E fibo.bas
Would that work? If so, what's a good strategy for keeping it away from stack and heap?

Yes, that would work to compile fibo.binary that could be loaded and run from another program. You'd have to make sure that the calling program (the one that invokes fibo.binary) plus its stack fits below 0x10000. The heap is statically sized in FlexProp, so it's already included in program size. That just leaves the stack, and there's no general way to calculate the stack size required (it's equivalent to the halting problem, IIRC) but if the program doesn't have recursion you can usually figure it out. As a rule of thumb hardly any programs need more than 4K of stack, unless they declare big local arrays.

evanh · 2020-12-06 22:35

Eric,
I just assembled Chip's HDMI demos from Pnut v35 and discovered he is using a new mnemonic: ASMCLK - for inserting canned function of setting the default sysclock according to _clkfreq constant.

Interestingly, Flexspin does not complain about this when assembling. But it doesn't do the right job either. Indications are ASMCLK is ignored and the sysclock is left at RCFAST.

00000                 | CON		hdmi_base	= 48		'must be a multiple of 8
00000                 | 
00000                 | 		_clkfreq	= 250_000_000	'system clock frequency must be 250 MHz for HDMI
00000                 | 
00000                 | 		fast		= 1		'0 for small code (7.8 fps), 1 for fast code (36.6 fps)
00000                 | 
00000                 | 		bitmap		= $400		'HDMI bitmap (300 KB)
00000                 | 
00000 000             | DAT		org
00000 000             | 
00000 000             | 		asmclk
00000 000             | 
00000 000 00 00 00 FF 
00004 001 10 02 EC FC | 		coginit	#1,##@pgm_hdmi		'launch HDMI
00008 002 00 00 00 FF 
0000c 003 A0 00 EC FC | 		coginit	#0,##@pgm_bmap		'launch bitmap cog

ersmith · 2020-12-07 14:15

@evanh: I think it was interpreting the "asmclk" as a label.

I'm not sure if I can support "asmclk" or not. It seems like a pretty dangerous feature: if someone wants to insert some assembly instructions I think they should write them out rather than having the compiler mysteriously insert them.

evanh · 2020-12-08 05:19

Okay, makes sense. It's no problem to substitute in Cluso's old method. eg: 300 MHz sysclock:

	XTALFREQ	= 20_000_000		'PLL stage 0: crystal frequency
	XDIV		= 20			'PLL stage 1: crystal divider (1..64)
	XMUL		= 300			'PLL stage 2: crystal / div * mul (1..1024)
	XDIVP		= 1			'PLL stage 3: crystal / div * mul / divp (1,2,4,6..30)


' Clock modes
	' %0000_000e_dddddd_mmmmmmmmmm_pppp_cc_ss	' enable oscillator
	XOSC		= %10				' OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
	XSEL		= %11				' XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
	XPPPP		= ((XDIVP>>1) + 15) & $F	' 1->15, 2->0, 4->1, 6->2...30->14
	_clkfreq	= round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
	CLK_MODE	= 1<<24 | (XDIV-1)<<18 | (XMUL-1)<<8 | XPPPP<<4 | XOSC<<2


DAT		org
                hubset  ##CLK_MODE		'config PLL
                waitx   ##25_000_000 / 200	'allow crystal+PLL 5ms to stabilize
                hubset  ##CLK_MODE | XSEL	'switch to PLL
                ...

TonyB_ · 2020-12-08 10:25

evanh wrote: »
Okay, makes sense. It's no problem to substitute in Cluso's old method. eg: 300 MHz sysclock:
	XTALFREQ	= 20_000_000		'PLL stage 0: crystal frequency
	XDIV		= 20			'PLL stage 1: crystal divider (1..64)
	XMUL		= 300			'PLL stage 2: crystal / div * mul (1..1024)
	XDIVP		= 1			'PLL stage 3: crystal / div * mul / divp (1,2,4,6..30)

I know it's an example but would you really use these XDIV & XMUL for 300 MHz?

EDIT:
Maybe you just changed XMUL that was 297?

Cluso99 · 2020-12-08 10:41

When I was having jitter problems, Chip recommended a change from dividing down to 0.5MHz to divide down to 3MHz, multiply up to 297MHz and then divide by 2.

{{'P2D2
  _XTALFREQ     = 12_000_000                                    ' crystal frequency
''_XDIV         = 12  * 2       '\                              ' crystal divider to give 0.5MHz
''_XMUL         = 297           '| 148.5MHz                     ' crystal / div * mul
''_XDIVP        = 1             '/                              ' crystal / div * mul /divp to give _CLKFREQ (1,2,4..30)
  _XDIV         = 4             '\                              '\ crystal divider                      to give 3.0MHz
  _XMUL         = 99            '| 148.5MHz                     '| crystal / div * mul                  to give 297MHz
  _XDIVP        = 2             '/                              '/ crystal / div * mul /divp            to give 148.5MHz
  _XOSC         = %01                                   'OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
}}

evanh · 2020-12-08 16:18

My frequency spanning tests are pretty much all done with adjusting XMUL alone. I don't bother with trying to get 0.5 MHz steps. I'll just round up instead. It's not like it matters.

evanh · 2020-12-09 02:07

Here's the recalculation I do for each step:

setclkfrq
		getqx	inb				'get excess results, final one sets event flag
		jnqmt	#setclkfrq			'wait for flag to collect final results - CORDIC pipeline flushed

'recalculate sysclock hertz using cordic
		qmul	xtalmul, ##(XTALFREQ / XDIV / XDIVP)	'integer component of pre-divided crystal frequency
		mov	pa, xtalmul
		mul	pa, ##round((float(XDIV*XDIVP)+0.5) * (float(XTALFREQ)/float(XDIV)/float(XDIVP) - float(XTALFREQ/XDIV/XDIVP)))
		qdiv	pa, ##(XDIV * XDIVP)		'fractional component of pre-divided crystal frequency
		getqx	clk_freq			'result of integer component
		getqx	pa				'result of fractional component
		add	clk_freq, pa			'de-error the integer rounding

		...

The newly calculated clk_freq is then used to recalculate and set the serial port timings to keep the same baud:

'recalculate baud divider (clk_freq / asyn_baud) of diag comport using cordic
'  low bauds won't operate at high sysclocks, the divider only has 16-bit reach
		qdiv	clk_freq, asyn_baud		'comport divider
		qfrac	#1, asyn_baud			'remainder scale factor, 2**32 / baud
		getqx	pa				'comport divider
		getqy	pb				'divider remainder, for .6 fraction
		getqx	temp1				'scale factor
		qmul	pb, temp1			'convert remainder to a "big" fraction
		getqx	pb				'fractional component of comport divider
		rolword	pa, pb, #1			'16.16 comport divider
		sets	pa, #7				'comport 8N1 framing (bottom 10 bits should be replaced but 9's enough)

		call	#waittx
		wxpin	pa, #DIAGTXPIN			'set tx baud and framing (divider format is 16.0 if divider >= 1024.0)
		wxpin	pa, #DIAGRXPIN			'set rx baud and framing (divider format is 10.6 if divider < 1024.0)

And finally, the new sysclock mode is set:

'adjust hardware to new XMUL sysclock frequency
		andn	clk_mode, #%11			'clear the two select bits to force RCFAST selection
		hubset	clk_mode			'**IMPORTANT**  Switches to RCFAST using known prior mode

		mov	clk_mode, xtalmul		'replace old with new ...
		sub	clk_mode, #1			'range 1-1024
		shl	clk_mode, #8
		or	clk_mode, ##(1<<24 + (XDIV-1)<<18 + XPPPP<<4 + XOSC<<2)
		hubset	clk_mode			'setup PLL mode for new frequency (still operating at RCFAST)

		or	clk_mode, #XSEL			'add PLL as the clock source select
		waitx	##25_000_000/100		'~10ms (at RCFAST) for PLL to stabilise
		hubset	clk_mode			'engage!  Switch back to newly set PLL
		...

JRoark · 2020-12-12 19:09

@ersmith FlexProp V5.0.1-beta has some new issues in the BASIC side of the house:

1). The following syntax variations all throw the error: "syntax error, unexpected shl, expecting $end or end of line or end or ':'"

   x >> 1   
   x SHR 1
   x << 1
   x SHL 1

However, if you write them this way, they all work fine:

   x = x >> 1   
   x = x SHR 1
   x = x << 1
   x = x SHL 1

The allowable syntax has changed somewhere along the line, because the first syntax used to be legal (and it's actually found verbatim on page 66 in the FlexBASIC manual). When it changed I can't tell you, but today when I pulled some old and tested code out, the compiler got grumpy at this.

2). The OCT$(), BIN$(), HEX$() functions all return an empty string if the GUI menu option "Options->Full Optimization" is selected. There is no error thrown at compile time. Changing the optimization level to "Default" fixes the problem.

ersmith · 2020-12-13 00:06

JRoark wrote: »
@ersmith
1). The following syntax variations all throw the error: "syntax error, unexpected shl, expecting $end or end of line or end or ':'"
   x >> 1   
   x SHR 1
   x << 1
   x SHL 1 
However, if you write them this way, they all work fine:
   x = x >> 1   
   x = x SHR 1
   x = x << 1
   x = x SHL 1 
The allowable syntax has changed somewhere along the line, because the first syntax used to be legal (and it's actually found verbatim on page 66 in the FlexBASIC manual). When it changed I can't tell you, but today when I pulled some old and tested code out, the compiler got grumpy at this.

It never should have accepted it, and it wouldn't have done anything even if it did. The documentation is poorly written here. "x >> 1" is syntactically the same as "x + 1", that is it's an expression that returns a new value based on "x" and which leaves "x" itself unchanged. The documentation incorrectly implies that it modifies "x", but that has never been the case. I'll fix the docs.

2). The OCT$(), BIN$(), HEX$() functions all return an empty string if the GUI menu option "Options->Full Optimization" is selected. There is no error thrown at compile time. Changing the optimization level to "Default" fixes the problem.

Thanks for the bug report, it's an error in the loop-reduce optimization. I'll try to fix that soon.

JRoark · 2020-12-17 18:31

@ersmith Flex v 5.0.2 in BASIC: It appears HEX$() has a bug:

	
        print hex$(15)   'prints "F".  OK
	print hex$(16)   'prints "0".  Not "10".  Whoops.
	print hex$(17)   'prints "11. OK

        print hex$(255)   'prints "FF".  OK
	print hex$(256)   'prints "00".  Not "100".  Whoops.
	print hex$(257)   'prints "101. OK

	print hex$(4095)   'prints "FFF".  OK
	print hex$(4096)   'prints "000".  Not "1000".  Whoops.
	print hex$(4097)   'prints "1001".  OK

It appears that the switch from having 2 chars returned, to 3 chars returned, isn't being handled properly?

ersmith · 2020-12-17 18:42

@JRoark: there's an off-by-one error in include/libsys/strings.bas. It's fixed in github now, but the patch is simple and you can apply it locally:

diff --git a/include/libsys/strings.bas b/include/libsys/strings.bas
index c2130fe0..39796e19 100644
--- a/include/libsys/strings.bas
+++ b/include/libsys/strings.bas
@@ -112,7 +112,7 @@ function number$(val as uinteger, n as uinteger, B as uinteger) as string
      ' we have to watch out for overflow in very large
      ' numbers; if tmp wraps around (so tmp < last tmp)
      ' then stop
-     while tmp < val and lasttmp < tmp
+     while tmp <= val and lasttmp < tmp
        lasttmp = tmp
        tmp = tmp * B
        n = n + 1

JRoark · 2020-12-17 19:50

@ersmith The patch works like a charm! Thank you.

JRoark · 2020-12-20 18:04

@ersmith A question about strings and garbage collection in FlexBASIC:

Assume that I create a string: A$ = "This is a test". This will be stored with a trailing 00h to mark the end of the string. Now assume that I remove the trailing 00h, and replace the space character between "THIS" and "IS" with a 00h.

Question 1: When the GC runs, will it reclaim *all* of the space that was originally used by the string, or will it hit that new 00h at offset 5 and quit, thereby leaving part of the original string unreclaimed?

Question 2: Does _gc_free(myvar) actually reclaim the space used by "myvar" or does it just mark it as being unused so the next time _gc_collect() runs it gets reclaimed?

I'm trying to locate a very slow memory leak/corruption problem that takes days to manifest itself, and these two scenarios seem to be possibilities.

ersmith · 2020-12-20 22:42

JRoark wrote: »

Assume that I create a string: A$ = "This is a test". This will be stored with a trailing 00h to mark the end of the string. Now assume that I remove the trailing 00h, and replace the space character between "THIS" and "IS" with a 00h.

Question 1: When the GC runs, will it reclaim *all* of the space that was originally used by the string, or will it hit that new 00h at offset 5 and quit, thereby leaving part of the original string unreclaimed?

The gc does not operate on strings, it operates on blocks of memory. The contents of the memory are not relevant and are not even looked at. So no, modifying the internal bytes of the string won't make any difference (trying to write past the allocated memory might, but that would definitely be a bug).

Question 2: Does _gc_free(myvar) actually reclaim the space used by "myvar" or does it just mark it as being unused so the next time _gc_collect() runs it gets reclaimed?

It actually reclaims it.

Note that the Achilles heel of the flexprop memory allocator is fragmentation. Memory is never moved, so if there are a lot of allocations and frees of different sizes, and some of those in the middle are not freed, then even though there may be enough freed memory in total for an allocation there may not be enough contiguous memory. As an illustrative example (it may not be exactly accurate), there may be 4K in use and 12K free, but that 12K might be split up in such a way that you cannot allocate more than 1K of it, if the 4K used is in small blocks located in bad places. There's no way of analyzing this at present. If I have time I may try to add a way to find the largest free block.

JRoark · 2020-12-20 23:26

@ersmith That explanation helps me a LOT. Thanks for taking the time to explain it so well. This suggests to me that the way/order that I define the bigger things in the main code body is important, ie, keep structures together that dont get deallocated, and allocate the smaller/on-the-fly stuff later.

It would be wonderful to have a means of compaction... but there are likely a few (hundred) things higher on the “need it” list! Lol.

Rayman · 2020-12-21 17:41

Code I'm working on has an "anonymous union" that doesn't compile. Is this a c++ thing?

void cFFFFFF(byte v) {
    union {
        uint32_t c;
        uint8_t b[4];
    };
    b[0] = v;
    b[1] = 0xff;
    b[2] = 0xff;
    b[3] = 0xff;
    cmd32(c);
}

Wuerfel_21 · 2020-12-21 18:24

Rayman wrote: »
Code I'm working on has an "anonymous union" that doesn't compile. Is this a c++ thing?
void cFFFFFF(byte v) {
    union {
        uint32_t c;
        uint8_t b[4];
    };
    b[0] = v;
    b[1] = 0xff;
    b[2] = 0xff;
    b[3] = 0xff;
    cmd32(c);
}

No, this is a not-technically-standard thing that most compilers just happen to support because it's useful.

Rayman · 2020-12-21 21:14

Is there a limit to fread size? This loop doesn't work when size increased from 1000 to 2000:

    int n;
    while ((n = fread(LogBuf, 1, 1000, f)) > 0)  //RJA we seem to have a limit somewhere between 1000 and 2000
    {

David Betz · 2020-12-21 21:21

Rayman wrote: »
Is there a limit to fread size? This loop doesn't work when size increased from 1000 to 2000:
    int n;
    while ((n = fread(LogBuf, 1, 1000, f)) > 0)  //RJA we seem to have a limit somewhere between 1000 and 2000
    {

Is the buffer on the stack? Maybe the stack is overflowing?

Rayman · 2020-12-21 21:55

I think I remember now about some limits when using plan 9...

ersmith · 2020-12-22 00:34

It's earlier in the thread, but yes, there's currently a bug in the 9p file system that causes reads/writes greater than 1K to fail. That's fixed in github and things will work properly in the upcoming 5.0.3 release.

JRoark · 2020-12-22 16:36

@ersmith It looks like the FlexBASIC += operator doesn't like to be used with array variables:

dim Thing(5) as long
dim total as long
dim i as long

	total = 0
	for i = 0 to 4
		total += Thing(i)   ' <---- throws error
	next i
	print total

The error thrown is "error: Object called is not a function"
If we rewrite it like below, then it works fine.

	for i = 0 to 4
		total = total + Thing(i)   ' <---- works fine
	next i

It seems to think that Thing(i) is a function instead of an array?

Edit: The same error gets thrown for all of the compound operators: +=, -=, *=, /=, and=, or=, xor=.

ManAtWork · 2020-12-22 17:17

Why does this work

void TestPwm ()
{
  int ampl= 0x0100;
  int theta;
  for (theta=0; true; theta+= 0x01000000);
  {
    __asm {
      qrotate ampl,theta

but this doesn't?

void TestPwm ()
{
  int ampl= 0x0100;
  for (int theta=0; true; theta+= 0x01000000);
  {
    __asm {
      qrotate ampl,theta

It throws an error message "undefined symbol theta". I thought loop variables and local function variables are both held in registers. If it's not legal to use a loop variable in assembler the compiler should say so.

ManAtWork · 2020-12-22 17:32

Stupid C error, as always. One semicolon too much!

ersmith · 2020-12-22 22:57

JRoark wrote: »

@ersmith It looks like the FlexBASIC += operator doesn't like to be used with array variables:

It seems to think that Thing(i) is a function instead of an array?

Edit: The same error gets thrown for all of the compound operators: +=, -=, *=, /=, and=, or=, xor=.

Thanks for catching this bug, it'll be fixed real soon now. The problem is that syntactically BASIC array refs look like function calls, and there's some hackery to fix up apparent function calls that are really array references. But some code for dealing with += and the like gets called before that fixup, so triggers the error. It shouldn't be hard to work around.

ersmith · 2020-12-23 15:38

flexprop 5.0.3 is released now. I've also uploaded the flexspin/flexcc and spin2cpp binaries to the spin2cpp repository.

jmg · 2020-12-24 19:56

ersmith wrote: »

I'm not sure if I can support "asmclk" or not. It seems like a pretty dangerous feature: if someone wants to insert some assembly instructions I think they should write them out rather than having the compiler mysteriously insert them.

Invisible insertion is not a good idea, but Assemblers do commonly support macros, and there, the code appears clearly in the listing file, showing exactly what was done by the macro.

flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

Comments