flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

pik33 · 2023-07-28 14:26

Something strange is going here:

sub compile_immediate_assign(linetype as ulong)

dim t1 as expr_result
dim i,j as integer
dim  suffix$ as string
dim varname$ as string 

t1.result_type=result_error : t1.result.uresult=0
i=-1: j=-1
if linetype=0 then varname$=lparts(0).part$ :  ct=2 ': print varname$    <---------------------- here varname$ exists and can be printed
if linetype=1 then varname$=lparts(1).part$ :  ct=3
print "Called compile immediate assign with linetype",linetype
suffix$=right$(varname$,1)  <---------------------------- varname$ doesn't exist here any more unless I printed it 3 lines earlier
expr()

if suffix$="$"  then
  if svarnum>0 then
    for i=0 to svarnum-1
      if svariables(i).name=varname$ then j=i : exit
    next i
  endif
  if  j=-1 andalso svarnum<maxvars then   
    svariables(svarnum).name=varname$
    j=svarnum
    svarnum+=1
  endif
  t1.result.uresult=j: t1.result_type=fun_assign_s  

else if suffix$="%" then
  if uvarnum>0 then
    for i=0 to uvarnum-1
      if uvariables(i).name=varname$ then j=i : exit
    next i
  endif
  if  j=-1 andalso uvarnum<maxvars then   
    uvariables(uvarnum).name=varname$
    j=uvarnum
    uvarnum+=1
  endif
  t1.result.uresult=j: t1.result_type=fun_assign_u  

else if suffix$="!" then
  if fvarnum>0 then
    for i=0 to fvarnum-1
      if fvariables(i).name=varname$ then j=i : exit
    next i
  endif
  if  j=-1 andalso fvarnum<maxvars then   
    fvariables(fvarnum).name=varname$
    j=fvarnum
    fvarnum+=1
  endif
  t1.result.uresult=j: t1.result_type=fun_assign_f  

else '  if suffix$<>"$" andalso suffix$<>"!" andalso suffix$<>"%"  then
  if ivarnum>0 then
    for i=0 to ivarnum-1
      if ivariables(i).name=varname$ then j=i : exit
    next i
  endif
  if  j=-1 andalso ivarnum<maxvars then   
    ivariables(ivarnum).name=varname$
    j=ivarnum
    ivarnum+=1  
  endif
  t1.result.uresult=j: t1.result_type=fun_assign_i  
endif  

compiledline(lineptr)=t1:  lineptr+=1 
compiledline(lineptr).result_type=token_end 
end sub

If varname$ was declared as local inside the sub, the sub didn't work, as if there was no varname$ inside these 'ifs'. If the varname$ is not in the table, the procedure should assign a new entry n the table with the new name given by varname$. This didn't happen.

Of course I tried to debug the code by adding "print varname$" after it is set This magically repairs the procedure.
I made a workaround by declaring varname$ as a global variable and that also repairs the procedure.
If not printed imediately after assign, varname$ disappears immediately.

Is it possible that some kind of optimizer/garbage collector thinks that varname$ is no longer needed and cleans it out before using it inside one of ifs?

Edit: assigning anything (eg. an empty string) to a local declared varname$ before 'if linetype=" also repairs the procedure.

This is Version 6.2.4-beta-v6.2.3-13-gbbc68880 Compiled on: Jul 26 2023

I will try this on a simpler piece of code, and also on a non-beta flexprop.

ersmith · 2023-07-28 15:44

@pik33 : Thanks for the bug report. I think this should be fixed in the github source code now (it was an optimizer bug where a register was being renamed inconsistently).

pik33 · 2023-07-28 20:23

It works now

Wuerfel_21 · 2023-07-28 20:57

@ersmith said:
@pik33 : Thanks for the bug report. I think this should be fixed in the github source code now (it was an optimizer bug where a register was being renamed inconsistently).

Not entirely sure your fix is correct. SafeToReplaceForward should already catch all unsafe replacements, should it? That should be fixed inside the function itself then. Or am I just being thick?

ersmith · 2023-07-29 12:10

@Wuerfel_21 said:

@ersmith said:
@pik33 : Thanks for the bug report. I think this should be fixed in the github source code now (it was an optimizer bug where a register was being renamed inconsistently).

Not entirely sure your fix is correct. SafeToReplaceForward should already catch all unsafe replacements, should it? That should be fixed inside the function itself then. Or am I just being thick?

You're right, that is a better place to fix it. I'll update the code.

Wuerfel_21 · 2023-07-29 12:14

Fix might still not be correct, I think the upper isDeadAfter check you added is not needed? That's the one where orig starts a new dependency chain and will actually never be dead (because in such cases the op that sets it would be dead and get deleted)

ersmith · 2023-07-29 13:18

@Wuerfel_21 said:
Fix might still not be correct, I think the upper isDeadAfter check you added is not needed? That's the one where orig starts a new dependency chain and will actually never be dead (because in such cases the op that sets it would be dead and get deleted)

I'd rather be safe than sorry. There have been too many optimizer bugs.

Wuerfel_21 · 2023-07-29 13:58

Looking at it again, you're checking IsDeadAfter on last_ir (should be ir->prev), which means it should actually return true in all cases where that branch is taken (InstrModifies(ir,orig) && !InstrUses(ir, orig)) ??? Yet it clearly has some effect on codegen, so that's not actually what happens.

Can you give me a standalone sample of the original bug pik had? I think there might be a deeper bug lurking somewhere.

Speaking of codegen quality, it might be a good idea to set up larger scale CI tests that report overall degradation/improvement per-commit on real code (and of course compile errors, too). I've seen that on some projects. I think we mostly care about RAM usage, but for P1 targets we could also hook up spinsim for cycle benchmarks. Also really need to resurrect the test line coverage tracking, I think the old PR that's still open might have bit-rotted.

ersmith · 2023-07-29 23:10

The simplified repo for the bug is:

dim ct as integer

sub tryit2(linetype as ulong)
dim varname$ as string
dim suffix$ as string
ct=99
if linetype=0 then
  varname$="part0" : ct = 2
end if
if linetype=1 then
  varname$="part1" : ct = 3
end if
'print "called tryit with linetype",linetype
suffix$=varname$ ' right$(varname$,1)
print "suffix=", suffix$
end sub

tryit2(0)
print "ct=",ct
tryit2(1)
print "ct=",ct
tryit2(2)
print "ct=",c

When the bug hits (e.g. with an older compiler) bad code is generated for the if linetype=0 block such that the varname$="part0" assignment is removed. That's because the register renamer renames the register holding varname$ to one of the argument registers, but only does so in the if statement (and not in the later uses); dead code removal then notices that the argument register is never used and removes the assignment.

pik33 · 2023-07-30 15:26

A FlexBasic feature request: static variables - if posssible and if they don't exist (if they exist - how to get them?) Local function/sub variables that don't disappear between function calls.

There isstatickeyword in C. Pascal does this in very strange ways using typed const:

const a:integer=0;

This declares a static variable initialized to 0. Strange but does what is needed.

Now I simply use global variables to do what I want (to keep state variables of the function that can be in multiple states) and it of course work, but having static variables may make the code more readable and less bloated with global variables

Wuerfel_21 · 2023-07-30 15:49

Okay I figured out / fixed the actual issue (SafeToReplaceForward doesn't track when condition flags get rewritten, making subsequent CondIsSubset checks return false positives (since the condition at that point can't really be a subset of the original condition the original setter had)). Will submit that fix later, but still wondering about the aforementioned isDeadAfter oddity, so I'll look into that first.

diff --git a/backends/asm/optimize_ir.c b/backends/asm/optimize_ir.c
index 36362934..bee47ba7 100644
--- a/backends/asm/optimize_ir.c
+++ b/backends/asm/optimize_ir.c
@@ -1409,6 +1409,7 @@ SafeToReplaceForward(IR *first_ir, Operand *orig, Operand *replace, IRCond sette
     bool assignments_are_safe = true;
     bool orig_modified = false;
     bool isCond = (setterCond != COND_TRUE);
+    bool condition_safe = true;

     if (SrcOnlyHwReg(replace) || !IsRegister(replace->kind)) {
         return NULL;
@@ -1519,7 +1520,7 @@ SafeToReplaceForward(IR *first_ir, Operand *orig, Operand *replace, IRCond sette
                 return NULL;
             }
         }
-        if (!CondIsSubset(setterCond,ir->cond) && InstrUses(ir,orig)) {
+        if ((!condition_safe || !CondIsSubset(setterCond,ir->cond)) && InstrUses(ir,orig)) {
             return NULL;
         }
         if (InstrModifies(ir,replace)) {
@@ -1531,7 +1532,7 @@ SafeToReplaceForward(IR *first_ir, Operand *orig, Operand *replace, IRCond sette
             //      value is being put into it
             //  if "assignments_are_safe" is false then we don't know if another
             //  branch might still use "replace", so punt and give up
-            if (!CondIsSubset(ir->cond,setterCond)) {
+            if (ir->cond != COND_TRUE && (!condition_safe || !CondIsSubset(ir->cond,setterCond))) {
                 return NULL;
             }
             if (ir->dst->kind == REG_SUBREG || replace->kind == REG_SUBREG) {
@@ -1562,9 +1563,9 @@ SafeToReplaceForward(IR *first_ir, Operand *orig, Operand *replace, IRCond sette
                 // sub registers are complicated, punt
                 return NULL;
             }
-            if (ir->cond != setterCond) {
+            if (ir->cond != setterCond || !condition_safe) {
                 assignments_are_safe = false;
-                if (!CondIsSubset(setterCond,ir->cond)) {
+                if (!condition_safe || !CondIsSubset(setterCond,ir->cond)) {
                     // Not a subset of the setter condition, can't replace
                     return NULL;
                 }
@@ -1572,7 +1573,7 @@ SafeToReplaceForward(IR *first_ir, Operand *orig, Operand *replace, IRCond sette
             if (!InstrUses(ir, orig) && assignments_are_safe) {
                 // we are completely re-setting "orig" here, so we can just
                 // leave now
-                return (last_ir && IsDeadAfter(last_ir, orig)) ? last_ir : NULL;
+                return (last_ir /*&& IsDeadAfter(last_ir, orig)*/) ? last_ir : NULL;
             }
             // we do not want to end accidentally modifying "replace" if it is still live
             // note that IsDeadAfter(first_ir, replace) gives a more accurate
@@ -1584,9 +1585,13 @@ SafeToReplaceForward(IR *first_ir, Operand *orig, Operand *replace, IRCond sette
             }
             orig_modified = true;
         }
+        if (isCond && InstrSetsFlags(ir,FlagsUsedByCond(setterCond))) {
+            condition_safe = false;
+        }
+
         last_ir = ir;
     }
-    if (last_ir && IsLocalOrArg(orig) && IsDeadAfter(last_ir, orig)) {
+    if (last_ir && IsLocalOrArg(orig) /*&& IsDeadAfter(last_ir, orig)*/) {
         return last_ir;
     }
     return NULL;

EDIT: The forum has diff highlighting, neat.

Wuerfel_21 · 2023-07-30 18:27

Ah okay, the IsDeadAfter thing is just because of the same condition-related futzery. A variable isn't dead in the general case if it's conditionally set. In SafeToReplaceForward we know the condition that originally set it, etc, etc. Submitted better fix plus a test.

pik33 · 2023-08-13 10:27

sub do_gettime

dim lo1, hi1 as ulong
dim lo2, hi2 as ulong
dim t1 as expr_result

asm 
   getct hi1 wc
   getct lo1
end asm   
lo2=lo1 : hi2=hi1
t1.result.twowords(0)=lo2
t1.result.twowords(1)=hi2
t1.result_type=result_uint 
push t1 
end sub

The result:

D:/programowanie/p2-retromachine/Propeller/Basic/basic018.bas:2419: error: Variable hi1 must be placed in memory (probably due to an @ expression) and hence cannot be accessed in inline assembly
D:/programowanie/p2-retromachine/Propeller/Basic/basic018.bas:2420: error: Variable lo1 must be placed in memory (probably due to an @ expression) and hence cannot be accessed in inline assembly

The error appears only if I call push t1. If push t1 is not called, then the error doesn't occur. Push does this:

sub push(t1 as expr_result)

if stackpointer<maxstack then 
  stack(stackpointer)=t1
  stackpointer+=1
' error reporting here  
endif
end sub

lo2, hi2 was declared to avoid this error by letting lo1, hi1 be used only for asm part and assign inside the procedure, but it didn't help at all.

What "helped", was changing the declaration of push:

sub push( byval t1 as expr_result)

But after this, the procedure doesn't work any more, pushing all zeros. It is called also by do_rnd: after 'byval' do_rnd also return an empty result with all zeros in it.
So 'byval' had to be removed.

Then I tried to do nothing with lo1, hi1 at all:

sub do_gettime

dim lo1, hi1 as ulong
dim t1 as expr_result

const asm 
   getct hi1 wc
   getct lo1
end asm   

t1.result.twowords(0)=1 
t1.result.twowords(1)=2 
t1.result_type=result_uint 
push t1 
end sub

and still got the same, now very stupid, error message.

D:/programowanie/p2-retromachine/Propeller/Basic/basic018.bas:2421: error: Variable hi1 must be placed in memory (probably due to an @ expression) and hence cannot be accessed in inline assembly
D:/programowanie/p2-retromachine/Propeller/Basic/basic018.bas:2422: error: Variable lo1 must be placed in memory (probably due to an @ expression) and hence cannot be accessed in inline assembly

Tried for "opt(0)" - no change

The workaround that works:

sub do_gettime

dim hi2, lo2 as ulong

dim t1 as expr_result
hi2,lo2=do_gettime2()
t1.result_type=result_uint
t1.result.twowords(0)=lo2
t1.result.twowords(1)=hi2
push t1
end sub

function do_gettime2() as ulong,ulong

dim lo1, hi1 as ulong

const asm 
   getct hi1 wc
   getct lo1
end asm   

return hi1, lo1
end function

Now the asm and call to push are in different functions. No errors.

The summary:
' byval' seems to not work with a class, without any warning,
calling a function that passes a class as a parameter inside a function that has asm in it, generates strange error messages

ersmith · 2023-08-13 20:10

@pik33 : The error message is poorly worded. As it stands now, if any variable within a subroutine or function must be placed on the stack, they all do.

evanh · 2023-08-14 06:56

Yes, that's something I've bumped into a few times with inline assembly myself. Having a local static variable in C will trigger it. In Spin at least, creating a reference to a local variable also triggers this function wide effect.

evanh · 2023-08-17 09:50

Eric,
I'm guessing you've already seen this Flexspin warning from Chip's code ... warning: used tabs for indentation (previous lines used spaces) Tabs between a case entry and a comment on the same line will trigger it. Eg:

    case first.[1..0]
      %00:                      'head/last
        ByteCount := first.[31..20] - $040
        return
      %01:                      'head/more
        ByteCount := $FFC - $040
      %10:                      'body/last
        ByteCount += first.[31..20] - $004
        return
      %11:                      'body/more
        ByteCount += $FFC - $004

ersmith · 2023-08-17 17:36

Thanks @evanh . I think this should be better in the github source code now.

pik33 · 2023-08-21 09:55

sin(x) fails if x>4*pi. It returns 0 then.

ersmith · 2023-08-21 15:42

@pik33 said:
sin(x) fails if x>4*pi. It returns 0 then.

Thanks, this should be improved now (although it's still true that the further away from the origin the number is the less accurate the results will be, at least they don't drop immediately to 0 at 4*pi).

pik33 · 2023-08-27 12:38

I managed to generate a segfault.

This line had an error in it:

 t1.result=varptr(variables(compiledline(lineptr_e).result.uresult).value)

Should be

 t1.result.uresult=varptr(variables(compiledline(lineptr_e).result.uresult).value)

as t1.result is an union.

Got a segfault instead of an error message.

ManAtWork · 2023-08-28 09:02

@ersmith Does Flexspin already support the PASM level debugger? I get an error message error: syntax error, unexpected end of line, expecting '(' if I place a debug without braces in my code. The debug command itself should be trivial because it translates to a simple brk #0. But I guess some more code in the protected RAM area is needed to communicate with the debugger on the PC side. Could this simply be copied from Chip's code? AFAIK it doesn't depend on any symbol information or other data from the compiler but just uses raw addresses.

pik33 · 2023-08-28 12:30

A feature for discussion/implement if possible: a disposable shared asm/DAT section

The current behavior is: if shared asm/file is declared, that file/data goes to the program area. That is OK if it is a part of the program

However, if the data is "load, use once and forget" type, for example to load and initialize the cog, or move it to the PSRAM, if any, that wastes the space for a program: after initializing, the code is no longer needed and can be used as heap and stack space.

The proposition is:

add a modifier keyboard, for example "disposable'
if used (FlexBasic mode example):

asm disposable
mydriver  file "mydriver.bin"
end asm

then "mydriver" should go up as high as possible, saving the space for the rest of the program/heap/stack.

pik33 · 2023-08-29 17:13

While the FlexBasic compiler accepts $100 as a hexadecimal number, val% does not. (and all my FlexBasic stuff uses $ for hexadecimal - old, 40 years long, habit... )

... edit: and 'val' does not accept even '&h100' syntax returning 0.0

ersmith · 2023-08-29 18:37

@pik33 : Disposable memory sounds like an interesting proposal, but how can the compiler / libraries know when it's safe to re-use the memory? I think we'd need to add explicit calls to do this, at which point it seems like providing general calls to add heap memory might be more useful.

An alternative would be to read the binary files from disk or flash into ordinary heap memory, which can then be managed with the existing functions.

(PS: I'll look at how val% is implemented -- it's probably re-using some C code that doesn't understand $)

pik33 · 2023-08-29 18:54

I think that the programmer knows it is disposable, so the compiler places it there and that's all. The compiler may alert if the initial stack pointer overlaps or is near the first disposable fragment of the code, so the user can lower the heap or do something to make the code shorter

calls to add heap memory might be more useful.

This is useful in combination with disposable memory blocks. Starting with a small heap, so the code fits, init all things that have to be started, then expand the heap, overwriting drivers.

Wuerfel_21 · 2023-08-29 20:07

@pik33 said:
This is useful in combination with disposable memory blocks. Starting with a small heap, so the code fits, init all things that have to be started, then expand the heap, overwriting drivers.

I think this could be very nicely handled by pre-allocating the "disposable" blocks into the heap at compile time. So after you're done using it, you just free it like a normal heap allocation. Zero extra complexity.

Wuerfel_21 · 2023-08-29 20:09

Though you can already compile the PASM drivers into a separate file and re-use the memory manually. For extreme example of doing this with flexspin on P1, check spin hexagon: https://github.com/IRQsome/Spin-Hexagon (check Rakefile and hexagon.spin - PASM blocks are built as separate files, included into DAT section and then re-used as display list buffer)

pik33 · 2023-08-29 20:39

you can already compile the PASM drivers into a separate file and re-use the memory manually.

That I already did, with a success : I have an obsolete now video driver and PSRAM driver placed somewhere in the flash and a procedure that loads it and initializes the cog. That disposable memory may made the process simpler, especially for an "end user" who simply flashes/loads a binary and it works instead of "run the flasher.bin first to flash the stuff, then run the program.bin to do things"

And the video driver in the flash is now obsolete so I have to create and reflash a new one, and I simply don't want to, until the interpreter, that's my current project, will grow too big. I still have 192 kB of heap declared. But in the case of the interpreter, this heap size decides how many resources you can have: variables, sprites, fonts and other stuff that can not be placed in the PSRAM.

ersmith · 2023-08-29 21:24

@pik33 said:
I think that the programmer knows it is disposable, so the compiler places it there and that's all. The compiler may alert if the initial stack pointer overlaps or is near the first disposable fragment of the code, so the user can lower the heap or do something to make the code shorter

Yes, I wasn't clear with my question. The compile time allocation is done by the programmer marking the block as disposable, but at run time the compiler doesn't know at what time during program execution it is safe to re-use that memory. That is, the actual disposal cannot happen automatically, the programmer will have to manually indicate at run-time when the blocks are no longer in use.

pik33 · 2023-08-29 21:27

Something like this?

coginit/cpu (... ,pointer, ...)
dispose(pointer)

flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

Comments