flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

ersmith · 2023-07-11 10:51

Returning large classes from a function does not work yet, as you discovered, and even when it does get fixed (soon I hope) it will be slow, because a lot of data will have to be copied around. It's far more efficient to use a BYREF or pointer parameter to set the result in this case.

pik33 · 2023-07-12 09:02

I tested the current version

Version 6.2.1 Compiled on: Jul 11 2023

Now I can assign a result of the function to a class variable in both cases (3 and 4 fields) and it works.

However, a test code from post 3210 doesn't compile now if the class has more than 3 fields. If it has 3 fields, it compiles and works. Adding the 4th field causes this:

"C:/Users/Piotr/Downloads/flexprop-6.1.5/flexprop/bin/flexspin" -2  -l --tabs=8 -D_BAUD=230400 -O1    --charset=utf8 -I "C:/Users/Piotr/Downloads/flexprop-6.1.5/flexprop/include"  "D:/Programowanie/P2-retromachine-1/Propeller/Basic/test5.bas" -DFF_USE_LFN -DFF_CODE_PAGE=852
Propeller Spin/PASM Compiler 'FlexSpin' (c) 2011-2023 Total Spectrum Software Inc. and contributors
Version 6.2.1 Compiled on: Jul 11 2023
test5.bas
D:/Programowanie/P2-retromachine-1/Propeller/Basic/test5.bas:15: error: Method reference on non-class expression
D:/Programowanie/P2-retromachine-1/Propeller/Basic/test5.bas:8: error: Method reference on non-class expression
child process exited abnormally
Finished at Wed Jul 12 10:59:47 2023

The lines that caused compilation errors are both "print"

print f().b

What is the magic in this "3" number of fields in the class that cannot be exceeded to not cause the problem?

pik33 · 2023-07-12 11:58

Now I can go further with my first attempt to write an interpreter. The class I return from the function has 7 fields in it and it now works (I can assign it, and i don't use print as in previous post anywhere).

While testing my code I encountered a strange result returned by mid$() called with 0 as its start position. I thought mid$ (a$,0,1) will always return an empty string but this was not true. Sometimes it returns something that is not an empty string.
Of course I shouldn't call mid$() with 0 as its start position (and I am not doing it anymore: a proper 'if' is now added to skip mid$() if the parameter given in a variable is less than 1) but the bug was very hard to find, as it worked... until it failed, after several calls.

ersmith · 2023-07-12 15:08

@pik33 said:
Now I can assign a result of the function to a class variable in both cases (3 and 4 fields) and it works.

Great!

However, a test code from post 3210 doesn't compile now if the class has more than 3 fields. If it has 3 fields, it compiles and works. Adding the 4th field causes this:

The lines that caused compilation errors are both "print"

print f().b

That should be fixed now.

What is the magic in this "3" number of fields in the class that cannot be exceeded to not cause the problem?

12 bytes / 3 longs is the limit for when classes can be returned in registers versus when they have to be returned in memory. Classes can be any size, so clearly there has to be some limit, and there are other parts of the compiler that restrict how many registers can be returned from functions (for example only 3 registers are available in the bytecode back end, I think).

pik33 · 2023-07-12 17:38

It seems the garbage collector interferes with redirecting channel #0 with

open SendRecvDevice(@v.putchar, nil, nil) as #0

The symptoms:

What I am trying to do is a retromachine style Basic interpreter. This means it does a lot of things with strings while analysing the line, so the heap is used a lot. And it got its input from an USB keyboard, and put its output to the video driver
To redirect prints to the video driver, I reopened the channel #0 with the video driver putchar() procedure.

After inputting several lines, my program stops printing using print while it still prints when using the driver text output procedures directly.

So I made the heap larger and that only made the problem appear later.
Then I tried to add _gc_collect at the start of the procedure. The thinking was... clean the heap before interpreting every line.

The printing stopped immediately after this.

The workaround was to reopen the channel after collecting the garbage.

_gc_collect()
open SendRecvDevice(@v.putchar, nil, nil) as #0

This works.

The problem doesn't affect built-in print to the serial terminal.

ersmith · 2023-07-13 12:03

@pik33 : Thanks for the garbage collection bug report, I'll look into it.

pik33 · 2023-07-13 17:19

It seems the new version works without a workaround

evanh · 2023-07-14 15:47

Eric,
I've bumped into the old glitch again. As per usual, I've got __asm volatile {} blocks again, although the bug is not presenting in that routine.
Flexspin Version 6.1.9-beta-v6.1.8-25-g568ad971 Compiled on: Jul 2 2023

New info gleaned is it triggers at 32 byte boundaries (and -1 of each boundary) of compiled binary size. eg: File sizes 1023, 1024, 1055, 1056, ...

I can't remember how it manifested previously but this time the executable locks up, apparently, when timing gets tight. And is consistently locking up at the same place every time. It's the 20th call to this routine each run. The code is heavily tested already but I can't be 100% positive it isn't a timing flaw on my part because edits are prone to hiding the bug again.

The new info about file size was gleaned because adding characters to strings is a consistent way to bring the bug in and out at will - when it is present at all that is. Change actual code anywhere in the whole program and the bug will vanish completely.

I don't know if you want to, or even can, narrow this down but here's the source code. I was editing the debug lines 121 to 130 in function rxblock() when it went to custard on me. The lock-up happens at start of function rxcmd400() after changing SD clock rate to sysclock/4.

EDIT: Err, um, you can't use the program without Roger's SD card add-on board. Not the best test candidate for you to work with.

EDIT2: LOL, and of course, a newer compiler build produces a different sized binary from this source code. So the problem goes away and no certain way to get it back. Typical.

pik33 · 2023-07-17 14:52

How to get a pointer to a function/sub?

I tried

dim testptr as any pointer
testptr=@do_fcircle : print testptr

Got $1a32c

In the listing, the do_fcircle starts at $077F8, and $1a32c points to objmem at the end of the program.

Edit: do_fcircle is a sub. Will try with a function....

ersmith · 2023-07-17 16:30

@pik33 said:
How to get a pointer to a function/sub?

With @, just as you did in your example. Note though that a method/function pointer is not the address of the method or function, it also has to include information about the object associated with the method. Different compiler targets handle this differently. In the P2 assembly backend the upper 12 bits is an index into a method table, and the lower 20 bits points to the object data.

pik33 · 2023-07-17 17:16

Then how to - in FlexBasic compiled to a P2 asm - get an address to the function or something I can call ?
And then, how to call it?
This is another Basic interpeteter problem. I have to create a list of functions to call, then call them according to the list. I tried the @, but i don't know what I can do with the result.
Maybe I can declare an array of variables that are functions - and then for i=1 to n :fun(i)() : next i - is it possible?

And I wonder why the @ operation returned a pointer to objmem label at the end of the program. There are zeros there - or is it initialized at runtime?

pik33 · 2023-07-17 17:28

Solved.

var testptr=varptr(do_fcircle)

Then

  asm
  call testptr
  end asm

and there is a circle on the screen

ersmith · 2023-07-17 23:09

@pik33 said:
Then how to - in FlexBasic compiled to a P2 asm - get an address to the function or something I can call ?
And then, how to call it?

You can call it directly from BASIC. Just declare it as a function pointer, something like:

dim f as function getx(y as integer) as integer
f = @some_method
a = f(b)

You can create an array of such pointers, too. It's probably easiest to do it with a type definition, something like:

sub f1(x as integer)
  print "called f1 "; x
end sub

sub f2(x as integer)
  print "called f2 "; x
end sub

type int1sub as sub(x as integer)
dim fa(2) as int1sub

fa(0) = @f1
fa(1) = @f2

print "calling"

for i = 0 to 1
  fa(i)(99)
next i

print "done"

And I wonder why the @ operation returned a pointer to objmem label at the end of the program. There are zeros there - or is it initialized at runtime?

Because the function is a method of the default class, and the default class memory starts at label objmem. Remember, the internal representation of a function pointer is as a 12 bit method index (which in this case may have been 0) and a 20 bit pointer to the object memory.

ersmith · 2023-07-17 23:11

@pik33 said:
Solved.

var testptr=varptr(do_fcircle)

Then
  asm
  call testptr
  end asm
and there is a circle on the screen

If that worked it was probably by accident. I would not recommend trying to call BASIC functions via a method pointer in inline assembly, your code will be heavily dependent on the flexspin version used to compile.

Mickster · 2023-07-18 04:36

@ersmith

Oh this is nice and exactly what I was enquiring about recently when I asked "do we have code ptrs"

@pik33 solution worked for me and I was clicking my heels but your 2nd example is exactly what I need.

Great stuff, many thanks

Craig

pik33 · 2023-07-18 05:46

If that worked it was probably by accident. I would not recommend trying to call BASIC functions via a method pointer in inline assembly, your code will be heavily dependent on the flexspin version used to compile.

The method with an array of function variables is more readable and elegant, so I will start with that, but let the varptr remain there This is logical: varptr is an absolute address of something, so if this something is a function, it returns its start address. I have an use for this too: the resident code and a table to its functions. Let the resident code remain at the upper RAM while the new program loads from #0. The function pointers that are not their addresses will not work there.

rogloh · 2023-07-19 12:12

@ersmith Hit a problem tonight with flexspin. I am using something based on 6.2.0 so it may have been fixed by now, but I noticed that the flexspin application will crash with a segmentation fault if you compile code with extra parentheses on a SEND pointer assignment statement by mistake. It's building fine of course without this (as should be the case) but I think it would be good not to crash the flexspin application if they are present by mistake. Maybe it would be better to generate an error/warning instead if this is detected.

Propeller Spin/PASM Compiler 'FlexSpin' (c) 2011-2023 Total Spectrum Software Inc. and contributors
Version 6.2.0-beta-v6.1.7-2-g588815b8 Compiled on: Jun 15 2023
proplcd.spin2
|-ers_fmt.spin2
|-SmartSerial.spin2
zsh: segmentation fault  ~/Applications/spin2cpp/build/flexspin -2 proplcd.spin2

Offending code was this snippet.

OBJ
    f:"ers_fmt"
    uart:"SmartSerial"

PUB main() | c
    uart.start(115200)
    send:=@uart.tx()  ' <-------- my flexspin seg faults if the () are included at end of line, okay without
    pinh(DATAPIN addpins 4)
    send("Running",13,10)
    repeat
        c:=uart.rx()
        sendspi(c,0,0)
        sendspi(c,1,0)
        sendspi(c,0,1)

ersmith · 2023-07-19 14:29

@pik33 : I can't guarantee anything about the behavior of varaddr with functions, because it was never supposed to work the way it happens to and I don't know why it isn't returning a proper method pointer. I'll leave it alone for now but please don't rely on it -- use real method pointers (with function or subroutine type) instead. Those have tests and will work. I've added some additional documentation which I hope will make how to use method pointers a bit clearer.

@rogloh : thanks for the bug report, that crash is fixed in 6.2.3.

ersmith · 2023-07-19 14:32

I've released FlexProp 6.2.3. This version has quite a few bug fixes and adds some missing string functions to the C library.

It may be downloaded from my Patreon page (see my signature) or from https://github.com/totalspectrum/flexprop/releases .

pik33 · 2023-07-19 14:50

@ersmith said:
@pik33 : I can't guarantee anything about the behavior of varaddr with functions, because it was never supposed to work the way it happens to and I don't know why it isn't returning a proper method pointer. I'll leave it alone for now but please don't rely on it -- use real method pointers (with function or subroutine type) instead. Those have tests and will work. I've added some additional documentation which I hope will make how to use method pointers a bit clearer.

@rogloh : thanks for the bug report, that crash is fixed in 6.2.3.

I started to implement the interpreter's runtime procedure using standard method pointers. These integers returned by varptr are usable only for calling something that gets no parameters and returns no results.
However something that returns the real function entry point may be useful to have, for example for debug purposes (let the function logs its entry address somewhere to save the search time for it in .lst, ), and doesn't harm anything. It is already implemented, so let it be there.

RichardB · 2023-07-19 18:50

@ersmith said:

@pik33 said:
Then how to - in FlexBasic compiled to a P2 asm - get an address to the function or something I can call ?
And then, how to call it?

You can create an array of such pointers, too. It's probably easiest to do it with a type definition, something like:

I tried this example, I see the logic and usefulness behind it, and it seems it should work. However when I tried it, (Copied and pasted into 6.2.3) I get 'called f1 99' twice.

I think I should have got one 'called f1 99' and one 'called f2 99'

What is wrong?

sub f1(x as integer)
  print "called f1 "; x
end sub

sub f2(x as integer)
  print "called f2 "; x
end sub

type int1sub as sub(x as integer)
dim fa(2) as int1sub

fa(0) = @f1
fa(1) = @f2

print "calling"

for i = 0 to 1
  fa(i)(99)
next i

print "done"

ersmith · 2023-07-19 19:08

The parsing of the line fa(i)(99) is a bit dodgy. You could try this instead:

sub f1(x as integer)
  print "called f1 "; x
end sub

sub f2(x as integer)
  print "called f2 "; x
end sub

type int1sub as sub(x as integer)

dim fa(2) as int1sub
dim p as int1sub

fa(0) = @f1
fa(1) = @f2

for i = 0 to 1
  p = fa(i)
  p(99)
next i

print "done"

RichardB · 2023-07-20 11:03

@ersmith said:
The parsing of the line fa(i)(99) is a bit dodgy. You could try this instead:

Thanks!

evanh · 2023-07-21 20:50

I think I've worked I did indeed have a timing issue. If the response code missed any clocks it got stuck. The earlier code must have been teetering on the edge of missing the first pulse - Which would be tripped up on worst case hubRAM access timings.

I've worked out I can reliably create a similar situation by making the command/response clocks faster than sysclock/4. Then the rx smartpin itself becomes unreliable at seeing every clock pulse - Which has the same outcome of appearing to lock up for the same reason. I still assume the rx smartpin is counting all clock pulses. Time to fix that ...

EDIT: Err, looks like it was still my code in this case too. It can't quite keep up at draining the rx smartpin when data arrives at sysclock/3.

ManAtWork · 2023-07-22 09:27

void BackpropAsm (AxisBuffEntry* buffer, int wri, int rdi)
// backpropagate decelleration ramp, assembler version
{
  uint32_t dlo = 0;
  uint32_t dhi = 0;
  uint32_t v2lo = halfAcc2;
  uint32_t v2hi = 0;
  fixp30_t v1, vMax;
  uint32_t* ptr;
  uint32_t hA = halfAcc;
  uint32_t nA = nomAcc;
  uint32_t mlo, mhi;
  __asm{
  .loop
    decmod wri,#AXIS_BUFFER_SIZE-1
    mov    ptr,wri
    shl    ptr,#6
    add    ptr,buffer
    add    ptr,#40   // v1= buffer[i].vNom
    rdlong v1,ptr
    mov    vMax,hA
    add    vMax,v1
    qmul   vMax,vMax // vMax² (<<<line 291)
    add    dlo,v1 wc
    addx   dhi,#0
    qmul   v1,nA     // v1 * nomAcc (pipelined CORDIC)
    add    ptr,#8
    setq   #2-1
    wrlong dlo,ptr   // buffer[i].dDec= dist
    getqx  mlo
    getqy  mhi  // first getqX/Y
    shr    mlo,#30
    mov    v1,mhi
    shl    v1,#2
    shr    mhi,#30
    or     mlo,v1
    cmp    v2lo,mlo wcz // if (v2 > vMax²) break
    cmpx   v2hi,mhi wcz
    getqx  mlo
    getqy  mhi // second getqX/Y
  if_a  jmp #.break
    mov    v1,mlo
    shl    mhi,#3
    shl    mlo,#3
    shr    v1,#32-3
    or     mhi,v1
    add    v2lo,mlo wc // v2+= (v1 * nomAcc) * 8
    addx   v2hi,mhi
    cmp    wri,rdi wz
  if_nz  jmp #.loop
  .break
  }
}

E:/Projekte/BeamiconII/Source/BeamiModular/P2/B2M_Axes.c:291: warning: Deleting apparently unused cordic instruction qmul

Huh, what? [drama mode on]
How have I deserved this? My wife tells me what clothes to wear. Traffic signs tell me to drive slower than I could. The tax office withdraws money from my account without asking... Programming in assembler is my last resort of true freedom! I can tell the machine exactly what to do and exactly in the way I want. Every single instruction is there for a reason. I don't claim to be perfect and I'm happy being warned of errors like forgotten '#' and things like that. But noone should dare to optimize away any command I put there!
[drama mode off]

evanh · 2023-07-22 10:01

I've never hit that message myself, but a known workaround would be to declare the assembly as const. This keeps the optimiser out and also forces it to stay inline hubexec. Eg:

  __asm const {
  .loop
    decmod wri,#AXIS_BUFFER_SIZE-1
    mov    ptr,wri
    shl    ptr,#6
    add    ptr,buffer
    add    ptr,#40   // v1= buffer[i].vNom
    rdlong v1,ptr
    mov    vMax,hA
    add    vMax,v1
    qmul   vMax,vMax // vMax² (<<<line 291)
    ...
  }

ersmith · 2023-07-22 11:07

@ManAtWork : if you want your inline asm code to be left alone by the optimizer you have to declare it as __asm const (if it should be run from HUB) or __asm volatile (if it should be run from FCACHE). This is deliberate: the libraries use inline assembly and I actually do want that to be optimized, because there are frequently cases where the library code can be inlined and optimized (e.g. when we know some parameters are constant).

ManAtWork · 2023-07-22 13:00

Yes, you are right. But deleting the first qmul is still a mistake as it alters the semantics. Optimizations should never result in different computation results, only in different timing or memory usage. I think the optimizer just doesn't know about pipelined use of the CORDIC unit.

Wuerfel_21 · 2023-07-22 13:40

@ManAtWork said:
Yes, you are right. But deleting the first qmul is still a mistake as it alters the semantics. Optimizations should never result in different computation results, only in different timing or memory usage. I think the optimizer just doesn't know about pipelined use of the CORDIC unit.

I wrote the CORDIC-related optimization passes. Pipelined CORDIC ops are very tricky. Whether or not deleting the QMUL affects the program behaviour is actually uncomputable in the general case. It depends on the (potentially unpredictable) timing of other instructions inbetween. Therefore the optimizer operates under the constraint/assumption that each cordic op has it's results read before another is put into the pipeline, which allows it to perform useful operations on generated code.

ManAtWork · 2023-07-22 14:41

Ok, I see. I'll have to use __asm const or __asm volatile. Pipelined CORDIC ops are very tricky, I agree. But they are very useful if you need lots of them. I make extensive use of them for motion control applications such as park or clarke transformations.

flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

Comments