flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

evanh · 2025-01-31 21:16

@evanh said:
File size for -O1 is 84_844 bytes
File size for -O2 is 81_980 bytes

That was for my custom loaded 4-bit SD mode driver.

For the built-in SPI mode driver the sizes are:
File size for -O1 is 71_592 bytes
File size for -O2 is 69_208 bytes

So -O2 shrinks the compiled binary file size by 2864 - 2384 = 480 bytes extra when having the driver as a separate object. Given the custom driver is bigger, a decent amount of that could just be optimising the driver itself.

ersmith · 2025-01-31 23:23

@evanh said:
Oddly, it also showed up two bugs:

One was a non-existent symbol being referenced in my driver. I've fixed the bug but am puzzled as to why it didn't produce a compile error when compiling the C version of the SD tester program.

Perhaps the function containing the non-existent symbol was never referenced? If so it can sometimes be skipped at a very early stage so we don't even see those kinds of errors. This is less likely to happen when the function is in an object/struct (because we have to assume all public functions in an object might be called).

The other is include/fcntl.h:56: error: syntax error, unexpected identifier 'mode_t' It's missing an include of it's own I guess. My current fix is to shift the order of includes in my driver. Again, the C version of the tester didn't produce any error here even though it's the identical driver code.
EDIT2: Come to think of it. A compile warning that I'd earlier dismissed shows up this same way - include/filesys/fatfs/fatfs_vfs.c:96: warning: mixing pointer and integer types in return: expected pointer to _struct__vfs but got int

These are both bugs in the header files. I've checked in some fixes. Thanks for catching these!

evanh · 2025-02-01 01:53

One was a non-existent symbol being referenced in my driver. I've fixed the bug but am puzzled as to why it didn't produce a compile error when compiling the C version of the SD tester program.

Solved! The driver was pilfering from an identical named Enum at the top level program. They were also the same values so it didn't cause any buggy behaviour. The Spin2 CON symbols at top level were named differently so the bug only showed up then. I'd revised the Spin2 program as I wrote it so there is some minor tweaks like that.

I've changed the C tester program to now use the same Enum symbol names as the Spin2 tester program.

Huh, that kind of begs the question, can I legally use Enums at the top level to set driver compile time presets?

Rayman · 2025-02-02 16:26

Is printf() blocking? I.e., does it not return until the string is completely sent?
Or, is it doing something with buffers and/or interrupts to return before string is sent?

evanh · 2025-02-02 19:26

I presume the one-deep hardware buffer will be used but, either way, it's still going to be blocking on a whole string.
The Flex compiler doesn't support interrupts.

Rayman · 2025-02-02 19:35

Ok, I was thinking Flex doesn't allow interrupts because it uses them itself and doesn't wank conflict. But, maybe remembering wrong...

evanh · 2025-02-02 19:44

There is a number of features of the Prop2 that require multi-instruction atomic actions. eg: SETQ+RDLONG for a fast copy. An interrupt splitting them would bust that. The Cordic's pipelining is timing sensitive, an interrupt could lose Cordic results.

There's other examples but suffice to say there is need for careful explicit shielding of each and every case before interrupts can be allowed to operate on the same cog.

EDIT: Err, the SETQ might shield one subsequent instruction automatically.
The hidden Q register has to be managed though. It can be trashed by a number of instructions. If an ISR uses Q, even inadvertently, it prolly needs to stash and restore Q's content. This wasn't considered during design so there isn't an explicit instruction for reading Q. MUXQ can be configured to do it though.

evanh · 2025-02-02 20:32

Looking at my SD block device driver code the CRC processing would need modified to be friendly to interrupts. Not because it won't work but because it naturally shields from interrupts for a very long interval using a large REP loop.

    if_z    rep @.rend, #512/4+16/8    // one SD block + CRC
    if_z    alti    pb, #0b100_111_000    // next D-field substitution, then increment PB, result goes to PA
    if_z    movbyts pa, #0b00_01_10_11    // byte swap within longword
    if_z    splitb  pa    // 8-nibble order swapped to bit order of 4 bytes
    if_z    setq    pa
    if_z    crcnib  crc3, poly
    if_z    crcnib  crc3, poly
    if_z    crcnib  crc2, poly
    if_z    crcnib  crc2, poly
    if_z    crcnib  crc1, poly
    if_z    crcnib  crc1, poly
    if_z    crcnib  crc0, poly
    if_z    crcnib  crc0, poly
.rend

130 loops of 12 instructions ... 130 x 24 + 2 = 3122 sysclock ticks.

Possibly the only shielding I'd need to add is for five instances of this across the whole driver:

        xinit   m_align, #0    // lead-in delay from here at sysclock/1
        setq    v_nco        // streamer transfer rate (takes effect with buffered command below)
        xzero   m_dat, #0     // rx buffered-op, aligned to clock via lead-in
        dirh    p_clk    // clock timing starts here
        wypin   clocks, p_clk    // first pulse outputs during second clock period

Just have to add a REP #5,#1 ahead of each XINIT.

Might have to also reorder a few instructions where I'm currently counting on an instruction to not take effect instantly.

My SETQs are all safe I think and I don't use the Cordic in the Pasm code, therefore it shouldn't matter if Q got trashed by an ISR - not speaking for any of the C code though.

Wuerfel_21 · 2025-02-02 20:57

@evanh said:
There is a number of features of the Prop2 that require multi-instruction atomic actions. eg: The Cordic's pipelining is timing sensitive, an interrupt could lose Cordic results.

The flexspin code generator never does anything like that on its own. As long as the ISR doesn't touch CORDIC or FIFO state, everything should be fine. However, it will generate REP loops, block fills and other interrupt-blocking sequences. You could set up an ISR by writing it in assembly and manually loading it into the explicitly free LUT area. But unsupported at your own risk etc. You may also find that flexspin has slightly relaxed memory semantics (i.e. a value that is semantically in hub RAM may be cached in a register when it is used multiple times) but that's also visible to regular multi-core code.

evanh · 2025-02-02 21:13

@Wuerfel_21 said:
... As long as the ISR doesn't touch CORDIC or FIFO state, everything should be fine.

Ah, yeah, my block driver uses the FIFO, with the Streamer. It would be catastrophic if an ISR used RDFAST for example.

However, it will generate REP loops, block fills and other interrupt-blocking sequences. ...

Might be fun to do some interrupt jitter measuring ...

evanh · 2025-02-02 21:21

I also heavily use SETSE1 and SETSE2. An ISR using those would destroy the driver too.

ersmith · 2025-02-03 12:47

@Rayman said:
Is printf() blocking? I.e., does it not return until the string is completely sent?
Or, is it doing something with buffers and/or interrupts to return before string is sent?

It depends on what file the printf output is going to. For the default case (serial output) it blocks until the last character has been put in the smartpin output buffer; it'll then take a little longer for that character to actually get to the host.

evanh · 2025-02-17 23:36

Eric,
I want to duplicate, in Spin2, a technique that I've copied of yours where, in C, you created some magical way to have the SD driver as a local scoped pointer to malloc()'d memory. Yet, the driver still has to be brought in at compile time. Which I'm scratching my head over.

My example use:

FILE * mountsd( void )
{
    struct __using("blkdrvr/sdmm_bashed.cc") *DRV;
    FILE *handle;

    DRV = _gc_alloc_managed(sizeof(*DRV));

    _seterror(0);
    handle = DRV->_sdmm_open(CLK_RL, CS_RL, MOSI_RL, MISO_RL);
    if( !handle )  {
        printf(" device open failed!   errno = %d: %s\n", errno, strerror(errno));
        _gc_free(DRV);
        exit(1);
    }
    mount("/sd", _vfs_open_fat_handle(handle));

    return handle;
}

ersmith · 2025-02-18 12:12

@evanh said:
I want to duplicate, in Spin2, a technique that I've copied of yours where, in C, you created some magical way to have the SD driver as a local scoped pointer to malloc()'d memory. Yet, the driver still has to be brought in at compile time. Which I'm scratching my head over.

An object is just (1) a set of methods, and (2) some memory for the member variables (and also (3) some static DAT section, but that's shared by all instances of an object so it isn't very interesting). In C/C++ terms a Spin object is just a C++ class, or equivalently a C struct with some associated functions. So to get a dynamically allocated instance of an object you just have to make sure you have enough memory for the member variables (which sizeof() can tell you).

I can't remember if the official Spin2 compiler supports object pointers, but flexspin does, and the syntax is like:

OBJ sdmm = "blkdrv/sdmm_bashed.cc"  ' = instead of : to declare an "abstract" object
VAR long sdptr
...
sdptr := _gc_alloc_managed(sizeof(sdmm)) ' get space for member variables
sdmm[sdptr].open(CLK_RL, CS_RL, MOSI_RL, MISO_RL)

One thing to watch out for is that the allocated memory is not initialized to 0, so if your methods expect this you have to set it manually.

Having said all that, why not just use C? You can mix C and Spin2 freely, so there's no great need to translate this kind of thing into Spin2. It's more conveniently expressed in C, and you can provide whatever API you want for Spin2 programs to call it.

evanh · 2025-02-18 12:54

@ersmith said:
An object is just (1) a set of methods, and (2) some memory for the member variables (and also (3) some static DAT section, but that's shared by all instances of an object so it isn't very interesting). In C/C++ terms a Spin object is just a C++ class, or equivalently a C struct with some associated functions. So to get a dynamically allocated instance of an object you just have to make sure you have enough memory for the member variables (which sizeof() can tell you).

I must admit I'm struggling with the OOP side of things. I never learnt C++. Thanks for some descriptions.

I can't remember if the official Spin2 compiler supports object pointers, but flexspin does, and the syntax is like:
```
OBJ sdmm = "blkdrv/sdmm_bashed.cc" ' = instead of : to declare an "abstract" object
VAR long sdptr
...

It's not the instancing that I'm after here. The way you did it in C looked like you were intentionally hiding the object existence inside a function's local namespace. Because, aside from the initial device open call, the object methods of the driver are only meant to be accessed via your underlying VFS layer.

Having said all that, why not just use C? You can mix C and Spin2 freely, so there's no great need to translate this kind of thing into Spin2. It's more conveniently expressed in C, and you can provide whatever API you want for Spin2 programs to call it.

I wanted to be able to provide an example integration of the driver plug-in for people coding in just Spin2.

I guess I'll just stick with the static object approach I already have. eg: https://forums.parallax.com/discussion/comment/1565714/#Comment_1565714

ersmith · 2025-02-18 14:16

@evanh said:

@ersmith said:
I can't remember if the official Spin2 compiler supports object pointers, but flexspin does, and the syntax is like:
```
OBJ sdmm = "blkdrv/sdmm_bashed.cc" ' = instead of : to declare an "abstract" object
VAR long sdptr
...

It's not the instancing that I'm after here. The way you did it in C looked like you were intentionally hiding the object existence inside a function's local namespace. Because, aside from the initial device open call, the object methods of the driver are only meant to be accessed via your underlying VFS layer.

Oh, no, I was just being lazy. The direct C equivalent to the Spin2 code would be something like:

typedef struct __Using("blkdrv/sdmm_bashed.cc") SDMMObj;
...
SDMMObj *sdptr = _gc_alloc_managed(sizeof(SDMMObj));
sdptr->open(...)

I just combined the typedef and the use of SDMMObj into the same line.

Your static object approach seems perfectly reasonable and probably more idiomatic as Spin2. Thinking about it it's probably better to avoid dynamic allocations anyway.

evanh · 2025-02-18 18:40

Ah, wrong direction. The C code of yours that I've worked from is in include/filesys/fatfs/fatfs_vfs.c
As far as I can see, you've created a local namespace instance with the purpose of hiding the object from all else. Then only the assigned VFS function pointers remain afterwards.

Here's the original complete function:

vfs_file_t *
_sdmm_open(int pclk, int pss, int pdi, int pdo)
{
    int r;
    int drv = 0;
    struct __using("filesys/block/sdmm.cc") *SDMM;
    unsigned long long pmask;
    vfs_file_t *handle;

    SDMM = _gc_alloc_managed(sizeof(*SDMM));

#ifdef _DEBUG
    __builtin_printf("sdmm_open: using pins: %d %d %d %d\n", pclk, pss, pdi, pdo);
#endif    
    pmask = (1ULL << pclk) | (1ULL << pss) | (1ULL << pdi) | (1ULL << pdo);
    if (!_usepins(pmask)) {
        _gc_free(SDMM);
        _seterror(EBUSY);
        return 0;
    }
    SDMM->f_pinmask = pmask;
    r = SDMM->disk_setpins(drv, pclk, pss, pdi, pdo);
    if (r == 0)
        r = SDMM->disk_initialize(0);
    if (r != 0) {
#ifdef _DEBUG
       __builtin_printf("sd card initialize: result=[%d]\n", r);
       _waitms(1000);
#endif
       goto cleanup_and_out;
    }
    handle = _get_vfs_file_handle();
    if (!handle) goto cleanup_and_out;

    handle->flags = O_RDWR;
    handle->bufmode = _IONBF;
    handle->state = _VFS_STATE_INUSE | _VFS_STATE_WROK | _VFS_STATE_RDOK;
    handle->read = &SDMM->v_read;
    handle->write = &SDMM->v_write;
    handle->close = &SDMM->v_close;
    handle->ioctl = &SDMM->v_ioctl;
    handle->flush = &SDMM->v_flush;
    handle->lseek = &SDMM->v_lseek;
    handle->putcf = &SDMM->v_putc;
    handle->getcf = &SDMM->v_getc;
    return handle;
cleanup_and_out:
    _freepins(pmask);
    _gc_free(SDMM);
    _seterror(EIO);
    return 0;
}

ersmith · 2025-02-18 19:44

@evanh said:
Ah, wrong direction. The C code of yours that I've worked from is in include/filesys/fatfs/fatfs_vfs.c
As far as I can see, you've created a local namespace instance with the purpose of hiding the object from all else. Then only the assigned VFS function pointers remain afterwards.

No, that was what I was talking about. There was no particular desire (or need) to hide the object from anything else. The line:

     struct __using("filesys/block/sdmm.cc") *SDMM;

is directly equivalent to:

    typedef struct __using("filesys/block/sdmm.cc") SDMMObj;
    SDMMObj *SDMM;

and the typedef can then be pulled out of the function without really changing anything. I mean, yeah, the typedef is then in the file namespace, but that shouldn't interfere with what any other file has.

Spin2's OBJ foo = "filename" acts the same as a typedef, creating an alias for the object without actually creating any instances of the object.

evanh · 2025-02-18 20:11

Hmm, why not just use a static C object then? No malloc()'s and it's less clutter in the source.

ersmith · 2025-02-18 20:42

@evanh said:
Hmm, why not just use a static C object then? No malloc()'s and it's less clutter in the source.

I think I wanted to provide for the (unlikely) possibility of having multiple SD cards in a system.

Rayman · 2025-02-25 01:49

Way back when, got fatfs to compile as add on to flexC . Can probably still do that and have all the usd you want…

evanh · 2025-02-25 10:17

@Rayman said:
Way back when, got fatfs to compile as add on to flexC . Can probably still do that and have all the usd you want…

If that's directed at me, I don't think I ever got it working properly ... nope, even now I can open two different drivers and copy between them just fine. But if I try to open two instances of the same driver then the first of the two instances is gone/unusable.

EDIT: I seem to have got it fully working with the sdmm_bashed.cc driver. And also ripped and tweaked version of sdmm.cc works too.

evanh · 2025-02-25 10:23

The following code mounts fine and works. The second driver source file sdsd2.cc is literally a duplicated sdsd.cc.
However, if I change it to using "blkdrvr/sdsd.cc" for both drv1_t and drv2_t then sd2/ appears to replace sd1/

EDIT: Err, interesting. If I use a simpler driver then multiple instances do work!

ersmith · 2025-02-25 13:57

@evanh said:
The following code mounts fine and works. The second driver source file sdsd2.cc is literally a duplicated sdsd.cc.
However, if I change it to using "blkdrvr/sdsd.cc" for both drv1_t and drv2_t then sd2/ appears to replace sd1/

EDIT: Err, interesting. If I use a simpler driver then multiple instances do work!

If you want to post the latest version of the more complicated driver we could try to figure out why it doesn't. Generally as long as you avoid static variables it should be OK to have multiple instances.

evanh · 2025-02-25 21:28

I gothad lots of statics in driver namespace, none inside functions. I thought the whole struct __using thing hid all that.

EDIT: Right, thanks, all fixed up now. Tested and working. Converting those DMA parameter structures into non-statics meant I also had to change their respective Pasm code to use a runtime C assigned local pointer instead of compile-time symbolic address.

Finished driver, with tester programs - https://obex.parallax.com/obex/sdsd-cc/

ersmith · 2025-02-26 13:49

@evanh said:
I gothad lots of statics in driver namespace, none inside functions. I thought the whole struct __using thing hid all that.

static works in objects like it does in C++ classes, namely it defines something in the DAT section that's shared among all instances of the object. Sometimes that's what you want, but it definitely isn't right for some things.

evanh · 2025-02-26 19:03

Perfectly reasonable.
Thanks for the explain.

evanh · 2025-02-26 22:52

There is a way to assemble a 4-bit SD slot for testing if you don't mind getting out the soldering iron.

It does require having the components lying around of course. In particular you have to find an unused SD card reader from which to desolder the SD slot. This along with a 2x6 IDC female plug, some hookup wire and six resistors makes the complete assembly. Pictures here - https://forums.parallax.com/discussion/comment/1563018/#Comment_1563018

evanh · 2025-03-03 20:16

What options is there for the user program to query the cluster size of a mounted FAT volume?

ersmith · 2025-03-03 21:57

@evanh said:
What options is there for the user program to query the cluster size of a mounted FAT volume?

I don't think there is a way. FatFs has an ioctl for the sector size, but not for the cluster size.

flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

Comments