Shop Learn P1 Docs P2 Docs
flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler - Page 93 — Parallax Forums

flexspin compiler for P2: Assembly, Spin, BASIC, and C in one compiler

19091939596103

Comments

  • evanhevanh Posts: 13,853
    edited 2022-04-16 04:12

    Cool.

    A couple of points I should make:
    - Both build paths use pasm2. I'm not sure if this source file is required to work with the Prop1 or not.
    - I've discovered that, above about 320 MHz sysclock (40 MHz SPI clock), the bit-bashed path isn't quite as rugged as the smartpins path has become. I had the advantage of easy fine tuning the clock high-low ratio with smartpins to give it more leeway.

    I'm thinking about retiring the optimised bit-bashed path anyway. Maybe revert to the old non-optimised path if Prop1 compatibility is desired.

    EDIT: On a tangent, there is no verification of valid data within the driver code. Getting verification might be a decent reason to eventually discontinue the use of SPI. I know CRC can be enabled with SPI but how much processing is that? It might be just as fast processing the full SD protocol.

  • pik33pik33 Posts: 1,817
    edited 2022-04-16 14:54

    I created a topic in Basic section - something is wrong, as changing the order of variable declaration changes the behavior of the code. Then I wanted to copy the output from the Flexprop's internal serial terminal and paste it to the forum topic editor.

    Then I discovered that I cannot do it (or I don't know how) - ctrl-c doesn't work on the internal terminal. To make a post I had to switch to the external terminal. Xubuntu 20.04

  • pik33pik33 Posts: 1,817
    edited 2022-04-21 17:34

    I compiled the new flexprop and spin2cpp

    The player no longer works as expected - it cannot open files.

    Something changed in the filesystem and the first impresion is the LFN support is now lost. If the file name is short, it opens. If the name is long, I cannot load the file.

  • iseriesiseries Posts: 1,286

    @pik33 ,

    You have to turn that on. It's not on by default.

    In this folder: \flexprop\include\filesys\fatfs find file ffconf.h.

    Look for this line:

     #define FF_USE_LFN     0
    

    Change the value from 0 to 1 and save it. Your good to go.

    Mike

  • pik33pik33 Posts: 1,817
    edited 2022-04-21 20:05

    The previous version uses -DFF_USE_LFN as a compiler line option and that was enough.

    But I tried to change the value to 1.

    It now cannot compile the player at all/ The result while compiling:

    ff.c:805: error: unknown identifier ff_oem2uni used in function call
    ff.c:805: error: Unknown symbol ff_oem2uni
    ff.c:887: error: unknown identifier ff_uni2oem used in function call
    ff.c:887: error: Unknown symbol ff_uni2oem
    ff.c:1886: error: unknown identifier ff_wtoupper used in function call
    ff.c:1886: error: Unknown symbol ff_wtoupper
    ff.c:2909: error: unknown identifier ff_uni2oem used in function call
    ff.c:2909: error: Unknown symbol ff_uni2oem
    

    Also I found cp932 is defined there. Japanese.

    Reverting now to the older version.

  • iseriesiseries Posts: 1,286

    Right, did run into that error, it has been a while.

    This is easy to fix. As I said 99% of the FatFs was used unchanged.

    Need to add a from file to this line to ff.h:

    WCHAR ff_oem2uni (WCHAR oem, WORD cp) __fromfile("filesys/fatfsx/ffunicode.c"); /* OEM code to Unicode conversion */
    

    Mike

  • ersmithersmith Posts: 5,516

    @pik33 : Sorry, I forgot that I had to add a #ifndef FF_USE_LFN in ffconf.h when I updated FatFs. The long file name support should be working again now.

  • evanhevanh Posts: 13,853

    Eric,
    I don't have any test hardware with me but this is a small streamline to the second half of rcvr_mmc() function. I think it's bug-free:

    // 8-bit rx (WFBYTE)
    rx_bytes
            and bc, #3  wz      // up to 3 bytes
        if_z    ret
    
            wypin   #8, PIN_CLK     // begin SPI clocks
            wxpin   #7 | 32, PIN_DO     // 8 bits, sample after rising clock
    rx_loop1
            sub bc, #1  wz      // check if last byte
    rx_wait1
            testp   PIN_DO  wc      // wait for received byte
        if_nc   jmp #rx_wait1
    
        if_nz   wypin   #8, PIN_CLK     // begin SPI clocks for next byte
            rdpin   r, PIN_DO       // byte from rx buffer
            rev r
    //      wfbyte  r           // store byte
            wrbyte  r, buff         // store byte
            add buff, #1
        if_nz   jmp #rx_loop1       // loop if more bytes
    
  • pik33pik33 Posts: 1,817

    In Basic, 1 shl 32 gives 1. Should be 0, as this 1 goes out of the register.

  • ersmithersmith Posts: 5,516

    @pik33 said:
    In Basic, 1 shl 32 gives 1. Should be 0, as this 1 goes out of the register.

    This happens in all languages; only the bottom 5 bits of the shift amount are used, and so 1 shl 32 is identical to 1 shl 0. That's the way the hardware works, and it would be expensive to try to work around this. I think you'll find something similar happens in FreeBASIC on x86, except that on a 64 bit machine it's 1 shl 64 that equals 1.

  • pik33pik33 Posts: 1,817
    edited 2022-04-22 17:23

    The code I am porting, that didn't work because of this, was written for a PC. They used int variable in Java and shl it from 1 to 32. Maybe this worked because PC CPUs are 64bit based, so they accept 0..63 as the shift amount.
    My ported procedure didn't work because of this and I didn't know why for a while.
    Now it is worked around by isolating the special case (n=32) using if

  • evanhevanh Posts: 13,853

    Eric,
    In disk_write() function in sdmm.c, I had earlier found a bug with order of execution in the case of single block writes. My fix was to replace this:

            if ((send_cmd(CMD24, sect) == 0)    /* WRITE_BLOCK */
                && xmit_datablock(buff, 0xFE))
                count = 0;
    

    with this:

            if( send_cmd(CMD24, sect) == 0 )    /* WRITE_BLOCK */
                if( xmit_datablock(buff, 0xFE) )
                    count = 0;
    
  • ersmithersmith Posts: 5,516

    @evanh said:
    Eric,
    In disk_write() function in sdmm.c, I had earlier found a bug with order of execution in the case of single block writes. My fix was to replace this:

          if ((send_cmd(CMD24, sect) == 0)    /* WRITE_BLOCK */
              && xmit_datablock(buff, 0xFE))
              count = 0;
    

    with this:

          if( send_cmd(CMD24, sect) == 0 )    /* WRITE_BLOCK */
              if( xmit_datablock(buff, 0xFE) )
                  count = 0;
    

    Aren't those two pieces of code identical? Internally I think if (A && B) F() is implemented as if (A) { if (B) F(); }.

  • evanhevanh Posts: 13,853
    edited 2022-05-03 14:05

    The first one executes in reverse order - Which is a problem when the command needs to come before the data block.

    EDIT: I only found it accidentally during my optimisation testing. I've forgotten the exact circumstances though. There was some error that wasn't explained by my changes and narrowing down the error lead me to that.

  • ersmithersmith Posts: 5,516

    @evanh said:
    The first one executes in reverse order - Which is a problem when the command needs to come before the data block.

    Not as it was written in the comment. Did you mean to put the xmit_datablock() first? Because I'm sure that if (send_cmd && xmit_datablock) will be the same as if (send_cmd) if (xmit_datablock) ....

  • evanhevanh Posts: 13,853

    Maybe it's an optimiser thing. I've found the posting now - https://forums.parallax.com/discussion/comment/1537502/#Comment_1537502

    The original code wasn't working. There was an error return code but I didn't know which one was returning the error. I split the two parts up to collect the return codes and it started working properly. I stopped looking any further at that point. So, I'm only guessing the problem is due to order of execution within the if() statement.

  • evanhevanh Posts: 13,853

    I'm guessing C doesn't rigidly define the execution order. ie: If the optimiser finds an advantage in reverse order then it's free to do so.

  • @evanh said:
    I'm guessing C doesn't rigidly define the execution order. ie: If the optimiser finds an advantage in reverse order then it's free to do so.

    It most certainly does define evaluation order for && and ||. Otherwise code like if (ptr && ptr->member) wouldn't be valid.

  • ersmithersmith Posts: 5,516

    I don't see any significant difference in the generated code for the two versions; the new version (with nested if's) does save one instruction, but I find it hard to believe that the timing is that sensitive that it would matter. But perhaps it does? The diff between the original (with &&) and new (with nested if's) is shown below. The difference is the same for both -O1 and -O2:

    --- orig-O2.p2asm   2022-05-03 14:06:34.436157276 -0300
    +++ new-O2.p2asm    2022-05-03 14:06:12.731017413 -0300
    @@ -2772,20 +2772,19 @@
        add ptr__fatfs_cc_dat__, #381
        rdbyte  local06, ptr__fatfs_cc_dat__
        sub ptr__fatfs_cc_dat__, #381
    -   test    local06, #8 wz
    +   and local06, #8 wz
      if_e  shl local05, #9
        cmp local04, #1 wz
      if_ne jmp #LR__0131
    -   mov arg01, #24
        mov arg02, local05
    +   mov arg01, #24
        call    #_fatfs_cc_send_cmd_0740
    -   mov local06, result1
    -   zerox   local06, #7 wz
    +   zerox   result1, #7 wz
      if_ne jmp #LR__0134
        mov arg01, local02
        mov arg02, #254
        call    #_fatfs_cc_xmit_datablock_0736
    -   cmps    result1, #0 wz
    +   cmp result1, #0 wz
      if_ne mov local04, #0
        jmp #LR__0134
    

    It does look like in the optimizer has trouble figuring out that local06 is no longer live after the whole expression, and hence generates slightly different code in the two cases. But both sets of code look correct to me.

  • evanhevanh Posts: 13,853

    @Wuerfel_21 said:
    It most certainly does define evaluation order for && and ||. Otherwise code like if (ptr && ptr->member) wouldn't be valid.

    I'd be concerned about such a use. Funnily, without memory protection, even if your code was executed in reverse, that example wouldn't cause any problem. Whereas Eric's certainly does.

  • Wuerfel_21Wuerfel_21 Posts: 3,310
    edited 2022-05-03 22:13

    @evanh said:
    I'd be concerned about such a use.

    Look at any C program (that deals with nullable struct pointers, anyways) and you will see that pattern everywhere.

  • evanhevanh Posts: 13,853

    @ersmith said:
    I don't see any significant difference in the generated code for the two versions; ... The difference is the same for both -O1 and -O2:

    Well, I couldn't think of another reason at the time. If Ada is correct then maybe the optimiser isn't always following the standard.

  • ersmithersmith Posts: 5,516

    @evanh said:

    @ersmith said:
    I don't see any significant difference in the generated code for the two versions; ... The difference is the same for both -O1 and -O2:

    Well, I couldn't think of another reason at the time. If Ada is correct then maybe the optimiser isn't always following the standard.

    Comparing the generated code shows though that the optimizer is following the standard, at least in this case. The timings are slightly different; could that be the source of the difference? Or, could there have been some other bug that you fixed around the same time?

  • evanhevanh Posts: 13,853
    edited 2022-05-03 22:54

    @ersmith said:
    ... at least in this case.

    I'll have a shot at reproducing it tonight ...

    No timing issues. Only thing currently timing sensitive in my code is buffer underflow/overflow [of smartpins because the SPI clock is non-stop for whole block]. The unused bit-bashed version had tight timing constraints but those were limited to within the volatile cogexec section.

    EDIT: So, it's not interrupt friendly at this stage. I'd probably rewrite the low-level code again using streamer+FIFO for supporting interrupts.

  • evanhevanh Posts: 13,853
    edited 2022-05-04 10:47

    Hmm, no luck. I've tried three editions of the driver: The latest edition, the pre-bug edition as posted, and the post-bug edition as posted, both bit-bashed and smartpin code paths where available. None exhibit the behaviour I'd observed. That's as far I want to go retracing it.

    I'll put it down to a freak combination/bug at the time and guessing incorrectly what was wrong. There was a lot of corrupted SD cards back then. And I often used them long after they should have been reformatted.

    Lol, I still feel nervous about the logical evaluation order. :)

  • ersmithersmith Posts: 5,516

    @evanh said:

    >

    Lol, I still feel nervous about the logical evaluation order. :)

    I understand :). But the C standard is pretty explicit that in A && B and A || B the A has to be evaluated first, and B is only evaluated if it needs to be.

    Thanks for digging into this and trying to reproduce it. I was afraid there was some subtle bug in flexspin (and there probably are still some, but apparently not in this particular case).

  • Wuerfel_21Wuerfel_21 Posts: 3,310
    edited 2022-05-07 13:20

    I think the updated FS code has a bug somewhere in it. Commit b46478b975db78c63173e0f01b2d1995c280de58 breaks loading some ROMs in MegaYume (Red screen of death -> SEGA boot code fails to verify game checksum). I think it has something to do with running into EOF during fread, because the only ROM that causes the issue isn't a multiple of the 16k read block size I use:

    PRI rom_read_block(dest) | i,j,tmp1,tmp2
    '' Read a 16K block of ROM data
    if tmp1 := $4000 - c.fread(dest,1,$4000,romfile)
      bytefill(dest+$4000-tmp1,0,tmp1)
    

    Will look into this a bit deeper....

    EDIT: Verified the assumption: padding the file to a 16K multiple makes it work with the new FS code.

  • Issue resolved: new code runs too fast (?) and the loop that calls rom_read_block was written in a very stupid way that just happened to work if rom_read_block is slow enough to never catch up to the PSRAM write. Odd that that never manifested with the "old FS code + faster SPI" setup.

  • evanhevanh Posts: 13,853

    I like those ones! New code so much faster it breaks the calling program.

  • evanhevanh Posts: 13,853
    edited 2022-05-18 08:41

    Eric,
    Struck what must be a regression, when testing Ahle's simpleSound - https://forums.parallax.com/discussion/172894/simplesound-a-simple-to-use-sound-engine-for-the-p2/p1

    Compiling "main.spin2" produces the following error message without any reference to a source file:

    Propeller Spin/PASM Compiler 'FlexSpin' (c) 2011-2022 Total Spectrum Software Inc.
    Version 5.9.11-beta-v5.9.10-14-g28045294 Compiled on: May 15 2022
    main.spin2
    |-simpleSound.spin2
    |-|-reSound.spin2
    |-|-trackerPlayer.spin2
    warning: Internal warning, found two GETQX during constant propagate??
    warning: Internal warning, found two GETQX during constant propagate??
    main.p2asm
    Done.
    

    Behaviour is the binary runs but only a low audio buzzing noise is produced from speakers. Maybe with signs of the sound module still sequencing in the noise.

    I'm pretty certain this wasn't a problem last time I compiled this code ... which was back in March. Certainly Pnut.exe has no issues.

Sign In or Register to comment.