A couple of points I should make:
- Both build paths use pasm2. I'm not sure if this source file is required to work with the Prop1 or not.
- I've discovered that, above about 320 MHz sysclock (40 MHz SPI clock), the bit-bashed path isn't quite as rugged as the smartpins path has become. I had the advantage of easy fine tuning the clock high-low ratio with smartpins to give it more leeway.
I'm thinking about retiring the optimised bit-bashed path anyway. Maybe revert to the old non-optimised path if Prop1 compatibility is desired.
EDIT: On a tangent, there is no verification of valid data within the driver code. Getting verification might be a decent reason to eventually discontinue the use of SPI. I know CRC can be enabled with SPI but how much processing is that? It might be just as fast processing the full SD protocol.
I created a topic in Basic section - something is wrong, as changing the order of variable declaration changes the behavior of the code. Then I wanted to copy the output from the Flexprop's internal serial terminal and paste it to the forum topic editor.
Then I discovered that I cannot do it (or I don't know how) - ctrl-c doesn't work on the internal terminal. To make a post I had to switch to the external terminal. Xubuntu 20.04
The player no longer works as expected - it cannot open files.
Something changed in the filesystem and the first impresion is the LFN support is now lost. If the file name is short, it opens. If the name is long, I cannot load the file.
The previous version uses -DFF_USE_LFN as a compiler line option and that was enough.
But I tried to change the value to 1.
It now cannot compile the player at all/ The result while compiling:
ff.c:805: error: unknown identifier ff_oem2uni used in function call
ff.c:805: error: Unknown symbol ff_oem2uni
ff.c:887: error: unknown identifier ff_uni2oem used in function call
ff.c:887: error: Unknown symbol ff_uni2oem
ff.c:1886: error: unknown identifier ff_wtoupper used in function call
ff.c:1886: error: Unknown symbol ff_wtoupper
ff.c:2909: error: unknown identifier ff_uni2oem used in function call
ff.c:2909: error: Unknown symbol ff_uni2oem
@pik33 : Sorry, I forgot that I had to add a #ifndef FF_USE_LFN in ffconf.h when I updated FatFs. The long file name support should be working again now.
Eric,
I don't have any test hardware with me but this is a small streamline to the second half of rcvr_mmc() function. I think it's bug-free:
// 8-bit rx (WFBYTE)
rx_bytes
and bc, #3 wz // up to 3 bytes
if_z ret
wypin #8, PIN_CLK // begin SPI clocks
wxpin #7 | 32, PIN_DO // 8 bits, sample after rising clock
rx_loop1
sub bc, #1 wz // check if last byte
rx_wait1
testp PIN_DO wc // wait for received byte
if_nc jmp #rx_wait1
if_nz wypin #8, PIN_CLK // begin SPI clocks for next byte
rdpin r, PIN_DO // byte from rx buffer
rev r
// wfbyte r // store byte
wrbyte r, buff // store byte
add buff, #1
if_nz jmp #rx_loop1 // loop if more bytes
@pik33 said:
In Basic, 1 shl 32 gives 1. Should be 0, as this 1 goes out of the register.
This happens in all languages; only the bottom 5 bits of the shift amount are used, and so 1 shl 32 is identical to 1 shl 0. That's the way the hardware works, and it would be expensive to try to work around this. I think you'll find something similar happens in FreeBASIC on x86, except that on a 64 bit machine it's 1 shl 64 that equals 1.
The code I am porting, that didn't work because of this, was written for a PC. They used int variable in Java and shl it from 1 to 32. Maybe this worked because PC CPUs are 64bit based, so they accept 0..63 as the shift amount.
My ported procedure didn't work because of this and I didn't know why for a while.
Now it is worked around by isolating the special case (n=32) using if
Eric,
In disk_write() function in sdmm.c, I had earlier found a bug with order of execution in the case of single block writes. My fix was to replace this:
@evanh said:
Eric,
In disk_write() function in sdmm.c, I had earlier found a bug with order of execution in the case of single block writes. My fix was to replace this:
The first one executes in reverse order - Which is a problem when the command needs to come before the data block.
EDIT: I only found it accidentally during my optimisation testing. I've forgotten the exact circumstances though. There was some error that wasn't explained by my changes and narrowing down the error lead me to that.
@evanh said:
The first one executes in reverse order - Which is a problem when the command needs to come before the data block.
Not as it was written in the comment. Did you mean to put the xmit_datablock() first? Because I'm sure that if (send_cmd && xmit_datablock) will be the same as if (send_cmd) if (xmit_datablock) ....
The original code wasn't working. There was an error return code but I didn't know which one was returning the error. I split the two parts up to collect the return codes and it started working properly. I stopped looking any further at that point. So, I'm only guessing the problem is due to order of execution within the if() statement.
@evanh said:
I'm guessing C doesn't rigidly define the execution order. ie: If the optimiser finds an advantage in reverse order then it's free to do so.
It most certainly does define evaluation order for && and ||. Otherwise code like if (ptr && ptr->member) wouldn't be valid.
I don't see any significant difference in the generated code for the two versions; the new version (with nested if's) does save one instruction, but I find it hard to believe that the timing is that sensitive that it would matter. But perhaps it does? The diff between the original (with &&) and new (with nested if's) is shown below. The difference is the same for both -O1 and -O2:
It does look like in the optimizer has trouble figuring out that local06 is no longer live after the whole expression, and hence generates slightly different code in the two cases. But both sets of code look correct to me.
@Wuerfel_21 said:
It most certainly does define evaluation order for && and ||. Otherwise code like if (ptr && ptr->member) wouldn't be valid.
I'd be concerned about such a use. Funnily, without memory protection, even if your code was executed in reverse, that example wouldn't cause any problem. Whereas Eric's certainly does.
@ersmith said:
I don't see any significant difference in the generated code for the two versions; ... The difference is the same for both -O1 and -O2:
Well, I couldn't think of another reason at the time. If Ada is correct then maybe the optimiser isn't always following the standard.
Comparing the generated code shows though that the optimizer is following the standard, at least in this case. The timings are slightly different; could that be the source of the difference? Or, could there have been some other bug that you fixed around the same time?
No timing issues. Only thing currently timing sensitive in my code is buffer underflow/overflow [of smartpins because the SPI clock is non-stop for whole block]. The unused bit-bashed version had tight timing constraints but those were limited to within the volatile cogexec section.
EDIT: So, it's not interrupt friendly at this stage. I'd probably rewrite the low-level code again using streamer+FIFO for supporting interrupts.
Hmm, no luck. I've tried three editions of the driver: The latest edition, the pre-bug edition as posted, and the post-bug edition as posted, both bit-bashed and smartpin code paths where available. None exhibit the behaviour I'd observed. That's as far I want to go retracing it.
I'll put it down to a freak combination/bug at the time and guessing incorrectly what was wrong. There was a lot of corrupted SD cards back then. And I often used them long after they should have been reformatted.
Lol, I still feel nervous about the logical evaluation order.
Lol, I still feel nervous about the logical evaluation order.
I understand . But the C standard is pretty explicit that in A && B and A || B the A has to be evaluated first, and B is only evaluated if it needs to be.
Thanks for digging into this and trying to reproduce it. I was afraid there was some subtle bug in flexspin (and there probably are still some, but apparently not in this particular case).
I think the updated FS code has a bug somewhere in it. Commit b46478b975db78c63173e0f01b2d1995c280de58 breaks loading some ROMs in MegaYume (Red screen of death -> SEGA boot code fails to verify game checksum). I think it has something to do with running into EOF during fread, because the only ROM that causes the issue isn't a multiple of the 16k read block size I use:
PRI rom_read_block(dest) | i,j,tmp1,tmp2
'' Read a 16K block of ROM data
if tmp1 := $4000 - c.fread(dest,1,$4000,romfile)
bytefill(dest+$4000-tmp1,0,tmp1)
Will look into this a bit deeper....
EDIT: Verified the assumption: padding the file to a 16K multiple makes it work with the new FS code.
Issue resolved: new code runs too fast (?) and the loop that calls rom_read_block was written in a very stupid way that just happened to work if rom_read_block is slow enough to never catch up to the PSRAM write. Odd that that never manifested with the "old FS code + faster SPI" setup.
Compiling "main.spin2" produces the following error message without any reference to a source file:
Propeller Spin/PASM Compiler 'FlexSpin' (c) 2011-2022 Total Spectrum Software Inc.
Version 5.9.11-beta-v5.9.10-14-g28045294 Compiled on: May 15 2022
main.spin2
|-simpleSound.spin2
|-|-reSound.spin2
|-|-trackerPlayer.spin2
warning: Internal warning, found two GETQX during constant propagate??
warning: Internal warning, found two GETQX during constant propagate??
main.p2asm
Done.
Behaviour is the binary runs but only a low audio buzzing noise is produced from speakers. Maybe with signs of the sound module still sequencing in the noise.
I'm pretty certain this wasn't a problem last time I compiled this code ... which was back in March. Certainly Pnut.exe has no issues.
Comments
Cool.
A couple of points I should make:
- Both build paths use pasm2. I'm not sure if this source file is required to work with the Prop1 or not.
- I've discovered that, above about 320 MHz sysclock (40 MHz SPI clock), the bit-bashed path isn't quite as rugged as the smartpins path has become. I had the advantage of easy fine tuning the clock high-low ratio with smartpins to give it more leeway.
I'm thinking about retiring the optimised bit-bashed path anyway. Maybe revert to the old non-optimised path if Prop1 compatibility is desired.
EDIT: On a tangent, there is no verification of valid data within the driver code. Getting verification might be a decent reason to eventually discontinue the use of SPI. I know CRC can be enabled with SPI but how much processing is that? It might be just as fast processing the full SD protocol.
I created a topic in Basic section - something is wrong, as changing the order of variable declaration changes the behavior of the code. Then I wanted to copy the output from the Flexprop's internal serial terminal and paste it to the forum topic editor.
Then I discovered that I cannot do it (or I don't know how) - ctrl-c doesn't work on the internal terminal. To make a post I had to switch to the external terminal. Xubuntu 20.04
I compiled the new flexprop and spin2cpp
The player no longer works as expected - it cannot open files.
Something changed in the filesystem and the first impresion is the LFN support is now lost. If the file name is short, it opens. If the name is long, I cannot load the file.
@pik33 ,
You have to turn that on. It's not on by default.
In this folder: \flexprop\include\filesys\fatfs find file ffconf.h.
Look for this line:
Change the value from 0 to 1 and save it. Your good to go.
Mike
The previous version uses -DFF_USE_LFN as a compiler line option and that was enough.
But I tried to change the value to 1.
It now cannot compile the player at all/ The result while compiling:
Also I found cp932 is defined there. Japanese.
Reverting now to the older version.
Right, did run into that error, it has been a while.
This is easy to fix. As I said 99% of the FatFs was used unchanged.
Need to add a from file to this line to ff.h:
Mike
@pik33 : Sorry, I forgot that I had to add a #ifndef FF_USE_LFN in ffconf.h when I updated FatFs. The long file name support should be working again now.
Eric,
I don't have any test hardware with me but this is a small streamline to the second half of
rcvr_mmc()
function. I think it's bug-free:In Basic, 1 shl 32 gives 1. Should be 0, as this 1 goes out of the register.
This happens in all languages; only the bottom 5 bits of the shift amount are used, and so 1 shl 32 is identical to 1 shl 0. That's the way the hardware works, and it would be expensive to try to work around this. I think you'll find something similar happens in FreeBASIC on x86, except that on a 64 bit machine it's 1 shl 64 that equals 1.
The code I am porting, that didn't work because of this, was written for a PC. They used int variable in Java and shl it from 1 to 32. Maybe this worked because PC CPUs are 64bit based, so they accept 0..63 as the shift amount.
My ported procedure didn't work because of this and I didn't know why for a while.
Now it is worked around by isolating the special case (n=32) using if
Eric,
In
disk_write()
function in sdmm.c, I had earlier found a bug with order of execution in the case of single block writes. My fix was to replace this:with this:
Aren't those two pieces of code identical? Internally I think
if (A && B) F()
is implemented asif (A) { if (B) F(); }
.The first one executes in reverse order - Which is a problem when the command needs to come before the data block.
EDIT: I only found it accidentally during my optimisation testing. I've forgotten the exact circumstances though. There was some error that wasn't explained by my changes and narrowing down the error lead me to that.
Not as it was written in the comment. Did you mean to put the xmit_datablock() first? Because I'm sure that
if (send_cmd && xmit_datablock)
will be the same asif (send_cmd) if (xmit_datablock) ...
.Maybe it's an optimiser thing. I've found the posting now - https://forums.parallax.com/discussion/comment/1537502/#Comment_1537502
The original code wasn't working. There was an error return code but I didn't know which one was returning the error. I split the two parts up to collect the return codes and it started working properly. I stopped looking any further at that point. So, I'm only guessing the problem is due to order of execution within the
if()
statement.I'm guessing C doesn't rigidly define the execution order. ie: If the optimiser finds an advantage in reverse order then it's free to do so.
It most certainly does define evaluation order for && and ||. Otherwise code like
if (ptr && ptr->member)
wouldn't be valid.I don't see any significant difference in the generated code for the two versions; the new version (with nested if's) does save one instruction, but I find it hard to believe that the timing is that sensitive that it would matter. But perhaps it does? The diff between the original (with &&) and new (with nested if's) is shown below. The difference is the same for both -O1 and -O2:
It does look like in the optimizer has trouble figuring out that local06 is no longer live after the whole expression, and hence generates slightly different code in the two cases. But both sets of code look correct to me.
I'd be concerned about such a use. Funnily, without memory protection, even if your code was executed in reverse, that example wouldn't cause any problem. Whereas Eric's certainly does.
Look at any C program (that deals with nullable struct pointers, anyways) and you will see that pattern everywhere.
Well, I couldn't think of another reason at the time. If Ada is correct then maybe the optimiser isn't always following the standard.
Comparing the generated code shows though that the optimizer is following the standard, at least in this case. The timings are slightly different; could that be the source of the difference? Or, could there have been some other bug that you fixed around the same time?
I'll have a shot at reproducing it tonight ...
No timing issues. Only thing currently timing sensitive in my code is buffer underflow/overflow [of smartpins because the SPI clock is non-stop for whole block]. The unused bit-bashed version had tight timing constraints but those were limited to within the
volatile
cogexec section.EDIT: So, it's not interrupt friendly at this stage. I'd probably rewrite the low-level code again using streamer+FIFO for supporting interrupts.
Hmm, no luck. I've tried three editions of the driver: The latest edition, the pre-bug edition as posted, and the post-bug edition as posted, both bit-bashed and smartpin code paths where available. None exhibit the behaviour I'd observed. That's as far I want to go retracing it.
I'll put it down to a freak combination/bug at the time and guessing incorrectly what was wrong. There was a lot of corrupted SD cards back then. And I often used them long after they should have been reformatted.
Lol, I still feel nervous about the logical evaluation order.
>
I understand . But the C standard is pretty explicit that in
A && B
andA || B
theA
has to be evaluated first, andB
is only evaluated if it needs to be.Thanks for digging into this and trying to reproduce it. I was afraid there was some subtle bug in flexspin (and there probably are still some, but apparently not in this particular case).
I think the updated FS code has a bug somewhere in it. Commit b46478b975db78c63173e0f01b2d1995c280de58 breaks loading some ROMs in MegaYume (Red screen of death -> SEGA boot code fails to verify game checksum). I think it has something to do with running into EOF during fread, because the only ROM that causes the issue isn't a multiple of the 16k read block size I use:
Will look into this a bit deeper....
EDIT: Verified the assumption: padding the file to a 16K multiple makes it work with the new FS code.
Issue resolved: new code runs too fast (?) and the loop that calls
rom_read_block
was written in a very stupid way that just happened to work ifrom_read_block
is slow enough to never catch up to the PSRAM write. Odd that that never manifested with the "old FS code + faster SPI" setup.I like those ones! New code so much faster it breaks the calling program.
Eric,
Struck what must be a regression, when testing Ahle's simpleSound - https://forums.parallax.com/discussion/172894/simplesound-a-simple-to-use-sound-engine-for-the-p2/p1
Compiling "main.spin2" produces the following error message without any reference to a source file:
Behaviour is the binary runs but only a low audio buzzing noise is produced from speakers. Maybe with signs of the sound module still sequencing in the noise.
I'm pretty certain this wasn't a problem last time I compiled this code ... which was back in March. Certainly Pnut.exe has no issues.