I'm using flexprop v5.9.2 to compile taqoz v2.8 210401-1230.
I have changed the clock routines to use spin2 methods described in parallax spin2 v35m doc pg 40-41
@bigtreeman said:
I'm using flexprop v5.9.2 to compile taqoz v2.8 210401-1230.
I have changed the clock routines to use spin2 methods described in parallax spin2 v35m doc pg 40-41
I thought taqoz is written in assembly? If so then you cannot directly use any Spin2 methods from within it. Spin2 can call assembly, the the reverse (assembly calling Spin2) is not supported.
Bigtree,
That's a compile-time version. It makes the assumption of RCFAST hand-over. You'll want to do a run-time version that uses the clkmode variable instead ...
PS: And clkfreq too I guess.
Anyway, it would start with the critical clean switch back to RCFAST:
The XORO32 can't prefix a WRLONG like that. It'll even be corrupting hubRAM as a result.
PS: I petitioned Chip, at a late stage, to have XORO32, and SCA and others, operate on the D operand just so this sort of combo could be done. It wasn't deemed important enough, I think revB sign-off was too close for design changes. Only bug fixes by then.
PPS: Well, "can't" is maybe too strong a word there. If one was wanting to pepper hubRAM in a random address order then that would be ideal solution.
Thanks @evanh, that's fixed in the latest github. I think you're building from source, but for those who aren't a work-around is to write "__asm const" to tell the optimizer not to mess with the contents of the __asm.
Eric,
There's no user reservable cogRAM is there? I can't say I want to keep the state variable for XORO32 stored in cogRAM permanently can I? I'm thinking of "static" declarations, btw.
@evanh said:
Eric,
There's no user reservable cogRAM is there? I can't say I want to keep the state variable for XORO32 stored in cogRAM permanently can I? I'm thinking of "static" declarations, btw.
No, there is no way to force a variable to be in COG ram. The best way to "persuade" it to be there is to always pass it as a parameter and return it from a function, something like:
Passing by reference passes a pointer, so while it does accomplish allowing a function to have multiple results, it doesn't compile to multiple return values like Spin2 functions do.
@Wuerfel_21 said:
Very valuable insight on the nucode backend: Still can build Spin Hexagon, lol.
Chokes on the unimplemented AST_POSTSET and then segfaults.
It's getting closer, but still not quite there yet. I've implemented the AST_POSTSET and fixed the segfault (infinite recursion blowing up the stack), but there are still a few things missing from nucode. It's definitely not ready for real use yet!
@Wuerfel_21 said:
Very valuable insight on the nucode backend: Still can build Spin Hexagon, lol.
Chokes on the unimplemented AST_POSTSET and then segfaults.
It's getting closer, but still not quite there yet. I've implemented the AST_POSTSET and fixed the segfault (infinite recursion blowing up the stack), but there are still a few things missing from nucode. It's definitely not ready for real use yet!
Well this turned interesting:
It now builds, but for some reason only if I run it in GDB, crashes otherwise (that seems like a great issue to debug).
If load that build, it does infact boot up and you can get to the stage select screen just fine, working audio and all. But when you actually start a game it hangs on the white screen. Odd.
Actually, it seems that crashing is just a thing that happens now even when compiling something less complex to PASM (read: the QuadView test program I just posted in another thread)
This one I've been able to catch in GDB
Reading symbols from C:\zeug\fastspin\flexspin.exe...done.
(gdb) run -2 quadview_test.spin2
Starting program: C:\zeug\fastspin/flexspin.exe -2 quadview_test.spin2
[New Thread 30328.0xb1a0]
Propeller Spin/PASM Compiler 'FlexSpin' (c) 2011-2021 Total Spectrum Software Inc.
Version 5.9.3-beta-v5.9.2-33-g35412c83 Compiled on: Sep 252021
quadview_test.spin2
|-QuadView.spin2
quadview_test.p2asm
Program received signal SIGSEGV, Segmentation fault.
0x0040b6e7inMarkStaticFunctionPointers (list=0x53001e0) at spinc.c:852
852 MarkStaticFunctionPointers(list->left);
(gdb)
Very funny recursion bug, it seems (bt makes the terminal explode).
@Wuerfel_21 : For some reason gcc isn't doing tail call optimization on MarkStaticFunctionPointers. Actually I guess that's not so mysterious after all, there are no optimizations enabled in the default makefile. Anyway, the stack size on Windows is relatively small, and hence the overflow.
I've modified MarkStaticFunctionPointers to loop instead of recurse at the top level, and also added -Og to the make flags, so that particular problem should be fixed now.
Eric,
I started looking at the base level SD bit-bashing in <filesys/fatfs/sdmm.cc> and can't work out why inline assembly can't be "volatile"d the way I normally do. FlexC is saying FCACHE is disabled and I don't know why.
Here's an initial working version of the block read subroutine. I can make it go a lot faster if it's in cogRAM. Instead of 26 sysclock ticks per bit, it'll be 8 ticks per bit. EDIT: It's actually a lot worse than 26 right now because of the multiple hubRAM stalls.
staticvoidrcvr_mmc (
BYTE *buff, /* Pointer to read buffer */
UINT bc /* Number of bytes to receive */)
{
#ifdef _fast_pasm_ehint r;
int PIN_SS = pins.pin_ss;
int PIN_CLK = pins.pin_clk;
int PIN_DI = pins.pin_di;
int PIN_DO = pins.pin_do;
__asm volatile {
drvh PIN_DI // Send 0xFF
rx_loop
rep @rx_rend, #8
waitx #16
testp PIN_DO wc
outnot PIN_CLK
rcl r, #1
outnot PIN_CLK
rx_rend
wrbyte r, buff
add buff, #1
djnz bc, #rx_loop
}
I rewrote most of that code a long time ago and get great performance compared to what is there now. I think what is there now is just some code to make it work. I wrote the bit banging a little different than that but when it compiles it kind of looks like yours. I guess the compiler makes it work better.
I want to submit this version to eric but need to get it in sink with what is there now.
I'll be removing that WAITX completely if I can sort the FCACHE problem.
Here's what I want it to be:
__asmvolatile { // IMPORTANT! "volatile" prevents automated optimisingwrfast#0, buff// Use the FIFO to buffer hubRAM write timingdrvhPIN_DI// Send 0xFFrx_loopoutnotPIN_CLK// I/O lag startnop// 2outnotPIN_CLK// 4rep @rx_rend, #7// 6outnotPIN_CLK// 8rcldata, #1// 10outnotPIN_CLK// 12testpPIN_DOwc// 13 (TESTP does an early sample of pin)rx_rendrcldata, #1waitx#2testpPIN_DOwc// final bit receivedrcldata, #1wfbytedata// Store a received bytedjnzbc, #rx_loop
}
PS: This snippet is ripped from another program. It assumes a slightly different internal SPI clock phase than what Eric uses. I'll have to fiddle around with that after FCACHE issue is sorted.
Yep, it's had a lot of testing. The trickiest part has always been handling the lag created by the Prop2's speed verses its I/O latencies. Once you've got your head around those latencies then it's just the simpler part of getting SPI's CPHA and CPOL sorted out.
@evanh : I'm afraid there's not enough context to know why FCACHE is disabled in your program. The most common reason is disabling optimizations (compiling with -O0). Otherwise, could you post a compilable example that shows the warning? When I tried inserting your code into the sdmm.cc file it compiled without any problems, although it didn't seem to work on my SD card.
@evanh said:
Eric,
In FlexC, is there a graceful way to handle SD card removal after mount()ing? I don't see any umount().
There's now a umount() call which is hooked up to fatfs (but not to the host fs yet). It seemed to work for me with simple testing, but I haven't tested a lot of different cards.
Comments
git pull
update. The error went after Eric suggest I try amake clean
.I'm using flexprop v5.9.2 to compile taqoz v2.8 210401-1230.
I have changed the clock routines to use spin2 methods described in parallax spin2 v35m doc pg 40-41
start of code
CON _xtlfreq = 25_000_000 _clkfreq = 150_000_000
associated taqoz-spin2 code
'' clkset ( newclckmode newclkfreq -- ) 'clkset clkset a,b ' jmp #DROP2 ' rdmode ( -- m ) rdmode rdlong x,#@clkmode jmp #pushx ' wrmode ( m -- ) wrmode wrlong a,#@clkmode jmp #drop ' rdfreq ( -- f ) rdfreq rdlong x,#@clkfreq jmp #pushx ' wrfreq ( f -- ) wrfreq wrlong a,#@clkfreq jmp #drop '' rderr ( -- f ) 'rderr rdlong x,#@errfreq ' jmp #pushx '' wrerr ( f -- ) 'wrerr wrlong a,#@errfreq ' jmp #drop
clkset doesn't compile in flexprop,
errfreq isn't available for read/write with running code
first code executed in initsys
hubset ##clkmode_ & !%11 'start crystal/PLL, stay in RCFAST waitx ##25_000_000/100 'wait 10ms hubset ##clkmode_ | %11 'switch to crystal/PLL
compile and run from flexprop
*Cold start* ------------------------------------------------------------------------------- Parallax P2 *TAQOZ RELOADED SIDE* V2.8 'CHIP' Prop_Ver G 150MHz 210401-1230 ------------------------------------------------------------------------------- TAQOZ# rdmode .bin --- %00000001000000000000010111111000 ok
appears to be pll,150mhz,15pF,rcfast
I can manually change to pll with
rdmode %11 or dup wrmode hubset
Where am I going wrong ??
I thought taqoz is written in assembly? If so then you cannot directly use any Spin2 methods from within it. Spin2 can call assembly, the the reverse (assembly calling Spin2) is not supported.
Yup. A fresh install cured everything for me too. Amazing how bit-gremlins form over time.
Ah, thanks,
I'll define clkset and errfreq in assembler for use in taqoz,
looking through the compiler code to see how it is all done,
so far -
'' clkset ( newclckmode newclkfreq -- ) '' and fix baud rate clkset hubset ##clkmode_ & !%11 'RCFAST mode wrlong a,#@clkfreq wrlong b,#@clkmode waitx ##25_000_000/100 'allow 10ms for external clock to stabilize hubset ##clkmode_ call #SETBAUD jmp #drop2
Bigtree,
That's a compile-time version. It makes the assumption of RCFAST hand-over. You'll want to do a run-time version that uses the
clkmode
variable instead ...PS: And
clkfreq
too I guess.Anyway, it would start with the critical clean switch back to RCFAST:
rdlong a, #@clkmode andn a, #%11 hubset a
Eric,
Another bug. Tested in C but presumably is across the board for the optimiser. I got this:
0099c D1 BE 01 FB | rdlong local03, ptr__dat__ 009a0 68 BE 61 FD | xoro32 local03 009a4 D1 BE 61 FC | wrlong local03, ptr__dat__
From this:
static uint32_t xoro_test( uint32_t *stateptr ) { uint32_t r, s; s = *stateptr; __asm { xoro32 s mov r, 0-0 } *stateptr = s; return r; }
The XORO32 can't prefix a WRLONG like that. It'll even be corrupting hubRAM as a result.
PS: I petitioned Chip, at a late stage, to have XORO32, and SCA and others, operate on the D operand just so this sort of combo could be done. It wasn't deemed important enough, I think revB sign-off was too close for design changes. Only bug fixes by then.
PPS: Well, "can't" is maybe too strong a word there. If one was wanting to pepper hubRAM in a random address order then that would be ideal solution.
Thanks @evanh, that's fixed in the latest github. I think you're building from source, but for those who aren't a work-around is to write "__asm const" to tell the optimizer not to mess with the contents of the __asm.
Thanks Eric. Yep, looks good now.
I am also getting a warning for each
0-0
, telling me to add a-0
funnily. This was happening before too.Eric,
There's no user reservable cogRAM is there? I can't say I want to keep the state variable for XORO32 stored in cogRAM permanently can I? I'm thinking of "static" declarations, btw.
No, there is no way to force a variable to be in COG ram. The best way to "persuade" it to be there is to always pass it as a parameter and return it from a function, something like:
#include <stdio.h> #include <stdint.h> typedef struct xoroState { uint32_t state; uint32_t randVal; } XoroState; XoroState xoro_test(XoroState xs) { uint32_t r, s; s = xs.state; __asm { xoro32 s; mov r, 0-0; } xs.state = s; xs.randVal = r; return xs; } int main() { int i; XoroState xs; xs.state = 0xdeadbeef; for (i = 0; i < 10; i++) { xs = xoro_test(xs); printf("xoro %d = 0x%08x\n", i, xs.randVal); } return 0; }
Oh, is that how you've handled Spin's multiple return values?
Well, that's the only way to return multiple values in C. It is a bit simpler and cleaner in Spin2 (or BASIC).
I guess passing arguments by reference is another way:
https://www.geeksforgeeks.org/how-to-return-multiple-values-from-a-function-in-c-or-cpp/
Passing by reference passes a pointer, so while it does accomplish allowing a function to have multiple results, it doesn't compile to multiple return values like Spin2 functions do.
Very valuable insight on the nucode backend: Still can build Spin Hexagon, lol.
Chokes on the unimplemented
AST_POSTSET
and then segfaults.It's getting closer, but still not quite there yet. I've implemented the AST_POSTSET and fixed the segfault (infinite recursion blowing up the stack), but there are still a few things missing from nucode. It's definitely not ready for real use yet!
Well this turned interesting:
It now builds, but for some reason only if I run it in GDB, crashes otherwise (that seems like a great issue to debug).
If load that build, it does infact boot up and you can get to the stage select screen just fine, working audio and all. But when you actually start a game it hangs on the white screen. Odd.
Actually, it seems that crashing is just a thing that happens now even when compiling something less complex to PASM (read: the QuadView test program I just posted in another thread)
This one I've been able to catch in GDB
Reading symbols from C:\zeug\fastspin\flexspin.exe...done. (gdb) run -2 quadview_test.spin2 Starting program: C:\zeug\fastspin/flexspin.exe -2 quadview_test.spin2 [New Thread 30328.0xb1a0] Propeller Spin/PASM Compiler 'FlexSpin' (c) 2011-2021 Total Spectrum Software Inc. Version 5.9.3-beta-v5.9.2-33-g35412c83 Compiled on: Sep 25 2021 quadview_test.spin2 |-QuadView.spin2 quadview_test.p2asm Program received signal SIGSEGV, Segmentation fault. 0x0040b6e7 in MarkStaticFunctionPointers (list=0x53001e0) at spinc.c:852 852 MarkStaticFunctionPointers(list->left); (gdb)
Very funny recursion bug, it seems (
bt
makes the terminal explode).@Wuerfel_21 : For some reason gcc isn't doing tail call optimization on MarkStaticFunctionPointers. Actually I guess that's not so mysterious after all, there are no optimizations enabled in the default makefile. Anyway, the stack size on Windows is relatively small, and hence the overflow.
I've modified MarkStaticFunctionPointers to loop instead of recurse at the top level, and also added -Og to the make flags, so that particular problem should be fixed now.
Eric,
In FlexC, is there a graceful way to handle SD card removal after mount()ing? I don't see any umount().
EDIT: Although I do find an unmount() in ff.h header ... but it's not recognised by default.
No, there's no umount() yet (and no VFS hook to call the FatFs unmount). Something for the TODO list...
No worries. Thanks for the reply.
Eric,
I started looking at the base level SD bit-bashing in <filesys/fatfs/sdmm.cc> and can't work out why inline assembly can't be "volatile"d the way I normally do. FlexC is saying FCACHE is disabled and I don't know why.
Here's an initial working version of the block read subroutine. I can make it go a lot faster if it's in cogRAM. Instead of 26 sysclock ticks per bit, it'll be 8 ticks per bit. EDIT: It's actually a lot worse than 26 right now because of the multiple hubRAM stalls.
static void rcvr_mmc ( BYTE *buff, /* Pointer to read buffer */ UINT bc /* Number of bytes to receive */ ) { #ifdef _fast_pasm_eh int r; int PIN_SS = pins.pin_ss; int PIN_CLK = pins.pin_clk; int PIN_DI = pins.pin_di; int PIN_DO = pins.pin_do; __asm volatile { drvh PIN_DI // Send 0xFF rx_loop rep @rx_rend, #8 waitx #16 testp PIN_DO wc outnot PIN_CLK rcl r, #1 outnot PIN_CLK rx_rend wrbyte r, buff add buff, #1 djnz bc, #rx_loop }
@evanh ,
I rewrote most of that code a long time ago and get great performance compared to what is there now. I think what is there now is just some code to make it work. I wrote the bit banging a little different than that but when it compiles it kind of looks like yours. I guess the compiler makes it work better.
I want to submit this version to eric but need to get it in sink with what is there now.
Mike
I'll be removing that WAITX completely if I can sort the FCACHE problem.
Here's what I want it to be:
__asm volatile { // IMPORTANT! "volatile" prevents automated optimising wrfast #0, buff // Use the FIFO to buffer hubRAM write timing drvh PIN_DI // Send 0xFF rx_loop outnot PIN_CLK // I/O lag start nop // 2 outnot PIN_CLK // 4 rep @rx_rend, #7 // 6 outnot PIN_CLK // 8 rcl data, #1 // 10 outnot PIN_CLK // 12 testp PIN_DO wc // 13 (TESTP does an early sample of pin) rx_rend rcl data, #1 waitx #2 testp PIN_DO wc // final bit received rcl data, #1 wfbyte data // Store a received byte djnz bc, #rx_loop }
PS: This snippet is ripped from another program. It assumes a slightly different internal SPI clock phase than what Eric uses. I'll have to fiddle around with that after FCACHE issue is sorted.
@evanh ,
The question then becomes will this work with my old 16meg SD card. I believe the current driver works with all the different cards.
Mike
Yep, it's had a lot of testing. The trickiest part has always been handling the lag created by the Prop2's speed verses its I/O latencies. Once you've got your head around those latencies then it's just the simpler part of getting SPI's CPHA and CPOL sorted out.
@evanh : I'm afraid there's not enough context to know why FCACHE is disabled in your program. The most common reason is disabling optimizations (compiling with -O0). Otherwise, could you post a compilable example that shows the warning? When I tried inserting your code into the sdmm.cc file it compiled without any problems, although it didn't seem to work on my SD card.
There's now a umount() call which is hooked up to fatfs (but not to the host fs yet). It seemed to work for me with simple testing, but I haven't tested a lot of different cards.