I'm working on a "template" cog engine, mostly for my own understanding. What I have so far is this :
PUBstart_explicit( DO, CLK, DI, CS ) : card_type
stop
long[@cmd] := "i"'set flag to know if the started cog has read its parameterslong[@cmd][1] := @DO
...
Ugh! Please don't do this. I know it was popular in Spin1 to save time by passing pointers to the parameters that happened to be on the stack, but it really kills performance in fastspin (it'll work, but it forces the compiler to store local variables on the stack instead of in registers, which is much slower). It also makes porting the code to other languages harder, because not all of them will pass variables on the stack -- many, like fastspin, will pass them in registers instead.
You've also got the mailbox in the DAT section, which means it's shared by all instances of the object. If you wanted to drive two different cards, with two different COGs, you'd need to have a lock on the mailbox so they don't conflict. Whereas if the mailbox is a local variable of the object this wouldn't be an issue. Maybe it doesn't matter for the SD card driver, but you mentioned wanting to use this as a template for other objects, so it might come up there.
So I would suggest that the more natural way to write this is:
That'll also simplify your PASM startup code, because you won't have to juggle multiple pointers (you just have the one to mbox), and I think makes everything easier to understand, both for the human reader and for the compiler.
As to passing parameters, I'll need to figure out the best way to handle this. I use this trick because there are many times I would like to pass more initialization info to the cog than I have parameters. I never did get my head around the "pre-load" trick used in the P1 and always did it this way. I guess I could use several commands for initialization, need to think on this more.
Regarding the mailbox in the dat section, I also have a commented out VAR section if I wanted multiple instances running at the same time. I will have one platform with 2 sd cards so this will be something to think about. That hardware also has a shared SPI bus so was thinking about providing some SPI methods for accessing those devices as well. I think it's best to start out "generic" so maybe others can find it helpful as well.
Attached is a sdrw test for the eval board. Confirmed to work consistently on my P2ES. You may want to consider adding this to the RC3 distribution. It was the file @Rayman included in his, with additions for CR + LF in the terminal. Thank you @cheezus !
So I'm looking at the docs, trying the RTFM manual way.. I think I'm close to testing the improved read, using smartpins. I'm sure there's something still wrong though. I'm sure the experts can see where I'm going with this though.
pri read : r | c, o ' Read eight bits from the card.
c := clk
o := do
asm
dirl c
wrpin sp_async_tx,c
wxpin ##$9,c 'set base period dirh c
dirl o
wrpin sp_clk_cfg,o 'sync txwxpin #%1_00111,o 'stop/start modedirh o 'enable smartpin' wypin txdata,o 'data, need to do anything?wypin #8,c 'start clock, tx data '' ! should this be 8 or 16?
busy testp c wcif_ncjmp #busy
rdpinpa,c ' needed?rdpin r,o
dirl o
dirl c
'' next part per chip's smart pin documentation REV r ' Set MSB First
TRIML r,#7
endasm
I've been working on this a bit more, trying to prove concepts using inline asm before trying to put them in their own cog. Spent way too much time trying to get the smartpin read working without any luck but I was able to get free-running clock gen for init clocks working. The one "improvement" I've been focusing on is converting the ReadBlock function to inline asm. I need to do the same for the WriteBlock, which should improve performance a few percent. It's not 100% asm yet, I got lazy..
pubreadblock(n, b) | c, i, o, outv, r'' Read a single block. The "n" passed in is the' block number (blocks are 512 bytes); the b passed' in is the address of 512 blocks to fill with the' data.'
c := clk
o := do
i := di
DRVL_(cs)
starttime := cnt
asm
rep #.end_read0, #8drvl c
drvh c
.end_read0
mov outv, #$40+17rol outv, #24rep #.end_send0, #8drvl c
rol outv, #1wcdrvc i
drvh c
.end_send0
drvh i
mov outv, n
rep #.end_send1, #32drvl c
rol outv, #1wcdrvc i
drvh c
.end_send1
drvh i
mov outv, #$87rol outv, #24rep #.end_send2, #8drvl c
rol outv, #1wcdrvc i
drvh c
.end_send2
drvh i
endasm
readresp
readresp
repeat sectorsize
asm
mov r, #0rep #.end_read, #8drvl c
waitx #14'' !! 13 safe for up to 160 mhz, 14 for 320mhz, 57@320mhz externaltestp o wcdrvh c
rcl r, #1
.end_read
endasm
byte[b++] := r
asm
rep #.end_read2, #16drvl c
drvh c
.end_read2
endasm
return endcmd
I've also mostly converted CMD to inline ASM. Again proving things work before burying in a cog!
pricmd(op, parm) | c, i, o, outv'' Send a full command sequence, and get and' return the response. We make sure cs is low,' send the required eight clocks, then the' command and parameter, and then the CRC for' the only command that needs one (the first one).' Finally we spin until we get a result.
DRVL_(cs)
c := clk
o := do
i := di
starttime := cnt
asm
rep #.end_read0, #8drvl c
drvh c
.end_read0
mov outv, op
add outv, #$40rol outv, #24rep #.end_send0, #8drvl c
rol outv, #1wcdrvc i
drvh c
.end_send0
drvh i
mov outv, parm
rep #.end_send1, #32drvl c
rol outv, #1wcdrvc i
drvh c
.end_send1
drvh i
cmp op, #0wzif_zmov outv, #$95if_nzmov outv, #$87rol outv, #24rep #.end_send2, #8drvl c
rol outv, #1wcdrvc i
drvh c
.end_send2
drvh i
endasm
return readresp
I'm really hoping to get much closer to sysclock /4 or maybe /8, the waitx #14 looks really ugly, lol
I have a few ??? I'm still trying to understand, such as I'm running @320mhz as max clock but seeing that 290mhz is where things top out.. I should probably start testing 290mhz clock??? Although once things are solid at the 160mhz "standard" clock, I haven't had a problem running up @320.
Also, since I need to slow things down for LA scoping I've been putting the PLL out of spec in all kinds of strange ways and things are surprisingly stable. Is there a recommended pll setup for 8mhz or 16mhz?
One thing I'd really like to do is figure out how to "tune" that waitx automatically. I tested using the SD card on my hardware and I had to push this to almost 60. I'm assuming ringing of cable and lots of PCB trace, I'll be shortening the cable eventually, hopefully this helps.
I finally started making progress, realized I was making a few dumb mistakes. Forgivable since I've had a bit of a break. I've been able to get the smartpins working, although it's still not quite right. Just missing a waitx somewhere I'm pretty sure, since removing the debugging code causes things to stop working. I'd investigate further but pretty sure I'm going to lose power soon. Once the weather calms down I'll get back on it. Now that it's working I can fine tune things as needed.
sp_sync_rx = %0000_0011_000_0000000000000_00_11101_0'' +3, inv
sp_sync_cfg = %1_00111pri read : r | c, o ' Read eight bits from the card.
c := clk
o := do
asm
dirl o
wrpin #sp_sync_rx,o 'sync txwxpin #sp_sync_cfg,o 'stop/start modedirh o 'enable smartpinakpin o
waitx #12rep #.end_rd, #8drvl c
waitx #12drvh c
waitx #2
.end_rd
waitx #32rdpin r, o
rev r
and r, #$ffdirl o
endasm
te.tx(r) ' debugging, remove breaks
I'm glad I've got it working and I'll figure out what I'm doing wrong shortly. Then getting smartpin TX and clock working!
I'm just going to drop this working version here, in case.. Smartpin RX is working and I've got the clock for RX using a smartpin as well. I just need to get the smartpin TX working and then see about writing an SPI cog. Much work to be done but take one victory at a time!
con
sp_srx = %0000_0011_000_0000000000000_00_11101_0'' +3, inv
sp_srx_c = %1_00111
sp_clk = %1000_0000_000_0000000000000_01_00101_0varlong spi_bitcycles
.. 'init code does this
spi_bitcycles :=((clkfreq / spi_clk_max) /2 )+1' needs to be inverse of freq / max
bt := ((clkfreq / 400_000 ) / 2 ) +1'' UNDER 400k
asm
wrpin #sp_srx, o 'sync txwxpin #sp_srx_c, o 'stop/start mode dirl c
wrpin #sp_clk, c
wxpin bt, c 'set base period (*2) ''2.5 400kmax initdirh c
wypin #initclks, c 'start clock, tx data n-1?
.busy testp c wcif_ncjmp #.busy
drvh csn
wypin #initclks, c 'start clock, tx data n-1?rdpinpa,c
.busy2 testp c wcif_ncjmp #.busy2
wrpin #0, c
endasm
...
pri read | c, o, t, bc ' Read eight bits from the card
c := clk
o := do
bc := spi_bitcycles
asm
dirl c
wrpin #sp_clk, c ' set clk pin modewxpin bc, c ' set clk base period dirh c ' enable sp clk dirh o ' enable sp diwypin #byteclks, c ' start clock
.busy testp c wcif_ncjmp #.busy
rdpin t, o ' get datadirl o ' disable smartpin di wrpin #0, c ' clear clk sp config rev t
and t, #$ff
endasm
' term.tx(t)return t
Once the TX is using smartpins, I won't need to disable the clock pin at the end of the read method. Really liking these smartpins. What a cool chip, once you get past that initial learning curve!
I think I've got the smartpins thing good. I'm still not 100% sure I understand the timings but it everything seems pretty solid. I've still got several things to do, one I noticed in testing is default the max spi clock to sysclock /2 in cases where sysclock is too low to meet max spi clock. I also need to generalize the bpin config to allow other sd card configurations. I'll need to wrap my head around this next but I wanted to get it working. You can invert the clock polarity by commenting out + sp_inv_pin on the clock config line. (I use default high because of pullups and pin sharing)
@"Peter Jakacki" I can't remember if the p2d2 uses the same pins as the p2ES. If not, you will probably need to change the smartpin config lines for SP_STX and SP_SRX. Let me know if you need any help.
I have a bunch of work left to do before hiding the spi away in it's own cog. I think I'm getting close, finally!
{{
SDSPI driver for P2-ES FastSpin/InlineASM RC3
Cheezus Slice (Joe Heinz) - July - 2019
Thanks to all those out there in fourm land!
Original spin version by Radical Eye Software Copyright 2008
}}
'CON Baudrate = 115200 'baud rate for serial communicationscon
sectorsize = 512
sectorshift = 9
spi_clk_max = 20_000_000' 20mhz safe, 25 seems to work though?'' NOTE, SYSCLOCK needs to be 2x spi_clk_max for now'%AAAA_BBBB_FFF_PPPPPPPPPPPPP_TT_MMMMM_0'ppp =''xxxx_CIO_HHH_LLL
sp_inv_pin = %0000_001_000_000 << 8'smartpin modes TT_MMMMM_0
sp_stx = (%0010 << 24) + %01_11100_0' +2
sp_srx = (%0011 << 24) + %00_11101_0' +3
sp_srx_c = %0_00111
sp_clk = %01_00101_0 + sp_inv_pin ' inv clock right way
initclks = (20*(2*8)) ' send 20 bytes to init card
byteclks = (2*8)
varlong di, do, clk, cs, starttime, sdhc, di_mask, do_mask, clk_mask, do_pin
long spi_bitcycles
'obj term: "SimpleSerial.spin2" ''-P2 TERMprisend(outv) | c, i' Send eight bits, then raise di.
i := di
c := clk
asm
SHL outv, #32-8' Set MSB FirstREV outv ' Set MSB Firstwypin outv, i ' data !!rdpinpa, c
wypin #byteclks, c 'start clock, tx data
.busy testp c wcif_ncjmp #.busy
wypin #$FF, i
endasm
pri read | c, o, t ' Read eight bits from the card.
c := clk
o := do
asm
rdpinpa, c
wypin #byteclks, c ' start clock
.busy testp c wcif_ncjmp #.busy
rdpin t, o ' get datarev t
and t, #$ff
endasm
return t
pubstart_explicit(iDO, iCLK, iDI, iCS)| t, c,csn,o,i, bt'RJA adding in some things to make work'' globals
do := iDO
clk := iCLK
di := iDI
cs := iCS
spi_bitcycles :=((clkfreq / spi_clk_max) /2 )+1'' locals
c := clk
csn := cs
o := do
i := di
bt := spi_bitcycles
asm
drvh csn
dirl c
wrpin #sp_clk, c
wxpin bt, c 'set base period (*2) ''2.5 400kmax initdirh c
dirl i
wrpin #sp_stx, i 'sync txwxpin #%1_00111, i 'stop/start modedirh i
wypin #$FF, i
dirl o
wrpin #sp_srx, o 'sync rxwxpin #sp_srx_c, o 'stop/start mode dirh o
wypin #initclks, c 'start clock, tx data n-1?
.busy testp c wcif_ncjmp #.busy
wypin #byteclks, c 'start clock, tx data n-1?rdpinpa, c
.busy2 testp c wcif_ncjmp #.busy2
endasm
starttime := t
cmd(0, 0)
drvh_(cs) ' Deselect the card to terminate a command.
cmd(8, $1aa)
read
read
read
read
drvh_(cs) ' Deselect the card to terminate a command. repeat
cmd(55, 0)
t := cmd(41, $4000_0000)
drvh_(cs) ' Deselect the card to terminate a command. if t <> 1
quit
if t
return -40'abort -40 ' could not initialize card
cmd(58, 0)
sdhc := (read >> 6) & 1
read
read
read
drvh_(cs) ' Deselect the card to terminate a command.if sdhc == 0' if not hc return -40' fail mount return sdhc +1' return card typepri readresp | r
'' made return timeout to caller CHZ 2-2019' Read eight bits, and loop until we' get something other than $ff.'repeatif (r := read) <> $ffreturn r
if checktime == -41return -41pri busy | r ' Wait until card stops returning busy'' made return timeout to caller CHZ 2-2019'repeatif (r := read)
return r
if checktime == -41return -41pri checktime ' Did we go over our time limit yet?if cnt - starttime > clkfreqreturn -41'abort -41 ' Timeout during readpricmd(op, parm)'' Send a full command sequence, and get and' return the response. We make sure cs is low,' send the required eight clocks, then the' command and parameter, and then the CRC for' the only command that needs one (the first one).' Finally we spin until we get a result.
DRVL_(cs)
starttime := cnt
read
send($40+op)
send(parm >> 24)
send(parm >> 16)
send(parm >> 8)
send(parm << 0)
if (op == 0)
send($95)
else
send($87)
return readresp
pri endcmd ' Deselect the card to terminate a command.
drvh_(cs) ' Deselect the card to terminate a command.PUB stop ' RJA adding in some thingsPUB release
pubstart(basepin)result := start_explicit(basepin, basepin+1, basepin+2, basepin+3)
pubreadblock(n, b) | c, i, o, outv, r'' Read a single block. The "n" passed in is the' block number (blocks are 512 bytes); the b passed' in is the address of 512 blocks to fill with the' data.'
starttime := cnt
cmd(17, n)
readresp
repeat sectorsize
byte[b++] := read
read
read
return endcmd
pubwriteblock(n, b)'' Write a single block. Mirrors the read above.'
starttime := cnt
cmd(24, n)
' cmd(24, doSDHC(n))
send($fe)
repeat sectorsize
send(byte[b++])
read
read
if ((readresp & $1f) <> 5)
return -42'abort -42
busy
return endcmd
{{
' Permission is hereby granted, free of charge, to any person obtaining
' a copy of this software and associated documentation files
' (the "Software"), to deal in the Software without restriction,
' including without limitation the rights to use, copy, modify, merge,
' publish, distribute, sublicense, and/or sell copies of the Software,
' and to permit persons to whom the Software is furnished to do so,
' subject to the following conditions:
'
' The above copyright notice and this permission notice shall be included
' in all copies or substantial portions of the Software.
'
' THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
' EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
' MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
' IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
' CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
' TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
' SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
}}
Pretty sure I'm getting close now. This is completely untested and I need to double check the init sequence a few times before I even try to debug this. I just wanted to post this in case someone cares to take a quick look for obvious errors. You know how it is when you stare at code too long and start missing things.
CON'Constants section
spi_clk_max = 20_000_000' 20mhz safe
initclks = (20*(2*8)) ' send 20 bytes to init card
byteclks = (2*8)
VARbyte cog
byte do, clk, di, cs
PUB HOOK ' test compile
start_explicit( 3, 2, 1, 0 )
PUB Stop
if cog
cogstop(cog-1)
cog := 0PUB GetCog
return cog
PUBstart_explicit( _DO, _CLK, _DI, _CS ) : card_type | domask, dimask, sbt'' save pins for later?
do := _DO
clk := _CLK
di := _DI
cs := _CS
'' prepare smartpin clock mask and bit time
domask := getmask(clk, do) ' bpin mask (clk)
dimask := getmask(clk, di)
sbt :=((clkfreq / spi_clk_max) /2 )+1' SPI bit timeif sbt < 2' make sure at least sysclock /2
sbt := 2
stop
long[@cmd] := "i"'set flag to know if the started cog has read its parameterslong[@cmd][1] := do | (clk << 8) + (di << 16) + (cs << 24)
long[@cmd][2] := domask + (dimask << 8) ' clock maskslong[@cmd][3] := sbt ' spi bit timelong[@cmd][4] := clkfreq'1slong[@cmd][5] := clkfreq / 100_000_000'1us
cog := cognew(@_asm_start,@cmd) + 1if cog
repeatuntillong[@cmd] <> "i"' wait until cog is done reading parameter
card_type := long[@cmd][1] ' arg1 returns card type or returns with ptr to error from low level driverelse' we don't have a cog? returnPUBReadBlock(n, b)return DoCommand("r", n, b, 0)
PUBWriteBlock(n, b)return DoCommand("w", n, b, 0)
PRIDoCommand(_cmd, _arg1, _arg2, _arg3)if cmd <> 0return @cogbusymsg '' busy
arg1 := _arg1
arg2 := _arg2
arg3 := _arg3
cmd := _cmd
repeatuntil cmd <> _cmd
return err
PRIgetmask(clk, data) | t
t := clk - data
if ( || t ) <> t
t := ( || t ) + %0100DATorg
_asm_start mov ptr_to_mbox, ptra' save mbox address for later'' get parameters from spin
do_init movptra, ptr_to_mbox ' return ptra back to mbox rdlong tmp, ptra++ ' pins long mov pinDO, tmp
and pinDO, #$ffmov pinCLK, tmp
shr pinCLK, #8and pinCLK, #$ffmov pinDI, tmp
shr pinDI, #16and pinDI, #$ffmov pinCS, tmp
shr pinCS, #24and pinCS, #$ffrdlong tmp, ptra++ 'mask longmov sp_doclk, tmp
and sp_doclk, #$fmov sp_diclk, tmp
shr sp_diclk, #4and sp_diclk, #$frdlong sp_bt, ptra++ ' timing long rdlong delay1s, ptra++ ' 1s longrdlong delay1us, ptra' 1us long movptra, ptr_to_mbox ' return ptra back to mbox '' setup pinsdrvh pinCS
dirl pinCLK '' setup Clock pinmov tmp, sp_clk
add tmp, sp_inv_pin ' invert clock wrpin tmp, pinCLK
wxpin sp_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk dirl pinDI '' setup DI pin mov tmp, sp_stx ' save mode, bpin might changeadd tmp, sp_diclk ' setup bpinwrpin tmp, pinDI ' sync txwxpin sp_stx_c, pinDI ' stop/start modedirh pinDI ' enable smartpinwypin #$FF, pinDI ' prime shifter dirl pinDO '' setup DO pin mov tmp, sp_srx ' save mode, bpin might changeadd tmp, sp_doclk ' setup bpinwrpin tmp, pinDO ' sync rxwxpin sp_stx_c, pinDO ' 8 bit, pre dirh pinDO ' enable pin '' card init drvh pinCS ' set cs highwypin #initclks, pinCLK ' set init clocksgetct timeout ' setup 1s timeout addct1 timeout, delay1s ' in case card does not respond mov sd_cmd, #0'cmd 0,0mov sd_parm, #0call #wait_clock ' wait for smartpins to finish call #do_sdcmd
drvh pinCS ' set cs highmov sd_cmd, #8' cmd 8 $1aamov sd_parm, #$1AAcall #do_sdcmd
call #read_byte ' read 32 bitcall #read_byte
call #read_byte
call #read_byte
drvh pinCS ' set cs high
.cmd55 mov err_out, ##-40mov sd_cmd, #55'cmd 55,0mov sd_parm, #0call #do_sdcmd
mov sd_cmd, #41'cmd 55,0mov sd_parm, ##$4000_0000'call #do_sdcmd
drvh pinCS ' set cs highcmp data_out, #$01wz' 01 = idle if_zjmp #.cmd55
mov sd_cmd, #58'cmd 0,0mov sd_parm, #0call #do_sdcmd
call #read_byte
and data_in, #40cmp data_in, #0wzif_zmov adrShift, #9if_nzmov adrShift, #0call #read_byte
call #read_byte
call #read_byte
drvh pinCS ' set cs highmov err_out, #0mov sd_cmd, #16'cmd 0,0mov sd_parm, ##$200call #do_sdcmd
drvh pinCS ' set cs high
do_init_ret ret
get_cmd rdlong cmd_in, ptrawz''' timeout and release card after 10sif_zjmp #get_cmd 'no cmd so handle time rdlong arg1_in, ++ptra' rdlong arg2_in, ++ptra' rdlong arg3_in, ++ptra'rdlong arg4_in, ++ptra'cmp cmd_in, #"r"wzif_zjmp #read_a
cmp cmd_in, #"w"wzif_zjmp #write_b
' cmp cmd_in, #"z" wz' if_z jmp #releasecmp cmd_in, #"i"wzif_zcall #do_init
'' do commands
end_cmd wrlong err_out, ++ptramov cmd_in, #0movptra, ptr_to_mbox ' return ptra back to mboxwrlong cmd_in, ptr_to_mbox
jmp #get_cmd
read_a '' arg1 is sd ptr, arg2 is hubmov sd_cmd, #17' cmd 8 $1aa' shr adrShift, arg1_in ' non-hc compatibilitymov sd_parm, arg1_in
call #do_sdcmd
call #read_resp
movptrb, arg2_in
rep @.endread, sectorsize
call #read_byte
wrbyte data_in, ptrb++
.endread
call #read_byte
call #read_byte
drvh pinCS ' set cs high
read_a_ret ret
write_b '' arg1 is sd ptr, arg2 is hubmov sd_cmd, #24' cmd 8 $1aa' shr adrShift, arg1_in ' non-hc compatibilitymov sd_parm, arg1_in
call #do_sdcmd
mov data_out, #$fecall #send_byte
movptrb, arg2_in
rep @.endsend, sectorsize
rdbyte data_out, ptrb++
call #send_byte
.endsend
call #read_byte
call #read_byte
call #read_resp
and data_in, #$1fcmp data_in, #5wzif_nzmov err_out, ##-42call #sd_busy
drvh pinCS ' set cs high
write_b_ret ret
do_sdcmd ' OP is sd_cmd, parm = sd_parm drvl pinCS
getct timeout ' setup 1s timeout addct1 timeout, delay1s
call #read_byte
mov data_out, sd_cmd
add data_out, #$40call #send_byte
mov data_out, sd_parm
shr data_out, #24and data_out, #$ffcall #send_byte
mov data_out, sd_parm
shr data_out, #16and data_out, #$ffcall #send_byte
mov data_out, sd_parm
shr data_out, #8and data_out, #$ffcall #send_byte
mov data_out, sd_parm
and data_out, #$ffcall #send_byte
cmp sd_cmd, #0wzif_zmov data_out, #$95if_nzmov data_out, #$87call #send_byte
call #read_resp
do_sdcmd_ret ret
send_byte SHL data_out, #32-8' Set MSB FirstREV data_out ' Set MSB Firstwypin data_out, pinDI ' data !!rdpinpa, pinCLK
wypin #byteclks, pinCLK ' start clock, tx data call #wait_clock
wypin #$FF, pinDI
send_byte_ret ret
read_byte rdpinpa, pinCLK
wypin #byteclks, pinCLK ' start clock call #wait_clock
rdpin data_in, pinDO ' get datarev data_in
and data_in, #$ff
read_byte_ret ret
sd_busy call #read_byte
cmp data_in, #0wzif_zcall #check_time
sd_busy_ret ret
read_resp call #read_byte
cmp data_in, #$ffwzif_zcall #check_time
read_resp_ret ret
check_time pollct1wcif_cmov err_out, ##-41drvh pinCS ' set cs high' if_c jmp #timed_out
check_time_ret ret
wait_clock testp pinCLK wcif_ncjmp #wait_clock
wait_clock_ret ret{=== Smart pin constants ===}
sp_clk long%01_00101_0
sp_stx long%01_11100_0
sp_srx long%00_11101_0
sp_stx_c long%1_00111
sp_srx_c long%0_00111
sp_inv_pin long%0000_001_000_000 << 8
adrShift long0' will be 0 for SDHC, 9 for MMC & SD
sectorsize long512' loaded on init{=== Pin Variables ===}
pinDO long0' do
pinCLK long0' clk
pinDI long0' di
pinCS long0' chip select{===== Smartpin Clock Variables =====}'' smartpins stuffs
sp_doclk long0
sp_diclk long0
sp_bt long0{===== Command Mailbox =====}
ptr_to_mbox res1
cmd_in res1
arg1_in res1
arg2_in res1
arg3_in res1
arg4_in res1
err_out res1{=== Timing Variables ===}
delay1s res1' clkfreq = 1 s
delay1us res1' clkfreq / 100_000_000
timeout res1{=== Temp Variables ===}
tmp res1' this may get used in all subroutines...don't use except in lowest
data_in res1
data_out res1
sd_cmd res1
sd_parm res1'buff res 299FIT496dat'' debug codes
cogbusymsg byte"SPI Busy",0
badpins byte"pins out of range",0dat'' singleton spin mailbox
alignl
cmd long0
arg1 long0
arg2 long0
arg3 long0
arg4 long0
err long0{
var '' multi spin mailbox
long cmd, arg1, arg2, arg3, arg4, err
}
I couldn't get your code and my comments to fit in one post, so only code relevant to my comments follows.
In the asm init section, I don't see where do_init is used separately from _asm_start. Moving the just saved pointer back isn't really needed is it?
Rearranging your code could allow the use of a setq+ rdlong construct to read the whole parameter block in one action. It could save some pointer maths and the code reduction will allow for the extra long needed (tmp2). Using getnib, getbyte, and getword instead of shift and mask for the slice and dice of the packed parameters is worth investigating too.
In the clock pin setup you add two constants before the WRPIN. Why not just a single constant?
read_byte destroys the contents of data_in on each call, so where you call it multiple times in sequence are you only interested in the last byte received?
Pretty sure I'm getting close now. This is completely untested and I need to double check the init sequence a few times before I even try to debug this. I just wanted to post this in case someone cares to take a quick look for obvious errors. You know how it is when you stare at code too long and start missing things.
DATorg
_asm_start mov ptr_to_mbox, ptra' save mbox address for later'' get parameters from spin
do_init movptra, ptr_to_mbox ' return ptra back to mbox rdlong tmp, ptra++ ' pins long mov pinDO, tmp
and pinDO, #$ffmov pinCLK, tmp
shr pinCLK, #8and pinCLK, #$ffmov pinDI, tmp
shr pinDI, #16and pinDI, #$ffmov pinCS, tmp
shr pinCS, #24and pinCS, #$ffrdlong tmp, ptra++ 'mask longmov sp_doclk, tmp
and sp_doclk, #$fmov sp_diclk, tmp
shr sp_diclk, #4and sp_diclk, #$frdlong sp_bt, ptra++ ' timing long rdlong delay1s, ptra++ ' 1s longrdlong delay1us, ptra' 1us long movptra, ptr_to_mbox ' return ptra back to mbox '' setup pinsdrvh pinCS
dirl pinCLK '' setup Clock pinmov tmp, sp_clk
add tmp, sp_inv_pin ' invert clock wrpin tmp, pinCLK
wxpin sp_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk dirl pinDI '' setup DI pin mov tmp, sp_stx ' save mode, bpin might changeadd tmp, sp_diclk ' setup bpinwrpin tmp, pinDI ' sync txwxpin sp_stx_c, pinDI ' stop/start modedirh pinDI ' enable smartpinwypin #$FF, pinDI ' prime shifter dirl pinDO '' setup DO pin mov tmp, sp_srx ' save mode, bpin might changeadd tmp, sp_doclk ' setup bpinwrpin tmp, pinDO ' sync rxwxpin sp_stx_c, pinDO ' 8 bit, pre dirh pinDO ' enable pin '' card init drvh pinCS ' set cs highwypin #initclks, pinCLK ' set init clocksgetct timeout ' setup 1s timeout addct1 timeout, delay1s ' in case card does not respond mov sd_cmd, #0'cmd 0,0mov sd_parm, #0call #wait_clock ' wait for smartpins to finish call #do_sdcmd
drvh pinCS ' set cs highmov sd_cmd, #8' cmd 8 $1aamov sd_parm, #$1AAcall #do_sdcmd
call #read_byte ' read 32 bitcall #read_byte
call #read_byte
call #read_byte
drvh pinCS ' set cs high
.cmd55 mov err_out, ##-40mov sd_cmd, #55'cmd 55,0mov sd_parm, #0call #do_sdcmd
mov sd_cmd, #41'cmd 55,0mov sd_parm, ##$4000_0000'call #do_sdcmd
drvh pinCS ' set cs highcmp data_out, #$01wz' 01 = idle if_zjmp #.cmd55
mov sd_cmd, #58'cmd 0,0mov sd_parm, #0call #do_sdcmd
call #read_byte
and data_in, #40cmp data_in, #0wzif_zmov adrShift, #9if_nzmov adrShift, #0call #read_byte
call #read_byte
call #read_byte
drvh pinCS ' set cs highmov err_out, #0mov sd_cmd, #16'cmd 0,0mov sd_parm, ##$200call #do_sdcmd
drvh pinCS ' set cs high
do_init_ret ret
get_cmd rdlong cmd_in, ptrawz''' timeout and release card after 10sif_zjmp #get_cmd 'no cmd so handle time rdlong arg1_in, ++ptra' rdlong arg2_in, ++ptra' rdlong arg3_in, ++ptra'rdlong arg4_in, ++ptra'cmp cmd_in, #"r"wzif_zjmp #read_a
cmp cmd_in, #"w"wzif_zjmp #write_b
' cmp cmd_in, #"z" wz' if_z jmp #releasecmp cmd_in, #"i"wzif_zcall #do_init
'' do commands
end_cmd wrlong err_out, ++ptramov cmd_in, #0movptra, ptr_to_mbox ' return ptra back to mboxwrlong cmd_in, ptr_to_mbox
jmp #get_cmd
read_a '' arg1 is sd ptr, arg2 is hubmov sd_cmd, #17' cmd 8 $1aa' shr adrShift, arg1_in ' non-hc compatibilitymov sd_parm, arg1_in
call #do_sdcmd
call #read_resp
movptrb, arg2_in
rep @.endread, sectorsize
call #read_byte
wrbyte data_in, ptrb++
.endread
call #read_byte
call #read_byte
drvh pinCS ' set cs high
read_a_ret ret
write_b '' arg1 is sd ptr, arg2 is hubmov sd_cmd, #24' cmd 8 $1aa' shr adrShift, arg1_in ' non-hc compatibilitymov sd_parm, arg1_in
call #do_sdcmd
mov data_out, #$fecall #send_byte
movptrb, arg2_in
rep @.endsend, sectorsize
rdbyte data_out, ptrb++
call #send_byte
.endsend
call #read_byte
call #read_byte
call #read_resp
and data_in, #$1fcmp data_in, #5wzif_nzmov err_out, ##-42call #sd_busy
drvh pinCS ' set cs high
write_b_ret ret
do_sdcmd ' OP is sd_cmd, parm = sd_parm drvl pinCS
getct timeout ' setup 1s timeout addct1 timeout, delay1s
call #read_byte
mov data_out, sd_cmd
add data_out, #$40call #send_byte
mov data_out, sd_parm
shr data_out, #24and data_out, #$ffcall #send_byte
mov data_out, sd_parm
shr data_out, #16and data_out, #$ffcall #send_byte
mov data_out, sd_parm
shr data_out, #8and data_out, #$ffcall #send_byte
mov data_out, sd_parm
and data_out, #$ffcall #send_byte
cmp sd_cmd, #0wzif_zmov data_out, #$95if_nzmov data_out, #$87call #send_byte
call #read_resp
do_sdcmd_ret ret
send_byte SHL data_out, #32-8' Set MSB FirstREV data_out ' Set MSB Firstwypin data_out, pinDI ' data !!rdpinpa, pinCLK
wypin #byteclks, pinCLK ' start clock, tx data call #wait_clock
wypin #$FF, pinDI
send_byte_ret ret
read_byte rdpinpa, pinCLK
wypin #byteclks, pinCLK ' start clock call #wait_clock
rdpin data_in, pinDO ' get datarev data_in
and data_in, #$ff
read_byte_ret ret
sd_busy call #read_byte
cmp data_in, #0wzif_zcall #check_time
sd_busy_ret ret
read_resp call #read_byte
cmp data_in, #$ffwzif_zcall #check_time
read_resp_ret ret
check_time pollct1wcif_cmov err_out, ##-41drvh pinCS ' set cs high' if_c jmp #timed_out
check_time_ret ret
wait_clock testp pinCLK wcif_ncjmp #wait_clock
wait_clock_ret ret{=== Smart pin constants ===}
sp_clk long%01_00101_0
sp_stx long%01_11100_0
sp_srx long%00_11101_0
sp_stx_c long%1_00111
sp_srx_c long%0_00111
sp_inv_pin long%0000_001_000_000 << 8
adrShift long0' will be 0 for SDHC, 9 for MMC & SD
sectorsize long512' loaded on init{=== Pin Variables ===}
pinDO long0' do
pinCLK long0' clk
pinDI long0' di
pinCS long0' chip select{===== Smartpin Clock Variables =====}'' smartpins stuffs
sp_doclk long0
sp_diclk long0
sp_bt long0{===== Command Mailbox =====}
ptr_to_mbox res1
cmd_in res1
arg1_in res1
arg2_in res1
arg3_in res1
arg4_in res1
err_out res1{=== Timing Variables ===}
delay1s res1' clkfreq = 1 s
delay1us res1' clkfreq / 100_000_000
timeout res1{=== Temp Variables ===}
tmp res1' this may get used in all subroutines...don't use except in lowest
data_in res1
data_out res1
sd_cmd res1
sd_parm res1'buff res 299FIT496dat'' debug codes
cogbusymsg byte"SPI Busy",0
badpins byte"pins out of range",0dat'' singleton spin mailbox
alignl
cmd long0
arg1 long0
arg2 long0
arg3 long0
arg4 long0
err long0{
var '' multi spin mailbox
long cmd, arg1, arg2, arg3, arg4, err
}
@AJL, Thanks for taking a look. I've got a bunch of problems in my code above... The first I found was I was missing a dummy rdlong at the beginning of my init. I also am packing the clock pin masks wrong. There's also something ?else? wrong because I'm outputting clock but still not getting data.
Re. init -
This is one I'm not 100% sure how I want to do. I'd like to be able to change sysclock or pins by restarting the cog (it's called from get_cmd) I could probably get away with one entry point but was even thinking about separating the timing stuff from pin stuff but going to leave it like this for now I think.
re. setq + rdlong -
This is one improvement I'm going to look at once I have things working. That goes for the getnib / setnib instructions, just not quite there yet. Trying to stay as close to what I know..
The extra constant for the clock pin inverts it and I plan on having a switch somewhere to control the polarity of the clock, as well as set default max SPI clock. Just haven't been able to get my head around that yet.
Read_byte trashes data_out and most of the time I'm not interested in any of the data.
I'm stumbling my way through, found all kinds of errors.. I'm stuck on the conversion of readblock… It took me a while to figure out what is going on and it seems like I'm trashing the call stack? I believe it's because of my use of CALLs and JMPs. I'm missing the obvious and correct way of conditional repeat. Spin code is this.
prireadresp | r' Read eight bits, and loop until we' get something other than $ff.'repeatif (r := read) <> $ffreturn r
if checktime == -41return -41
When I was trying to convert it to ASM I had a feeling this wouldn't work.
Scratch that, seems the problem is REP is not working I noticed this while trying to use the output of fastspin to help guide me. For some reason, this is not repeating 512 times, just falling through. I know there was a bug with the REP instruction but I can't remember what it was exactly. I think I'm just doing something wrong.
That's a normal REP rule. The CALL counts as an exit branch from the REP mechanism. Another rule is any exit branching can't be the last instruction of the REP block. Another rule is interrupts are put on hold until the REP completes/exits.
The flaw (Corrected in the final silicon) is to do with a group of branching instructions that test the hardware events mechanism, eg: JCT1.
Thanks guys! I'm getting close, just got to nail down block writes! Once I get that done I'll try to clean up code and post. A quick benchmark (since I don't have writes yet and can't run the full test) for mounting a card is 4.57ms bit-bashed inline asm, 4.04ms inline asm smartpins and 2.33ms smartpins in their own cog. These were all @160mhz sysclock and 20mhz spi clock. Note, my LA can't keep up with 20mhz SPI.
Interestingly enough, the bit-bashed command packet is 2nd to complete although it's the all out loser in final times.
Bit - bashed SPI
Smartpins inline
Smartpins cog
*edit
Got block writes working so was able to come up with a full set of test times. To mount the card, read and write 2m.
I'll give the code a quick cleanup, now that I've got everything working including clock polarity. There's a lot of places to improve still. I'd like to figure out a way to determine max spi clock at init, although I really can't wrap my head around any way besides brute-force init card and increase speed till it fails. Maybe I'm missing something obvious as usual. Really happy I have this working though.
*final edit
I did some code cleanup and have just about everything where I want it. I'm posting a NEW RC3 here and once things are tested further I'll update the first post.
Still testing RC3 posted above. I've got a bunch of work left to do but it's fairly functional AFAIK. I'm wondering if anyone can give me a reason why it would be better to return a negative error code, as opposed to aborting? I'm probably going to re-write with aborts, as V1 used. Just wondering if anyone has a good argument against aborts?
I don’t like abort
Seems like unnecessary complexity to me
For some cases I can see how it could be considered unnecessary but I'm working on the Touchscreen OS again and aborts seem really helpful for SD cards. The ability to point to an error message, instead of needing to do a lookup on an error code. Six of one, half dozen of the other... I guess it could be done with conditional compile and would be the best, because a lot of P1 apps expect negative return type errors from FSRW. I really like how verbose Kye's SD driver is, plus it would be nice to not need to rewrite the ENTIRE touch.spin.
As I said, I really like Kye's driver and have pretty much talked myself into trying to get the genius piece of code working. One of the functions that is very essential to the OS is the ability to launch a file from SD card and it seems that @Cluso99 's on-chip SD could do this with little additional code. I just haven't wrapped my head around that yet. In the past I had commented out large portions of Kye's driver to free up memory but now that we have enough hubram I'm hoping to be able to do BIG THINGS much easier.
Perhaps the title of this thread should be changed to cover both FSRW and KYE's SD?
Kye’s SD driver is great. It’s easy to follow. I added a few extras such as returning the physical sector etc.
From a users perspective the names are too long, but makes understanding what its doing a breeze.
This is one of the reasons i wanted to get the Spin1 interpreter working with P2. But i got sidetracked near the end with work and haven’t managed to get back to it yet.
FWIW the ROM SD code, with the addition of write, and tweeking the actual spi routine, this should slot into Kye’s code nicely.
Kye’s SD driver is great. It’s easy to follow. I added a few extras such as returning the physical sector etc.
From a users perspective the names are too long, but makes understanding what its doing a breeze.
This is one of the reasons i wanted to get the Spin1 interpreter working with P2. But i got sidetracked near the end with work and haven’t managed to get back to it yet.
FWIW the ROM SD code, with the addition of write, and tweeking the actual spi routine, this should slot into Kye’s code nicely.
Kye's driver is great and the Touch OS is based largely on your OS based on Kye's SD driver I really wish I could wrap my head around how to interface cleanly with the ROM SD code. I know I'm probably just making it harder than I need to.
The main reason I didn't focus on using the ROM code was needing to understand SPI on the P2 a little better.
I'm stuck needing a driver for the resistive touch ADC XPT2046. Going to try modifying SDSPI for this. Once I get that done I'll be able to test a lot more, although I'm sure there will be a ton of debugging and bugfixes.
One of the things that looks really promising is LUT sharing. IE, read block from sd using CogN into shared LUT and use CogN+1 to push that block to display (or ram/flash). CogN could be reading the next block while N+1 is writing the current block. Seems like this could save a TON of hubcycles over the way things are done on the P1. I don't see myself getting around to this any time soon since I'm not sure how other people would be able to use such a specific driver.
I'm hoping to try using the rom to run a file, how many times can I get this one wrong? lol
I updated the 1st post with the current RC and some RC notes. I really wish I was better at documentation but until I get some feedback from people using this it seems pointless. I'm guessing right now there isn't much need for a P2 FSRW or Kyes, but that will change soon hopefully!
Really hoping the new silicon doesn't break things too much, hopefully worst case is change smartpin clock from sysclock /2 to /3 or /4. Time will tell though!
@cheezus,
Note the ES ROM has some fixes in the respin ROM to better recover from the SD DO problem which fails to release the DO drive from the SD card. This means the SD ROM code positions change in the new ROM. The Serial Monitor/Debugger did not change position so this is fine to call.
Second thing to note is that the SD ROM code (routine) to drive the SPI code is deliberately not fast to ensure it works reliably from hubexec. It has send/receive combined ie it shifts in (reads) as it shifts out (writes).
The code in the new ROM is way better in the DO error handling, so this is what I will be using as a base for my SD (if you or others haven't done it when I need it), with just the write sector added, and the SPI send/receive routine replaced with a faster version.
BTW there is IIRC a single further fix for the DO that did not make it into the new ROM, so I will need to dig it out. I am overseas for another 2 weeks so that will have to wait.
@cheezus,
Note the ES ROM has some fixes in the respin ROM to better recover from the SD DO problem which fails to release the DO drive from the SD card. This means the SD ROM code positions change in the new ROM. The Serial Monitor/Debugger did not change position so this is fine to call.
Second thing to note is that the SD ROM code (routine) to drive the SPI code is deliberately not fast to ensure it works reliably from hubexec. It has send/receive combined ie it shifts in (reads) as it shifts out (writes).
The code in the new ROM is way better in the DO error handling, so this is what I will be using as a base for my SD (if you or others haven't done it when I need it), with just the write sector added, and the SPI send/receive routine replaced with a faster version.
BTW there is IIRC a single further fix for the DO that did not make it into the new ROM, so I will need to dig it out. I am overseas for another 2 weeks so that will have to wait.
I've looked at the first ROM code quite a bit. It's beautiful, but I was planning on having a cog just for the SD driver so hubexec is what got me confused. I was also aware of the changes for the respin, another reason I haven't worked on this. I would have liked to build a driver around it but figured you would be doing one (using hubexec) so focusing on something that could be stuffed in a cog like P1. FSRW is a bare minimum driver and so my focus on early development. Soon I'll need a way to run a precompiled binary and if I can figure out how to copy the filename to the right place in hub and jmp into the hubexec code I'll be really happy.
I don't have time to verify just now, but this is what you need to know
'+-------[ SD: Initialise/Locate/Load/Run a file from SD ]---------------------+ <--- SD: init/locate/load/run a file --->'+ On Entry: +'+ fname[3]: filename[11] 8+3 without "." (will be $0 terminated) +'+ Call Format: +'+ CALL #@_Run_SDfile ' + < call: init/locate/load/run a file >'+ On Return: +'+ "NZ" if error, else does not return +'+-----------------------------------------------------------------------------+'+-----------------------------------------------------------------------------+
fc560 fdb00040 _Start_SDcard call #@_SDcard_Init ' initialise & read CSD/CID
fc564 f6079a00 mov skiprun, #0' load/run MBR/VOL code
fc568 adb00174 if_ecall #@_readMBR ' read MBR/VOL/FSI/FAT
fc56c adb002a0 if_ecall #@_readDIR ' read directory for filenames'' mov skiprun, #0 ' load/run <file> (already 0)
fc570 adb0038c if_ecall #@_readFILE ' read/load/run the file
fc574 fd800100 JMP #try_serial ' failed: so go back and try serial'+-----------------------------------------------------------------------------+
fc578 fdb00028 _Run_SDfile call #@_SDcard_Init ' initialise & read CSD/CID'' mov skiprun, #1 ' do not load/run MBR/VOL code
fc57c adb0015c if_ecall #@_readMBR1 ' read MBR/VOL/FSI/FAT (don't run)
fc580 adb002bc if_ecall #@_searchDIR ' search dir for <fname>
fc584 f6079a00 mov skiprun, #0' load/run <file>
fc588 adb00374 if_ecall #@_readFILE ' read/load/run the file
fc58c fd64002d RET' return "NZ" = failed, "Z" if loaded ok'+-----------------------------------------------------------------------------+
fc590 fdb00010 _Load_SDfile call #@_SDcard_Init ' initialise & read CSD/CID'' mov skiprun, #1 ' do not load/run MBR/VOL code
fc594 adb00144 if_ecall #@_readMBR1 ' read MBR/VOL/FSI/FAT (don't run)
fc598 adb002a4 if_ecall #@_searchDIR ' search dir for <fname>'' mov skiprun, #1 ' do not load/run <file> (already 1)
fc59c adb00360 if_ecall #@_readFILE ' read/load the file
fc5a0 fd64002d RET' return "NZ" = failed, "Z" if loaded ok'+-----------------------------------------------------------------------------+
and
DAT''============[ COG VARIABLES - SD BOOT]========================================
fc560 1c0 org$1C0' place the variables in cog $1C0-$1DF
fc560 1c0 cmdout res1' The 8b CMDxx | $40
fc560 1c1 cmdpar res1' The 32b parameters
fc560 1c2 cmdcrc res1' The 8b CRC (must be valid for CMD0 & CMD8)
fc560 1c3 cmdpar2 res1' SDV1=$0, SDV2=$40000000
fc560 1c4 cmdtype res1' reply is R1=1, R3=3, R7=7, else 0
fc560 1c5 reply res1' R1 reply (moved to replyR1 when R3/R7 32b reply here)
fc560 1c6 replyR1 res1' R1 reply (8b saved when R3/R7 32b reply follows)
fc560 1c7 dataout res1' 8/32 bit data being shifted out
fc560 1c8 bytescnt res1' #bytes to send/recv
fc560 1c9 bitscnt res1' #bits to be shifted in/out
fc560 1ca ctr1 res1
fc560 1cb timeout res1' = starttime + delay
fc560 1cc spare res1
fc560 1cd skiprun res1' 1= skip load/run mbr/vol & load/no-run fname'\ 1=SDV1, 2=SDV2(byte address), 3=SDHC/SDV2(block address)
fc560 1ce blocksh res1'/ block shift 0/9 bits
fc560 1cf clustersh res1' sectors/cluster SHL 'n' bits
fc560 1d0 vol_begin res1'$0000_2000 ' Ptn0: first sector of PTN
fc560 1d1 fsi_begin res1'$0000_2001 ' Ptn0: sector of file system info
fc560 1d2 fat_begin res1'$0000_3122 ' Ptn0: first sector of FAT table
fc560 1d3 dir_begin res1' $0000_4000 ' Ptn0: first sector of DATA is DIR table
fc560 1d4 dat_begin res1'$0000_4580 $0000_54C0' Ptn0: first sector of file's DATA
fc560 1d5 ptn_size res1'$0008_0000 ' file-size 32KB = 64<<9 sectors
fc560 1d6 _bufad res1
fc560 1d7 _blocknr res1
fc560 1d8 _sectors res1
fc560 1d9 _entries res1
fc560 1da bufad res1' ptr sector buffer
fc560 1db blocknr res1' sector#
fc560 1dc fname res3' 8+3+1
fc560 1df _hubdata res1
In your code...
cog $1C0-$1DF is required for use by the SD ROM driver
Place the filename (8+3+$0) into 3 longs at fname (cog $1DC) (remember little endian so reverse each group of 4 characters!)
So "filename.xtn" would become "elif" + "eman" + ($0,"ntx")
I'm working on this again and I think I'm getting ready for another release. I need to verify all my changes are correct first but it looks much nicer now that I'm using the getbyte instructions. There's a couple uglies left maybe someone can point to the CLEAN way to do this.
... ' ex1
send_byte SHL data_out, #32-8' Set MSB FirstREV data_out ' Set MSB First
…
… 'ex2rev data_in
and data_in, #$ff
read_byte_ret ret
I also need to use all the new test instructions but I think I can get those SPI smartpin setup looks like this now
PUBstart_explicit( _DO, _CLK, _DI, _CS ) : card_type | domask, dimask
stop
long[@cmd] := "i"'set flag to know if the started cog has read its parameterslong[@cmd][1] := _DO | (_CLK << 8) + (_DI << 16) + ( _CS << 24)' pins long[@cmd][2] := getmask(_CLK, _DO) + (getmask(_CLK, _DI) << 4) ' clock maskslong[@cmd][2] += spi_clk_inv << 8' clock invertlong[@cmd][3] := GetBitTime(spi_clk_max) ' spi bit timelong[@cmd][4] := GetBitTime(400_000) '400k
cog := cognew(@_asm_start,@cmd) + 1if cog
repeatuntillong[@cmd] <> "i"' wait until cog is done reading parameter
card_type := long[@cmd][4] ' arg5 returns card type or returns with ptr to error from low level driverDATorg
_asm_start mov ptr_to_mbox, ptra' save mbox address for later'' get parameters from spin
do_init movptra, ptr_to_mbox ' return ptra back to mbox rdlong tmp, ptra++ ' command long 0 rdlong tmp, ptra++ ' pins long 1 getbyte pinDO, tmp, #0getbyte pinCLK, tmp, #1getbyte pinDI, tmp, #2getbyte pinCS, tmp, #3rdlong tmp, ptra++ 'mask long 2 getnib sp_doclk, tmp, #0getnib sp_diclk, tmp, #1getnib sp_inv_pin, tmp, #2shl sp_doclk, #24shl sp_diclk, #24shl sp_inv_pin, #14rdlong sp_bt, ptra++ ' timing long 3 rdlong sp_init_bt, ptra' 400k 4 rdlong delay1s, #$14' clkfreq from hub '' setup pinsdrvh pinCS
dirl pinCLK '' setup Clock pinmov tmp, sp_clk
add tmp, sp_inv_pin ' invert clock wrpin tmp, pinCLK
wxpin sp_init_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk dirl pinDI '' setup DI pin mov tmp, sp_stx ' save mode, bpin might changeadd tmp, sp_diclk ' setup bpinwrpin tmp, pinDI ' sync txwxpin sp_stx_c, pinDI ' stop/start modedirh pinDI ' enable smartpinwypin #$FF, pinDI ' prime shifter dirl pinDO '' setup DO pin mov tmp, sp_srx ' save mode, bpin might changeadd tmp, sp_doclk ' setup bpinwrpin tmp, pinDO ' sync rxwxpin sp_stx_c, pinDO ' 8 bit, pre dirh pinDO ' enable pin '' card init drvh pinCS ' set cs highwypin sectorsize, pinCLK ' set init clocksgetct timeout ' setup 1s timeout addct1 timeout, delay1s ' in case card does not respond call #wait_clock ' wait for smartpins to finishdirl pinCLK '' setup Clock pinwxpin sp_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk
I think the fifo could help in the readbyte and writebyte but haven't wrapped my head around that yet.
I'm ready to try getting this runfile thing sorted and I THINK understand things pretty well. I'm about to start renaming some of my variables to match the convention laid out by @Cluso99. Not that there's a shortage of cogram yet with 237 longs left...
The other thing that I've been trying to wrap my head around is the LUT sharing.. Seems that YModem would really benefit from being able to load the LUT with 1k from the SD and then the ASync serial cog could just use a shared LUT bypassing the hub. I would also like to do this with my RAM/Display driver i'm working on and it seems probable, it would need some convention to be set with the objects that share the adjacent cog.
Packing the mask long into nibbles instead of bytes doesn't really gain anything.
How about something like this using bytes:
DATorg
_asm_start mov ptr_to_mbox, ptra' save mbox address for later'' get parameters from spin
do_init movptra, ptr_to_mbox ' return ptra back to mbox rdlong tmp, ptra++ ' command long 0; not usedrdbyte pinDO, ptra++ ' pins long 1rdbyte pinCLK, ptra++
rdbyte pinDI, ptra++
rdbyte pinCS, ptra++
rdlong sp_doclk, ptra' mask long 2getbyte sp_diclk, sp_doclk, #1getbyte sp_inv_pin, sp_doclk, #2shl sp_doclk, #24shl sp_diclk, #24shl sp_inv_pin, #14rdlong delay1s, #$14' clkfreq from hub '' setup pins
Slightly smaller memory footprint, and saves double handling the data when it isn't necessary.
I'm packing into nibbles so I have extra bits later if I need them. Not real sure HOW to use them yet so it might be a waste. I was thinking about different ways I could do this and I really wanted it to all fit into the mailbox. I finally got the init sequence the way I wanted, with all of the parameters being loaded by the normal command loop.
DATorg
[code]
DATorg
_asm_start mov ptr_to_mbox, ptra' save mbox address for later'' get parameters from spin
get_cmd rdlong cmd_in, ptrawz''' timeout and release card after 10s if_zjmp #get_cmd ' no cmd so handle time rdlong arg1_in, ++ptrardlong arg2_in, ++ptrardlong arg3_in, ++ptrardlong err_out, ++ptra' dummy read to align err pointer'' decode command cmp cmd_in, #"i"wzif_zjmp #do_init
cmp cmd_in, #"c"wzif_zcall #card_init
cmp cmd_in, #"r"wzif_zjmp #read_a
cmp cmd_in, #"w"wzif_zjmp #write_b
' cmp cmd_in, #"z" wz ' if_z jmp #release mov err_out, #5' err = no command ' '' do commands
end_cmd drvh pinCS ' set cs highwrlong err_out, ptramov cmd_in, #0' signal done to spin driver movptra, ptr_to_mbox ' return ptra back to mboxwrlong cmd_in, ptr_to_mbox
jmp #get_cmd
do_init getbyte pinDO, arg1_in, #0getbyte pinCLK, arg1_in, #1getbyte pinDI, arg1_in, #2getbyte pinCS, arg1_in, #3getnib sp_doclk, arg2_in, #0getnib sp_diclk, arg2_in, #1getnib sp_inv_pin, arg2_in, #2shl sp_doclk, #24shl sp_diclk, #24shl sp_inv_pin, #14mov sp_bt, arg3_in ' get max clockmov sp_init_bt, err_out ' get 400k clockmov err_out, #0' err = no error rdlong delay1s, #$14' clkfreq from hub dirl pinCLK '' setup Clock pin mov spare, ##sp_clk ' transition add spare, sp_inv_pin ' invert clock wrpin spare, pinCLK ' set clock modewxpin sp_init_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk dirl pinDI '' setup DI pin mov spare, ##sp_stx ' sync txadd spare, sp_diclk ' setup bpinwrpin spare, pinDI ' set di modewxpin #sp_stx_c, pinDI ' set di configdirh pinDI ' enable smartpinwypin #$FF, pinDI ' prime shifter dirl pinDO '' setup DO pin mov spare, ##sp_srx ' sync rxadd spare, sp_doclk ' setup bpinwrpin spare, pinDO ' set do modewxpin #sp_srx_c, pinDO ' set do config dirh pinDO ' enable pin call #card_init
jmp #end_cmd
I'm pretty happy with this although the memory footprint is a little wasteful, with cluso's variables in cog. I could probably squeeze out a few cog longs but it's getting there.
I keep thinking about ways to deal using the LUT and I've got a proposal. Instead of passing a positive number for the hub address for block read/write, if you pass a negative number it will be loaded to LUT. This would give us 4 pages, I just need to figure out how to handle loading cog2 and doing something with the data. Since the LUT can't be accessed in bytes like the cogram can, i'm not sure how much more efficient packing bytes into longs and copying to LUT will be. I think that's the next step though.
I did separate the SD card init from pins init, to allow mounting of cards without resetting pins. I need to setup commands for this, as well as handle release properly. I also want to implement an auto-release timeout too, just need to try to implement it.
I'm going to release RC4 here, it's a major code cleanup. I tried to use all the instructions where applicable. Things clean up nicely this way. I know that with 3 operand instructions vs 2 regular instructions it's only readability but still, I LOVE P2ASM!!!!
I'm also including Ymodem with this release, although it appears receive is broken. I think it's baud related, or maybe it's a buffer overflow issue since smartpin serial isn't buffered? I'd think that would show up as a problem sending a file TO the sd card?? I'll debug that as soon as I'm done with this SD driver
I updated the above code blurbs to reflect the new versions. Now that I've got cog variables cleaned up a bit and I'm getting ready to try to run a file I have a couple final questions. First to verify the file that is renamed and placed on card is the .EEPROM file created? (?duh?) Most important, if I create a constants section with the rom hook addresses... what's the proper JMP convention? I'm SOOOO lost on that part!
I'm also thinking about the whole pushing data to LUT. I guess as far as raw performance goes it would help to use the FIFO, so I should probably start reading up on that... LUTS seems like the way to go, although if you have to pack / unpack bytes it might not be as efficient? It would be wasteful to only get one block read/write into luts and I think it would also kinda defeat the purpose?!?
*edit
Thanks for the suggestion @msrobots, here's the updated version. I made the change to the release real quick.
PUBstart_explicit( _DO, _CLK, _DI, _CS ) : card_type | domask, dimask
stop
cmd := "i"'set flag to know if the started cog has read its parameters
arg1 := _DO | (_CLK << 8) + (_DI << 16) + ( _CS << 24)' pins
arg2 := getmask(_CLK, _DO) + (getmask(_CLK, _DI) << 4) ' clock masks
arg2 += spi_clk_inv << 8' clock invert
arg3 := GetBitTime(spi_clk_max) ' spi bit time
err := GetBitTime(400_000 ) '400k
cog := cognew(@_asm_start,@cmd) + 1if cog
repeatuntil cmd <> "i"' wait until cog is done reading parameter
Still looking for "fixes" for these -
... ' ex1
send_byte SHL data_out, #32-8' Set MSB FirstREV data_out ' Set MSB First
…
… 'ex2rev data_in
and data_in, #$ff
read_byte_ret ret
Comments
Ugh! Please don't do this. I know it was popular in Spin1 to save time by passing pointers to the parameters that happened to be on the stack, but it really kills performance in fastspin (it'll work, but it forces the compiler to store local variables on the stack instead of in registers, which is much slower). It also makes porting the code to other languages harder, because not all of them will pass variables on the stack -- many, like fastspin, will pass them in registers instead.
You've also got the mailbox in the DAT section, which means it's shared by all instances of the object. If you wanted to drive two different cards, with two different COGs, you'd need to have a lock on the mailbox so they don't conflict. Whereas if the mailbox is a local variable of the object this wouldn't be an issue. Maybe it doesn't matter for the SD card driver, but you mentioned wanting to use this as a template for other objects, so it might come up there.
So I would suggest that the more natural way to write this is:
VAR long mbox[10] ' or whatever size ... PUB start_explicit(DO, CLK, DI, CS) : card_type stop mbox[0] := "i" mbox[1] := DO mbox[2] := CLK mbox[3] := DI mbox[4] := CS ... cog := cognew(@asm_start, @mbox)
That'll also simplify your PASM startup code, because you won't have to juggle multiple pointers (you just have the one to mbox), and I think makes everything easier to understand, both for the human reader and for the compiler.
Regards,
Eric
As to passing parameters, I'll need to figure out the best way to handle this. I use this trick because there are many times I would like to pass more initialization info to the cog than I have parameters. I never did get my head around the "pre-load" trick used in the P1 and always did it this way. I guess I could use several commands for initialization, need to think on this more.
Regarding the mailbox in the dat section, I also have a commented out VAR section if I wanted multiple instances running at the same time. I will have one platform with 2 sd cards so this will be something to think about. That hardware also has a shared SPI bus so was thinking about providing some SPI methods for accessing those devices as well. I think it's best to start out "generic" so maybe others can find it helpful as well.
pri read : r | c, o ' Read eight bits from the card. c := clk o := do asm dirl c wrpin sp_async_tx,c wxpin ##$9,c 'set base period dirh c dirl o wrpin sp_clk_cfg,o 'sync tx wxpin #%1_00111,o 'stop/start mode dirh o 'enable smartpin ' wypin txdata,o 'data, need to do anything? wypin #8,c 'start clock, tx data '' ! should this be 8 or 16? busy testp c wc if_nc jmp #busy rdpin pa,c ' needed? rdpin r,o dirl o dirl c '' next part per chip's smart pin documentation REV r ' Set MSB First TRIML r,#7 endasm
pub readblock(n, b) | c, i, o, outv, r ' ' Read a single block. The "n" passed in is the ' block number (blocks are 512 bytes); the b passed ' in is the address of 512 blocks to fill with the ' data. ' c := clk o := do i := di DRVL_(cs) starttime := cnt asm rep #.end_read0, #8 drvl c drvh c .end_read0 mov outv, #$40+17 rol outv, #24 rep #.end_send0, #8 drvl c rol outv, #1 wc drvc i drvh c .end_send0 drvh i mov outv, n rep #.end_send1, #32 drvl c rol outv, #1 wc drvc i drvh c .end_send1 drvh i mov outv, #$87 rol outv, #24 rep #.end_send2, #8 drvl c rol outv, #1 wc drvc i drvh c .end_send2 drvh i endasm readresp readresp repeat sectorsize asm mov r, #0 rep #.end_read, #8 drvl c waitx #14 '' !! 13 safe for up to 160 mhz, 14 for 320mhz, 57@320mhz external testp o wc drvh c rcl r, #1 .end_read endasm byte[b++] := r asm rep #.end_read2, #16 drvl c drvh c .end_read2 endasm return endcmd
I've also mostly converted CMD to inline ASM. Again proving things work before burying in a cog!
pri cmd(op, parm) | c, i, o, outv ' ' Send a full command sequence, and get and ' return the response. We make sure cs is low, ' send the required eight clocks, then the ' command and parameter, and then the CRC for ' the only command that needs one (the first one). ' Finally we spin until we get a result. DRVL_(cs) c := clk o := do i := di starttime := cnt asm rep #.end_read0, #8 drvl c drvh c .end_read0 mov outv, op add outv, #$40 rol outv, #24 rep #.end_send0, #8 drvl c rol outv, #1 wc drvc i drvh c .end_send0 drvh i mov outv, parm rep #.end_send1, #32 drvl c rol outv, #1 wc drvc i drvh c .end_send1 drvh i cmp op, #0 wz if_z mov outv, #$95 if_nz mov outv, #$87 rol outv, #24 rep #.end_send2, #8 drvl c rol outv, #1 wc drvc i drvh c .end_send2 drvh i endasm return readresp
I'm really hoping to get much closer to sysclock /4 or maybe /8, the waitx #14 looks really ugly, lol
I have a few ??? I'm still trying to understand, such as I'm running @320mhz as max clock but seeing that 290mhz is where things top out.. I should probably start testing 290mhz clock??? Although once things are solid at the 160mhz "standard" clock, I haven't had a problem running up @320.
Also, since I need to slow things down for LA scoping I've been putting the PLL out of spec in all kinds of strange ways and things are surprisingly stable. Is there a recommended pll setup for 8mhz or 16mhz?
One thing I'd really like to do is figure out how to "tune" that waitx automatically. I tested using the SD card on my hardware and I had to push this to almost 60. I'm assuming ringing of cable and lots of PCB trace, I'll be shortening the cable eventually, hopefully this helps.
sp_sync_rx = %0000_0011_000_0000000000000_00_11101_0 '' +3, inv sp_sync_cfg = %1_00111 pri read : r | c, o ' Read eight bits from the card. c := clk o := do asm dirl o wrpin #sp_sync_rx,o 'sync tx wxpin #sp_sync_cfg,o 'stop/start mode dirh o 'enable smartpin akpin o waitx #12 rep #.end_rd, #8 drvl c waitx #12 drvh c waitx #2 .end_rd waitx #32 rdpin r, o rev r and r, #$ff dirl o endasm te.tx(r) ' debugging, remove breaks
I'm glad I've got it working and I'll figure out what I'm doing wrong shortly.
con sp_srx = %0000_0011_000_0000000000000_00_11101_0 '' +3, inv sp_srx_c = %1_00111 sp_clk = %1000_0000_000_0000000000000_01_00101_0 var long spi_bitcycles .. 'init code does this spi_bitcycles :=((clkfreq / spi_clk_max) /2 )+1 ' needs to be inverse of freq / max bt := ((clkfreq / 400_000 ) / 2 ) +1 '' UNDER 400k asm wrpin #sp_srx, o 'sync tx wxpin #sp_srx_c, o 'stop/start mode dirl c wrpin #sp_clk, c wxpin bt, c 'set base period (*2) ''2.5 400kmax init dirh c wypin #initclks, c 'start clock, tx data n-1? .busy testp c wc if_nc jmp #.busy drvh csn wypin #initclks, c 'start clock, tx data n-1? rdpin pa,c .busy2 testp c wc if_nc jmp #.busy2 wrpin #0, c endasm ... pri read | c, o, t, bc ' Read eight bits from the card c := clk o := do bc := spi_bitcycles asm dirl c wrpin #sp_clk, c ' set clk pin mode wxpin bc, c ' set clk base period dirh c ' enable sp clk dirh o ' enable sp di wypin #byteclks, c ' start clock .busy testp c wc if_nc jmp #.busy rdpin t, o ' get data dirl o ' disable smartpin di wrpin #0, c ' clear clk sp config rev t and t, #$ff endasm ' term.tx(t) return t
Once the TX is using smartpins, I won't need to disable the clock pin at the end of the read method. Really liking these smartpins. What a cool chip, once you get past that initial learning curve!
@"Peter Jakacki" I can't remember if the p2d2 uses the same pins as the p2ES. If not, you will probably need to change the smartpin config lines for SP_STX and SP_SRX. Let me know if you need any help.
I have a bunch of work left to do before hiding the spi away in it's own cog. I think I'm getting close, finally!
{{ SDSPI driver for P2-ES FastSpin/InlineASM RC3 Cheezus Slice (Joe Heinz) - July - 2019 Thanks to all those out there in fourm land! Original spin version by Radical Eye Software Copyright 2008 }} 'CON Baudrate = 115200 'baud rate for serial communications con sectorsize = 512 sectorshift = 9 spi_clk_max = 20_000_000 ' 20mhz safe, 25 seems to work though? '' NOTE, SYSCLOCK needs to be 2x spi_clk_max for now '%AAAA_BBBB_FFF_PPPPPPPPPPPPP_TT_MMMMM_0 'ppp =''xxxx_CIO_HHH_LLL sp_inv_pin = %0000_001_000_000 << 8 'smartpin modes TT_MMMMM_0 sp_stx = (%0010 << 24) + %01_11100_0 ' +2 sp_srx = (%0011 << 24) + %00_11101_0 ' +3 sp_srx_c = %0_00111 sp_clk = %01_00101_0 + sp_inv_pin ' inv clock right way initclks = (20*(2*8)) ' send 20 bytes to init card byteclks = (2*8) var long di, do, clk, cs, starttime, sdhc, di_mask, do_mask, clk_mask, do_pin long spi_bitcycles 'obj term: "SimpleSerial.spin2" ''-P2 TERM pri send(outv) | c, i ' Send eight bits, then raise di. i := di c := clk asm SHL outv, #32-8 ' Set MSB First REV outv ' Set MSB First wypin outv, i ' data !! rdpin pa, c wypin #byteclks, c 'start clock, tx data .busy testp c wc if_nc jmp #.busy wypin #$FF, i endasm pri read | c, o, t ' Read eight bits from the card. c := clk o := do asm rdpin pa, c wypin #byteclks, c ' start clock .busy testp c wc if_nc jmp #.busy rdpin t, o ' get data rev t and t, #$ff endasm return t pub start_explicit(iDO, iCLK, iDI, iCS)| t, c,csn,o,i, bt 'RJA adding in some things to make work '' globals do := iDO clk := iCLK di := iDI cs := iCS spi_bitcycles :=((clkfreq / spi_clk_max) /2 )+1 '' locals c := clk csn := cs o := do i := di bt := spi_bitcycles asm drvh csn dirl c wrpin #sp_clk, c wxpin bt, c 'set base period (*2) ''2.5 400kmax init dirh c dirl i wrpin #sp_stx, i 'sync tx wxpin #%1_00111, i 'stop/start mode dirh i wypin #$FF, i dirl o wrpin #sp_srx, o 'sync rx wxpin #sp_srx_c, o 'stop/start mode dirh o wypin #initclks, c 'start clock, tx data n-1? .busy testp c wc if_nc jmp #.busy wypin #byteclks, c 'start clock, tx data n-1? rdpin pa, c .busy2 testp c wc if_nc jmp #.busy2 endasm starttime := t cmd(0, 0) drvh_(cs) ' Deselect the card to terminate a command. cmd(8, $1aa) read read read read drvh_(cs) ' Deselect the card to terminate a command. repeat cmd(55, 0) t := cmd(41, $4000_0000) drvh_(cs) ' Deselect the card to terminate a command. if t <> 1 quit if t return -40'abort -40 ' could not initialize card cmd(58, 0) sdhc := (read >> 6) & 1 read read read drvh_(cs) ' Deselect the card to terminate a command. if sdhc == 0 ' if not hc return -40 ' fail mount return sdhc +1 ' return card type pri readresp | r '' made return timeout to caller CHZ 2-2019 ' Read eight bits, and loop until we ' get something other than $ff. ' repeat if (r := read) <> $ff return r if checktime == -41 return -41 pri busy | r ' Wait until card stops returning busy '' made return timeout to caller CHZ 2-2019 ' repeat if (r := read) return r if checktime == -41 return -41 pri checktime ' Did we go over our time limit yet? if cnt - starttime > clkfreq return -41'abort -41 ' Timeout during read pri cmd(op, parm) ' ' Send a full command sequence, and get and ' return the response. We make sure cs is low, ' send the required eight clocks, then the ' command and parameter, and then the CRC for ' the only command that needs one (the first one). ' Finally we spin until we get a result. DRVL_(cs) starttime := cnt read send($40+op) send(parm >> 24) send(parm >> 16) send(parm >> 8) send(parm << 0) if (op == 0) send($95) else send($87) return readresp pri endcmd ' Deselect the card to terminate a command. drvh_(cs) ' Deselect the card to terminate a command. PUB stop ' RJA adding in some things PUB release pub start(basepin) result := start_explicit(basepin, basepin+1, basepin+2, basepin+3) pub readblock(n, b) | c, i, o, outv, r ' ' Read a single block. The "n" passed in is the ' block number (blocks are 512 bytes); the b passed ' in is the address of 512 blocks to fill with the ' data. ' starttime := cnt cmd(17, n) readresp repeat sectorsize byte[b++] := read read read return endcmd pub writeblock(n, b) ' ' Write a single block. Mirrors the read above. ' starttime := cnt cmd(24, n) ' cmd(24, doSDHC(n)) send($fe) repeat sectorsize send(byte[b++]) read read if ((readresp & $1f) <> 5) return -42'abort -42 busy return endcmd {{ ' Permission is hereby granted, free of charge, to any person obtaining ' a copy of this software and associated documentation files ' (the "Software"), to deal in the Software without restriction, ' including without limitation the rights to use, copy, modify, merge, ' publish, distribute, sublicense, and/or sell copies of the Software, ' and to permit persons to whom the Software is furnished to do so, ' subject to the following conditions: ' ' The above copyright notice and this permission notice shall be included ' in all copies or substantial portions of the Software. ' ' THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, ' EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF ' MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. ' IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY ' CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, ' TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE ' SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. }}
CON 'Constants section spi_clk_max = 20_000_000 ' 20mhz safe initclks = (20*(2*8)) ' send 20 bytes to init card byteclks = (2*8) VAR byte cog byte do, clk, di, cs PUB HOOK ' test compile start_explicit( 3, 2, 1, 0 ) PUB Stop if cog cogstop(cog-1) cog := 0 PUB GetCog return cog PUB start_explicit( _DO, _CLK, _DI, _CS ) : card_type | domask, dimask, sbt '' save pins for later? do := _DO clk := _CLK di := _DI cs := _CS '' prepare smartpin clock mask and bit time domask := getmask(clk, do) ' bpin mask (clk) dimask := getmask(clk, di) sbt :=((clkfreq / spi_clk_max) /2 )+1 ' SPI bit time if sbt < 2 ' make sure at least sysclock /2 sbt := 2 stop long[@cmd] := "i" 'set flag to know if the started cog has read its parameters long[@cmd][1] := do | (clk << 8) + (di << 16) + (cs << 24) long[@cmd][2] := domask + (dimask << 8) ' clock masks long[@cmd][3] := sbt ' spi bit time long[@cmd][4] := clkfreq '1s long[@cmd][5] := clkfreq / 100_000_000 '1us cog := cognew(@_asm_start,@cmd) + 1 if cog repeat until long[@cmd] <> "i" ' wait until cog is done reading parameter card_type := long[@cmd][1] ' arg1 returns card type or returns with ptr to error from low level driver else ' we don't have a cog? return PUB ReadBlock(n, b) return DoCommand("r", n, b, 0) PUB WriteBlock(n, b) return DoCommand("w", n, b, 0) PRI DoCommand(_cmd, _arg1, _arg2, _arg3) if cmd <> 0 return @cogbusymsg '' busy arg1 := _arg1 arg2 := _arg2 arg3 := _arg3 cmd := _cmd repeat until cmd <> _cmd return err PRI getmask(clk, data) | t t := clk - data if ( || t ) <> t t := ( || t ) + %0100 DAT org _asm_start mov ptr_to_mbox, ptra ' save mbox address for later '' get parameters from spin do_init mov ptra, ptr_to_mbox ' return ptra back to mbox rdlong tmp, ptra++ ' pins long mov pinDO, tmp and pinDO, #$ff mov pinCLK, tmp shr pinCLK, #8 and pinCLK, #$ff mov pinDI, tmp shr pinDI, #16 and pinDI, #$ff mov pinCS, tmp shr pinCS, #24 and pinCS, #$ff rdlong tmp, ptra++ 'mask long mov sp_doclk, tmp and sp_doclk, #$f mov sp_diclk, tmp shr sp_diclk, #4 and sp_diclk, #$f rdlong sp_bt, ptra++ ' timing long rdlong delay1s, ptra++ ' 1s long rdlong delay1us, ptra ' 1us long mov ptra, ptr_to_mbox ' return ptra back to mbox '' setup pins drvh pinCS dirl pinCLK '' setup Clock pin mov tmp, sp_clk add tmp, sp_inv_pin ' invert clock wrpin tmp, pinCLK wxpin sp_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk dirl pinDI '' setup DI pin mov tmp, sp_stx ' save mode, bpin might change add tmp, sp_diclk ' setup bpin wrpin tmp, pinDI ' sync tx wxpin sp_stx_c, pinDI ' stop/start mode dirh pinDI ' enable smartpin wypin #$FF, pinDI ' prime shifter dirl pinDO '' setup DO pin mov tmp, sp_srx ' save mode, bpin might change add tmp, sp_doclk ' setup bpin wrpin tmp, pinDO ' sync rx wxpin sp_stx_c, pinDO ' 8 bit, pre dirh pinDO ' enable pin '' card init drvh pinCS ' set cs high wypin #initclks, pinCLK ' set init clocks getct timeout ' setup 1s timeout addct1 timeout, delay1s ' in case card does not respond mov sd_cmd, #0 'cmd 0,0 mov sd_parm, #0 call #wait_clock ' wait for smartpins to finish call #do_sdcmd drvh pinCS ' set cs high mov sd_cmd, #8 ' cmd 8 $1aa mov sd_parm, #$1AA call #do_sdcmd call #read_byte ' read 32 bit call #read_byte call #read_byte call #read_byte drvh pinCS ' set cs high .cmd55 mov err_out, ##-40 mov sd_cmd, #55 'cmd 55,0 mov sd_parm, #0 call #do_sdcmd mov sd_cmd, #41 'cmd 55,0 mov sd_parm, ##$4000_0000' call #do_sdcmd drvh pinCS ' set cs high cmp data_out, #$01 wz ' 01 = idle if_z jmp #.cmd55 mov sd_cmd, #58 'cmd 0,0 mov sd_parm, #0 call #do_sdcmd call #read_byte and data_in, #40 cmp data_in, #0 wz if_z mov adrShift, #9 if_nz mov adrShift, #0 call #read_byte call #read_byte call #read_byte drvh pinCS ' set cs high mov err_out, #0 mov sd_cmd, #16 'cmd 0,0 mov sd_parm, ##$200 call #do_sdcmd drvh pinCS ' set cs high do_init_ret ret get_cmd rdlong cmd_in, ptra wz ' '' timeout and release card after 10s if_z jmp #get_cmd 'no cmd so handle time rdlong arg1_in, ++ptra ' rdlong arg2_in, ++ptra ' rdlong arg3_in, ++ptra ' rdlong arg4_in, ++ptra ' cmp cmd_in, #"r" wz if_z jmp #read_a cmp cmd_in, #"w" wz if_z jmp #write_b ' cmp cmd_in, #"z" wz ' if_z jmp #release cmp cmd_in, #"i" wz if_z call #do_init '' do commands end_cmd wrlong err_out, ++ptra mov cmd_in, #0 mov ptra, ptr_to_mbox ' return ptra back to mbox wrlong cmd_in, ptr_to_mbox jmp #get_cmd read_a '' arg1 is sd ptr, arg2 is hub mov sd_cmd, #17 ' cmd 8 $1aa ' shr adrShift, arg1_in ' non-hc compatibility mov sd_parm, arg1_in call #do_sdcmd call #read_resp mov ptrb, arg2_in rep @.endread, sectorsize call #read_byte wrbyte data_in, ptrb++ .endread call #read_byte call #read_byte drvh pinCS ' set cs high read_a_ret ret write_b '' arg1 is sd ptr, arg2 is hub mov sd_cmd, #24 ' cmd 8 $1aa ' shr adrShift, arg1_in ' non-hc compatibility mov sd_parm, arg1_in call #do_sdcmd mov data_out, #$fe call #send_byte mov ptrb, arg2_in rep @.endsend, sectorsize rdbyte data_out, ptrb++ call #send_byte .endsend call #read_byte call #read_byte call #read_resp and data_in, #$1f cmp data_in, #5 wz if_nz mov err_out, ##-42 call #sd_busy drvh pinCS ' set cs high write_b_ret ret do_sdcmd ' OP is sd_cmd, parm = sd_parm drvl pinCS getct timeout ' setup 1s timeout addct1 timeout, delay1s call #read_byte mov data_out, sd_cmd add data_out, #$40 call #send_byte mov data_out, sd_parm shr data_out, #24 and data_out, #$ff call #send_byte mov data_out, sd_parm shr data_out, #16 and data_out, #$ff call #send_byte mov data_out, sd_parm shr data_out, #8 and data_out, #$ff call #send_byte mov data_out, sd_parm and data_out, #$ff call #send_byte cmp sd_cmd, #0 wz if_z mov data_out, #$95 if_nz mov data_out, #$87 call #send_byte call #read_resp do_sdcmd_ret ret send_byte SHL data_out, #32-8 ' Set MSB First REV data_out ' Set MSB First wypin data_out, pinDI ' data !! rdpin pa, pinCLK wypin #byteclks, pinCLK ' start clock, tx data call #wait_clock wypin #$FF, pinDI send_byte_ret ret read_byte rdpin pa, pinCLK wypin #byteclks, pinCLK ' start clock call #wait_clock rdpin data_in, pinDO ' get data rev data_in and data_in, #$ff read_byte_ret ret sd_busy call #read_byte cmp data_in, #0 wz if_z call #check_time sd_busy_ret ret read_resp call #read_byte cmp data_in, #$ff wz if_z call #check_time read_resp_ret ret check_time pollct1 wc if_c mov err_out, ##-41 drvh pinCS ' set cs high ' if_c jmp #timed_out check_time_ret ret wait_clock testp pinCLK wc if_nc jmp #wait_clock wait_clock_ret ret {=== Smart pin constants ===} sp_clk long %01_00101_0 sp_stx long %01_11100_0 sp_srx long %00_11101_0 sp_stx_c long %1_00111 sp_srx_c long %0_00111 sp_inv_pin long %0000_001_000_000 << 8 adrShift long 0 ' will be 0 for SDHC, 9 for MMC & SD sectorsize long 512 ' loaded on init {=== Pin Variables ===} pinDO long 0 ' do pinCLK long 0 ' clk pinDI long 0 ' di pinCS long 0 ' chip select {===== Smartpin Clock Variables =====}'' smartpins stuffs sp_doclk long 0 sp_diclk long 0 sp_bt long 0 {===== Command Mailbox =====} ptr_to_mbox res 1 cmd_in res 1 arg1_in res 1 arg2_in res 1 arg3_in res 1 arg4_in res 1 err_out res 1 {=== Timing Variables ===} delay1s res 1 ' clkfreq = 1 s delay1us res 1 ' clkfreq / 100_000_000 timeout res 1 {=== Temp Variables ===} tmp res 1 ' this may get used in all subroutines...don't use except in lowest data_in res 1 data_out res 1 sd_cmd res 1 sd_parm res 1 'buff res 299 FIT 496 dat '' debug codes cogbusymsg byte "SPI Busy",0 badpins byte "pins out of range",0 dat '' singleton spin mailbox alignl cmd long 0 arg1 long 0 arg2 long 0 arg3 long 0 arg4 long 0 err long 0 { var '' multi spin mailbox long cmd, arg1, arg2, arg3, arg4, err }
In the asm init section, I don't see where do_init is used separately from _asm_start. Moving the just saved pointer back isn't really needed is it?
Rearranging your code could allow the use of a setq+ rdlong construct to read the whole parameter block in one action. It could save some pointer maths and the code reduction will allow for the extra long needed (tmp2). Using getnib, getbyte, and getword instead of shift and mask for the slice and dice of the packed parameters is worth investigating too.
In the clock pin setup you add two constants before the WRPIN. Why not just a single constant?
read_byte destroys the contents of data_in on each call, so where you call it multiple times in sequence are you only interested in the last byte received?
I've run out of time, but that's a start.
Re. init -
This is one I'm not 100% sure how I want to do. I'd like to be able to change sysclock or pins by restarting the cog (it's called from get_cmd) I could probably get away with one entry point but was even thinking about separating the timing stuff from pin stuff but going to leave it like this for now I think.
re. setq + rdlong -
This is one improvement I'm going to look at once I have things working. That goes for the getnib / setnib instructions, just not quite there yet. Trying to stay as close to what I know..
The extra constant for the clock pin inverts it and I plan on having a switch somewhere to control the polarity of the clock, as well as set default max SPI clock. Just haven't been able to get my head around that yet.
Read_byte trashes data_out and most of the time I'm not interested in any of the data.
pri readresp | r ' Read eight bits, and loop until we ' get something other than $ff. ' repeat if (r := read) <> $ff return r if checktime == -41 return -41
When I was trying to convert it to ASM I had a feeling this wouldn't work.
read_resp call #read_byte cmp data_in, #$ff wz if_z call #check_time if_z jmp #read_resp read_resp_ret ret
I'm completely stuck right now and i'm pretty sure this is the problem.
… mov ptrb, arg2_in rep @.endread, ##512 call #read_byte wrbyte data_in, ptrb++ .endread call #read_byte call #read_byte ...
*edit
CONFIRMED. I tried a couple different things but couldn't get rep to work??? I have it working inline ASM but not in actual asm! RTFM!!
I was able to get it working and now the SD card mounts!
.loop call #read_byte wrbyte data_in, ptrb++ djnz count, #.loop
I need to cleanup code and do some testing to make sure things are stable but I think I'm close!!
DUH!!! Thanks Peter, feeling kinda silly now
The flaw (Corrected in the final silicon) is to do with a group of branching instructions that test the hardware events mechanism, eg: JCT1.
Interestingly enough, the bit-bashed command packet is 2nd to complete although it's the all out loser in final times.
Bit - bashed SPI
Smartpins inline
Smartpins cog
*edit
Got block writes working so was able to come up with a full set of test times. To mount the card, read and write 2m.
inline asm bit-bashed-19.5s
inline asm smart-pin-14s
cog smart-pin-10.9s
I'll give the code a quick cleanup, now that I've got everything working including clock polarity. There's a lot of places to improve still. I'd like to figure out a way to determine max spi clock at init, although I really can't wrap my head around any way besides brute-force init card and increase speed till it fails. Maybe I'm missing something obvious as usual. Really happy I have this working though.
*final edit
I did some code cleanup and have just about everything where I want it. I'm posting a NEW RC3 here and once things are tested further I'll update the first post.
Seems like unnecessary complexity to me
For some cases I can see how it could be considered unnecessary but I'm working on the Touchscreen OS again and aborts seem really helpful for SD cards. The ability to point to an error message, instead of needing to do a lookup on an error code. Six of one, half dozen of the other... I guess it could be done with conditional compile and would be the best, because a lot of P1 apps expect negative return type errors from FSRW. I really like how verbose Kye's SD driver is, plus it would be nice to not need to rewrite the ENTIRE touch.spin.
As I said, I really like Kye's driver and have pretty much talked myself into trying to get the genius piece of code working. One of the functions that is very essential to the OS is the ability to launch a file from SD card and it seems that @Cluso99 's on-chip SD could do this with little additional code. I just haven't wrapped my head around that yet. In the past I had commented out large portions of Kye's driver to free up memory but now that we have enough hubram I'm hoping to be able to do BIG THINGS much easier.
Perhaps the title of this thread should be changed to cover both FSRW and KYE's SD?
From a users perspective the names are too long, but makes understanding what its doing a breeze.
This is one of the reasons i wanted to get the Spin1 interpreter working with P2. But i got sidetracked near the end with work and haven’t managed to get back to it yet.
FWIW the ROM SD code, with the addition of write, and tweeking the actual spi routine, this should slot into Kye’s code nicely.
Kye's driver is great and the Touch OS is based largely on your OS based on Kye's SD driver
The main reason I didn't focus on using the ROM code was needing to understand SPI on the P2 a little better.
I'm stuck needing a driver for the resistive touch ADC XPT2046. Going to try modifying SDSPI for this. Once I get that done I'll be able to test a lot more, although I'm sure there will be a ton of debugging and bugfixes.
One of the things that looks really promising is LUT sharing. IE, read block from sd using CogN into shared LUT and use CogN+1 to push that block to display (or ram/flash). CogN could be reading the next block while N+1 is writing the current block. Seems like this could save a TON of hubcycles over the way things are done on the P1. I don't see myself getting around to this any time soon since I'm not sure how other people would be able to use such a specific driver.
I'm hoping to try using the rom to run a file, how many times can I get this one wrong? lol
I updated the 1st post with the current RC and some RC notes. I really wish I was better at documentation but until I get some feedback from people using this it seems pointless. I'm guessing right now there isn't much need for a P2 FSRW or Kyes, but that will change soon hopefully!
Really hoping the new silicon doesn't break things too much, hopefully worst case is change smartpin clock from sysclock /2 to /3 or /4. Time will tell though!
Note the ES ROM has some fixes in the respin ROM to better recover from the SD DO problem which fails to release the DO drive from the SD card. This means the SD ROM code positions change in the new ROM. The Serial Monitor/Debugger did not change position so this is fine to call.
Second thing to note is that the SD ROM code (routine) to drive the SPI code is deliberately not fast to ensure it works reliably from hubexec. It has send/receive combined ie it shifts in (reads) as it shifts out (writes).
The code in the new ROM is way better in the DO error handling, so this is what I will be using as a base for my SD (if you or others haven't done it when I need it), with just the write sector added, and the SPI send/receive routine replaced with a faster version.
BTW there is IIRC a single further fix for the DO that did not make it into the new ROM, so I will need to dig it out. I am overseas for another 2 weeks so that will have to wait.
I've looked at the first ROM code quite a bit. It's beautiful, but I was planning on having a cog just for the SD driver so hubexec is what got me confused. I was also aware of the changes for the respin, another reason I haven't worked on this. I would have liked to build a driver around it but figured you would be doing one (using hubexec) so focusing on something that could be stuffed in a cog like P1. FSRW is a bare minimum driver and so my focus on early development. Soon I'll need a way to run a precompiled binary and if I can figure out how to copy the filename to the right place in hub and jmp into the hubexec code I'll be really happy.
'+-------[ SD: Initialise/Locate/Load/Run a file from SD ]---------------------+ <--- SD: init/locate/load/run a file ---> '+ On Entry: + '+ fname[3]: filename[11] 8+3 without "." (will be $0 terminated) + '+ Call Format: + '+ CALL #@_Run_SDfile ' + < call: init/locate/load/run a file > '+ On Return: + '+ "NZ" if error, else does not return + '+-----------------------------------------------------------------------------+ '+-----------------------------------------------------------------------------+ fc560 fdb00040 _Start_SDcard call #@_SDcard_Init ' initialise & read CSD/CID fc564 f6079a00 mov skiprun, #0 ' load/run MBR/VOL code fc568 adb00174 if_e call #@_readMBR ' read MBR/VOL/FSI/FAT fc56c adb002a0 if_e call #@_readDIR ' read directory for filenames '' mov skiprun, #0 ' load/run <file> (already 0) fc570 adb0038c if_e call #@_readFILE ' read/load/run the file fc574 fd800100 JMP #try_serial ' failed: so go back and try serial '+-----------------------------------------------------------------------------+ fc578 fdb00028 _Run_SDfile call #@_SDcard_Init ' initialise & read CSD/CID '' mov skiprun, #1 ' do not load/run MBR/VOL code fc57c adb0015c if_e call #@_readMBR1 ' read MBR/VOL/FSI/FAT (don't run) fc580 adb002bc if_e call #@_searchDIR ' search dir for <fname> fc584 f6079a00 mov skiprun, #0 ' load/run <file> fc588 adb00374 if_e call #@_readFILE ' read/load/run the file fc58c fd64002d RET ' return "NZ" = failed, "Z" if loaded ok '+-----------------------------------------------------------------------------+ fc590 fdb00010 _Load_SDfile call #@_SDcard_Init ' initialise & read CSD/CID '' mov skiprun, #1 ' do not load/run MBR/VOL code fc594 adb00144 if_e call #@_readMBR1 ' read MBR/VOL/FSI/FAT (don't run) fc598 adb002a4 if_e call #@_searchDIR ' search dir for <fname> '' mov skiprun, #1 ' do not load/run <file> (already 1) fc59c adb00360 if_e call #@_readFILE ' read/load the file fc5a0 fd64002d RET ' return "NZ" = failed, "Z" if loaded ok '+-----------------------------------------------------------------------------+
andDAT ''============[ COG VARIABLES - SD BOOT]======================================== fc560 1c0 org $1C0 ' place the variables in cog $1C0-$1DF fc560 1c0 cmdout res 1 ' The 8b CMDxx | $40 fc560 1c1 cmdpar res 1 ' The 32b parameters fc560 1c2 cmdcrc res 1 ' The 8b CRC (must be valid for CMD0 & CMD8) fc560 1c3 cmdpar2 res 1 ' SDV1=$0, SDV2=$40000000 fc560 1c4 cmdtype res 1 ' reply is R1=1, R3=3, R7=7, else 0 fc560 1c5 reply res 1 ' R1 reply (moved to replyR1 when R3/R7 32b reply here) fc560 1c6 replyR1 res 1 ' R1 reply (8b saved when R3/R7 32b reply follows) fc560 1c7 dataout res 1 ' 8/32 bit data being shifted out fc560 1c8 bytescnt res 1 ' #bytes to send/recv fc560 1c9 bitscnt res 1 ' #bits to be shifted in/out fc560 1ca ctr1 res 1 fc560 1cb timeout res 1 ' = starttime + delay fc560 1cc spare res 1 fc560 1cd skiprun res 1 ' 1= skip load/run mbr/vol & load/no-run fname '\ 1=SDV1, 2=SDV2(byte address), 3=SDHC/SDV2(block address) fc560 1ce blocksh res 1 '/ block shift 0/9 bits fc560 1cf clustersh res 1 ' sectors/cluster SHL 'n' bits fc560 1d0 vol_begin res 1 '$0000_2000 ' Ptn0: first sector of PTN fc560 1d1 fsi_begin res 1 '$0000_2001 ' Ptn0: sector of file system info fc560 1d2 fat_begin res 1 '$0000_3122 ' Ptn0: first sector of FAT table fc560 1d3 dir_begin res 1' $0000_4000 ' Ptn0: first sector of DATA is DIR table fc560 1d4 dat_begin res 1 '$0000_4580 $0000_54C0' Ptn0: first sector of file's DATA fc560 1d5 ptn_size res 1 '$0008_0000 ' file-size 32KB = 64<<9 sectors fc560 1d6 _bufad res 1 fc560 1d7 _blocknr res 1 fc560 1d8 _sectors res 1 fc560 1d9 _entries res 1 fc560 1da bufad res 1 ' ptr sector buffer fc560 1db blocknr res 1 ' sector# fc560 1dc fname res 3 ' 8+3+1 fc560 1df _hubdata res 1
In your code...cog $1C0-$1DF is required for use by the SD ROM driver
Place the filename (8+3+$0) into 3 longs at fname (cog $1DC) (remember little endian so reverse each group of 4 characters!)
So "filename.xtn" would become "elif" + "eman" + ($0,"ntx")
Call _Load_SDfile
or
Call _Run_SDfile
... ' ex1 send_byte SHL data_out, #32-8 ' Set MSB First REV data_out ' Set MSB First … … 'ex2 rev data_in and data_in, #$ff read_byte_ret ret
I also need to use all the new test instructions but I think I can get those
PUB start_explicit( _DO, _CLK, _DI, _CS ) : card_type | domask, dimask stop long[@cmd] := "i" 'set flag to know if the started cog has read its parameters long[@cmd][1] := _DO | (_CLK << 8) + (_DI << 16) + ( _CS << 24)' pins long[@cmd][2] := getmask(_CLK, _DO) + (getmask(_CLK, _DI) << 4) ' clock masks long[@cmd][2] += spi_clk_inv << 8 ' clock invert long[@cmd][3] := GetBitTime(spi_clk_max) ' spi bit time long[@cmd][4] := GetBitTime(400_000) '400k cog := cognew(@_asm_start,@cmd) + 1 if cog repeat until long[@cmd] <> "i" ' wait until cog is done reading parameter card_type := long[@cmd][4] ' arg5 returns card type or returns with ptr to error from low level driver DAT org _asm_start mov ptr_to_mbox, ptra ' save mbox address for later '' get parameters from spin do_init mov ptra, ptr_to_mbox ' return ptra back to mbox rdlong tmp, ptra++ ' command long 0 rdlong tmp, ptra++ ' pins long 1 getbyte pinDO, tmp, #0 getbyte pinCLK, tmp, #1 getbyte pinDI, tmp, #2 getbyte pinCS, tmp, #3 rdlong tmp, ptra++ 'mask long 2 getnib sp_doclk, tmp, #0 getnib sp_diclk, tmp, #1 getnib sp_inv_pin, tmp, #2 shl sp_doclk, #24 shl sp_diclk, #24 shl sp_inv_pin, #14 rdlong sp_bt, ptra++ ' timing long 3 rdlong sp_init_bt, ptra ' 400k 4 rdlong delay1s, #$14 ' clkfreq from hub '' setup pins drvh pinCS dirl pinCLK '' setup Clock pin mov tmp, sp_clk add tmp, sp_inv_pin ' invert clock wrpin tmp, pinCLK wxpin sp_init_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk dirl pinDI '' setup DI pin mov tmp, sp_stx ' save mode, bpin might change add tmp, sp_diclk ' setup bpin wrpin tmp, pinDI ' sync tx wxpin sp_stx_c, pinDI ' stop/start mode dirh pinDI ' enable smartpin wypin #$FF, pinDI ' prime shifter dirl pinDO '' setup DO pin mov tmp, sp_srx ' save mode, bpin might change add tmp, sp_doclk ' setup bpin wrpin tmp, pinDO ' sync rx wxpin sp_stx_c, pinDO ' 8 bit, pre dirh pinDO ' enable pin '' card init drvh pinCS ' set cs high wypin sectorsize, pinCLK ' set init clocks getct timeout ' setup 1s timeout addct1 timeout, delay1s ' in case card does not respond call #wait_clock ' wait for smartpins to finish dirl pinCLK '' setup Clock pin wxpin sp_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk
I think the fifo could help in the readbyte and writebyte but haven't wrapped my head around that yet.
I'm ready to try getting this runfile thing sorted and I THINK understand things pretty well. I'm about to start renaming some of my variables to match the convention laid out by @Cluso99. Not that there's a shortage of cogram yet with 237 longs left...
The other thing that I've been trying to wrap my head around is the LUT sharing.. Seems that YModem would really benefit from being able to load the LUT with 1k from the SD and then the ASync serial cog could just use a shared LUT bypassing the hub. I would also like to do this with my RAM/Display driver i'm working on and it seems probable, it would need some convention to be set with the objects that share the adjacent cog.
How about something like this using bytes:
DAT org _asm_start mov ptr_to_mbox, ptra ' save mbox address for later '' get parameters from spin do_init mov ptra, ptr_to_mbox ' return ptra back to mbox rdlong tmp, ptra++ ' command long 0; not used rdbyte pinDO, ptra++ ' pins long 1 rdbyte pinCLK, ptra++ rdbyte pinDI, ptra++ rdbyte pinCS, ptra++ rdlong sp_doclk, ptra ' mask long 2 getbyte sp_diclk, sp_doclk, #1 getbyte sp_inv_pin, sp_doclk, #2 shl sp_doclk, #24 shl sp_diclk, #24 shl sp_inv_pin, #14 rdlong delay1s, #$14 ' clkfreq from hub '' setup pins
Slightly smaller memory footprint, and saves double handling the data when it isn't necessary.
I'm packing into nibbles so I have extra bits later if I need them. Not real sure HOW to use them yet so it might be a waste. I was thinking about different ways I could do this and I really wanted it to all fit into the mailbox. I finally got the init sequence the way I wanted, with all of the parameters being loaded by the normal command loop.
DAT org [code] DAT org _asm_start mov ptr_to_mbox, ptra ' save mbox address for later '' get parameters from spin get_cmd rdlong cmd_in, ptra wz ' '' timeout and release card after 10s if_z jmp #get_cmd ' no cmd so handle time rdlong arg1_in, ++ptra rdlong arg2_in, ++ptra rdlong arg3_in, ++ptra rdlong err_out, ++ptra ' dummy read to align err pointer '' decode command cmp cmd_in, #"i" wz if_z jmp #do_init cmp cmd_in, #"c" wz if_z call #card_init cmp cmd_in, #"r" wz if_z jmp #read_a cmp cmd_in, #"w" wz if_z jmp #write_b ' cmp cmd_in, #"z" wz ' if_z jmp #release mov err_out, #5 ' err = no command ' '' do commands end_cmd drvh pinCS ' set cs high wrlong err_out, ptra mov cmd_in, #0 ' signal done to spin driver mov ptra, ptr_to_mbox ' return ptra back to mbox wrlong cmd_in, ptr_to_mbox jmp #get_cmd do_init getbyte pinDO, arg1_in, #0 getbyte pinCLK, arg1_in, #1 getbyte pinDI, arg1_in, #2 getbyte pinCS, arg1_in, #3 getnib sp_doclk, arg2_in, #0 getnib sp_diclk, arg2_in, #1 getnib sp_inv_pin, arg2_in, #2 shl sp_doclk, #24 shl sp_diclk, #24 shl sp_inv_pin, #14 mov sp_bt, arg3_in ' get max clock mov sp_init_bt, err_out ' get 400k clock mov err_out, #0 ' err = no error rdlong delay1s, #$14 ' clkfreq from hub dirl pinCLK '' setup Clock pin mov spare, ##sp_clk ' transition add spare, sp_inv_pin ' invert clock wrpin spare, pinCLK ' set clock mode wxpin sp_init_bt, pinCLK ' set base period (*2) dirh pinCLK ' enable clk dirl pinDI '' setup DI pin mov spare, ##sp_stx ' sync tx add spare, sp_diclk ' setup bpin wrpin spare, pinDI ' set di mode wxpin #sp_stx_c, pinDI ' set di config dirh pinDI ' enable smartpin wypin #$FF, pinDI ' prime shifter dirl pinDO '' setup DO pin mov spare, ##sp_srx ' sync rx add spare, sp_doclk ' setup bpin wrpin spare, pinDO ' set do mode wxpin #sp_srx_c, pinDO ' set do config dirh pinDO ' enable pin call #card_init jmp #end_cmd
I'm pretty happy with this although the memory footprint is a little wasteful, with cluso's variables in cog. I could probably squeeze out a few cog longs but it's getting there.I keep thinking about ways to deal using the LUT and I've got a proposal. Instead of passing a positive number for the hub address for block read/write, if you pass a negative number it will be loaded to LUT. This would give us 4 pages, I just need to figure out how to handle loading cog2 and doing something with the data. Since the LUT can't be accessed in bytes like the cogram can, i'm not sure how much more efficient packing bytes into longs and copying to LUT will be. I think that's the next step though.
I did separate the SD card init from pins init, to allow mounting of cards without resetting pins. I need to setup commands for this, as well as handle release properly. I also want to implement an auto-release timeout too, just need to try to implement it.
long[@cmd] := "i" 'set flag to know if the started cog has read its parameters long[@cmd][1] := _DO | (_CLK << 8) + (_DI << 16) + ( _CS << 24)' pins long[@cmd][2] := getmask(_CLK, _DO) + (getmask(_CLK, _DI) << 4) ' clock masks long[@cmd][2] += spi_clk_inv << 8 ' clock invert long[@cmd][3] := GetBitTime(spi_clk_max) ' spi bit time long[@cmd][4] := GetBitTime(400_000) '400k
=cmd := "i" 'set flag to know if the started cog has read its parameters cmd[1] := _DO | (_CLK << 8) + (_DI << 16) + ( _CS << 24)' pins cmd[2] := getmask(_CLK, _DO) + (getmask(_CLK, _DI) << 4) ' clock masks cmd[2] += spi_clk_inv << 8 ' clock invert cmd[3] := GetBitTime(spi_clk_max) ' spi bit time cmd[4] := GetBitTime(400_000) '400k
Mike
I'm also including Ymodem with this release, although it appears receive is broken. I think it's baud related, or maybe it's a buffer overflow issue since smartpin serial isn't buffered? I'd think that would show up as a problem sending a file TO the sd card?? I'll debug that as soon as I'm done with this SD driver
I updated the above code blurbs to reflect the new versions. Now that I've got cog variables cleaned up a bit and I'm getting ready to try to run a file I have a couple final questions. First to verify the file that is renamed and placed on card is the .EEPROM file created? (?duh?) Most important, if I create a constants section with the rom hook addresses... what's the proper JMP convention? I'm SOOOO lost on that part!
DAT … … jmp #$FC578 'jump into rom code? … … CON P2_REVa_ROM_ADDRESS_TABLE _Start_SDcard = $FC560 _Run_SDfile = $FC578 _SDcard_Init = $FC5A4 readFILE = $FC900 search_dir = $FC840 _reset_booter = $FCA78 _Start_Monitor = $FCA88 _Enter_Monitor = $FCA8C _Redo_Monitor = $FCA9C _SerialAddr = $FCAA8 _SerialBaud = $FCAB0 _SerialInit = $FCAB8 _HubTxCR = $FCAE4 _HubTxRev = $FCAEC _HubTx = $FCAF0 _HubRx = $FCB10 _HubHexRev = $FCB24 _HubHex8 = $FCB28 _HubHex = $FCB2C _HubTxStrVer = $FCB9C _HubTxString = $FCBA4 _HubListA2H = $FCBC4 _HubList = $FCBC8 _RdLongCogHub = $FCF34 _HubRxStrMon = $FCCC4 _HubRxString = $FCCCC _HubMonitor = $FCD78 _Download = $FCEA4 _ParseHex = $FCF68
I'm also thinking about the whole pushing data to LUT. I guess as far as raw performance goes it would help to use the FIFO, so I should probably start reading up on that... LUTS seems like the way to go, although if you have to pack / unpack bytes it might not be as efficient? It would be wasteful to only get one block read/write into luts and I think it would also kinda defeat the purpose?!?
*edit
Thanks for the suggestion @msrobots, here's the updated version. I made the change to the release real quick.
PUB start_explicit( _DO, _CLK, _DI, _CS ) : card_type | domask, dimask stop cmd := "i" 'set flag to know if the started cog has read its parameters arg1 := _DO | (_CLK << 8) + (_DI << 16) + ( _CS << 24)' pins arg2 := getmask(_CLK, _DO) + (getmask(_CLK, _DI) << 4) ' clock masks arg2 += spi_clk_inv << 8 ' clock invert arg3 := GetBitTime(spi_clk_max) ' spi bit time err := GetBitTime(400_000 ) '400k cog := cognew(@_asm_start,@cmd) + 1 if cog repeat until cmd <> "i" ' wait until cog is done reading parameter
Still looking for "fixes" for these -
... ' ex1 send_byte SHL data_out, #32-8 ' Set MSB First REV data_out ' Set MSB First … … 'ex2 rev data_in and data_in, #$ff read_byte_ret ret