Flash file driver for P2

evanh · 2022-02-08 03:46

Doh! After compiling for Prop1 I'd left off the -2 in the compile options.

EDIT: It WORKS now.
Lol, Wine has a "Wordpad" lookalike that is better at viewing RTF docs than LibreWriter is - Doesn't try to make pages out of it and is much more responsive.

EDIT2: Wow, the memories are back - Entering code in a terminal with line numbers somehow instantly transports me back to school days.

Mike Green · 2022-02-08 04:46

For now, INPUT works only for a single variable with or without a prompt. PRINT and WRITE also seem to have problems with more than one variable.

evanh · 2022-02-08 05:25

Bug report I've saved a program and can reload and rerun it no problem. But when doing a FILES it lists the new name then continuously spews blank lines, never to return to command entry.

Detail: Blocks free has decreased by one. Here's first bootup message:

FemtoBASIC 1.001 (46), 32K bytes free
Flash size 16MB, 4040 4K blocks free
OK

and current message:

FemtoBASIC 1.001 (46), 32K bytes free
Flash size 16MB, 4039 4K blocks free
OK

evanh · 2022-02-08 12:17

I've cut another 16 clocks off the overhead ... but it relies on a Flexspin specific directive. So, not as portable.

It's simple enough, use ASM/ENDASM to move the non-repeating setup code back out to the calling hubRAM. A smaller block copy into cogRAM is then done quicker.

Mike Green · 2022-02-08 14:31

Thanks evanh. The "missing" free block is the one that's holding the program you saved.

Given the overhead of the interpreter, there's a point where speeding up SPI I/O or any few things makes little difference in overall execution speed. In addition, a "high performance" driver for flash memory should have a decent wear leveling algorithm, subdirectories, and good behavior on power failure. Hmmm ... sounds like SD cards and something like FAT16?

evanh · 2022-02-08 15:11

@"Mike Green" said:
Thanks evanh. The "missing" free block is the one that's holding the program you saved.

Yep. Just providing what info I have. FILES doesn't like something.

Given the overhead of the interpreter, there's a point where speeding up SPI I/O or any few things makes little difference in overall execution speed. In addition, a "high performance" driver for flash memory should have a decent wear leveling algorithm, subdirectories, and good behavior on power failure. Hmmm ... sounds like SD cards and something like FAT16?

SD cards will do wear levelling at the block level irrespective of filesystem in use. It's definitely not a FAT16/32 feature.

Mike Green · 2022-02-08 18:46

INPUT appears to work properly with more than one variable. FILES also seems to work properly

Yes, SD cards do wear levelling at the block level. Some filesystems like FAT16 provide for subdirectories. Journalling file systems provide one way of protecting from power glitches and the like. I'm just saying that this filesystem is very primitive, but probably appropriate for a little Basic interpreter or an application that may benefit from overlays or data logging.

Mike Green · 2022-02-08 18:47

Oops ... here's the archive.

evanh · 2022-02-09 01:44

Solved, the highest 64 kB chunk of the EEPROM mostly contained zeros. Once I erased that area, then FILES behaved.

Here's my diagnostic printout before erasing:

  *** FreeMap ***
 00000000 ffffff00 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff fffffeff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff 0000ffff

  *** Used File Headers ***
    Header 1,064
 74736574 ffffffff 00736162 ffffffff 50203031 544e4952 65682220 226f6c6c 30320a0d 54454c20 3d207920 0a0d3220
    Header 4,080
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    Header 4,081
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,082
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,083
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,084
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,085
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,086
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,087
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,088
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,089
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,090
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,091
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,092
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,093
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,094
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
    Header 4,095
 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000

Mike Green · 2022-02-09 01:53

When the first 12 bytes of a 4K block are zeros, FILES thinks that this is the first block in a file with zeros as its name. That'll display as spaces

evanh · 2022-02-09 01:57

It certainly did that alright. Never seemed to end though.

Mike Green · 2022-02-09 03:23

Some examples of what this FemtoBasic implementation can do:

90 OPEN "testdata",w
100 FOR i=0 TO 255
110 LET p=3.1415926
120 LET a=SIN ((i/256.0) * p)
130 LET b=COS ((i/256.0) * p)
140 WRITE i;", ";a;", ";b
150 NEXT i
160 CLOSE
170 END

This creates (or replaces) a file "testdata" with 256 sets of values separated by commas ... i, the number of the line of values, the sine of (i x pi) / 256, and the cosine of (i x pi) / 256.

100 OPEN "testdata",r
110 READ i,a,b
120 PRINT i,a,b
130 GOTO 110

This reads the "testdata" file and PRINTs each line of data. The program gets an error after the last line of data is read. The error can be avoided by reading a fixed number of lines when that is known.

100 OPEN "testdata",r
105 FOR i=0 TO 255
110 READ i,a,b
120 PRINT i,a,b
130 NEXT i
140 END

evanh · 2022-02-12 09:54

This is for my posterity as anything ... What triggered my attention was a diagram that Chip/Jeff has drawn up in the new draft hardware doc ... I became a little puzzled with the details of why so many sysclock ticks of lag was effective. And it is. But turns out my inline code comments weren't very precise as to why. I had proven to myself that it worked but without full understanding.

After some more testing I've now clarified the relationship between instructions and I/O stages. Turns out the staging latencies introduce an even number of ticks from OUT to TESTP, namely 8 (Without registered I/O). Whereas OUT to IN is another extra tick on top, making 9.

So, not the 13 I had mentioned in the old source code but it's still helpful to use all available time of an SPI clock period. Eg: Without factoring in SPI device response time, just the Prop2 I/O pins themselves need +2 ticks when above 320 MHz sysclock.

Therefore, the stance is use everything available. Which is 8 (min) + 6 = 14 ticks. Anything more and the +6 will step into the next 8-tick SPI clock period. Well, 8+7 is theoretically doable but that obviously doesn't suit the instruction intervals.

Here's the same code but with better commenting:

' Bit-bashed SPI byte receiver, CPHA=1 (SPI clocking modes 1 and 3)
        outnot  CLK         ' I/O lag start
        nop             ' 2
        outnot  CLK         ' 4  First clock edge now at physical pin

        rep @rx_rend, #7        ' 6
        outnot  CLK         ' 8  (CPHA=1) min TESTP rx lag, +1 if using IN reg
        rcl sD, #1          ' 10    As well as internal latencies, externals also need to be
        outnot  CLK         ' 12    allowed for when at high frequencies.  So use all spare ...
        testp   DO  wc          ' 14    +6 to minimum lag (Eight ticks per bit)
rx_rend

evanh · 2022-02-12 10:05

Smartpins and streamers are different again ...

My prior testing of smartpins suggests the combined input to output smartpin response is 4 ticks, including a tick for processing. So that'll be one staging register each for input and output routing in the silicon. Quite favourable to the 8 or 9 ticks for a cog.

evanh · 2022-02-12 11:11

Some more details:
For just pin output, at low sysclock frequency, an unregistered output pin will transition 3 ticks after OUTx instruction has completed. Under same conditions, the pin will transition 1 tick after a smartpin does the same.

Pin input, at low sysclock frequency, always has one tick added for input settling ahead of the front register. So 1+4 ticks from pin to TESTP, 1+5 to IN. And 1+1 from pin to smartpin.

You'll note that's a total of only 3 ticks for combined smartpin I/O. To get the needed 4, add one for smartpin processing.

PS: OUT being shorter latency than IN is something of an illusion. Those numbers are all referenced to instruction completion. So that skews the numbers. There is three I/O routing stages each way for the cogs.

PPS: Actually, it's really hard to know these internal relationships with instruction execution. My numbers above could easily be out by one so not skewed at all. ie: It could take an extra tick to get out and one less tick to arrive in. Doesn't change the round trip, which is all that really matters.

Flash file driver for P2

Comments