I'm so happy, I got my Rev A P2 board to work with flexprop. I was thinking it was banned to the paper weight pile.
There is no real documentation that says what will not work on a Rev A chip versus a Rev B/C.
I knew it has something to do with SETQ not working correctly. So after a lot of hunting and pecking I got it at least to run some code.
It's not 100% but the change was small and seems to work for the most part. I will have to try some other programs and see how it works.
Still have issues loading code onto the boards as it fails quite often with checksum errors and timeouts. I think it's because my computer runs to fast or something. Tried to rebuild loadp2 to see if I could add some pauses or something. It kind off errors out with not being able to find fastspin.
I guess I can continue to use my CSharp loader as it works everytime.
There is a change in the streamer modes, so rev A codes these in the instruction set different to revB/C. And there is also a change on PTRsomething to allow different ranges for relative addressing.
There was a cmdline switch in fastspin, but @eric said he will not support rev A any longer since there are just about 120 out in the wild. Understandable and not really a problem since the real P2 is out now.
rev B to C is just a change for the DAC ADC circuit, I think no changes to the instruction set.
Your loading issue is weird, I have 3 revA and some revB and had never any loading problems. But I do supply power to both USB on the eval's via a powered USB Hub. Maybe give that a try?
I'm so happy, I got my Rev A P2 board to work with flexprop. I was thinking it was banned to the paper weight pile.
If you're able to get it working reliably I'd be happy to accept a pull request to restore Rev A support; it's just not something I can support myself.
Still have issues loading code onto the boards as it fails quite often with checksum errors and timeouts. I think it's because my computer runs to fast or something. Tried to rebuild loadp2 to see if I could add some pauses or something. It kind off errors out with not being able to find fastspin.
I've updated the loadp2 Makefile to use "flexspin" instead of "fastspin".
Does "loadp2 -single" work for you? It uses the slow single-stage loader, which should be fine as long as you don't need any special loader features (like loading at a different starting address).
The Pushregs and Popregs which is used all over the place is the main problem for Rev A chips. Rev A does not increment the pointer register with multi-read/multi-write set. It only increments it by 4 instead of by the number of moves.
So in "outasm.c" the following lines don't work on Rev A chips.
Version 5.0.2
- Added "noinline" attribute
- Added a warning for some non-terminating loops
- Fixed a compiler warning in ff.cc
- Fixed a bug with returning structures in C
- Fixed an optimizer bug with trying to replace ptra++
- Fixed an overly aggressive loop reduction optimization
- Made BASIC open command use "throw" to report errors
Unofficially, it should also work on Rev. A boards again, but that's not an officially supported configuration and I'd urge everyone to upgrade to a board with a Rev. B or Rev. C chip on it.
Does anyone care about separate spin2cpp and flexspin binaries? flexspin binaries may be obtained from the flexprop.zip package, but spin2cpp isn't in there. Should it be?
About that...
I've been using FlexProp with great success.. But haven't had the time to work out how to get it running in Linux. Where does one obtain binary versions of flexspin and loadp2 for Linux?
About that...
I've been using FlexProp with great success.. But haven't had the time to work out how to get it running in Linux. Where does one obtain binary versions of flexspin and loadp2 for Linux?
There are no binary versions of these for Linux. If you check out the source and do "make install" it'll build and install in your $HOME/flexprop directory. You'll need Tcl/Tk, bison, and the usual build tools (gcc, make, etc.)
About that...
I've been using FlexProp with great success.. But haven't had the time to work out how to get it running in Linux. Where does one obtain binary versions of flexspin and loadp2 for Linux?
There are no binary versions of these for Linux. If you check out the source and do "make install" it'll build and install in your $HOME/flexprop directory. You'll need Tcl/Tk, bison, and the usual build tools (gcc, make, etc.)
Yep, it's really easy. Unlike most C programs I've seen, attempting to compile flexspin (IDK about the GUI) does not suck all the lifeforce out of your soul (this isn't even a joke, that is legit the feeling CMake invokes in me), it just works.
About that...
I've been using FlexProp with great success.. But haven't had the time to work out how to get it running in Linux. Where does one obtain binary versions of flexspin and loadp2 for Linux?
There are no binary versions of these for Linux. If you check out the source and do "make install" it'll build and install in your $HOME/flexprop directory. You'll need Tcl/Tk, bison, and the usual build tools (gcc, make, etc.)
Yep, it's really easy. Unlike most C programs I've seen, attempting to compile flexspin (IDK about the GUI) does not suck all the lifeforce out of your soul (this isn't even a joke, that is legit the feeling CMake invokes in me), it just works.
Its as easy as finding 'flexspin' source code... its right... where?
Seriously, you underestimate the depth of my ignorance in this matter. Life force is about 70% at this point...
All the source is available in the github flexprop repository... https://github.com/totalspectrum/flexprop
As ersmith stated, you can easily build the whole project (i.e. GUI, all the binaries for compiling & loading) on WIN10, Linux, macOS...
I suggest (on Linux or macOS) something like:
$ mkdir source
$ cd source
$ git clone --recursive https://github.com/totalspectrum/flexprop.git
$ cd flexprop
$ make clean # only if you had previously run make
$ make install
Note: '--recursive' allows git to include all of the projects sub-modules (PropLoader, loadp2, & spin2cpp).
If successful, you'll find a new flexprop directory in your home directory, where you'll find the flexprop.tcl GUI, tools & sample code:
$ cd ~/flexprop
$ ls
License.txt bin doc include src
README.md board flexprop.tcl samples
$ ls bin
fastspin flexcc flexspin loadp2 mac_terminal.sh proploader
Use flexprop.tcl if you like or the tools in the bin directory from the Linux/macOS command line. It's "all there".
Note: flexprop may have some requirements for successful building... If make install does not work, you can probably get help here in the forum...
Plan9 question...
Trying to load up file from PC using Plan9.
This "fread" function doesn't work in USE_HOST mode. Worked in USE_SD mode.
I replaced it with fgetc to make it work.
Also, why is Plan9 so slow? Would increasing baud somewhere make it faster?
//bytesRead = fread(imbuff,1,blocklen,pfile); /* copy the data into pbuff and then transfter it to command buffer */
bytesRead = 0;
for (i = 0; i < blocklen; i++)
{
imbuff[i] = fgetc(pfile);
bytesRead++;
}
Beginning to do useful things with flexbasic.
I noticed that your assembler is clever enough to use fcache to speed things up.
If I understand it correctly your program looks for loops it can put into fcache- but not all loops- there must be rules determining what is allowed. Perhaps loops without calls?
What about for/next loops?
An example
sub spiout(dval as ulong,bits as ubyte) 'do/loop twice as fast
dim sp1 as ulong
sp1=1<<(bits-1) 'bit mask msbit first
do
if (dval and sp1) = 0 then
output(dt) = 0
else
output(dt)=1
endif
sp1=sp1>>1 'put it here to allow a bit of data setup time
output(lck)=1 'strobe data in
output(lck)=0
loop until sp1=0
end sub
@Rayman: There's a mistake in the 9p calculation for breaking large reads/writes up. So for now you'll have to stick to reading or writing <= 1024 bytes at a time. (The bug is fixed in github if you want to use bigger I/O)
The 9p file I/O goes over the serial line, so yes, increasing the baud rate will improve the performance. 2 megabaud seems to work fine.
@tritonium : Yes, for loops will be placed into the cache. The loop cannot contain any branches to outside the loop (including calls). So no subroutine calls or GOTO statements inside the loop. Also, the loop has to fit: on P2 that's not too much of a restriction since 1K of LUT is available, but on P1 it's an issue.
Also, you can further optimize your SPI code by testing for bit 31 or bit 0 inside your loop -- that way the compiler can use the carry bit from the shift. So for MSB first output either reverse the bits before the loop and shift right, or else shift the data up before the loop so that the next bit you need to shift out is at bit 31.
That's brilliant!
How on earth does your compiler manage that!
I have many loops calling spiout, I might try pasting the spiout code into those loops and see the difference.
That's brilliant!
How on earth does your compiler manage that!
I wrote it to take advantage of the Propeller's features, and efficient bit-banging was a high priority .
I have many loops calling spiout, I might try pasting the spiout code into those loops and see the difference.
You could try that, although honestly I suspect most of the time is spent in the spiout loop itself, so as long as that loop gets fcached then it's not as big a deal if the outer ones don't.
Another thing you can do with small, frequently accesssed functions (like spiout) is to explicitly place them in LUT or COG memory. Doing this too much will overflow, but for a few key functions it may be worth while.
Also, the loop has to fit: on P2 that's not too much of a restriction since 1K of LUT is available, but on P1 thats an issue...
This begs a follow-up question. Assume for the moment I have just under 1k of loop code that gets placed into the LUT.... and a half dozen other similar loops that are each just under 1k in size. Does the compiler swap these in and out at runtime as needed, or is this fixed? (ie, the allocation of that 1k is static, and whatever winds-up there first stays there with no swapping?)
flexprop .spin2 bug or poor programming on my part (using flexprop 5.0.3 Beta on macOS)
Have I missed something very simple in this code? The two arrays should print out with different results (I think). The X array of longs gets overwritten with the Y array, in this example:
CON
_clkfreq=200_000_000
rx_pin = 63
tx_pin = 62
baud = 230_400
VAR
long X[10]
long Y[10]
OBJ
ser: "spin/SmartSerial"
PUB Main()| i
ser.start(rx_pin, tx_pin, 0, baud)
ser.printf("Broken array problem\n\n")
repeat i from 0 to 9
long[X][i] := i
repeat i from 0 to 9
long[Y][i] := i + 103
repeat i from 0 to 9
ser.printf("%d x: %d ", i, long[X][i])
ser.printf("%d y: %d \n", i, long[Y][i])
repeat
Well, there may be a problem in FlexProp. I couldn't get it to work there, so I moved it over to Propeller Tool. It's okay there with the syntax I suggested.
I must have done something wrong. This works. Is there a reason you don't simplify to named arrays, e.g.
PUB main() | i
ser.tstart(baud)
ser.fstr0(string("Broken array problem\n\n"))
repeat i from 0 to 9
X[i] := i
Y[i] := i + 103
repeat i from 0 to 9
ser.fstr3(string("%d x: %d y: %d\n"), i, X[i], Y[i])
repeat
waitct(0)
Also, the loop has to fit: on P2 that's not too much of a restriction since 1K of LUT is available, but on P1 thats an issue...
This begs a follow-up question. Assume for the moment I have just under 1k of loop code that gets placed into the LUT.... and a half dozen other similar loops that are each just under 1k in size. Does the compiler swap these in and out at runtime as needed, or is this fixed? (ie, the allocation of that 1k is static, and whatever winds-up there first stays there with no swapping?)
The fcache code is swapped out as needed. Only the most recently used loop is kept in there (so there's only one loop at a time in the cache).
Is there a reason you don't simplify to named arrays, e.g.
That code was just an example, that I quickly put together to show the problem. The real code has a single named array (named 'buffer' in both C and Spin2) that I send to C code (thanks to flexprop's ability for multi-language projects) that builds the array up with bezier curve coordinates. I need the C code because I couldn't write a bezier function with integers in Spin2
I think the problem was within the C function... I was using the input array pointer as an indexed array. When I changed that code to use pointer arithmetic, the returned array was correct.
C code:
*buffer++ = x ;
*buffer++ = y;
I now retrieve the coordinates in Spin and use them like this:
SetPixel(long[buffer][i], long[buffer][i+1],color)
repeat i from 0 to 9
long[X][i] := i
repeat i from 0 to 9
long[Y][i] := i + 103
repeat i from 0 to 9
ser.printf("%d x: %d ", i, long[X][i])
ser.printf("%d y: %d \n", i, long[Y][i])
repeat
dgately
In Spin if "X" is an array then just plain "X" in an expression is equivalent to "X[0]". (That's *very* different from C, of course!). So
long[X][i] := i
is the same as
long[X[0]][i] := i
which is probably not what you intended. As Jon mentioned, you more likely wanted
Comments
There is no real documentation that says what will not work on a Rev A chip versus a Rev B/C.
I knew it has something to do with SETQ not working correctly. So after a lot of hunting and pecking I got it at least to run some code.
It's not 100% but the change was small and seems to work for the most part. I will have to try some other programs and see how it works.
Still have issues loading code onto the boards as it fails quite often with checksum errors and timeouts. I think it's because my computer runs to fast or something. Tried to rebuild loadp2 to see if I could add some pauses or something. It kind off errors out with not being able to find fastspin.
I guess I can continue to use my CSharp loader as it works everytime.
Mike
There was a cmdline switch in fastspin, but @eric said he will not support rev A any longer since there are just about 120 out in the wild. Understandable and not really a problem since the real P2 is out now.
rev B to C is just a change for the DAC ADC circuit, I think no changes to the instruction set.
Your loading issue is weird, I have 3 revA and some revB and had never any loading problems. But I do supply power to both USB on the eval's via a powered USB Hub. Maybe give that a try?
Mike
The Rev B to C change is only a die track cut that only affects the ADC noise, so for almost everyone you can safely ignore it.
I've updated the loadp2 Makefile to use "flexspin" instead of "fastspin".
Does "loadp2 -single" work for you? It uses the slow single-stage loader, which should be fine as long as you don't need any special loader features (like loading at a different starting address).
So in "outasm.c" the following lines don't work on Rev A chips.
Changing it to this fixes the issue.
since it's a small change it could be changed for all versions of the chip.
I'm sure there are other issues but at least most of the code now works.
Mike
Unofficially, it should also work on Rev. A boards again, but that's not an officially supported configuration and I'd urge everyone to upgrade to a board with a Rev. B or Rev. C chip on it.
I've been using FlexProp with great success.. But haven't had the time to work out how to get it running in Linux. Where does one obtain binary versions of flexspin and loadp2 for Linux?
There are no binary versions of these for Linux. If you check out the source and do "make install" it'll build and install in your $HOME/flexprop directory. You'll need Tcl/Tk, bison, and the usual build tools (gcc, make, etc.)
Yep, it's really easy. Unlike most C programs I've seen, attempting to compile flexspin (IDK about the GUI) does not suck all the lifeforce out of your soul (this isn't even a joke, that is legit the feeling CMake invokes in me), it just works.
Its as easy as finding 'flexspin' source code... its right... where?
Seriously, you underestimate the depth of my ignorance in this matter. Life force is about 70% at this point...
and it puts the binaries into the build directory.
Probably similar for loadp2.
https://github.com/totalspectrum/flexprop
As ersmith stated, you can easily build the whole project (i.e. GUI, all the binaries for compiling & loading) on WIN10, Linux, macOS...
I suggest (on Linux or macOS) something like: Note: '--recursive' allows git to include all of the projects sub-modules (PropLoader, loadp2, & spin2cpp).
If successful, you'll find a new flexprop directory in your home directory, where you'll find the flexprop.tcl GUI, tools & sample code:
Use flexprop.tcl if you like or the tools in the bin directory from the Linux/macOS command line. It's "all there".
Note: flexprop may have some requirements for successful building... If make install does not work, you can probably get help here in the forum...
dgately
Trying to load up file from PC using Plan9.
This "fread" function doesn't work in USE_HOST mode. Worked in USE_SD mode.
I replaced it with fgetc to make it work.
Also, why is Plan9 so slow? Would increasing baud somewhere make it faster?
Beginning to do useful things with flexbasic.
I noticed that your assembler is clever enough to use fcache to speed things up.
If I understand it correctly your program looks for loops it can put into fcache- but not all loops- there must be rules determining what is allowed. Perhaps loops without calls?
What about for/next loops?
An example
It looks like its putting the 'do loop' code between LR__0021 and LR__0022 into the cache for fast execution.
Makes heck of a difference!
Can I write my flexbasic code in a way that I can 'encourage' this behaviour?
Dave
The 9p file I/O goes over the serial line, so yes, increasing the baud rate will improve the performance. 2 megabaud seems to work fine.
Also, you can further optimize your SPI code by testing for bit 31 or bit 0 inside your loop -- that way the compiler can use the carry bit from the shift. So for MSB first output either reverse the bits before the loop and shift right, or else shift the data up before the loop so that the next bit you need to shift out is at bit 31.
That's brilliant!
How on earth does your compiler manage that!
I have many loops calling spiout, I might try pasting the spiout code into those loops and see the difference.
Thanks
Dave
You could try that, although honestly I suspect most of the time is spent in the spiout loop itself, so as long as that loop gets fcached then it's not as big a deal if the outer ones don't.
Another thing you can do with small, frequently accesssed functions (like spiout) is to explicitly place them in LUT or COG memory. Doing this too much will overflow, but for a few key functions it may be worth while.
I changed the loadp2 command line to have the new baud.
Added this:
and then added this in the startup:
But, it doesn't appear to work... Actually, it does work...
This begs a follow-up question. Assume for the moment I have just under 1k of loop code that gets placed into the LUT.... and a half dozen other similar loops that are each just under 1k in size. Does the compiler swap these in and out at runtime as needed, or is this fixed? (ie, the allocation of that 1k is static, and whatever winds-up there first stays there with no swapping?)
Have I missed something very simple in this code? The two arrays should print out with different results (I think). The X array of longs gets overwritten with the Y array, in this example:
dgately
The fcache code is swapped out as needed. Only the most recently used loop is kept in there (so there's only one loop at a time in the cache).
I think the problem was within the C function... I was using the input array pointer as an indexed array. When I changed that code to use pointer arithmetic, the returned array was correct.
In Spin if "X" is an array then just plain "X" in an expression is equivalent to "X[0]". (That's *very* different from C, of course!). So is the same as which is probably not what you intended. As Jon mentioned, you more likely wanted