Okay. Here is the intermediate release. Its documentation will be getting updated over the next few days. So, at first, you are on your own, needing to infer things from the instruction list in Prop2_Docs.txt, which IS accurate. Of course, you can ask any questions here on the forum and I'll answer them.
The DE0-Nano FPGA configuration lacks the following, in order to make one cog fit:
CTRB - 2nd counter (CTRA is complete, including waveform output)
SERB - 2nd serial port
CORDIC - qsincos/qarctan/qrotate/qlog/qexp
PIX - texture mapping, pixel blending
I'm feeling like maybe I need a DE2... Can someone please remind me: How much HUB RAM do you get with DE0 and DE2?
I seem to recall that DE0 was something like 32kB, but I don't remember what DE2 gives you...
I'm compiling the DE0-Nano now. I had to remove the PIX, CORDIC, CTRB, SERB, and waveform output from CTRA to get things to fit. If this compile works, I'll try putting the waveform output back in for CTRA. Anyway, you need the VID and XFR blocks to do multithreaded VGA w/SDRAM, and those blocks are both in the DE0-Nano compile. For the SDRAM writing/reading, you'll need to be single-tasking because that requires exact timing.
REPD #n,#i seems to be an exact subset of REPS #n,#i - if I am correct, that frees up a double argument instruction.
It isn't really a true subset:
Prop2_Docs.txt from 26Jan14, starting at line 1200:
REPS differs from REPD by executing at the 2nd stage of the pipeline, instead of the 4th. By
executing two stages earlier, it needs only one spacer instruction *. Because of its earliness,
no conditional execution is possible, so it is forced to always execute, allowing the CCCC bits
to be repurposed, affording a contiguous 16-bit constant for the repeat count.
Prop2_Docs.txt from 26Jan14, starting at line 1200:
REPS differs from REPD by executing at the 2nd stage of the pipeline, instead of the 4th. By
executing two stages earlier, it needs only one spacer instruction *. Because of its earliness,
no conditional execution is possible, so it is forced to always execute, allowing the CCCC bits
to be repurposed, affording a contiguous 16-bit constant for the repeat count.
Functionality is the same, so the only difference I can see is needing two extra spacers, which I was aware of when I posted it.
That is not an advantage as far as I am concerned... so it is a candidate for the chopping block.
Removing REPD would not free up a complete dual operand instruction.
It shares opcode space with a lot of other instructions that start with "1111111 00" and then use part or all of S as an additional part of determining the actual instruction.
Maybe Bill is suggesting removing the REPS codepoint. The drawback is that we would need three filler instructions instead of one, and REPD #D,#S does not allow for as many iterations as REPS. The processor could recognize an unconditional REPD #D,#S instruction at stage 2 and treat it like REPS. REPD could also be treated as unconditional at all times to gain 4 more bits for the iteration count. Personally, I prefer separate REPD and REPS instructions.
Maybe Bill is suggesting removing the REPS codepoint. The drawback is that we would need three filler instructions instead of one, and REPD #D,#S does not allow for as many iterations as REPS. The processor could recognize an unconditional REPD #D,#S instruction at stage 2 and treat it like REPS. REPD could also be treated as unconditional at all times to gain 4 more bits for the iteration count. Personally, I prefer separate REPD and REPS instructions.
Bill, if you get rid of REPD you lose the ability to set a variable loop count in a register. Having a variable loop size many not be as useful, but there could be cases where you would want to conditionally include extra code within the loop. This would require a way to avoid executing the extra code when the REPD completes, but that could be done with a conditional jump.
Bill, if you get rid of REPD you lose the ability to set a variable loop count in a register. Having a variable loop size many not be as useful, but there could be cases where you would want to conditionally include extra code within the loop. This would require a way to avoid executing the extra code when the REPD completes, but that could be done with a conditional jump.
2) Making PASM code universally usable across all languages as binary blobs no matter if it was written in a Spin object or in GAS or whatever. Does Spin wrap address that ?
No. And there will never be a universally accepted solution for this. Use what's available.
This is only because the PropTool was born "too simple".
You have hit the nail on the head.
Almost...It's not the PropTool that was born to simple it was the Spin language itself. Spin + included PASM that is.
The structure of which you speak, "spin in one side and pasm in other having a fixed interface" is not a function of the tool used for editing but the very language definition. Don't forget there are many other Spin compilers that do not depend on any GUI tool to use.
Aside: I have noticed many times that users of IDE's, and especially VB, cannot tell the difference between a programming language and the tool they use to edit it.
Now, about those rules or structure...
I think that to get Spin programmers to create PASM code that is usable outside of Spin, with C say, would require only slight modification to the Spin language.
1) Introduce a PASM section. This is basically a DAT section with some extra rules...
2) Make it a requirement that any PASM code to be launched into a COG has to be in a PASM section.
3) Code on a PASM section cannot access any variables defined in Spin VAR blocks or DAT sections by name. This removes dependency of PASM from Spin data layout.
4) Spin code cannot access any locations in a PASM section, at least by name. This prevents initializing of LONGs and such prior to loading to COG. Data that C would not know about.
5) Due to 4) all parameters passed to PASM prior to launch are to go through pointers passed in PAR.
6) All communication between a running PASM code and Spin go through shared HUB as normal, using pointers passed through PAR.
With this in place there is no symbolic linkage between PASM and Spin, except for the starting address which Spin needs. With no linkage the resulting code can be compiled buy tools such as openspin, or BST into a binary blob that is usable by any language.
However, I suspect the Spin community would hate this idea.
Perhaps you even don't need a publicly accessible obex. It can be an integration (repository) of the multi-platform IDE, it will be nice if it can have the possibility to keep in sync with a local copy (in the case internet is not available in the place of development). Publication of new objects will be trough the IDE when on-line.
Ouch!, no, the OBEX has to be a publicly accessible resource. That's a wonderful thing.
I personally think the whole thing should be moved to github but that's another story.
But yes, I think being able to explore OBEX and pull down objects dropping them straight into your project (or local library) in an IDE is a great idea.
I'm compiling the DE0-Nano now. I had to remove the PIX, CORDIC, CTRB, SERB, and waveform output from CTRA to get things to fit. If this compile works, I'll try putting the waveform output back in for CTRA. Anyway, you need the VID and XFR blocks to do multithreaded VGA w/SDRAM, and those blocks are both in the DE0-Nano compile. For the SDRAM writing/reading, you'll need to be single-tasking because that requires exact timing.
Why can't you just give us a slower DE0-nano instead of redacting so much functionality? I'd be happy with a 40-60Mhz DE0 if it meant a full implementation of the core.
Very similar to Heater's proposal to above, but slightly simpler and thus more likely to be adapted
From the latest docs:
COGNEW D, S/#
COGINIT D, S/#, #0..7
For COGNEW, D specifies a long address in hub memory that is the start of the program that is to be
loaded into the idle cog, while S is a 18-bit parameter (usually an address) that will be conveyed to
the PTRA's of that cog.
The PTRB's of that cog will be set to the start address of its new program in
hub memory, which is the same as the D value used in the COGNEW instruction, AND'd with $3FFFC to
form a hub long address.
AUGS could be used to point to the parameter block with #.
The problem is that D is a register - a cog register - which pretty much requires the cog image to be patched. Except....
If SPIN is disallowed from accessing labels inside the programs code, then binary objects can be written that will work the same with Spin and C.
Spin & C can set up initial data in the parameter section, just like Spin currently does in the cog image - but this time, as the cog image is not modified, the same blob would work fine with C or SPIN.
Note the above does not care if the cog will be running in cog or hubexec mode!
A binary driver would just have to publish the definition of its parameter block, the start of which would have to be long aligned.
This also means that we can free up two dual-reg opcodes, and use argumentless opcodes for COGNEW and COGINIT ... forcing the PTRA/PTRB usage using LOCPTR. Unless I am mistaken, this will actually require less logic to implement than the current COGNEW/COGINIT!
If SPIN is disallowed from accessing labels inside the programs code, then binary objects can be written that will work the same with Spin and C.
I don't see how you do that without introducing something like a PASM section into the language. After all Spin does need DAT for other purposes some times.
Doesn't the linkage have to be broken both ways? Code in a PASM section should not know about where named Spin variables are.
I've modified PASM programs to work with C by moving all of the variables that are initialized by Spin code to the beginning of the PASM code. The first line of the PASM code just jumps over the variables. This allows C to set up PASM variables just like Spin does. As an example, here are the first few lines of a VGA driver written in PASM.
org 0
initialization jmp #skipover
directionState long 0
videoState long 0
frequencyState long 0
numTileLines long 0
numTileVert long 0
visibleScale long 0
invisibleScale long 0
horizontalLongs long 0
tilePtr1 long 0
tileMap1 long 0
pixelColorsAddress long 0
syncIndicatorAddress long 0
skipover
mov vcfg, videoState ' Setup video hardware.
mov frqa, frequencyState
movi ctra, #%0_00001_101
Currently, with the boot loader and PNUT and there not being SPIN in ROM, we are writing and loading pure PASM that has essentially decoupled from Spin. With a few changes, lose the CON section and replace with "label EQU value" and remove the DAT, the code is pure PASM looking like almost any other ASM. Your PASM2 code no longer needs to have a one line Spin program wrapping it doing a COGINIT/COGNEW. The bootloader takes care of that. (Yay!)
We are creating loadable blobs. Earlier, someone (sorry, forgot who) suggested removing the zero fill from the front of the binary to make it a more relocatable blob.
When OpenSpin gets the Spin2/PASM2 spec or even with Pnut, how much harder would it to finish the decoupling? Put in a $PASM_STRICT directive to give you PASM decoupled from SPIN. You the PASM programmer are responsible for properly using whatever you expect passed in the PAR register. Your variable namespace is the simple PASM namespace of offsets from ORG or ORGH. If you do not specify $PASM_STRICT, you get the wonderfully blended marriage of Spin and PASM that everyone has grown up with.
- mine does not directly state your (3), I assumed it - my bad - due to defining these binary modules to be independent of anything external, except the parameter block
- I am proposing modifying COGINIT/COGNEW, simplifying them, by using LOCPTRA & LOCPTRB, instead of D&S, for the code pointer and parameter pointer (this frees two full dual op codes)
- not modifying the instructions for LOCPTRA/LOCPTRB gives some issues for specifying D
- explicit LOCPTRA/B loading should remove some logic from COGINIT/COGNEW
So it is mostly the same (very similar), except for explicit LOCPTRA/LOCPTRB, simplifying COGINIT/NEW logic, and eliminates the D issue
Basically, it is additional suggestions to your proposal
I don't see how you do that without introducing something like a PASM section into the language. After all Spin does need DAT for other purposes some times.
Doesn't the linkage have to be broken both ways? Code in a PASM section should not know about where named Spin variables are.
I was expecting to re-purpose ORG / ORGH for the start of such a block, but we could have "PASM" or "DRIVER" or "MODULE" keyword for that.
Absolutely, the linkage has to be broken both ways, limited to just the Parameter block - which is public.
The whole idea was to totally decouple drivers from the language... be it Spin, C, Forth etc.
Re: SPIN and PASM should not know where variables are.
No. That tight marriage, particularly with inline PASM needs to be there. If somebody isn't interested in interoperating with other languages, having SPIN behave as it does right now is a huge benefit. I am opposed to breaking that for everyone.
Offering an option, such as STRICT, or INTEROP makes sense, because those who are interested get help making sure that all happens.
Right now, SPIN + PASM is seriously powerful precisely because those limits aren't there.
@Bill: One case is inline. With common labels, inline can work extremely lean. One can quite literally, type the block, using the same names, variables, addresses, the works. Doing this is super easy, and for people wanting to use some PASM, it is a no brainer to enter in just the bit they want to use and continue on. Originally, the snippet idea was to handle that case, but now that we have HUBEX, we can now just do it inline, quickly, easily.
Reuse is a great reason to author things intended for general consumption. We all benefit from the effort and I'm for ways to help people do that when doing so is their intent.
But reuse is not always the primary case, which is exactly why SPIN should do what it does now.
Think about UNIX. Some command line tools are written in the classic reuse way. They can be piped, redirected, whatever. Others are written to get stuff done, and they may not work in those classic ways.
Would we break UNIX just to insure that anything people authored would be usable no matter what?
No, of course not. What we did do is make it easier to do that, but we didn't lock people into having to do that.
Same case here.
Let's get SPIN and PASM out the door as the flexible, powerful thing it is. Then we can add on and assist people with common sense options to address the "needs to work with C" use cases. Building it in takes away from the "who cares?" use case, and making people have to care is a sure fire way to insure they really don't. Ever.
Can't PNut cut filling 0000000 in bin file to first ORGH position --->
That will give simpler make separately loaded modules.
It was this post about the zero fill that I couldn't remember. ORGH may not fill the gaps with zero but the front end appears to be filler. If you don't zero fill though in all places when you create the binary, then the boot loader (or blob loader) needs to have the smarts to load pieces in places which means you needs to have headers inside the binary to tell it the next blob is so long and loads at this absolute hub address, no? If you do zero fill, then a binary is truly a binary image of what should be loaded from address 0 to the last address.
Comments
Sorry Rick! I already had the output ready to go and hadn't posted it.
You have the next one!
Cheers
Brian
This time, I will make time to play!
Now I just have to remember how to load my DE2-115... it has been a while.
I seem to recall that DE0 was something like 32kB, but I don't remember what DE2 gives you...
Unless I am missing something, a dual-op instruction can be recovered:
REPD #n,#i seems to be an exact subset of REPS #n,#i - if I am correct, that frees up a double argument instruction.
It isn't really a true subset:
Prop2_Docs.txt from 26Jan14, starting at line 1200:
REPS differs from REPD by executing at the 2nd stage of the pipeline, instead of the 4th. By
executing two stages earlier, it needs only one spacer instruction *. Because of its earliness,
no conditional execution is possible, so it is forced to always execute, allowing the CCCC bits
to be repurposed, affording a contiguous 16-bit constant for the repeat count.
C.W.
That is not an advantage as far as I am concerned... so it is a candidate for the chopping block.
Removing REPD would not free up a complete dual operand instruction.
It shares opcode space with a lot of other instructions that start with "1111111 00" and then use part or all of S as an additional part of determining the actual instruction.
C.W.
Seriously, what is the need for the REPD #i,#n variant, that does exactly the same thing, and requires two extra spacers?
Can anyone think of a single usage case where REPD #i,#n is needed (and REPS #i,#n cannot be used)?
I am not being snarky, I really am curious to see if I am missing a good usage case.
Ok, the freed slot would be limited to a dual op instruction that does not need to be able to specify WC WZ
The freed slot would also need the top 3 bits of S to be 001.
I'm not opposed to removing REPD, just pointing out it doesn't gain a full dual opcode instruction.
If a better use for the slot comes up then we have the option of letting it go.
C.W.
It is still potentially useful for a full D, partial S (0..63) alternate instruction, if needed.
I only want to get rid of the
REPD #i,#n ' constant loop count
variant, and keep the other register based variant, which I agree is extremely useful.
Thanks for the latest bundle!
No. And there will never be a universally accepted solution for this. Use what's available.
Agreed. And like you said, only give up the immediate version.
C.W.
Almost...It's not the PropTool that was born to simple it was the Spin language itself. Spin + included PASM that is.
The structure of which you speak, "spin in one side and pasm in other having a fixed interface" is not a function of the tool used for editing but the very language definition. Don't forget there are many other Spin compilers that do not depend on any GUI tool to use.
Aside: I have noticed many times that users of IDE's, and especially VB, cannot tell the difference between a programming language and the tool they use to edit it.
Now, about those rules or structure...
I think that to get Spin programmers to create PASM code that is usable outside of Spin, with C say, would require only slight modification to the Spin language.
1) Introduce a PASM section. This is basically a DAT section with some extra rules...
2) Make it a requirement that any PASM code to be launched into a COG has to be in a PASM section.
3) Code on a PASM section cannot access any variables defined in Spin VAR blocks or DAT sections by name. This removes dependency of PASM from Spin data layout.
4) Spin code cannot access any locations in a PASM section, at least by name. This prevents initializing of LONGs and such prior to loading to COG. Data that C would not know about.
5) Due to 4) all parameters passed to PASM prior to launch are to go through pointers passed in PAR.
6) All communication between a running PASM code and Spin go through shared HUB as normal, using pointers passed through PAR.
With this in place there is no symbolic linkage between PASM and Spin, except for the starting address which Spin needs. With no linkage the resulting code can be compiled buy tools such as openspin, or BST into a binary blob that is usable by any language.
However, I suspect the Spin community would hate this idea.
I personally think the whole thing should be moved to github but that's another story.
But yes, I think being able to explore OBEX and pull down objects dropping them straight into your project (or local library) in an IDE is a great idea.
Why can't you just give us a slower DE0-nano instead of redacting so much functionality? I'd be happy with a 40-60Mhz DE0 if it meant a full implementation of the core.
From the latest docs:
AUGS could be used to point to the parameter block with #.
The problem is that D is a register - a cog register - which pretty much requires the cog image to be patched. Except....
I think what we need to do is:
If SPIN is disallowed from accessing labels inside the programs code, then binary objects can be written that will work the same with Spin and C.
Spin & C can set up initial data in the parameter section, just like Spin currently does in the cog image - but this time, as the cog image is not modified, the same blob would work fine with C or SPIN.
Note the above does not care if the cog will be running in cog or hubexec mode!
A binary driver would just have to publish the definition of its parameter block, the start of which would have to be long aligned.
This also means that we can free up two dual-reg opcodes, and use argumentless opcodes for COGNEW and COGINIT ... forcing the PTRA/PTRB usage using LOCPTR. Unless I am mistaken, this will actually require less logic to implement than the current COGNEW/COGINIT!
How is your a different proposal?. I don't see how you do that without introducing something like a PASM section into the language. After all Spin does need DAT for other purposes some times.
Doesn't the linkage have to be broken both ways? Code in a PASM section should not know about where named Spin variables are.
Breaking SPIN to make it some sort of feeder to C is a crock of you know what I'm tempted to write here.
Frankly, if SPIN gets "fixed" in that way, some of us will "unfix" it. Just know that.
We are creating loadable blobs. Earlier, someone (sorry, forgot who) suggested removing the zero fill from the front of the binary to make it a more relocatable blob.
When OpenSpin gets the Spin2/PASM2 spec or even with Pnut, how much harder would it to finish the decoupling? Put in a $PASM_STRICT directive to give you PASM decoupled from SPIN. You the PASM programmer are responsible for properly using whatever you expect passed in the PAR register. Your variable namespace is the simple PASM namespace of offsets from ORG or ORGH. If you do not specify $PASM_STRICT, you get the wonderfully blended marriage of Spin and PASM that everyone has grown up with.
But there are some differences.
- your version did not address hubexec code
- mine does not directly state your (3), I assumed it - my bad - due to defining these binary modules to be independent of anything external, except the parameter block
- I am proposing modifying COGINIT/COGNEW, simplifying them, by using LOCPTRA & LOCPTRB, instead of D&S, for the code pointer and parameter pointer (this frees two full dual op codes)
- not modifying the instructions for LOCPTRA/LOCPTRB gives some issues for specifying D
- explicit LOCPTRA/B loading should remove some logic from COGINIT/COGNEW
So it is mostly the same (very similar), except for explicit LOCPTRA/LOCPTRB, simplifying COGINIT/NEW logic, and eliminates the D issue
Basically, it is additional suggestions to your proposal
I was expecting to re-purpose ORG / ORGH for the start of such a block, but we could have "PASM" or "DRIVER" or "MODULE" keyword for that.
Absolutely, the linkage has to be broken both ways, limited to just the Parameter block - which is public.
The whole idea was to totally decouple drivers from the language... be it Spin, C, Forth etc.
Don't we already have a directive that avoids the zero fill? It has been a while since that discussion, but I recall something like:
ORGH $1000
--some data---
ORGH $2000
ORG 0
--a cog image--
etc...
Did we not do that?
I am curious.
Why is it a problem for Spin to modify the variables in the parameter section (PTRA) instead of inside the body of the driver code?
By having a separate parameter section, pointed to by PTRA, we can use the neat PTRA indexed modes to read/write the parameters from the driver.
It makes the driver interfacing so much cleaner.
I am very open to examples of why this would be bad / inferior.
No. That tight marriage, particularly with inline PASM needs to be there. If somebody isn't interested in interoperating with other languages, having SPIN behave as it does right now is a huge benefit. I am opposed to breaking that for everyone.
Offering an option, such as STRICT, or INTEROP makes sense, because those who are interested get help making sure that all happens.
Right now, SPIN + PASM is seriously powerful precisely because those limits aren't there.
@Bill: One case is inline. With common labels, inline can work extremely lean. One can quite literally, type the block, using the same names, variables, addresses, the works. Doing this is super easy, and for people wanting to use some PASM, it is a no brainer to enter in just the bit they want to use and continue on. Originally, the snippet idea was to handle that case, but now that we have HUBEX, we can now just do it inline, quickly, easily.
Reuse is a great reason to author things intended for general consumption. We all benefit from the effort and I'm for ways to help people do that when doing so is their intent.
But reuse is not always the primary case, which is exactly why SPIN should do what it does now.
Think about UNIX. Some command line tools are written in the classic reuse way. They can be piped, redirected, whatever. Others are written to get stuff done, and they may not work in those classic ways.
Would we break UNIX just to insure that anything people authored would be usable no matter what?
No, of course not. What we did do is make it easier to do that, but we didn't lock people into having to do that.
Same case here.
Let's get SPIN and PASM out the door as the flexible, powerful thing it is. Then we can add on and assist people with common sense options to address the "needs to work with C" use cases. Building it in takes away from the "who cares?" use case, and making people have to care is a sure fire way to insure they really don't. Ever.
It was this post about the zero fill that I couldn't remember. ORGH may not fill the gaps with zero but the front end appears to be filler. If you don't zero fill though in all places when you create the binary, then the boot loader (or blob loader) needs to have the smarts to load pieces in places which means you needs to have headers inside the binary to tell it the next blob is so long and loads at this absolute hub address, no? If you do zero fill, then a binary is truly a binary image of what should be loaded from address 0 to the last address.