I really like how you've done Prop2_Docs! Even if a more verbose manual gets written someday, or if an ultra-terse quick reference guide is produced, I hope that this bridge between the two will continue to be supported and preserved.
It would take too long to explain this instant love affair in detail. So I'll just say that it's perfect.
Here is the latest P2 Instruction Set summary (Jan 28th)... (please note not all condition modifiers WZ and WC are correct for all instructions per line)
Is there a collision in your P2 instruction set summary document for the following two instructions groups or am I just reading it wrong? Something doesn't seem right here.
UPDATE: Never mind, I went to Chip's latest docs and see that the I bit has to be a 0 for LOCINST (only) and a 1 for REPS even though both instructions use the exact same 01 pattern in the two ff bits. This was just not readily apparent in the summary document which had temporarily confused me.
Is there a collision in your P2 instruction set summary document for the following two instructions groups or am I just reading it wrong? Something doesn't seem right here.
UPDATE: Never mind, I went to Chip's latest docs and see that the I bit has to be a 0 for LOCINST (only) and a 1 for REPS even though both instructions use the exact same 01 pattern in the two ff bits. This was just not readily apparent in the summary document which had temporarily confused me.
Yes, there are a couple of places where the overlap has to be inspected closely to see what is really happening - Chips tricks
The push and pop instructions actually only use 1 of the block of 4 and the other 3 are free.
I really like how you've done Prop2_Docs! Even if a more verbose manual gets written someday, or if an ultra-terse quick reference guide is produced, I hope that this bridge between the two will continue to be supported and preserved.
It would take too long to explain this instant love affair in detail. So I'll just say that it's perfect.
Thank you!
That makes me feel relieved, because I try to write them so they get the point across with brevity, so things don't get lost. They are insufficient in some regards, but that will get fixed a little later.
Those have changed, and have been updated in the latest .zip file. Thanks for finding these things. The newest Pro2_Docs.txt has all the fixes you found, plus some others.
Ah what? That doesn't compute to what I was linking. Chip, I think you might have replied to the wrong person. I was replying to David about allowing COGINIT to start a Cog directly to HubExec mode without any pre-loading of the Cog.
Cluso99, if you look in the latest .zip file, I got this straightened out. It uses all eight WIDE registers. I had to add a bit to the XFR counter.
Chip, this is from the latest docs (28th). It is the wide to pins direction - the two examples show only 4 longs are transferred either as 4 longs or 8 words, not 8 or 16.
BTW The P2 is looking amazing now. I cannot believe how much the P2 has advanced since Thanksgiving! You have really done a magnificent job!
There were a couple of errors in the Insruction Summary posted above, so here is an update... (flags are not correct for some instructions on the same line)
Chip, this is from the latest docs (28th). It is the wide to pins direction - the two examples show only 4 longs are transferred either as 4 longs or 8 words, not 8 or 16.
BTW The P2 is looking amazing now. I cannot believe how much the P2 has advanced since Thanksgiving! You have really done a magnificent job!
Thank you all for noticing these things. I'll review the docs today and fix all these things. I'm glad you guys understand this all well enough to notice problems like these. There are going to be phase errors as things evolve.
Chip, of course there are going to be errors - we are all human! It is a great credit to you that there are so few errors, and the explanations are quite clear withou being verbose.
Now I have been thinking (dangerous I know ), spin2 may as well use hubexec mode, freeing up cog ram and aux ram for variables and stack space.Here I mean the interpreter. The pasm code shouldbe able to run as reentrant code (a term used in the mini I worked on that permitted multiple processors to concurrently execute the same shared code). What I mean here is that multiple cogs can execuute the same pasm hubexec spin interppreter code concurrently without any conflicts.
...Now I have been thinking (dangerous I know ), spin2 may as well use hubexec mode, freeing up cog ram and aux ram for variables and stack space.Here I mean the interpreter. The pasm code shouldbe able to run as reentrant code (a term used in the mini I worked on that permitted multiple processors to concurrently execute the same shared code). What I mean here is that multiple cogs can execuute the same pasm hubexec spin interppreter code concurrently without any conflicts.
Sorry for typos - on xoom
Absolutely! This will free up lots of cog RAM space. We can also run four hardware threads of interpreted Spin code, per cog.
Hard to imagine it won't do byte codes. Inline PASM will be there for higher speed cases.
I was thinking that you could have Spin subroutines employ a keyword in their declaration to force real compilation for when the speed is really needed. You could even do the same at the object instantiation level, to compile everything in that object.
I was thinking that you could have Spin subroutines employ a keyword in their declaration to force real compilation for when the speed is really needed. You could even do the same at the object instantiation level, to compile everything in that object.
Excellent.
Sounds a lot like the HUBTEXT keyword in Propeller-GCC which is used to force XMM code into HUB RAM for functions that need speed such as serial IO which saves a COG.
I assume inline PASM would be treated in a similar way.
I was thinking that you could have Spin subroutines employ a keyword in their declaration to force real compilation for when the speed is really needed. You could even do the same at the object instantiation level, to compile everything in that object.
This sounds excellent. I Spin compiler that can produce native code would be great. You get speed where you need it and code density when speed doesn't matter as much. Sounds like a good combination.
I recall that this idea was discussed a few years ago in connection to SpinLMM. I proposed variations of the PUB and PRI keywords that would cause the entire method to be compiled to LMM PASM instead of Spin byte codes. The PASM code would have access to cog RAM so it could use PBASE, VBASE and DBASE to access Spin variables. Of course, this could be done inline within a method as well, but the PASM code will need to be careful about how it pushes and pops stuff off of the Spin stack.
I found the thread that I referred to in my previous post. It's a thread from about 4 years ago at http://forums.parallax.com/showthread.php/119695-Suggestion-for-improving-Spin-performance . However, the idea there was to compile Spin code into PASM that would run in a separate cog. I suggested using the keywords CPUB, CPRI, CDAT and CVAR to generate instructions and data that would reside within a Cog. I actually began writing a compiler that would generate the PASM code. Maybe something like this could be used to generate HUB Exec code, or maybe a compiler directive would work better.
Comments
I think it is bug in this last info You posted
I noted in the updated docs that Pin Transfer (wide to pins) only transfers 4 longs (not 8). Is this correct?
(Lines 2656+ and 2669+)
I really like how you've done Prop2_Docs! Even if a more verbose manual gets written someday, or if an ultra-terse quick reference guide is produced, I hope that this bridge between the two will continue to be supported and preserved.
It would take too long to explain this instant love affair in detail. So I'll just say that it's perfect.
Thank you!
(please note not all condition modifiers WZ and WC are correct for all instructions per line)
InstructionSet_20140128.spin
Is there a collision in your P2 instruction set summary document for the following two instructions groups or am I just reading it wrong? Something doesn't seem right here.
UPDATE: Never mind, I went to Chip's latest docs and see that the I bit has to be a 0 for LOCINST (only) and a 1 for REPS even though both instructions use the exact same 01 pattern in the two ff bits. This was just not readily apparent in the summary document which had temporarily confused me.
The push and pop instructions actually only use 1 of the block of 4 and the other 3 are free.
Cluso99, if you look in the latest .zip file, I got this straightened out. It uses all eight WIDE registers. I had to add a bit to the XFR counter.
That makes me feel relieved, because I try to write them so they get the point across with brevity, so things don't get lost. They are insufficient in some regards, but that will get fixed a little later.
Ah what? That doesn't compute to what I was linking. Chip, I think you might have replied to the wrong person. I was replying to David about allowing COGINIT to start a Cog directly to HubExec mode without any pre-loading of the Cog.
BTW The P2 is looking amazing now. I cannot believe how much the P2 has advanced since Thanksgiving! You have really done a magnificent job!
It is supposed to be @
DJNZ/DJNZD are now relative jumps.
(flags are not correct for some instructions on the same line) InstructionSet_20140128c.spin
Are there any provisions for handling quotient overflows from a 64/32 bit divide?
For example, if I divide 2^33 by 1, what will GETDIVQ return? Will carry be set?
Walter
I think would have to pre-bit-shift the denominator to be larger than the upper 32 bits of numerator. Eg: The 1 becomes 4.
Hmm, has the ring of floating-point to it ...
No. I need to figure out what happens and document it. I think you might get $FFFFFFFF.
Thank you all for noticing these things. I'll review the docs today and fix all these things. I'm glad you guys understand this all well enough to notice problems like these. There are going to be phase errors as things evolve.
Now I have been thinking (dangerous I know ), spin2 may as well use hubexec mode, freeing up cog ram and aux ram for variables and stack space.Here I mean the interpreter. The pasm code shouldbe able to run as reentrant code (a term used in the mini I worked on that permitted multiple processors to concurrently execute the same shared code). What I mean here is that multiple cogs can execuute the same pasm hubexec spin interppreter code concurrently without any conflicts.
Sorry for typos - on xoom
Absolutely! This will free up lots of cog RAM space. We can also run four hardware threads of interpreted Spin code, per cog.
That's crazy talk!
So, a Spin2 compiler will generate byte-codes as well as HUBEXEC code?
Byte-codes are much more compact than PASM of course and would certainly be missed if not supported.
Watch out with that "interpreted" word. We don't want Bill getting all postal on us again
I was thinking that you could have Spin subroutines employ a keyword in their declaration to force real compilation for when the speed is really needed. You could even do the same at the object instantiation level, to compile everything in that object.
Sounds a lot like the HUBTEXT keyword in Propeller-GCC which is used to force XMM code into HUB RAM for functions that need speed such as serial IO which saves a COG.
I assume inline PASM would be treated in a similar way.
I have the same assumption / question about inline PASM