@rjo_ For sure. It's going to be pretty easy to treat a P2 like a P1, but for a few basic things. Doing that will get somebody really productive. Then, they can start taking on advanced things, one at a time, gaining each time. IMHO, it's going to rock hard. And be a lot of fun!
Also IMHO, the ideal document is something like an introduction to the chip, block diagram, etc... Then the basics, SPIN, monitor and the list of easy cheezy instructions. Include all the instruction definitions, and ideally sample code. Follow that up with cook book / project type contexts for the advanced things, volume 1, 2, 3...
I can see all sorts of potential uses for this in graphics, string operations and even cryptography.
Sorry, cannot resist...
I wonder if there would be uses for the inverse?
SRCD D/#
SRCDP D/#
SRCDN D/#
SRCDOFF D/#
Ah, if only...
Anything like this has to occur two stages deeper in the pipeline and be resilient to any combination of pipeline cancellations; hence, INDA and INDB and their intransigence. It actually took me several days to get my head around the data forwarding rules for those indirect circuits, so that I could chart out every state and determine what needed to be done in each.
Redirecting the result register, on the other hand, is a cheap trick, since everything is known at the final stage of the pipeline, and there is no possibility of needing to backtrack in any way. It only took ~6 hours to implement the new RESD scheme, including assembler modifications and FPGA testing.
It's just for cog RAM registers - a little trick to swap out result register addresses.
Then in that case it is true that there is no point to add offset. Also I think that I misunderstood the new instruction. What I actually wanted to do was to add an offset to INDA++, and INDB++ (+2, +3, ... , +8), to be able to synchronize several cogs and write directly into HUB RAM. But also because those pointers are just for cog RAM, they cannot be used the way I wanted.
Anything like this has to occur two stages deeper in the pipeline and be resilient to any combination of pipeline cancellations; hence, INDA and INDB and their intransigence. It actually took me several days to get my head around the data forwarding rules for those indirect circuits, so that I could chart out every state and determine what needed to be done in each.
Redirecting the result register, on the other hand, is a cheap trick, since everything is known at the final stage of the pipeline, and there is no possibility of needing to backtrack in any way. It only took ~6 hours to implement the new RESD scheme, including assembler modifications and FPGA testing.
May I please put some urgency back into getting out a release. Currently many of us are on hold, anxiously waiting for that release. It has been way tooo long since the last usable release.
As soon as you have fpga and pnut working, may we have a release please? IMHO we all can wait for the additional docs to these instructions if it means we get the code earlier.
While all these extras are nice, lets get on with current testing, and USB and SERDES.
And while USB / SERDES happens, how about we start another thread to discuss nice extra simple instructions - at least with a new thread we don't have to go searching as much.
Then in that case it is true that there is no point to add offset. Also I think that I misunderstood the new instruction. What I actually wanted to do was to add an offset to INDA++, and INDB++ (+2, +3, ... , +8), to be able to synchronize several cogs and write directly into HUB RAM. But also because those pointers are just for cog RAM, they cannot be used the way I wanted.
@Ramon,
Take a look at the PTRA[index] capability to access hub RAM. That may possibly help you do your interleaving depending on what you need to do and if you have time between all the memory transfers and whatever else the code is doing to come up for air so to speak.
Also once synchronized to the hub, you will have 7 other instructions between writes and I believe 5 between reads to modify your pointers if required.
As soon as you have fpga and pnut working, may we have a release please? IMHO we all can wait for the additional docs to these instructions if it means we get the code earlier.
I would suggest that it is better to wait to make the next release as close as possible to the last release. The reason to do this is that the C compiler porting could be started.
The C compiler will increase testing capabilities (due to both increased applications, and increased number of testers). While this forum is full of experts in P2 assembler, newcomers outside of this forum will say 'ok I will try it, lets see how easy can I port this to P2, and its performance'.
(Two weeks ago, when Chip was thinking about how to make preemptive multithreading David Betz make some good points about how this would work with C. And I think the discussion was greatly enriched. So I think it is important to re-start the C compiler development as soon as possible. And it is worth the wait if we can guarantee that the instruction opcode list is the final one.)
I would suggest that it is better to wait to make the next release as close as possible to the last release. The reason to do this is that the C compiler porting could be started.
The C compiler will increase testing capabilities (due to both increased applications, and increased number of testers). While this forum is full of experts in P2 assembler, newcomers outside of this forum will say 'ok I will try it, lets see how easy can I port this to P2, and its performance'.
(Two weeks ago, when Chip was thinking about how to make preemptive multithreading David Betz make some good points about how this would work with C. And I think the discussion was greatly enriched. So I think it is important to re-start the C compiler development as soon as possible. And it is worth the wait if we can guarantee that the instruction opcode list is the final one.)
Actually, the C compiler work can be started again once we have a stable instruction set encoding even if new instructions end up being added later. What is difficult is when the instruction encoding changes every time there is a release. However, I wouldn't want Chip to be constrained to keep the same encodings if he finds he needs more opcode space and thinks of a way to rearrange things to get that space. What I should do is write a program that parses Chip's instruction list and automatically generates an assembler to match.
May I please put some urgency back into getting out a release. Currently many of us are on hold, anxiously waiting for that release. It has been way tooo long since the last usable release.
As soon as you have fpga and pnut working, may we have a release please? IMHO we all can wait for the additional docs to these instructions if it means we get the code earlier.
While all these extras are nice, lets get on with current testing, and USB and SERDES.
And while USB / SERDES happens, how about we start another thread to discuss nice extra simple instructions - at least with a new thread we don't have to go searching as much.
Here is an updated Instruction Summary with...
* Column for WZ & WC now has it for each op
* adds new instructions REPD & PICKZC (note PICKZC my not be correct opcode bits)
* removes instruction SETZC Tip to view without wrap - reduce text size (In IE Ctrl-Scroll)
My editor after change to new instruction set reports now that this instruction's are NOT in use.
Can You look if it is OK else that You are mising some of them in NEW Instruction's LIST
My editor after change to new instruction set reports now that this instruction's are NOT in use.
Can You look if it is OK else that You are mising some of them in NEW Instruction's LIST
Hi Sapieha,
Here is a few differences I noticed in your list.
SPA,SPB are now called PTRX,PTRY
TJZ,TJNZ are now JZ,JNZ
SETQUAD/Z are now SETWIDE/Z
RDWAUD,WRQUAD are now RDWIDE,WRWIDE
SETCOG is no longer used
SETMUL is no longer used, MUL32 A,B is the new format(2 ops)
ADDSPA is now ADDPTRX
IJZ,IJNZ and delayed variants have been removed
Hi Sapieha,
Here is a few differences I noticed in your list.
SPA,SPB are now called PTRX,PTRY
TJZ,TJNZ are now JZ,JNZ
SETQUAD/Z are now SETWIDE/Z
RDWAUD,WRQUAD are now RDWIDE,WRWIDE
SETCOG is no longer used
SETMUL is no longer used, MUL32 A,B is the new format(2 ops)
ADDSPA is now ADDPTRX
IJZ,IJNZ and delayed variants have been removed
Hi Sapieha,
Here is a few differences I noticed in your list.
SPA,SPB are now called PTRX,PTRY
TJZ,TJNZ are now JZ,JNZ
SETQUAD/Z are now SETWIDE/Z
RDWAUD,WRQUAD are now RDWIDE,WRWIDE
SETCOG is no longer used
SETMUL is no longer used, MUL32 A,B is the new format(2 ops)
ADDSPA is now ADDPTRX
IJZ,IJNZ and delayed variants have been removed
-----------------------------------------------------------
Here is a few differences I noticed in your list.
SETCOG is no longer used
IJZ,IJNZ and delayed variants have been removed
SPA,SPB are now called PTRX,PTRY
SETMUL is no longer used, MUL32 A,B is the new format(2 ops)
TJZ,TJNZ are now JZ,JNZ
SETQUAD/Z are now SETWIDE/Z
RDWAUD,WRQUAD are now RDWIDE,WRWIDE
ADDSPA are now ADDPTRX
movi=seti new one SETINST
setporta new one SETPORA
setportb new one SETPORB
setportc new one SETPORC
setportd new one SETPORD
cachex hve now --->
DCACHEX
ICACHEX
ICACHEP
ICACHEN
----------------------------------------------------------------
From the Prop2 docs
PUSHX D/# alias for 'WRAUX D/#,PTRX++'
PUSHY D/# alias for 'WRAUXR D/#,PTRY++'
POPX D alias for 'RDAUX D,--PTRX'
POPY D alias for 'RDAUXR D,--PTRY'
PUSHA/POPA are noew called PUSHX,POPX
PUSHB/POPB are noew called PUSHY,POPY
--------------------------------------------------------------------------------------------------------
jmptask - parameters were #address,#mask now is #mask,#address
JMPTASK D/#,S/# force PC's in mask D/# to address S/#
--------------------------------------------------------------------------------------------------------
setdivu setdiva setdivb is now the following
To start a 32/32-bit divide, execute one of the following:
DIV32 D/#,S/# - Begin 32/32-bit signed divide of D/# over S/#
DIV32U D/#,S/# - Begin 32/32-bit unsigned divide of D/# over S/#
To start a 64/32-bit divide, first set the denominator:
DIV64D D/# - Set the 32-bit denominator to D/#
Then execute one of the following:
DIV64 D/#,S/# - Set the 64-bit numerator to {S/#,D/#} and begin signed divide
DIV64U D/#,S/# - Set the 64-bit numerator to {S/#,D/#} and begin unsigned divide
--------------------------------------------------------------------------------------------------------
setsqrh setsqrl setsqrt is now the following
SQRT32 D/# - Begin computing square root of 32-bit unsigned D/#
SQRT64 D/#,S/# - Begin computing square root of 64-bit unsigned {S/#,D/#}
--------------------------------------------------------------------------------------------------------
setx is now SETCOND
Almost all those instructions are still there. They just have different names now.
The ones that aren't in there became something else to provide improved functionality, like FITACCA is now SARACCA.
As things developed, the mnemonic names evolved to reflect more standard ways of expressing ideas. For example SPA/SPB are now PTRX/PTRY, which have some conceptual relation to PTRA/PTRB. Both can do calls and returns now.
Chip, How is the fpga code coming along?
Did you manage to get it to compile to the Cyclone V or did you give up?
Did you get time to look at my verilog code for USB ? jmg fixed some of the syntax and helped with verilog
I didn't know how to express the input pins as pin[s[7:0]] and pin[s[7:0] | $01].
Sapieha & ozpropdev,
There is an opcode map posted by Chip a few pages back - dated 12 March - I posted a summary yesterday that excludes the new RESxx instructions.
Comments
Also IMHO, the ideal document is something like an introduction to the chip, block diagram, etc... Then the basics, SPIN, monitor and the list of easy cheezy instructions. Include all the instruction definitions, and ideally sample code. Follow that up with cook book / project type contexts for the advanced things, volume 1, 2, 3...
SRCDP D/#
SRCDN D/#
SRCDOFF D/#
They would use the same source register, with increment or decrement.
uses:
- equivalent to using INDA/INDB, but task-unique
- string functions without using INDA/INDB
- summing a vector without using INDA/INDB
Basically, it cuts down on the issue of INDA/INDB not being task specific
I only thought of it because of symmetry.
Ah, if only...
Anything like this has to occur two stages deeper in the pipeline and be resilient to any combination of pipeline cancellations; hence, INDA and INDB and their intransigence. It actually took me several days to get my head around the data forwarding rules for those indirect circuits, so that I could chart out every state and determine what needed to be done in each.
Redirecting the result register, on the other hand, is a cheap trick, since everything is known at the final stage of the pipeline, and there is no possibility of needing to backtrack in any way. It only took ~6 hours to implement the new RESD scheme, including assembler modifications and FPGA testing.
Then in that case it is true that there is no point to add offset. Also I think that I misunderstood the new instruction. What I actually wanted to do was to add an offset to INDA++, and INDB++ (+2, +3, ... , +8), to be able to synchronize several cogs and write directly into HUB RAM. But also because those pointers are just for cog RAM, they cannot be used the way I wanted.
RESD is still very nice, and will be used a lot.
It is possible to made MOVBYTE D,S,(1-3)
with others word move byte from S (0 to 7) to D in position determined by (1-3)
http://forums.parallax.com/showthread.php/154686-P2-New-Instruction-Ideas-Discussions-and-Requests
May I please put some urgency back into getting out a release. Currently many of us are on hold, anxiously waiting for that release. It has been way tooo long since the last usable release.
As soon as you have fpga and pnut working, may we have a release please? IMHO we all can wait for the additional docs to these instructions if it means we get the code earlier.
While all these extras are nice, lets get on with current testing, and USB and SERDES.
And while USB / SERDES happens, how about we start another thread to discuss nice extra simple instructions - at least with a new thread we don't have to go searching as much.
Sound like a plan ???
@Ramon,
Take a look at the PTRA[index] capability to access hub RAM. That may possibly help you do your interleaving depending on what you need to do and if you have time between all the memory transfers and whatever else the code is doing to come up for air so to speak.
Also once synchronized to the hub, you will have 7 other instructions between writes and I believe 5 between reads to modify your pointers if required.
I would suggest that it is better to wait to make the next release as close as possible to the last release. The reason to do this is that the C compiler porting could be started.
The C compiler will increase testing capabilities (due to both increased applications, and increased number of testers). While this forum is full of experts in P2 assembler, newcomers outside of this forum will say 'ok I will try it, lets see how easy can I port this to P2, and its performance'.
(Two weeks ago, when Chip was thinking about how to make preemptive multithreading David Betz make some good points about how this would work with C. And I think the discussion was greatly enriched. So I think it is important to re-start the C compiler development as soon as possible. And it is worth the wait if we can guarantee that the instruction opcode list is the final one.)
I agree. I'm working on it...
In this Summary You are missing.
SETZC FIXINDB FIXINDS SETINDA SETINDB SETINDS WAITCNT
In Yours Last
Prop2_Instructions_2014_03_12.txt.
You are missing NOP instruction and its bit's usage
Thanks for pointing that out, Sapieha. I'll put it in right now, before I forget.
Thanks
My editor after change to new instruction set reports now that this instruction's are NOT in use.
Can You look if it is OK else that You are mising some of them in NEW Instruction's LIST
jmptask tjnz tjz addspa setcog coginit setspa popar
chkspa subspa pushar getspb getspa chkspb setspb
pushb wrquad rdquad rdquadc cachex gettops setquad setquaz
popb addspb getspd chkspd sndser rcvser mac movi jmpretd absneg
seti setx pushx pushy popx popy chkptra chkptrb setqm
setsqrt getpn setporta setportb setportc setportd cashex
tasksw taskswd setmulb setdivu setdiva setdivb setsqrh
setsqrl setmulu setmula fitaccb fitaccs setskip
tjzd tjnzd ijz ijzd ijnz ijnzd
Hi Sapieha,
Here is a few differences I noticed in your list.
SPA,SPB are now called PTRX,PTRY
TJZ,TJNZ are now JZ,JNZ
SETQUAD/Z are now SETWIDE/Z
RDWAUD,WRQUAD are now RDWIDE,WRWIDE
SETCOG is no longer used
SETMUL is no longer used, MUL32 A,B is the new format(2 ops)
ADDSPA is now ADDPTRX
IJZ,IJNZ and delayed variants have been removed
Thanks
I found one more
seti -- new one SETINST
Fond some more differences that are identified now
NOT Identified
jmptask jmpretd coginit
pushb popb pushx popx
popar pushy popy pushar
gettops getpn
sndser rcvser mac absneg
chkptra chkptrb setqm
tasksw taskswd
setdivu setdiva setdivb
setsqrh setsqrl setsqrt
setskip fitaccb fitaccs
setx
getspd chkspd
Identified
ijz ijzd ijnz ijnzd
tjz tjnz tjzd tjnzd
setquad setquaz
wrquad rdquad rdquadc
setcog
setmulu
seti
setmulb setmula
setporta setportb setportc setportd
cachex
addspa addspb
subspa subspb
chkspb chkspa
setspa setspb
getspa getspb
Here's some more information.
Thanks.
Will update previous post with Yours info
Almost all those instructions are still there. They just have different names now.
The ones that aren't in there became something else to provide improved functionality, like FITACCA is now SARACCA.
As things developed, the mnemonic names evolved to reflect more standard ways of expressing ideas. For example SPA/SPB are now PTRX/PTRY, which have some conceptual relation to PTRA/PTRB. Both can do calls and returns now.
Copies D[0:1+0-31] into Z,C
Have you seen the current Prop2 Docs (Feb 6 2014)?
Did you manage to get it to compile to the Cyclone V or did you give up?
Did you get time to look at my verilog code for USB ? jmg fixed some of the syntax and helped with verilog
I didn't know how to express the input pins as pin[s[7:0]] and pin[s[7:0] | $01].
There is an opcode map posted by Chip a few pages back - dated 12 March - I posted a summary yesterday that excludes the new RESxx instructions.