I got that there seemed to be a caution there. Not sure why you quoted me to reply to.
If you mean making it 10x-100x bigger for no more speed with maybe some extra capabilities using an actual processor per pin has a future, I doubt it. Configurable FPGA like hardware would be a better way to deal with configurable I/O functions in a more bloated fashion and guaranteed to provide more capabilities. And this is already done on denser SOC offerings.
Cogs do fill this roll already to some extent. It's the Prop way, you could say. The typical setup is that some if not most Cogs are servicing I/O. The big feature for the Prop2 (over the Prop1) is better low-level I/O hardware (Smartpins and Streamers) to reduce the reliance on bit-bashing the raw I/O data. Allowing for improved throughput and greater supervisor/protocol handling rolls.
I mentioned to a hardened full range Pic developer, what Prop2's Smartpins will offer, earlier today and once it sunk in what is to be available on every pin he was suitably impressed. Impressed enough to drop everything he's using now? Time will tell. We need a finished product before that.
I quoted your post because of the “No! ... no ... NO! ... Won't be faster nor smaller. ” response to Cluso99’s post proposing a a discussion of the idea of a specialised cog for each pin. I don’t think anything approaching a full cog for each pin is a good idea either, but a discussion of how to make the pins more flexible and powerful may lead to some improvements for the P3 when the time comes.
I don’t think anything approaching a full cog for each pin is a good idea either, but a discussion of how to make the pins more flexible and powerful may lead to some improvements for the P3 when the time comes.
Yes, this may need some final 'shake out' numbers on the relative sizes of P2 COG vs P1 COG vs Smart Pin Cell.
Other vendors have done small state machines and very basic wait-move style nano-cores to try to off load Main CPU from servicing.
On P2, there are 16 COGS that can do that, but it may turn out that a good number of COGS are very low usage, with a few lines of PASM doing the basic wait-move smart pin servicing.
Rather than a P1-COG, which would have differing binary files, a shrink-subset-P2 may be a more natural choice.
eg See the threads on P1 ASM -> P2 ASM mapping.
@Chip
DE0-nano + addon board flashed and runs ok, so just DE0-Nano bare doesn't respond.
@Heater
Sorry, trying to do too many things at once at the moment.
After flashing the Nano with Quartus programmer and power cycling it you will get a green led indicating Cog is active.
Pressing Ctrl-G from Pnut will confirm the FPGA has loaded Ok.
Because there is no ROM monitor like the old P2-Hot the only other way to confirm the Nano is alive is to send a command via a serial terminal to it.
@Chip
I suspect the issue with the Nano bare image is related to a pin assignment error or weak pull_up setting on the DTR/reset pin.
Toggling DTR does not reset the P2 (normally cog led will blink)
Sorry about the DE0_Nano_Bare image not working. It may just be a copy of the normal DE0_Nano image. At least, it works with the add-on board. I'll get this fixed in the morning. I did something wrong.
evanh
I quoted your post because of the “No! ... no ... NO! ... Won't be faster nor smaller. ” response to Cluso99’s post proposing a a discussion of the idea of a specialised cog for each pin. I don’t think anything approaching a full cog for each pin is a good idea either, but a discussion of how to make the pins more flexible and powerful may lead to some improvements for the P3 when the time comes.
That wasn't an isolated statement. But you've got my position on this planned Prop3 now anyway.
I'm trying to run the 640x480x8 VGA program in cog 1. Cog 0 does a coginit, and then spins in a loop. If I change the coginit to start cog 0 it works OK, but any other cog fails. Will VGA only work from cog 0? The code is attached.
EDIT: Also, the last 50 or 60 lines are black on the display. Is that what everybody else sees? Why is that?
I'm trying to run the 640x480x8 VGA program in cog 1. Cog 0 does a coginit, and then spins in a loop. If I change the coginit to start cog 0 it works OK, but any other cog fails. Will VGA only work from cog 0? The code is attached.
EDIT: Also, the last 50 or 60 lines are black on the display. Is that what everybody else sees? Why is that?
If you are running a DE0-Nano, you only have 32KB of hub RAM. Also, only one cog. What FPGA board are you using? And, yes, only COG0 connects to the DACs.
I'm using a DE2-115, which has 4 cogs. Is the COG0 DAC connection only on the FPGA, or will the silicon be that way also? And do you know the reason for the blank 50 to 60 lines on the bottom of the screen?
EDIT: I understand the black lines now. The DE2-115 only has 256K of hub RAM, and the BMP is over 300K in size.
EDIT2: The P2 document says that all cogs have 4 fast DACs, so the DACs only on COG0 must be an FPGA board limitation. I think I answered my remaining questions. Thanks
I'm using a DE2-115, which has 4 cogs. Is the COG0 DAC connection only on the FPGA, or will the silicon be that way also? And do you know the reason for the blank 50 to 60 lines on the bottom of the screen?
EDIT: I understand the black lines now. The DE2-115 only has 256K of hub RAM, and the BMP is over 300K in size.
EDIT2: The P2 document says that all cogs have 4 fast DACs, so the DACs only on COG0 must be an FPGA board limitation. I think I answered my remaining questions. Thanks
Yes, that is an FPGA board limitation, only. And you're right about the 256k.
PTRx expressions with AUGS:
If "##" is used before the index value in a PTRx expression, the assembler will automatically insert an AUGS instruction and assemble the 20-bit index instruction pair:
RDBYTE D,++PTRB[##$12345]
...becomes...
1111 1111000 000 000111000 010010001 AUGS #$00E12345
1111 1010110 001 DDDDDDDDD 101000101 RDBYTE D,#$00E12345 & $1FF
First, why does $12345 become $E12345 ? Does this assume PTRB=$E00000?
Second, what is meant by the "& $1FF"? Wouldn't trimming the argument to 9 bits kind of defeat the whole purpose of this?
PTRx expressions with AUGS:
If "##" is used before the index value in a PTRx expression, the assembler will automatically insert an AUGS instruction and assemble the 20-bit index instruction pair:
RDBYTE D,++PTRB[##$12345]
...becomes...
1111 1111000 000 000111000 010010001 AUGS #$00E12345
1111 1010110 001 DDDDDDDDD 101000101 RDBYTE D,#$00E12345 & $1FF
First, why does $12345 become $E12345 ? Does this assume PTRB=$E00000?
Second, what is meant by the "& $1FF"? Wouldn't trimming the argument to 9 bits kind of defeat the whole purpose of this?
The four bits you identify as $E are the "1SUP" subfield. In this case, S=1 (PTRB) U = 1 (update PTRx) P = 0 (pre-modify). The & $1FF is just communicating how instructions are encoded with AUGx. The 9 LSBs are encoded in the original instruction (RDBYTE, in this case), and the upper 23 bits are encoded in the AUGx (AUGS, in this case). Because this particular use only involves a 20-bit index, there are actually 12 unused bits in the AUGS (S[22:11]). But, as this is an index (not an address), the "1SUP" is in S[14:11]. Just like the non-AUGx version, it immediately precedes the index field.
Comments
If you mean making it 10x-100x bigger for no more speed with maybe some extra capabilities using an actual processor per pin has a future, I doubt it. Configurable FPGA like hardware would be a better way to deal with configurable I/O functions in a more bloated fashion and guaranteed to provide more capabilities. And this is already done on denser SOC offerings.
Cogs do fill this roll already to some extent. It's the Prop way, you could say. The typical setup is that some if not most Cogs are servicing I/O. The big feature for the Prop2 (over the Prop1) is better low-level I/O hardware (Smartpins and Streamers) to reduce the reliance on bit-bashing the raw I/O data. Allowing for improved throughput and greater supervisor/protocol handling rolls.
I mentioned to a hardened full range Pic developer, what Prop2's Smartpins will offer, earlier today and once it sunk in what is to be available on every pin he was suitably impressed. Impressed enough to drop everything he's using now? Time will tell. We need a finished product before that.
- No instruction changes
- New custom bytecode executor with 6-clock overhead
- SKIPF now behaves like SKIP in hub-exec
All V18 images flashed and run OK (P123 A7 & A9,DE2-115,BeMicro A2 & A9)
except for
DE0-Nano bare doesn't seem to work (Checked on two different boards)
Haven't tested Nano+add_on board yet (left it at office)
Do any of the documents say which pins have smarts?
I quoted your post because of the “No! ... no ... NO! ... Won't be faster nor smaller. ” response to Cluso99’s post proposing a a discussion of the idea of a specialised cog for each pin. I don’t think anything approaching a full cog for each pin is a good idea either, but a discussion of how to make the pins more flexible and powerful may lead to some improvements for the P3 when the time comes.
What does "DE0-Nano bare doesn't seem to work" mean?
I can program v18 to my bare nano and the COG 0 LED comes up as I seem to remember it should. Basically it does the same as v17.
However what next?
I have totally forgotten what to do with this since I last tried a P2 config. Where's the P2 on Nano For Idiots guide?
Other vendors have done small state machines and very basic wait-move style nano-cores to try to off load Main CPU from servicing.
On P2, there are 16 COGS that can do that, but it may turn out that a good number of COGS are very low usage, with a few lines of PASM doing the basic wait-move smart pin servicing.
Rather than a P1-COG, which would have differing binary files, a shrink-subset-P2 may be a more natural choice.
eg See the threads on P1 ASM -> P2 ASM mapping.
No response using the "> Prop_Chk 0 0 0 0" sequence either.
That is not helping at all.
I don't know what to expect to happen.
I seem to remember a LED coming on to indicate a COG was running. Then I could get some output from it on the Prop Plug and a terminal prgram.
Not today.
DE0-nano + addon board flashed and runs ok, so just DE0-Nano bare doesn't respond.
@Heater
Sorry, trying to do too many things at once at the moment.
After flashing the Nano with Quartus programmer and power cycling it you will get a green led indicating Cog is active.
Pressing Ctrl-G from Pnut will confirm the FPGA has loaded Ok.
Because there is no ROM monitor like the old P2-Hot the only other way to confirm the Nano is alive is to send a command via a serial terminal to it.
> Prop_Chk 0 0 0 0 <cr>
nano respomds with
Prop_Ver Cu
I'll see what I can do with that.
I suspect the issue with the Nano bare image is related to a pin assignment error or weak pull_up setting on the DTR/reset pin.
Toggling DTR does not reset the P2 (normally cog led will blink)
fc de0_nano_bare_prop2_v18.jic de0_nano_prop2_v18.jic
I wonder if the files are just identical. I'm not by my PC now; otherwise, I would just try it, myself. Thanks.
"nevermind"... Need to update files to v18.
Woops... I didn't see your last post. I'll have a new zip file out in the morning. I'm hoping I didn't clobber things at the source level.
Thanks, Ozpropdev.
EDIT: Also, the last 50 or 60 lines are black on the display. Is that what everybody else sees? Why is that?
If you are running a DE0-Nano, you only have 32KB of hub RAM. Also, only one cog. What FPGA board are you using? And, yes, only COG0 connects to the DACs.
EDIT: I understand the black lines now. The DE2-115 only has 256K of hub RAM, and the BMP is over 300K in size.
EDIT2: The P2 document says that all cogs have 4 fast DACs, so the DACs only on COG0 must be an FPGA board limitation. I think I answered my remaining questions. Thanks
Yes, that is an FPGA board limitation, only. And you're right about the 256k.
Don't quite understand this part:
First, why does $12345 become $E12345 ? Does this assume PTRB=$E00000?
Second, what is meant by the "& $1FF"? Wouldn't trimming the argument to 9 bits kind of defeat the whole purpose of this?
The four bits you identify as $E are the "1SUP" subfield. In this case, S=1 (PTRB) U = 1 (update PTRx) P = 0 (pre-modify). The & $1FF is just communicating how instructions are encoded with AUGx. The 9 LSBs are encoded in the original instruction (RDBYTE, in this case), and the upper 23 bits are encoded in the AUGx (AUGS, in this case). Because this particular use only involves a 20-bit index, there are actually 12 unused bits in the AUGS (S[22:11]). But, as this is an index (not an address), the "1SUP" is in S[14:11]. Just like the non-AUGx version, it immediately precedes the index field.