New FPGA files for next silicon version - 5th/final release - contains new ROM!!
cgracey
Posts: 14,232
5th Release
New ROM with updated SD booter and TAQOZ.
Extra register on each IN signal from pins to ensure metastability.
Fixes r/w glitch during LUT sharing.
Fixes JMP-event-within-REP bug.
'GETCT reg WC' doesn't change C.
This is for anyone who wants to try the next version of silicon, including the new ROM:
https://drive.google.com/file/d/1dOe3JPTZvcKvdE9SDOUSdMqM7BJ8Ixqk/view?usp=sharing
Here are the differences between the current silicon and these next-silicon FPGA images:
RDLUT and WRLUT now support PTRA/PTRB expressions. This means immediate LUT addresses are limited to $000..$0FF, unless ## is used.
PTRA/PTRB expressions are now encoded slightly differently to allow wider address ranging. These are used by
RDBYTE/RDWORD/RDLONG/WRBYTE/WRWORD/WRLONG/WMLONG, and now also RDLUT/WRLUT. The version of PNut.exe included in the .zip file handles all this. You don't need to do anything. PNut.exe will assemble proper object code from your PASM source code.
The system counter (CT) has been extended to 64 bits. 'GETCT reg WC' returns the top 32 bits of the 64-bit system counter, clears C, and shields the next instruction from interrupts, so that a time-aligned reading of both halves can be made by following 'GETCT high WC' with 'GETCT low'.
There are two new instructions which set up and read the scope mode: 'SETSCP D/#' and 'GETSCP D'. SETSCP points the scope mux to a set of four pins starting at (D[5:0] AND $3C), with D[6]=1 to enable scope operation. Any time GETSCP is executed, the lower bytes of those four pins' RDPIN values are returned in D. This feature will mainly be useful on the next silicon, as the FPGAs don't have ADC-capable pins.
Lastly, the USB smart pin modes have changed. There used to be four different USB modes ranging in %110xx. USB mode is now %11011 with WXPIN bits 15 and 14 setting the sub-modes and bits 13..0 setting the NCO frequency, as before, since bits 15 and 14 were always '0', anyway. Now, bit 15 = 0 for device mode or 1 for host mode, and bit 14 = 0 for low-speed mode or 1 for full-speed mode.
Smart pin modes %1100x are SINC2/SINC3/raw ADC modes, while smart pin mode %11010 is Scope mode. These aren't very useful until the next silicon exists, so there's no need to elaborate, yet.
I think those are the only changes.
Wait, better review this list of changes, too, since now all instructions that affect bits can now affect a RANGE of bits. You'll need to make sure you're not inadvertently affecting more than one bit, unless you intend to:
https://forums.parallax.com/discussion/169282/list-of-changes-in-next-p2-silicon/p1
The .spin2 files in the .zip have all been modified to take advantage, where possible, of the new bit/pin-range operations.
New ROM with updated SD booter and TAQOZ.
Extra register on each IN signal from pins to ensure metastability.
Fixes r/w glitch during LUT sharing.
Fixes JMP-event-within-REP bug.
'GETCT reg WC' doesn't change C.
This is for anyone who wants to try the next version of silicon, including the new ROM:
https://drive.google.com/file/d/1dOe3JPTZvcKvdE9SDOUSdMqM7BJ8Ixqk/view?usp=sharing
cogs smart pins RAM Freq CORDIC Filename +------------------------------------------------------------------------- Prop123-A9 | 8 0-39,56-63 512k * 80MHz Yes Prop123_A9_Prop2_v33k.rbf BeMicro-A9 | 8 0-39,56-63 512k * 80MHz Yes BeMicro_A9_Prop2_v33k.jic ** Prop123-A7 | 4 0-15,62-63 512k 80MHz Yes Prop123_A7_Prop2_v33k.rbf DE2-115 | 4 0-7,60-63 256k 80MHz Yes DE2_115_Prop2_v33k.pof * Allows loading up to $FFFFF to rewrite ROM. ** I had a file overwrite and I don't think that the SD card pins are mapped properly anymore to P[61:58] on the BeMicro-A9 image.
Here are the differences between the current silicon and these next-silicon FPGA images:
RDLUT and WRLUT now support PTRA/PTRB expressions. This means immediate LUT addresses are limited to $000..$0FF, unless ## is used.
PTRA/PTRB expressions are now encoded slightly differently to allow wider address ranging. These are used by
RDBYTE/RDWORD/RDLONG/WRBYTE/WRWORD/WRLONG/WMLONG, and now also RDLUT/WRLUT. The version of PNut.exe included in the .zip file handles all this. You don't need to do anything. PNut.exe will assemble proper object code from your PASM source code.
The system counter (CT) has been extended to 64 bits. 'GETCT reg WC' returns the top 32 bits of the 64-bit system counter, clears C, and shields the next instruction from interrupts, so that a time-aligned reading of both halves can be made by following 'GETCT high WC' with 'GETCT low'.
There are two new instructions which set up and read the scope mode: 'SETSCP D/#' and 'GETSCP D'. SETSCP points the scope mux to a set of four pins starting at (D[5:0] AND $3C), with D[6]=1 to enable scope operation. Any time GETSCP is executed, the lower bytes of those four pins' RDPIN values are returned in D. This feature will mainly be useful on the next silicon, as the FPGAs don't have ADC-capable pins.
Lastly, the USB smart pin modes have changed. There used to be four different USB modes ranging in %110xx. USB mode is now %11011 with WXPIN bits 15 and 14 setting the sub-modes and bits 13..0 setting the NCO frequency, as before, since bits 15 and 14 were always '0', anyway. Now, bit 15 = 0 for device mode or 1 for host mode, and bit 14 = 0 for low-speed mode or 1 for full-speed mode.
Smart pin modes %1100x are SINC2/SINC3/raw ADC modes, while smart pin mode %11010 is Scope mode. These aren't very useful until the next silicon exists, so there's no need to elaborate, yet.
I think those are the only changes.
Wait, better review this list of changes, too, since now all instructions that affect bits can now affect a RANGE of bits. You'll need to make sure you're not inadvertently affecting more than one bit, unless you intend to:
https://forums.parallax.com/discussion/169282/list-of-changes-in-next-p2-silicon/p1
The .spin2 files in the .zip have all been modified to take advantage, where possible, of the new bit/pin-range operations.
Comments
My ROM Booter has not changed, at all, so you can use my prior code when putting the whole image together, which includes your code.
We are going to verify through simulation that the race condition is gone, so there's no need for me to change my code around to handle DIR differently.
It's critical, of course, that this latest version of PNut be used to assemble your programs, so that PTRx expression are assembled correctly.
The mechanism for overwriting the ROM is in place, as during the last development period.
I'm hoping we can get this together in the next few days. And Thanks!!!
Have you set the dual-port SRAM parameter, READ_DURING_WRITE_MODE_MIXED_PORTS? See https://forums.parallax.com/discussion/comment/1462814/#Comment_1462814
very nice to have
this is just cool. Thank you very much
And now you got me worried, must read...
Enjoy!
Mike
Flashed all 4 images of "33g" to relevant FPGA boards.
All running Ok, will throw some more code at them tomorrow.
Wasn't the last proposal for the PTRx encoding backward compatible? It'd be really nice if we didn't have to have separate sets of tools for the P2ES and the next chip .
Yes, my idea (B2) for binary compatibility including the Verilog change is in the first post on this page:
http://forums.parallax.com/discussion/169243/rdlut-wrlut-with-auto-incrementing-address/p5
If implemented, an index of -16..+15 would encode the same in rev B as rev A.
Does hardware handle the case where the lower 32 bits wrap after reading the upper 32 bits? If software has to handle it we would need to do something like this:
EDIT: I'm guessing that the high cycles look ahead by 2 cycles or the low cycles are delayed by 2 cycles so they are in sync, correct?
So there would be a time difference of 2 cycles if I read low immediately after reading high versus reading low by itself.
@Mark_T, thanks for correcting my code. That's what I get for trying to write code early in the morning.
Here's the original discussion, which went increasingly off-topic in the later pages:
https://forums.parallax.com/discussion/169267/cnt-extension-to-64-bit/p1
I think Chip has implemented GETCT slightly differently now but the method of sync'ing high and low counts is probably the same.
A suggestion: could 'wc' for high count copy CT[32] to C instead of clearing it?
It would be better to not change the C at all, buy it’s likely more silicon and not intuitive unless a pseudo op ode of GETCTH D was used.
Usually, latched opcodes grab both fields on the first opcode, and the second opcode merely reads the stored value.
There should be no rollover handling needed, and the only time difference should be the 64 code is 1 opcode larger/slower than 32b, but the capture instant should not move.
Usually such details are hidden from the user, so it appears like a 'seamless 64b counter'.
eg on P2, a true 64b counter would run too slow, so the actual code will generate a terminal count on -1 which clock enables the second 32b counter.
when running, all bits rollover to 0000 on the same clock.
It seems that using wc is the easiest way to read the high count. Clearing C is not the only possible option and I suggested an alternative but here's another one: copy C to C!
IMHO it would be better if the C flag was not changed as this allows user code to keep C unchanged. But that may be a few gates extra.
I'd be happy with this as standard baggage given there is already many instruction pairings that already do this. AUGx/ALTx/SETQ comes to mind.
Here is a program I made to verify rollover behavior on the next silicon:
I forgot!
I just talked to Wendy at ON Semi about this, though, and she is looking into what we must do to ensure that random data is not returned on a READ during a simultaneous write to the same location from the other port. She is going to call me back soon about this. If it's doable, I'll update the FPGA images, accordingly.
Thanks for bringing this up!!!
The new bit/pin-field capability lets you condense a series of bit/pin operations into one two-clock instruction. Keeps code small and fast.
Thanks, Brian!
I implemented what was simplest in logic, because the PTRx computation circuitry was near critical-path and I didn't want to possibly slow things down.
I need to document what the new scheme is, though you can run PNut and see the output for different expressions. There's not a whole lot to it, and I don't know exactly where it breaks compatibility. Need to look at it.
Yes! It's in there.
It COULD have, but I already sent the code off to ON Semi. If we wind up doing a bug fix because one of you guys detect a problem, I will make C=CT[32].
Ah, I could have had it make C = current C. That would have been so simple, and better.
Yes! I've made a note in the source to make that change if we submit more code, due to a bug or timing fix.