P2 Tricks, Traps & Differences between P1 (Reference Material Only)
Cluso99
Posts: 18,069
Please keep this thread as a reference
Any corrections, additions and discussions can be done in the thread linked here...
forums.parallax.com/discussion/167812/p2-tricks-traps-differences-between-p1-discussion/p1?new=1
Update 2022-06-03: ROM code and docs (hard to find, so added links here)
ROM Code Listing (Rev 1 & 2)
forums.parallax.com/discussion/comment/1480216/#Comment_1480216
SD Boot Code (Rev 2)
forums.parallax.com/discussion/170637/p2-sd-boot-code-rev-2-silicon
Monitor/Debugger (Rev2)
forums.parallax.com/discussion/170638/p2-rom-monitor-debugger-rev-1-rev-2-silicon
Any corrections, additions and discussions can be done in the thread linked here...
forums.parallax.com/discussion/167812/p2-tricks-traps-differences-between-p1-discussion/p1?new=1
Update 2022-06-03: ROM code and docs (hard to find, so added links here)
ROM Code Listing (Rev 1 & 2)
forums.parallax.com/discussion/comment/1480216/#Comment_1480216
SD Boot Code (Rev 2)
forums.parallax.com/discussion/170637/p2-sd-boot-code-rev-2-silicon
Monitor/Debugger (Rev2)
forums.parallax.com/discussion/170638/p2-rom-monitor-debugger-rev-1-rev-2-silicon
Comments
In P1...
LOCKSET D WC returns C=previous lock setting ie C=1=previously set
In P2
LOCKTRY D WC returns C=1=obtained lock Note the reversed result
This is opposite to P1, and also differs from
LOCKNEW D WC returns C=1=no lock available
Here is the definitive example from Chip
https://forums.parallax.com/discussion/comment/1438380/#Comment_1438380
Here is a solution to skipping instructions without using condition codes... Acknowledgement to PeterJ
I noticed this very smart coding trick Chip used in checking for a hex character (0..9, A..F, a..f).
Code and table must be in cog.
I also found a shorter way to convert it to a binary nibble. The add x,#9 converts "A..F" to "J..O" and "a..f" to "j..o". "J" is ascii $4A and "j" is ascii $6A.
Then we strip off the upper nibble from both the "0..9" and "J..O"/"j..o" leaving $00..$0F"
While it's shorter code to first check for ASCII $80 and above, rather than using the 8 long table we could use a 4 long table. This would save 2 longs.
However, Chip is looking for the fastest code.
The P1 ignores lowest hub address bit(s) for non-aligned hub reads and writes.
The P2 correctly accesses non-aligned word and long reads and writes.
In the P1, there were tricks associated with the lower hub address bit(s) being ignored. This code will fail on P2.
In P1 we did this...
In P2 we do this...
And here is the trap: Our P1 code incorrectly converted to P2 Note we need to include the addct1 instruction into the loop because waitcnt no longer adds to the count
WAITX D/#
An alternative to just wait n+2 clocks is the WAITX instruction.
This should work. Just fill in the values and the values will be calculated for you...
(samples shown for P2-EVAL & P2D2 boards with 150MHz & 148.5MHz selected respectively) Oops. Fixed this _SETFREQ = 1<<24 +.. was 1<<25
Updated: 22 Feb 2019
(Source - https://forums.parallax.com/discussion/comment/1392310/#Comment_1392310)
Mode list:
Digital I/O (default)
- inversion
- clocking (registered)
- out strength
- feedback select
- logic in / schmitt in / comparator in
COMP_DAC
- preset level
- pinA/B in select
- feedback select (1500 Ohm out strength)
ADC_MODE
- in/gain select (pinB is disabled in revC silicon)
DAC_MODE
- strength
- preset level
- COG_DAC
- - specify cog
- - cog/streamer provides level
- SMART_DAC
- - associated smartpin provides level
- BIT_DAC
- - preset split level
Updated 20-11-2018 to correct smartpin mode number for capturing ADC bitstream.
25-11-2018: Reverse the update. Doh!
26-12-2018: Typo, R -> % in BIT_DAC mode line. And stated for engineering sample.
26-3-2019: Add link to source "pin_modes.png" sheet.
3-7-2019: Add "0" and "5" bit position indicators for faster constant crafting.
17-11-2019: Add changed revB silicon BIT_DAC mode. Mark DAC_MODE and its sub-modes as registered I/O.
1-12-2019: Add short mode list.
4-4-2020: RevC update.
BMASK D,#11 'D = $FFF
BMASK D,#15 'D = $FFFF
BMASK D,#30 'D = $7FFF_FFFF
So here is another way of expressing this... adjusted per Chip's following suggestion
Here is a section of code to record 'sample_length' of P0-P31 or P32-P63 using the streamer into HUB RAM at 'samples'...
NOTE: When specifying a numerical immediate in place of the label, it is treated as an absolute address and the same rules apply, ie: it will be encoded as PC-relative if possible, as per above. However, "@" produces an error with Pnut but not with p2asm. p2asm will perform a left shift by 2.
Test code added: 2019-4-11
UPDATES:
2019-4-12: Bug is fixed in fastspin v3.9.25 - https://github.com/totalspectrum/spin2cpp
2019-4-13: Bug is fixed in p2asm v0.015 - https://github.com/davehein/p2gcc
2020-7-05: Bug is fixed in Pnut v34 (5 Feb 2020) - http://forums.parallax.com/discussion/comment/1488890/#Comment_1488890
For anyone new reading, all register-level instructions affect the Z flag as you would expect, where it gets set if the result was zero.
For the bit-level instructions, on the other hand, C and Z are both read and written directly, without semantic consideration for what Z usually means.
The known one is the last instruction within the REP block cannot be a branching instruction. If a branching instruction is used then the branch distance will have the block length subtracted from it. It won't be remedied, so just don't do it.
Any intentionally cancelling branches must be at least one instruction back from the end of the REP block.
UPDATE: This flaw only affects relative branching! Absolute immediate and register direct can work as usual with the last instruction of a REP block. Alternatively, a compensation (block length added) applied to the relative branch also works. Further reading - https://forums.parallax.com/discussion/comment/1488366/#Comment_1488366
The until recently unknown one is with the Jxxx branch-on-event instructions. Using any of these instructions anywhere within the REP block will intermittently fail to branch upon event. When the failure occurs, execution will continue out the end of the REP block and even fail to take the first following branch.
This one is expected to be fixed. Example code of accommodating, and alternative below - https://forums.parallax.com/discussion/comment/1458393/#Comment_1458393
UPDATE: This one was fixed with v33 FPGA files, rev B silicon.
Some extra detail - https://forums.parallax.com/discussion/comment/1461458/#Comment_1461458
and https://forums.parallax.com/discussion/169438/rep-blocks-and-branching-issue/p1
There's a hardware flaw with shared lutRAM. A simultaneous read and write will glitch the read data. The glitch patterns change slightly with each run but the alignment doesn't. Same for different sys clock rates.
v32i FPGA image on P123 board @ 20 MHz
RevA P2ES board @ 20 MHz
EDIT: Updated with extra tests. Thanks Brian.
EDIT2: Added that revB has this fixed.
You've managed to add a forth comment now!
Seem the forum software does not actually allow that, so I've done the next best thing..... ♪
much harder to spot and can lead you (well me) running around in circles trying to track down very strange behaviours
that seem to defy logic:
Of course the ORG should be after the random long data, otherwise all the addressing is thrown, but
the effect is to omit the leading instructions in call'ed routines (jmps are relative and tend to work),
which can have subtle and bizarre consequences.
On the P1 all branches get affected and things tend to fall over completely rather than limp along misbehaving
It would be nice to get an assembler warning if the first entry after ORG 0 is not an instruction.
P2 will be a little different to P1 in that you can start the cog at other than address COG $0, and also because we don't really have a NOP opcode.
Postedit
In P1 it was a common trick to use
LONG <somevalue>
where <somevalue> was <= 18-bits
This meant that the conditional bits EEEE=0000 and so the instruction would be treated as a NOP.
In P2 the EEEE=0000 is a valid instruction with the _RET_ condition. The instruction executes, and if not a taken branch it will also perform a RET instruction.
Therefore, any value other than all zeros will execute some intsruction.
Thus
LONG <any-non-zero-value>
will NOT BE TREATED AS A NOP in P2 !!!