Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

jmg · 2017-04-23 22:50

Roy Eltham wrote: »

Warnings for valid code is just wrong.

True enough, but that's not quite what Rayman is asking for.

Warnings for typos, or common misuse, are not really warnings for valid code.
They are the tools helping the programmer.

Here is a real example, showing how another assembler manages this :

; for 8051
        MOV    A,@R0
        MOV    A,@R3
        MOV     A,#dR0_rb0
        MOV     A,#R0

may appear fine in all cases , R3 & R0 are valid names, but assembling this gives two error messages 

   617:    0131  E6                             MOV    A,@R0
   618:    0132  E6                             MOV    A,@R3
                                                            ^
(Test.asm 221,21) ERROR: Invalid index register
   619:    0133  74 00                          MOV     A,#dR0_rb0
   620:    0135  74 00                          MOV     A,#R0
                                                             ^
(Test.asm 223,22) ERROR: Error in expression

Dave Hein · 2017-04-23 23:52

With a little bit of smarts the compiler could figure out if the target address was intended to be a pointer or an instruction. It is unlikely that the programmer would intentionally do a jump without a "#" to a location that contained an instruction, and it's also not likely that one would jump to a location using a "#" that did not contain an instruction. Consider this code:

        jmp #label1
....
        jmp label2
...
        jmp label3
...
        jmp #label4
....
label1  mov x, y
label2  add x,#3
...
label3  long    label1
label4  long    label2

A smart compiler would issue a warning for "jmp label2" and "jmp #label4". Of course, it's always possible, though unlikely, that somebody did want to do something tricky and doesn't want the compiler to issue warnings. So we would need a way to disable warnings, such as a line that reads "'COMPILER WARNINGS OFF", and to turn warnings back on you would have a line reading "'COMPILER WARNINGS ON".

jmg · 2017-04-24 00:18

Dave Hein wrote: »
With a little bit of smarts the compiler could figure out if the target address was intended to be a pointer or an instruction. It is unlikely that the programmer would intentionally do a jump without a "#" to a location that contained an instruction, and it's also not likely that one would jump to a location using a "#" that did not contain an instruction. Consider this code:
        jmp #label1
....
        jmp label2
...
        jmp label3
...
        jmp #label4
....
label1  mov x, y
label2  add x,#3
...
label3  long    label1
label4  long    label2
A smart compiler would issue a warning for "jmp label2" and "jmp #label4". Of course, it's always possible, though unlikely, that somebody did want to do something tricky and doesn't want the compiler to issue warnings. So we would need a way to disable warnings, such as a line that reads "'COMPILER WARNINGS OFF", and to turn warnings back on you would have a line reading "'COMPILER WARNINGS ON".

Yup, tho some more care would be needed for LUT and HUB cases, so something like 'reg' or 'regseg' or ??, could indicare a LONG in COG aka a Valid Register
eg

        jmp #label1
....
        jmp label2
...
        jmp label3
...
        jmp #label4
....
label1  mov x, y
label2  add x,#3
...
'~~~COG space~~~
label3  reg    label1   ' fine to use as indirect
label4  reg    label2
'~~~LUT space~~~
label5  long    label1  ' valid code, but do not try to use as indirect
label6  long    DataValue ' more common code
'~~~HUB space~~~
label7  long    label1  ' valid code, but do not try to use as indirect
label8  long    DataValue

Placing labeled initialized tables in LUT and HUB will be fine, but using those labels as indirect pointers will fail.
Reg above will give a error if copied/pasted into the wrong code location, and assist the tools in knowing reg is a valid indirect usage.

Dave Hein · 2017-04-24 00:42

PNut already knows that valid registers are between $00 and $1ff. It will issue an error if the value is outside of that range.

I've encountered the issue of trying to treat hub memory as registers in my little Prop2 GCC project. I generate assembly from C using the P1 GCC compiler. This uses the cog memory model, which assumes everything is in cog memory. However, I'm trying to execute it on the P2 from hub memory using hub exec. So any instructions that reference hub memory as a register are converted to a "RDLONG temp, label" followed by the instruction using the temp register instead of the label.

Cluso99 · 2017-04-24 00:56

Rayman wrote: »

I just wish you'd get a warning if the "#" is missing...
Seems that half my bugs turn out to be a missing #.

I think jmg is right and that direct calls are much more common.
If I were doing this from scratch, definitely drop the #...
Anyway, best if the normal usage is the simplest...

bst has an option to give a warning. Saved my bacon many times!

When I turned this option on from a working ZiCog, found a few bugs that had not shown their ugly heads.
Of course there were a few warnings on real code - so I put a special comment "indirect jump" and then I could ignore those flagged warnings

jmg · 2017-04-24 00:56

Dave Hein wrote: »

PNut already knows that valid registers are between $00 and $1ff. It will issue an error if the value is outside of that range.

True, but I'm also thinking ahead a little to flows that use a Linker, as has been mentioned before for P2.
There, it can be useful to separate at source level, those VARs that must be placed in COG, from those that could be placed in COG or LUT or HUB.
HUB-read comes with a speed cost, but tables in HUB, will likely be used to free up COG/LUT code space for a net gain in speed.
It also makes edit/merge harvesting of code examples easier.

Dave Hein · 2017-04-24 17:35

Does anybody know what the letter designations are for the various boards in the Prop_Ver response to Prop_Chk. From what I can tell "A" is the Prop123-A9, "B" is the DE2-115 and "C" is the DE2-Nano. What are the letters for the BeMicro-A9, Prop123-A7 and BeMicro-A2?

cgracey · 2017-04-25 00:26

Dave Hein wrote: »

Does anybody know what the letter designations are for the various boards in the Prop_Ver response to Prop_Chk. From what I can tell "A" is the Prop123-A9, "B" is the DE2-115 and "C" is the DE2-Nano. What are the letters for the BeMicro-A9, Prop123-A7 and BeMicro-A2?

From the ROM_Booter.spin2 file:

	ver		=	"A"		'Prop123-A9 / BeMicro-A9
'	ver		=	"B"		'DE2-115
'	ver		=	"C"		'DE0-Nano
'	ver		=	"D"		'BeMicro-A2
'	ver		=	"E"		'Prop123-A7

Dave Hein · 2017-04-25 00:54

Thanks Chip.

evanh · 2017-04-25 23:44

Chip,
I've noticed an easy spare double-operand instruction slot if there was any needed of one. The SFUNC instruction is currently filling one and can be moved to a group of single-operand slots instead.

Bonus: The individual instructions SFUNC encompasses will no longer be listed as aliases but become regular citizens in the documentation. SFUNC would vanish as a name.

cgracey · 2017-04-25 23:49

evanh wrote: »

Chip,
I've noticed an easy spare double-operand slot instruction slot if there was any needed of one. The SFUNC instruction is currently filling one and can be moved to a group of single-operand slots instead.

Bonus: The individual instructions SFUNC encompasses will no longer be listed as aliases but become regular citizens in the documentation. SFUNC can vanish as a name.

Man, we are thinking the same thing at the same moment. I just responded to TonyB's thread. I want the blessing of those working on tools, though, since they've been through the ringer a few times, already.

evanh · 2017-04-25 23:55

Lol, timing ay, this all comes from a conversation in the random number topic. Hence, both of us posting together.

cgracey · 2017-05-07 09:09

I'm still working on getting the compiles done.

In looking at timing reports, I realized many of the slow paths were starting from the instruction bits and bogging down in the instruction decoding logic. I made some alternate macros to decode instructions in the prior clock and then capture them into registers, so they are ready at the start of the critical cycle. I just change macro names to cause a register to be made, in lieu of normal decoding logic. The following instructions use the register macros:

SCL/SCLU
REP
SETQ
SETQ2
SKIP
SKIPF
EXECF
AUGS
AUGD

This took 9 flops per cog, but jacked up the 2-cog Fmax from ~83MHz to ~96MHz. Now, the critical paths are things that I can't really speed up, anymore, as they are already quite optimized. This is good for the FPGA emulation and the silicon synthesis.

I'm running a 16-cog compile now for the Prop123-A9 board. It's going to take a few hours.

I hope to have a new set of files (v19) out on Monday.

cgracey · 2017-05-07 09:22

Here's the latest instruction file. Note that flag suffixes are now single symbols, with WCZ replacing 'WC,WZ'. Someone here suggested that and it really simplifies things.

jmg · 2017-05-07 09:22

cgracey wrote: »

This took 9 flops per cog, but jacked up the 2-cog Fmax from ~83MHz to ~96MHz..

Will be interesting to see a table of how fMax has changed across all the builds. 15.66% gain is significant.

cgracey · 2017-05-07 09:26

jmg wrote: »

cgracey wrote: »

This took 9 flops per cog, but jacked up the 2-cog Fmax from ~83MHz to ~96MHz..

Will be interesting to see a table of how fMax has changed across all the builds. 15.66% gain is significant.

The FMax is all over the place, depending on the luck of the fitter seed value, but things have definitely gotten faster. It would be neat if v19 could be 100MHz.

ozpropdev · 2017-05-07 09:28

~96 fMax is sounding very encouraging.

jmg · 2017-05-07 09:30

cgracey wrote: »

The FMax is all over the place, depending on the luck of the fitter seed value, but things have definitely gotten faster. It would be neat if v19 could be 100MHz.

To a human, perhaps.
To USB circuits, 84MHz or 96MHz might be more useful thresholds

potatohead · 2017-05-07 11:45

I'll take the 100.

kwinn · 2017-05-07 16:31

potatohead wrote: »

I'll take the 100.

Of course, you're human ;-)

cgracey · 2017-05-07 16:36

I just looked at the compile for Prop123-A9. I had to set the fitter to area-optimization mode to get everything to fit, and we are not going to get above 80MHz, after all. To get to 100MHz, I'd have to drop maybe two smart pins.

Alexander (Sandy) Hapgood · 2017-05-07 18:52

cgracey wrote: »

NO MORE CHANGES PLANNED, barring bug fixes.

Chip; Are you at the point where you can revise the above statement to NO MORE CHANGES, barring bug fixes?

Sandy

cgracey · 2017-05-07 21:17

Alexander (Sandy) Hapgood wrote: »

cgracey wrote: »

NO MORE CHANGES PLANNED, barring bug fixes.

Chip; Are you at the point where you can revise the above statement to NO MORE CHANGES, barring bug fixes?

Sandy

Yes.

cgracey · 2017-05-07 21:18

Except, those recent changes weren't planned.

evanh · 2017-05-07 22:48

You got me chuckling there Chip!

cgracey · 2017-05-07 23:32

evanh wrote: »

You got me chuckling there Chip!

I do think we are done adding.

Rayman · 2017-05-08 00:43

IIRC ersmith was the source of the latest addition. I think it's great to optimize C as well as Spin....

potatohead · 2017-05-08 01:34

Worth it.

dMajo · 2017-05-09 12:59

cgracey wrote: »

evanh wrote: »

You got me chuckling there Chip!

I do think we are done adding.

One thing I would like you spend some time in thinking over: an additional error flag for math operations.
1) you need one instruction to clear the flag
2) all the math instructions add, sub, .... and CORDIC result reading instructions will OR (accumulate) its error output (overflow, underflow, division by 0, ...) with the internal flag.
3a) you need one instruction to test the flag and/or one to jump on its state
3b) you need one instruction to copy&clear the flag state to C or Z thus allowing the use of already made (test, jump, conditionals, ...) based on the error state transferred to C or Z

In this way in critical points, during math operations, or at the end of several math operations, one can understand if the result is valid or not. It can be used to force some default values in place of erroneous computed ones into the mathematic equation chains

Other thing. It is possible to use the internal CORDIC engine to compute floating point single precision (at least the basic: +, -, /, *, squarer and integer2float and float2integer conversions) math?
I've read on the web that with CORDIC you can do +/- 1/16777216 precision math and is often times used in FPGA and MCU where a FPU is not available or its cost is not reasonable.

Seairth · 2017-05-09 15:18

Quartus 17 has been release. Apparently, the big new thing is:

"The Intel® Quartus® Prime design software v17.0 Pro edition offers new team-based design flows that allow your geographically diverse development team to collaborate on a design."

Does this mean we'll all get to start working on the P2 verilog at the same time? Just think how much faster we'll get the final design knocked out!

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Comments