Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

jmg · 2015-09-27 00:40

Seairth wrote: »
As for REP, looking at MainLoader.spin, there's this snippet:
        rep     #2,#7           'ready for 8 bits
        waitx   waita           'wait for middle of 1st data bit
        testb   inb,#31	wc      'sample rx
        rcr     x,#1            'rotate bit into byte
        waitx   waitb           'wait for middle of nth data bit
From that, it looks to me like:

1. The instruction immediately following REP is executed.
2. The first REP parameter is N-1 instructions that you want to repeat
3. The second REP parameter is N-1 times you want to repeat the loop

Is that correct? If so, for #2 and #3, I wish there were a better way of setting the parameters. Immediate thoughts are to change the second parameter to have #0 mean "forever" (if that's still possible) and have any positive value indicate the count.

I know the first parameter could also be done like:
        rep     #(r_end-r_beg) >> 2, #7  'ready for 8 bits
        waitx   waita           'wait for middle of 1st data bit
r_beg   testb   inb,#31	wc      'sample rx
        rcr     x,#1            'rotate bit into byte
r_end   waitx   waitb           'wait for middle of nth data bit
But the expression is... well... ugly. You might be able to do something like:
        rep     #(@r_end-4) >> 2, #7  'ready for 8 bits
        waitx   waita           'wait for middle of 1st data bit
        testb   inb,#31	wc      'sample rx
        rcr     x,#1            'rotate bit into byte
r_end   waitx   waitb           'wait for middle of nth data bit
which is a bit better. But what I'd really like is something like:
bits        rep     #7              'ready for 8 bits
            waitx   waita           'wait for middle of 1st data bit
            testb   inb,#31	wc  'sample rx
            rcr     x,#1            'rotate bit into byte
rep_bits    waitx   waitb           'wait for middle of nth data bit
where the compiler will do the work for me.

We've talked about this before, and I prefer the clearest approach, which is a version of your #!

        rep     r_end,r_beg, #7  'ready for 8 bits
        waitx   waita           'wait for middle of 1st data bit
r_beg   testb   inb,#31	wc      'sample rx
        rcr     x,#1            'rotate bit into byte
r_end   waitx   waitb           'wait for middle of nth data bit

Now, the tools can warn if the user makes a mistake, and the two labels make
the loop limits very clear to anyone quickly scanning the code.

Cluso99 · 2015-09-27 01:07

Here is the MEMORY MAP...

// P2 MEMORY MAP 24SEP2015
// 
//      addr            read            write           name
//      ---------------------------------------------------------------
// COG REGISTERS (9-bit addressable)
//
//      000             INA             -               INA / IJMP0
//      001             INB             -               INB / IRET0
//      002             RAM             RAM+OUTA        OUTA
//      003             RAM             RAM+OUTB        OUTB
//      004             RAM             RAM+DIRA        DIRA
//      005             RAM             RAM+DIRB        DIRB
//      006             PTRA            PTRA            PTRA
//      007             PTRB            PTRB            PTRB
//
//      008             RAM             RAM             user / ADRA
//      009             RAM             RAM             user / ADRB
//      00A             RAM             RAM             user / IJMP1
//      00B             RAM             RAM             user / IRET1
//      00C             RAM             RAM             user / IJMP2
//      00D             RAM             RAM             user / IRET2
//      00E             RAM             RAM             user / IJMP3
//      00F             RAM             RAM             user / IRET3
//
//      010-1FF         RAM             RAM             user
//      ---------------------------------------------------------------
// LUT
//      200-3FF         RAM             RAM             user / cog-exec
//
// LUT (possible expansion)
//      400-5FF         RAM             RAM             user / cog-exec
//      ---------------------------------------------------------------
// HUB
//      00000-7FFFF     RAM             RAM             user / hub-exec
//
// HUB (future expansion)
//      80000-FFFFF     RAM             RAM             user / hub-exec
//      ---------------------------------------------------------------
// HUB ROM
//      00000-03FFF     (not accessible)                boot
//      ---------------------------------------------------------------

Cluso99 · 2015-09-27 01:16

There is an Internal Stack in all COGs...

P2 INTERNAL STACK (ALL COGs) 24SEP2015
==================================================================================================
There is a 16 level 32-bit Internal Stack in all COGs.
This is accessible using the following instructions...
--------------------------------------------------------------------------------------------------
CCCC 1101011 00L DDDDDDDDD 000101000   PUSH    D/#            'push D/# on internal stack
CCCC 1101011 CZ0 DDDDDDDDD 000101100   POP     D     {WC,WZ}  'pop  D from internal stack
CCCC 1101011 CZ0 DDDDDDDDD 000101001   CALL    D     {WC,WZ}  'save return address on internal stack
CCCC 1101101 Rnn nnnnnnnnn nnnnnnnnn   CALL    #abs/@rel      'save return address on internal stack
CCCC 1101011 CZ0 000000000 000101101   RET           {WC,WZ}  'jump via internal stack
==================================================================================================

evanh · 2015-09-27 01:24

What's ADRA/B used for?

Rayman · 2015-09-27 01:25

Is there any security advantage to leaving below $1000 non-executable?
Seems if there were an OS running, you could call it protected memory and using for buffering outside stuff that you need to make sure can't run...
Just a thought...

Seairth · 2015-09-27 01:33

evanh wrote: »

What's ADRA/B used for?

This threw me off too. Here's chip's answer.:

cgracey wrote: »
Seairth wrote: »
cgracey wrote: »
Here is the new cog register map:
//	addr		read		write		name
//	-------------------------------------------------------------
//
//	008		RAM		RAM		user / ADRA
//	009		RAM		RAM		user / ADRB
By the way, what is ADRA/ADRB?
They are generic registers that can receive 20-bit address results from the LOC instruction, in addition to PTRA and PTRB,

evanh · 2015-09-27 02:28

Ah, yes, and the link (CALLD) instruction also. LOC will mostly be used to reset PTRA/B, particularly when juggling lots of Hub based tables, while CALLD will use ADRA/B so as not to compete with PTR registers usage.

Doh! Even though I'd just been thinking about CALLD I hadn't even looked at what register it targeted.

Cluso99 · 2015-09-27 02:32

Seairth wrote: »
evanh wrote: »

What's ADRA/B used for?

This threw me off too. Here's chip's answer.:
cgracey wrote: »
Seairth wrote: »
cgracey wrote: »
Here is the new cog register map:
//	addr		read		write		name
//	-------------------------------------------------------------
//
//	008		RAM		RAM		user / ADRA
//	009		RAM		RAM		user / ADRB
By the way, what is ADRA/ADRB?
They are generic registers that can receive 20-bit address results from the LOC instruction, in addition to PTRA and PTRB,

And the destination register for the CALLA and CALLB return address.

Seairth · 2015-09-27 02:54

(moved to separate thread)

msrobots · 2015-09-27 03:29

cgracey wrote: »

Rayman wrote: »

I think I almost understand the blinky example, but some things look magic...

What does "orgh 1" do? Why not just orgh ? Does this code start at $1000 (I think so)?

The last two lines with org and res x are hurting my brain...
Does the compiler load anything after "org" into cog before starting?
Or does this only work for "res" reserved space that doesn't need initializing?

That ORGH 1 is there because that's where the loader jumps into your code. It's that non-aligned hub exec below $1000 that people hate. I just haven't changed it yet. I kind of don't want to, because it allows most efficient use of memory. You could always just put a JMP #$1000 after it and pretend it's not really happening.

That ORG + RES business was just a quick way to get some symbolic cog registers declared. It doesn't generate any code. Each blinking cog will use its own instance of those registers.

I think nobody hates it, it just sounded so complicated. If Hub Exec can work below$1000, nobody will complain. Most where just eager to get you to put a image out. Stop changing things, avoid the Homer Car, over $1000 is OK. At least I did.

Now you did 'gave birth' and should feel 100lbs lighter or so. Take the kids hiking for a day or two, the weather is still good. Relax.

Thank you!

Mike

evanh · 2015-09-27 04:01

Cluso99 wrote: »

And the destination register for the CALLA and CALLB return address.

I believe CALLA/B push to HubRAM using PTRA/B as a stack pointers respectively.

Bill Henning · 2015-09-27 04:32

Wohoooooo!

Time to dust off the DE2-115...

Thanks Chip!

Seairth · 2015-09-27 14:31

Looking at the beginning of singlestep.spin, we see

dat
	orgh	1

' This program demonstrated single-stepping.
' You'll need to connect a logic analyzer to p4..p0.

entry	setq	#$1F7
	rdlong	begin,ptrb[(code-entry)>>2]
	jmp	#start
code
	org

begin	long	0,0
ret1	long	0,int1		'interrupt vectors
ret2	long	0,int2
ret3	long	0,int3

Does this mean that "org" without a parameter is implicitly the same as "org #8" (or is that "org #32")?

Also, I'm a little confused about the interrupt "table" layout. According to earlier posts (e.g. Cluso's post above), the pairs are:

IJMPx
IRETx

Which means the above code is initializing IRETx, not IJMPx. I think the labels in the other documentation is backwards?

And finally (for now), what is the format of the SETBRK operand?

mindrobots · 2015-09-27 16:48

Sanity check:

used px.exe to loaf the .rbf file
pnut tells me I have a Prop 2 1-2-3 FPGA A7 loaded
when powered up, the "conf status" lights along with all the red LEDs until loaded.
Once loaded, RED0 is solid on

Load of all_cogs_blink.spin appears to work - I do get a pop-up titled "Hardware" with no content while it loads

After the load completes, nothing happens. No LEDs blinking. My assumption is that the LEDs on the 1-2-3 board are hard wired to pins 0 through 15 of the P2 via the FPGA pins.

I'm using the 1-2-3 without a PLL fix of any kind.

Any thoughts?

Time to try my DE2.

EDIT: DE2 - same results, loads image through Quartus, ctrl-G in PNUT tells me I have a P2 running on a DE2-115. When I use PNUT to load the blinky program, it loads and I see nothing. I took an LED and a resistor and tried it on P0 - P8 coming out of the expansion board - nothing.
My P2 expansion board is the NEW variety, not the original one Chip pictured.

Seairth · 2015-09-27 17:38

mindrobots wrote: »

Load of all_cogs_blink.spin appears to work - I do get a pop-up titled "Hardware" with no content while it loads

That part sounds suspicious. I get the following text inside the status box, in quick succession:

1. A... (something too fast to see)
2. Checking COM3
3. Loading Loader
4. Loading RAM

mindrobots · 2015-09-27 17:55

Follow up on my problems above.

It is a PNUT under Wine issue*. I just dusted off my Win7 desktop, downloaded the latest file from Chip. When I loaded the blinky program from PNUT, I received all the dialog listed by Seairth above and now have blinky lights on my 1-2-3 FPGA board!!! Yay!!

EDIT: DE2-115 works as expected also...yay!!

Looks like I'm back to using Win7 until I spend time on my Wine issues.

Oh, P2, the sacrifices I make for you!!!

*I've seen other issues where some Windows programs can't talk to FTDI USB serial ports with any reliability. It still seems to be unresolved.

Seairth · 2015-09-27 18:43

@cgracey: I'm encountering an AUGS encoding issue, possibly with pnut.

If I have the following snippet:

addcnt c, ##$400000

I get the following binary:

FF000200 FA842E00

The AUGS seems to be off by 4 bits (I was expecting FF002000).

Then, if I instead try this manually:

augs #$2000
addcnt c, #0

I instead get

FF000010 FA842E00

The second instruction is still correct, but I have no idea how I'm ending up with $10 instead of $2000 in the first instruction.

Roy Eltham · 2015-09-27 22:57

Seairth,
Perhaps, it's because the AUGS instruction only encodes the part that is above the first 9 bits. The compiler still expects you to put in the full 32bit value, but it only encodes the upper 23 bits into the instruction.

I believe that would make your above examples "make sense".

Seairth · 2015-09-27 23:08

Roy Eltham wrote: »

Seairth,
Perhaps, it's because the AUGS instruction only encodes the part that is above the first 9 bits. The compiler still expects you to put in the full 32bit value, but it only encodes the upper 23 bits into the instruction.

I believe that would make your above examples "make sense".

Ahh! Well, that certainly solves one of the issues. I find it surprising that AUGx expects a 32-bit value, not a 23-bit value. However, I can live with that (I prefer the ## syntax better anyhow).

Oh... and the other part turns out to be my mistake. Transposing pnut's LE view to BE, I accidentally flipped two hex digits. I am, indeed, seeing "FF002000 FA842E00", which is what I was expecting. Sorry about that!

David Betz · 2015-09-27 23:15

What is the difference between instructions.txt and instructions.sv? It looks like the .sv file contains more information. I'm looking at writing a program to parse these files to automatically generate the opcode table for gas. Which should I use? Are they both always guaranteed to match where the information between the two overlaps?

evanh · 2015-09-27 23:58

All human readable languages and even all computing formats are written/displayed in BE format. LE has no real technical advantages over BE.

I'm constantly surprised LE is still seen as preferable.

Roy Eltham · 2015-09-28 01:20

evanh,
LE is "preferred" partially because when casting between larger and smaller sized integer types you don't change the address of the data (ie the pointer doesn't change). Aside from that, perhaps it's also partly because x86 uses it and that's pretty widespread these days...

Tor · 2015-09-28 02:45

The 6502 implemented LE instead of BE as was used by its predecessor, the 6800. The main reason was that it saves cycles when doing operations which requires addition of 16-bit numbers (including address indexing) on an 8-bit bus. With LE you can fetch the lowest significant byte first, then start working with that one immediately while fetching the next byte.
The casting thing was something which was kind of popular with Fortran programmers on VAX, the VAX would fetch 32 bits at the time so there isn't any processing gain with LE. Fortran doesn't have casts, but you could have a function expecting a different size integer than the one you fed it, and it would still work (if the actual value was still sensible). Obviously that led to non-portable code (wouldn't work on BE systems), and was generally a dirty hack.

Rayman · 2015-09-28 16:53

Unboxed my DE2

Had some trouble downloading the software here... Microsoft Security Essentials doesn't like file named px.exe ... Had to override settings to get it...

Had Quartus software from earlier work with DE0. It's telling me "cannot add target device EPCQ64 to device chain when in current programming mode". Clearly I have no idea what I'm doing... Guess I need to revisit the sticky threads...

Sticky threads didn't help... Had to read the manual

Seems I need to "Configuring the EPCS64 in AS Mode"

Rayman · 2015-09-28 17:30

Think I have it

I see now it's the red leds that blink with #cogs...

mindrobots · 2015-09-28 17:34

Yay!!! Blinky lights is GOOD!!!

A CTL-G in PNUT should tell you that you have a DE2-115 P2.....next stop, codeville!!!

Rayman · 2015-09-28 17:41

Ran the cog_1k_program.spin file and hooked an oscilloscope up to pins...
See 50% duty cycle burst (probably 500 pulses) with 25 kHz burst repetition frequency on P0,P1,P2 and P3.
See 12.5 KHz square wave on P4.

Think I see now that DIRA goes to Emulator Add-On pins and DIRB goes to red leds on DE2-115 board...

evanh · 2015-09-29 05:33

Roy Eltham wrote: »

evanh,
LE is "preferred" partially because when casting between larger and smaller sized integer types you don't change the address of the data (ie the pointer doesn't change). Aside from that, perhaps it's also partly because x86 uses it and that's pretty widespread these days...

The casting point is a hack at best and there is also times one is wanting the high bits instead. The x86 point is non-technical.

evanh · 2015-09-29 05:35

Tor wrote: »

The 6502 implemented LE instead of BE as was used by its predecessor, the 6800. The main reason was that it saves cycles when doing operations which requires addition of 16-bit numbers (including address indexing) on an 8-bit bus. With LE you can fetch the lowest significant byte first, then start working with that one immediately while fetching the next byte.

The 16-bit (over 8-bit bus) addressing example still requires both bytes before an address can be actioned. I suspect a bigger adder circuit would resolve whatever is of concern here. Could do with some more detail.

jmg · 2015-09-29 05:39

Roy Eltham wrote: »

evanh,
LE is "preferred" partially because when casting between larger and smaller sized integer types you don't change the address of the data (ie the pointer doesn't change). Aside from that, perhaps it's also partly because x86 uses it and that's pretty widespread these days...

I think the smaller code size is the most compelling :
ie You can increment an array of larger entities, with a single pointer pass with LE, starting from the low index which is the convention.
Of course, sometimes you need to reverse things, so good debuggers should display both endian choices.

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Comments