Propeller II: Emulation of the P2 on FPGA boards (Prop123-A7/A9, DE0-NANO, DE2-115, etc)

cgracey · 2012-12-08 02:43

Ariba wrote: »

OK I verified it and it is reallly 3MHz.

But after a while I also found the reason:
This 2 instructions LMM code just stays always in the quad-cache, no need to reload it ! So rdlongc always takes only 1 cycle.
This changes if I make the LMM code longer or the addresses of the two instruction loop goes over a qaud boundery.

Andy

That's a relief! By the way, as someone pointed out, the FPGA and the actual chip are going to be functionally identical as goes the logic behavior. They are the same thing, just in different forms..

Sapieha · 2012-12-08 04:01

Hi Chip.

Can You give some clarification to this.

http://forums.parallax.com/showthread.php?144432-The-unofficial-P2-documentation-project&p=1148354&viewfull=1#post1148354

thanks

Bill Henning · 2012-12-08 05:35

Thanks for those details Chip.

Now for REPx

1) can I assume if there is a jump out of the REP block it gets cancelled?

1a) above implies that if you call a subroutine from inside a REP block, the REP gets cancelled

2) can I assume that you cannot nest a REP'ed loop within another REP loop?

Regards,

Bill

cgracey wrote: »
Ariba,

1) Regardless of how few cogs an FPGA implementation has, it always cycles the hub as if there were eight cogs. So, the DE0-Nano board gives its single cog every 8th hub cycle, just as a single cog would get in a complete chip.

2) You are right about two instructions needing to be between an instruction modifier and the instruction getting modified. I was just writing some pipeline explanation about this:
PIPELINE
--------

Each cog has a 4-stage pipeline which all instructions progress through, in order to execute:


  1st stage    - Read instruction
  2nd stage    - Determine indirect/remapped D and S addresses, update INDA/INDB
  3rd stage    - Read D and S
  4th stage    - Execute instruction, writing D, Z/C/PC, and any other results


On every clock cycle, the instruction in each stage advances to the next stage, unless the instruction
in the 4th stage is stalling the pipeline because it's waiting for something (i.e. WRBYTE waits for
the hub).

To keep D and S data current within the pipeline, the resultant D from the 4th stage is passed back to
the 3rd stage to substitute for any obsoleted data being read from the cog register RAM. The same is
done for instruction data in the 1st stage, but there is still a two-stage gap between when a register
is modified and it can be executed, at the earliest:


        MOVD    :inst,top9         'modify instruction
        NOP                        '1...
        NOP                        '2... at least two instructions in-between
:inst   ADD     A,B                'modified instruction executes


Tasks that execute in at least every 3rd time slot don't need to observe this 2-instruction rule because
their instructions will always be sufficiently spread apart in the pipeline.
Your LMM code looks sensible to me. I would think that every 20 clocks you should suffer a 4-clock delay to get up to a multiple of 8 clocks (24). When RDLONGC needs to do a RDQUAD, it will take 3-10 clocks. What kind of 4-loop period are you seeing?

cgracey · 2012-12-08 09:56

Sapieha wrote: »

Hi Chip.

Can You give some clarification to this.

http://forums.parallax.com/showthread.php?144432-The-unofficial-P2-documentation-project&p=1148354&viewfull=1#post1148354

thanks

MMMM = PLL mode:

        %0000 for disabled, else XX must be set for XI input or XI/XO crystal oscillator
        %0001 for multiply XI by 2
        %0010 for multiply XI by 3
        %0011 for multiply XI by 4
        %0100 for multiply XI by 5
        %0101 for multiply XI by 6
        %0110 for multiply XI by 7
        %0111 for multiply XI by 8
        %1000 for multiply XI by 9
        %1001 for multiply XI by 10
        %1010 for multiply XI by 11
        %1011 for multiply XI by 12
        %1100 for multiply XI by 13
        %1101 for multiply XI by 14
        %1110 for multiply XI by 15
        %1111 for multiply XI by 16

cgracey · 2012-12-08 09:59

Bill Henning wrote: »

Thanks for those details Chip.

Now for REPx

1) can I assume if there is a jump out of the REP block it gets cancelled?

1a) above implies that if you call a subroutine from inside a REP block, the REP gets cancelled

2) can I assume that you cannot nest a REP'ed loop within another REP loop?

Regards,

Bill

1) Yes, REPS/REPD gets cancelled if any branch occurs.
2) Yes, there is only one REPS/REPD circuit, so REPS/REPD's cannot be nested.

Bill Henning · 2012-12-08 10:44

Thanks!

That makes perfect sense.

Bob Lawrence (VE1RLL) · 2012-12-08 19:47

My DE0-NANO FPGA board arrived. Hopefully I can get going tonight or tomorrow. The Altera Quartus FPGA Software was a pig to download and failed the download process several times. The DEO-Nano is powered up so Let the fun begin :cool:

D.P · 2012-12-08 20:19

Bob Lawrence (VE1RLL) wrote: »

My DE0-NANO FPGA board arrived. Hopefully I can get going tonight or tomorrow. The Altera Quartus FPGA Software was a pig to download and failed the download process several times. The DEO-Nano is powered up so Let the fun begin :cool:

If you just want to program the DE0 with the existing P2 image all you need is this:

https://www.altera.com/download/software/prog-software

and these window drivers for usb connectivity.

Cluso99 · 2012-12-08 20:39

Chip will now continue posting instruction and other specific P2 information in this thread...
http://forums.parallax.com/showthread.php?144432-The-unofficial-P2-documentation-project&p=1148577&viewfull=1#post1148577

Bob Lawrence (VE1RLL) · 2012-12-08 20:45

@D.P

Thanks! very much for that download link. I had two problems so far:

1. It coulden't find a hardware device but downloading the USB blaster files and manually adding them to the USB driver in the control pannel / Drivers solved that problem.

2. Then when I loaded the DE0_Nano_Prop2.jic file it kept saying that it was corrupt. I then downloaded it again serveral times but it still complained that it was corrupt. I read back through this thread and see that others had the same problem and they downloading a stand alone version of the programmer software, which worked form me as well.

Once I got it to compile the spin file and load it into the Nano I reset theNano board and the green LED's changed so that only LED0 is lit. Hitting the space bar in the Putty serial terminal activated the Monitor Program and it Identified itself as Propeller 2. Hitting the enter key allows me to step through the monitor prog.

So far so good :smile:

potatohead · 2012-12-08 21:05

Programming now... I had a lot of trouble loading the software.

Fingers crossed!

While I wait, I might as well write up my snags. I'm on Win XP, and it's my old Prop dev laptop... Kind of want to replace it, but I really like it, so... Deal. Check.

The USB Blaster threw me for a loop! Drivers get dropped on to your machine in the Altera install directory. It's easiest to just do the Programmer or full software installation, then hook your stuff up, wait for the USB detection and point it at the directory supplied. Sheesh.

One other thing I got stuck on was creating an account. Didn't work no matter what, so I did the one time access. Recommended.

Programming done! Maybe I have a P2. (dang, this is kind of cool)

Well, I just bought two new Prop Plugs to replace the one that broke on me a month or two ago. I switched to both serial and the USB built in on some boards. This one isn't recognized, and I'm assuming that's a driver update...

potatohead · 2012-12-08 21:23

Glad I bought two Prop Plugs. I got a bogus one.

I'll get that sorted once I look it over some. Maybe it's something silly. I'm in now!

Another DE2-115 up and running!

=== Propeller II Monitor ===



>?



                - HUB -

{adr{.adr}}                     - View

{adr{.adr}}/{dat{ dat}}         - Search

{adr{.adr}}:{dat{ dat}}         - Enter

adr.adr[</>]adr                 - Move

adr.adr^                        - Checksum

adr@                            - Watch

[Y/W/N]                         - Byte/word/long

                - COGS -

cog+adr{+adr}                   - Start

cog-                            - Stop

M                               - Map

                - PINS -

{pin}[H/L/T/Z/R]                - High/low/toggle/off/read

pin#                            - Watch

pin|cfg                         - Configure

dat\                            - Set DACs

                - MISC -

dat*                            - Set clock

'                               - Repeat

Q                               - Quit



>

And on a related note, I have a Firefox browser that would not render code windows correctly. Somebody here suggested font size, and yes! That fixed it on my machine. FYI.

Now, off to read all the docs and play some.

SRLM · 2012-12-08 21:26

potatohead wrote: »

Glad I bought two Prop Plugs. I got a bogus one. I'll get that sorted once I look it over some. Maybe it's something silly. I'm in now!

This has happened to me several times, but I always assumed that it was my fault. I make sure to throw them away; otherwise, the next time I spend 1/2 hour trying to figure out what's wrong.

potatohead · 2012-12-08 21:30

Seriously! I dorked with that for a loooooong time. My original one was fine, it just got cold sitting next to a window and the thin solder joints snapped due to some cable stress. This one is getting a blob of reinforcing something or other, and I'll probably just leave it on the cable, so I don't have to go looking for the thing. I'll send the other one back, once I've determined it's not something I did, or just stupid that can be fixed, but not now.

cgracey · 2012-12-08 22:37

potatohead wrote: »
Glad I bought two Prop Plugs. I got a bogus one. I'll get that sorted once I look it over some. Maybe it's something silly. I'm in now!

Another DE2-115 up and running!
=== Propeller II Monitor ===



>?



                - HUB -

{adr{.adr}}                     - View

{adr{.adr}}/{dat{ dat}}         - Search

{adr{.adr}}:{dat{ dat}}         - Enter

adr.adr[</>]adr                 - Move

adr.adr^                        - Checksum

adr@                            - Watch

[Y/W/N]                         - Byte/word/long

                - COGS -

cog+adr{+adr}                   - Start

cog-                            - Stop

M                               - Map

                - PINS -

{pin}[H/L/T/Z/R]                - High/low/toggle/off/read

pin#                            - Watch

pin|cfg                         - Configure

dat\                            - Set DACs

                - MISC -

dat*                            - Set clock

'                               - Repeat

Q                               - Quit



>
And on a related note, I have a Firefox browser that would not render code windows correctly. Somebody here suggested font size, and yes! That fixed it on my machine. FYI.

Now, off to read all the docs and play some.

Doug, configure your terminal so it doesn't do two line feeds when it receives an $0D+$0A.

It sounds like we have some problems making Prop Plugs. I think I read that someone else was having trouble. Do any of you know if the problems have to do with the USB connector not soldering correctly?

potatohead · 2012-12-08 23:54

Got sorted and moved over to PUTTY. That literally was the first go.

I have a nice white on blue screen now with a good font. Happy days!

Maybe there is trouble. My first one suffered from cold and the low amount of solder. I don't really consider that a failure. It was snowing... I managed to fix it then, and it worked until just recently when I over stressed it. Didn't want to fix it again.

But, the two I got act different on the computer. One fails on Win XP consistently on a variety of cables and machine states, booted recently, etc... I've attached the error picture. It also blinks three or four times with a pause between each... The working one does the usual quick coupla blinks and is ready to go.

I'll try it on another machine later on to be sure, then probably send it back to you guys so somebody can ferret out what the trouble really is.

Sapieha · 2012-12-08 23:57

Hi Chip.

I know that that will give Parallax some extra charge -- But why not use trough hole type USB connector -- same for 4 pin's one on other side -- Both give me problems

needed resolder them some times already. -- (Have it same time as my time on forum)

cgracey wrote: »

Doug, configure your terminal so it doesn't do two line feeds when it receives an $0D+$0A.

It sounds like we have some problems making Prop Plugs. I think I read that someone else was having trouble. Do any of you know if the problems have to do with the USB connector not soldering correctly?

FredBlais · 2012-12-09 16:52

I have an old DE2 Cyclone II EP2C35F672 board lying around, is it compatible with the propeller 2 binary?

Sapieha · 2012-12-09 17:06

Hi Chip.

Can You clarify this instructions little more ?

  DECOD5        |    D        | Overwrite register D (0-511) with decoded D[1:0] repeated 1 time.
  DECOD4        |    D        | Overwrite register D (0-511) with decoded D[2:0] repeated 2 time.
  DECOD3        |    D        | Overwrite register D (0-511) with decoded D[3:0] repeated 4 time.
  DECOD2        |    D        | Overwrite register D (0-511) with decoded D[4:0] repeated 8 time.

Leon · 2012-12-09 17:12

FredBlais wrote: »

I have an old DE2 Cyclone II EP2C35F672 board lying around, is it compatible with the propeller 2 binary?

A DE2-115 or DE0-Nano with a Cyclone IV is required. The emulation won't work on a Cyclone II.

Cluso99 · 2012-12-09 17:20

Sapieha: Sorry I have not replied to your question regarding hub layout. Been quite busy to think about this for now. But yes, we do need to define a preferred hub layout for $0e80-$0fff.

FredBlais · 2012-12-09 17:25

Leon wrote: »

A DE2-115 or DE0-Nano with a Cyclone IV is required. The emulation won't work on a Cyclone II.

So I guess I'll have to wait for the real thing

jmg · 2012-12-09 17:47

Sapieha wrote: »

Hi Chip.
I know that that will give Parallax some extra charge -- But why not use trough hole type USB connector -- same for 4 pin's one on other side -- Both give me problems
needed resolder them some times already.

There is something of a trend back to thru-hole, at least for the shell 'mounting ears'. Most USB parts now also have offerings in SMD-pins, and thu hole GND connections. Still reflow compatible, but much stronger.

David Betz · 2012-12-09 18:42

Has anyone started making a list of which P1 instructions have different encodings in P2? I've been trying to get PropGCC working and so far I've stumbled over WAITCNT and JMPRET. Obviously, P2 has many new instructions but does anyone have a complete list of which P1 instructions have different encodings in P2?

Dave Hein · 2012-12-09 19:10

Sapieha wrote: »

Hi Chip.

Can You clarify this instructions little more ?

  DECOD5        |    D        | Overwrite register D (0-511) with decoded D[1:0] repeated 1 time.
  DECOD4        |    D        | Overwrite register D (0-511) with decoded D[2:0] repeated 2 time.
  DECOD3        |    D        | Overwrite register D (0-511) with decoded D[3:0] repeated 4 time.
  DECOD2        |    D        | Overwrite register D (0-511) with decoded D[4:0] repeated 8 time.

Sapieha, I think the specification for these instruction isn't quite right. I believe the order of the D[x:0] specifications is reversed. DECOD5 should be D[4:0], DECOD4 should be D[3:0], ans so on. At least that's the way I have it coded in SpinSim. Basically, DECOD5 produces 1 << D[4:0], DECOD4 produces (1 << D[3:0]) * $00010001, DECOD3 produces (1 << D[2:0]) * $01010101 and DECOD2 produces (1 << D[1:0]) * $11111111.

At least this is the way I understand it. Hopefull Chip can verify whether this is correct.

Bill Henning · 2012-12-09 19:17

The P2 pdf on ParallaxSemiconductor has bit encodings, however I don't know if those are up to date.

David Betz wrote: »

Has anyone started making a list of which P1 instructions have different encodings in P2? I've been trying to get PropGCC working and so far I've stumbled over WAITCNT and JMPRET. Obviously, P2 has many new instructions but does anyone have a complete list of which P1 instructions have different encodings in P2?

cgracey · 2012-12-09 19:26

Dave Hein wrote: »

Sapieha, I think the specification for these instruction isn't quite right. I believe the order of the D[x:0] specifications is reversed. DECOD5 should be D[4:0], DECOD4 should be D[3:0], ans so on. At least that's the way I have it coded in SpinSim. Basically, DECOD5 produces 1 << D[4:0], DECOD4 produces (1 << D[3:0]) * $00010001, DECOD3 produces (1 << D[2:0]) * $01010101 and DECOD2 produces (1 << D[1:0]) * $11111111.

At least this is the way I understand it. Hopefull Chip can verify whether this is correct.

You're exactly right.

DECOD5 decodes the 5 LSB's.
DECOD4 decodes the 4 LSB's, replicating the result twice to fill 32 bits.
DECOD3 decodes the 3 LSB's, replicating the result four times to fill 32 bits.
DECOD2 decodes the 2 LSB's, replicating the result eight times to fill 32 bits.

David Betz · 2012-12-09 19:55

David Betz wrote: »

Has anyone started making a list of which P1 instructions have different encodings in P2? I've been trying to get PropGCC working and so far I've stumbled over WAITCNT and JMPRET. Obviously, P2 has many new instructions but does anyone have a complete list of which P1 instructions have different encodings in P2?

I did a quick compare of opcodes and found that the following instructions differ between P1 and P2:

JMP/JMPRET/CALL/RET
CMPSUB
DJNZ
TJNZ
TJZ
WAITPEQ
WAITPNE
WAITCNT

There are of course some instructions whose opcodes haven't changed but that have enhanced capability in P2 like the RD/WR BYTE/WORD/LONG instructions.
And there are *many* new instructions.

One thing I noticed though is that you can't just use any instruction with 0000 in the CCCC field as a NOP because the REPS instruction uses those bits for other purposes.

cgracey · 2012-12-09 20:24

FredBlais wrote: »

I have an old DE2 Cyclone II EP2C35F672 board lying around, is it compatible with the propeller 2 binary?

No. It takes me a day to get a working compilation for a particular device/board combination. Then, I must maintain it as we move forward. I don't think it's worth doing for an FPGA that is three generations old. I wish you had some new Cyclone V board like this:

http://www.altera.com/products/devkits/altera/kit-cyclone-v-e.html

That board would be worth doing the work for, as it might enable 100MHz Prop2 emulation.

When looking at Altera's page on their forthcoming 20nm FPGA technology (http://www.altera.com/technology/system-tech/20nm/20nm-technology-features.html), I read this:

...Applications of this very-high-performance DSP capability in mixed-system fabric include high-performance and low-latency financial calculations, high-resolution broadcast and wireless, and advanced military sensors and battle-field awareness.

Someone had mentioned here about using FPGA's for high-speed stock trading. Altera seems to be aiming their new parts at this esoteric market. I've read how certain too-big-to-fail banks have massive low-latency connections to the main trading systems at the NYSE and CME. By intercepting asks and bids, they execute meta-trades within milliseconds that front run trades by regular dolts like us, and probably everyone else. That's how they can make money, no matter what the market does. The consensus at zerohedge.com is that the computer algorithms are about the only players left in the markets today, as volumes are quite low. The average stock share is now owned for only 20 seconds. Meanwhile, the Fed is said to be the only remaining buyer of government debt. I'd say there's a good future in all this.

Bill Henning · 2012-12-09 20:33

Chip,

I just finished experimenting with RDQUAD.

- I really like that you can have seven instructions in a RDQUAD "shadow"
- there needs to be at five instructions before the first instruction fetched by RDQUAD can execute.

I am posting the revised LMM2 RDQUAD loop in my thread... first time through the loop the fetched values are not executed

Propeller II: Emulation of the P2 on FPGA boards (Prop123-A7/A9, DE0-NANO, DE2-115, etc)

Comments