Propeller II programing questions to Chip

David Betz · 2013-01-10 04:36

cgracey wrote: »

I forgot to mention this in the doc's. I'll add it tonight.

In the meantime, you must use 'SETPORT D/#n' where bits 5 and 6 of the operand determine which port (0..3) the WAITPEQ/WAITPNE will look at.

Thanks Chip!

potatohead · 2013-01-10 08:43

Thanks Chip. I was just looking those over confused... Nice!

Is the text document linked above inclusive? In other words, are you just adding to your original source one and posting here Chip?

cgracey · 2013-01-11 03:26

potatohead wrote: »

Thanks Chip. I was just looking those over confused... Nice!

Is the text document linked above inclusive? In other words, are you just adding to your original source one and posting here Chip?

I keep adding to the same document, but I also keep making changes to older parts of it. In the end, it will be good to read the whole thing from beginning to end to get an updated overview of things.

potatohead · 2013-01-11 07:54

I'll treat them as atomic then. Won't hurt to re-read

Thanks.

mindrobots · 2013-01-11 08:14

potatohead wrote: »

I'll treat them as atomic then. Won't hurt to re-read

Thanks.

I've been trying to do that. Of course, with my memory, it's like I'm getting a new document each time!

Sapieha · 2013-01-22 14:14

Hi Chip.

It is maybe to late but I gave looked on this instructions.
And are not sure if that is so usable.

000011 ZCR 1 CCCC DDDDDDDDD 1100bbbbb   |   SETBC       |   D.b     | Set bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511) to C.&#8221;
000011 ZCR 1 CCCC DDDDDDDDD 1101bbbbb   |   SETBNC      |   D.b     | Set bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511) to NC.&#8221;
000011 ZCR 1 CCCC DDDDDDDDD 1110bbbbb   |   SETBZ       |   D.b     | Set bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511) to Z.&#8221;
000011 ZCR 1 CCCC DDDDDDDDD 1111bbbbb   |   SETBNZ      |   D.b     | Set bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511) to NZ.&#8221;

In my opinion more usable have be that ones for handle of CPU emulators

000011 ZCR 1 CCCC DDDDDDDDD 1100bbbbb   |   SETBC       |   D.b     | Copy bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511) to C.&#8221;
000011 ZCR 1 CCCC DDDDDDDDD 1101bbbbb   |   SETCB       |   D.b     | Copy C to bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511.&#8221;
000011 ZCR 1 CCCC DDDDDDDDD 1110bbbbb   |   SETBZ       |   D.b     | Copy bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511) to Z.&#8221;
000011 ZCR 1 CCCC DDDDDDDDD 1111bbbbb   |   SETZB       |   D.b     | Copy Z to bit &#8220;b (0-31)&#8221; of register &#8220;D (0-511).&#8221;

Roy Eltham · 2013-01-22 15:46

Aren't there already instructions to copy the flags to/from a register?

Ariba · 2013-01-22 18:20

The SETBC /NC /Z /NZ do already copy the state of the C or Z flag to the bit.
If you want to copy a bit from a register into another bit in another register you will need 2 instructions:

isob  reg1,#bit1  wc,nr
  setbc reg2,#bit2

It may be a good idea to make a pseudo instruction for ISOB with NR-effect (like TEST is a AND NR) :

isob dst,#bit nr   =  getb dst,#bit

Andy

potatohead · 2013-01-23 09:38

Haven't we already made the shuttle?

FredBlais · 2013-01-23 09:44

potatohead wrote: »

Haven't we already made the shuttle?

I think the answer to your question is in this thread. http://forums.parallax.com/showthread.php/145201-Shuttle-today?p=1156240#post1156240

Fred

potatohead · 2013-01-23 09:58

Thanks! Too many threads. I'm just getting back from a conference in NOLA, got sick as a dog (seriously, use that hand cleaner well if you are there), and am just now catching back up.

Sapieha · 2013-03-13 13:38

Hi Chip.

Have You any info/demo on strobed IO

cgracey · 2013-03-13 14:26

Sapieha wrote: »

Hi Chip.

Have You any info/demo on strobed IO

What do you mean by strobed I/O?

Sapieha · 2013-03-13 14:34

Hi Chip.

This mode

xxxx [COLOR=#ff0000][U]C[/U][/COLOR]IOHHHLLL      [U]|[B] [COLOR=#ff0000]C [/COLOR][/B]| OUT / IN[/U]   | 0 | Live [B]  | [COLOR=#ff0000]1 | Clocked[/COLOR][/B]

cgracey wrote: »

What do you mean by strobed I/O?

Sapieha · 2013-03-13 14:50

Hi Chip.

Previous post edited -- to be more clear

cgracey · 2013-03-13 15:01

Sapieha wrote: »

Hi Chip.

This mode

xxxx [COLOR=#ff0000][U]C[/U][/COLOR]IOHHHLLL      [U]|[B] [COLOR=#ff0000]C [/COLOR][/B]| OUT / IN[/U]   | 0 | Live [B]  | [COLOR=#ff0000]1 | Clocked[/COLOR][/B]

I understand now. You are talking about pin modes which are selectable on the actual chip, but not possible to fully implement on the FPGA. I could implement part of the pin circuitry on the FPGA, but not much, so I didn't bother doing it, at all. We'll have to wait for the chip.

Sapieha · 2013-03-13 15:12

Hi Chip.

If it is not on Emulator -- not interesting in this phase of experiments ----

Thanks

cgracey wrote: »

I understand now. You are talking about pin modes which are selectable on the actual chip, but not possible to fully implement on the FPGA. I could implement part of the pin circuitry on the FPGA, but not much, so I didn't bother doing it, at all. We'll have to wait for the chip.

Sapieha · 2013-03-24 02:15

Hi Chip.

How I use MUL, DIV
I see in instructions set that instructions BUT don't find how I shall use them.
Any help appreciated if possible with Examples

GETMULL D WC attempt to get lower multiplier result
GETMULH D WC attempt to get upper multiplier result
GETDIVQ D WC attempt to get divider quotient result
GETDIVR D WC attempt to get divider remainder result

Sapieha · 2013-03-25 00:46

Hi Chip.

You have simple multiply instruction but why not simple Divide?

| MUL | D,S#n |
| DIV | D,S#n | Why not that one ?

cgracey · 2013-03-25 10:37

Sapieha wrote: »

Hi Chip.

You have simple multiply instruction but why not simple Divide?

| MUL | D,S#n |
| DIV | D,S#n | Why not that one ?

The reason is that there is no such thing as a simple (as in 'fast') divide technique. Division must be performed on a step-by-step test-case basis, unlike multiplication which is deterministic. There are two asynchronous 16-over-8-bit dividers in the texture mapper to handle the Z-perspective correction and they take 3 clocks just to settle. Their latency is the main reason GETPIX takes three clocks.

The 64-over-32-bit divider in each cog takes 17 clocks, being a radix-4 divider. It tests 4 cases of subtraction on each clock to generate 2 bits of quotient. The 17th clock is needed to possibly negate the quotient and remainder results, in case the operation was signed and the results were due to be negative.

Sapieha · 2013-03-25 10:40

Hi Chip.

Thanks for explanation.

cgracey · 2013-03-25 10:42

Sapieha wrote: »

Hi Chip.

How I use MUL, DIV
I see in instructions set that instructions BUT don't find how I shall use them.
Any help appreciated if possible with Examples

GETMULL D WC attempt to get lower multiplier result
GETMULH D WC attempt to get upper multiplier result
GETDIVQ D WC attempt to get divider quotient result
GETDIVR D WC attempt to get divider remainder result

I will document this after I get the SDRAM working.

Sapieha · 2013-03-25 10:43

Hi Chip.

Thanks

Sapieha · 2013-03-25 11:10

Hi Chip.

In attachment You can see DIVIDE.VHD I use that not need settle time

cgracey wrote: »

The reason is that there is no such thing as a simple (as in 'fast') divide technique. Division must be performed on a step-by-step test-case basis, unlike multiplication which is deterministic. There are two asynchronous 16-over-8-bit dividers in the texture mapper to handle the Z-perspective correction and they take 3 clocks just to settle. Their latency is the main reason GETPIX takes three clocks.

The 64-over-32-bit divider in each cog takes 17 clocks, being a radix-4 divider. It tests 4 cases of subtraction on each clock to generate 2 bits of quotient. The 17th clock is needed to possibly negate the quotient and remainder results, in case the operation was signed and the results were due to be negative.

cgracey · 2013-03-25 11:35

Sapieha wrote: »

Hi Chip.

In attachment You can see DIVIDE.VHD I use that not need settle time

It still needs settling time - not in clocks, but in nanoseconds. You can find out what the settling time is by putting flops on the inputs and outputs and then compiling it and checking the Fmax. You cannot clock that thing very quickly, compared to a multiplier.

For big dividers, you must have them work over multiple clocks with flip-flops, because they'd never be able to keep up with the main clock as asynchronous circuits. Even in a 16-over-8 asynchronous divider, the critical path is through a few hundred standard cells in an ASIC. It's less cells in an FPGA, because those cells are more complex, but even FPGA will often have dedicated math circuits to overcome the logic fabric's speed limitations for math operations.

Sapieha · 2013-03-25 11:45

Hi Chip.

Thanks.
Now I understand what You mean --- settle time

Cluso99 · 2013-03-25 15:11

cgracey wrote: »

The reason is that there is no such thing as a simple (as in 'fast') divide technique. Division must be performed on a step-by-step test-case basis, unlike multiplication which is deterministic. There are two asynchronous 16-over-8-bit dividers in the texture mapper to handle the Z-perspective correction and they take 3 clocks just to settle. Their latency is the main reason GETPIX takes three clocks.

The 64-over-32-bit divider in each cog takes 17 clocks, being a radix-4 divider. It tests 4 cases of subtraction on each clock to generate 2 bits of quotient. The 17th clock is needed to possibly negate the quotient and remainder results, in case the operation was signed and the results were due to be negative.

This is a really impressive divide. Wow 17 clocks!!! Now, if we could just clock this thing at 1GHz

cgracey · 2013-03-25 16:54

Cluso99 wrote: »

This is a really impressive divide. Wow 17 clocks!!! Now, if we could just clock this thing at 1GHz

$3M would do it!

Cluso99 · 2013-03-25 18:57

cgracey wrote: »

$3M would do it!

Wouldn't $3M get me 3GHz? Maybe it is only $2.9M for 1GHz.
Of course I would also be expecting more SRAM for this.

I have my lotto ticket in

Cluso99 · 2013-03-26 18:45

Chip: Is there any restriction in pnut.exe that prevents it from compiling programs greater than 2KB?
Postedit: Just discovered I found this restriction in December -oops. It has been confirmed. Use p2load by David Betz instead to load a pnut binary image.
http://forums.parallax.com/showthread.php/144384-p2load-A-Loader-for-the-Propeller-II

pnut appears to cut the load at $165F and then fills 32 bytes (8 longs) with $00000001. This would mean $E80..$167F = $0800 = 2KB. I have tried making a new DAT section and also an ORG 0. Neither fix this.

Attached is a code sample. I have patched my code to go straight to the ROM Monitor after a 5 sec delay. If you then examine $1600-$16FF you will see the "==== HUB END ===" followed by some $33 ("3") bytes. In my code (at the end) is a byte $33[256] which should output 256 x $33 but they are truncated at $165F.

LMM_SerialDebugger_025_bug.spin

Propeller II programing questions to Chip

Comments