Latency from WAITPEQ (or WAITPNE) to sampling an input pin in the next instruct

godzich · 2008-11-11 16:11

Hi,

I'm in the middle of writing code fo a very high speed application-specific serial data receiver. This technical detail I'm askin about·could even interest other people that tries to squeeze every drop of performance out of the propeller...

I need to know the following (not mentioned anywhere in the documentation) so I hope Chip, or someone else very familiar with the propeller inner beauty, could answer. Even better, pin I/O timing·should be added to the next release of the data sheet, where it belongs in the first place. Rayman is asking elsewhere in this forum also for the I/O pin timings, so this IS important technical info that is still not (publically) documented.

Assume the following pasm code:

·· WAITPEQ state,· #mask··········· 'wait for pin·X to change state
·· AND····zero,· ·#mask····wc, nr···'set Carry flag·according to pin X's state (zero=0 and kept 0)

[font=ArialNarrow,BoldItalic size=2][font=ArialNarrow,BoldItalic size=2][font=ArialNarrow,BoldItalic size=2][font=ArialNarrow,BoldItalic size=2]My·question is (assuming the maximum safe 80MHz system clock frequency):[/font][/font][/font][/font]

1. From the exact moment the pin X changes state - how long (in nanoseconds) does it take before the same pin X is sampled, in the next AND operation ?
2. How much jitter can be anticipated here structurally (in nanoseconds)?

The main interest here is to know if the COG engine spends whole 50 ns cycles or multiples thereof here - during the WAITPEQ instruction?
Or - is the COG on hold in a low power state awaiting for the transition - and then starts the execution of the next at the next SYSTEM clock rising transition, or the next SYSTEMCLK / 4 clock transition?

I know this IS a technical question, but I bet Chip can just draw the answer from his sleeve [noparse]:)[/noparse] Hopefully I was able to explain clearly enough what I'm trying to figure out...

Thanks for all help in advance!!!

Sincerely,

Christian

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Post Edited (godzich) : 11/11/2008 4:18:55 PM GMT

Phil Pilgrim (PhiPi) · 2008-11-11 16:38

Christian,

Notwithstanding your main question, your example code will never set the carry flag, since you're ANDing with zero. Moreover, your AND statement bears no relation to any pin state. To recheck the pin, you will have to define a LONG with your mask value and check the pin like this:

        TEST mask_value,INA wc

-Phil

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Just a few PropSTICK Kit bare PCBs left!

Beau Schwabe · 2008-11-11 18:21

godzich,

Phil is correct.· As far as timing is concerned, the WAITPEQ will take 6 clocks, but with jitter it could take·6,7, or 8 clocks.· Since at 80MHz, each clock is 12.5ns, the WAITPEQ will take 75ns, 87.5ns, or 100ns.

Keep in mind though... TESTing INA after the WAITPEQ command might negate the event you are waiting for in the first place.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

godzich · 2008-11-11 18:43

Thanx Phil for correcting me. That was my first attempt, before I also realised that the TEST instruction is the one to use [noparse]:)[/noparse] Probably an OR isntruction coulc be used·as well...

But back to the original question that I got an answer for, but no quite complete.

And thank's Beau for your input so far.

I fully undestand that the next instruction after waitpeq is probably not very useful, and would in the actual code be replaced byu one or two nops. So here is my newest code example:

WAITPEQ state,     #mask            'wait for pin X to change state
NOP
NOP
TEST mask_value,   INA     wc       'get pin X state into C

The question remains (assuming 80MHz as the sysclk):

After the transition of the pin in the WAITPEQ command, after how many nanosecods is the input pin sampled in the example above?

And how big is the sampling time jitter (min average max)? And what is the main reason for it?

After all, this is a·piece of information that is hard or impossible to test in the lab, without proper and expensive test equipment.

Is is so that the WAITPEQ instruction is actually executed almost completely (it must be at least started) and then the execution halts until the condition is met? Can it still take 5 or more clocks before the program execution continues? This seems strange.

Sorry but the whole picture is still a little blurry. I appreciate your further input here very much !!!

Christian

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Phil Pilgrim (PhiPi) · 2008-11-11 19:12

Christian,

This is something you could determine empirically, using another cog to generate a pulse, along with a WAITCNT to determine the width. Then turn on an LED if the trailing edge of the pulse is detected by the TEST instruction.

This will give you an answer for signals synchronized to the system clock. For asynchronous signals, you will have to use another Propeller chip.

-Phil

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Just a few PropSTICK Kit bare PCBs left!

godzich · 2008-11-11 19:54

Hi Phil,

You said it; "This is something you could determine empirically...". Those are the key words! Why do I have to determine something empirically,·when some hard fact guaranteedly exists, and what in the first hand should reside in the Propeller's data sheet? It cannot be the responsibility of the end users to figure out some BASIC functionality of the processor!!! Hobbyists might have time for experimenting, but why re-invent the wheel?

You probably got my idea wrongly: The idea was to wait for a startbit using the WAITPEQ instruction, and then execute one or two nops to let the start bit pass, and then execute a string of test and shift instructions, to read in a data stream of some 10..20 bits, and then during a longer stopbit, store this data. And then again resync to the next start bit. Hoping to input raw data at a speed of 5..8Mbit/s just using one COG.

Practically a async protocol, but just a receiver, and a non -standard number of data carrying bits. The same cog not doing anything else.

IF the instruction following the WAITPQE instruction would sync itself within 12,5 ns precision to the pin signal edge releasing the execution (albeit whatever other fixed delay would be involved), then it would be possible to input data streams at remarkable speed.

However, it is probably so that all COGS are running in "unison", starting all 8 of them the exectution of the next assembly code at the next system clock divisible by 4.... I was hoping that COGS could run "out of phase", ·especially after a wait-for something-instructions, in some 12,5ns, 25ns or 75ns offsets to the others. But that was probably just wishful thinking... Would have been handy, though...

Christian

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Beau Schwabe · 2008-11-12 04:12

godzich,

This thread probably does much of the timing you're describing... Using a single I/O you can reach throughput speeds of up to 8.42 Mega Baud ... Using 4 I/O's you can reach a throughput of 14.5 Mega Baud while still using only 1 cog.

http://forums.parallax.com/showthread.php?p=691952

In answer to your timing question, it is answered on page 350 and 351 of the Propeller Manual.· Each NOP will require 4 clocks, so two successive NOP's will add 100ns delay if the Prop is running at 80 MHz.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

godzich · 2008-11-12 10:13

Hi,

Thanks Beau for the link. It is proof of that most things someone else has already done before [noparse]:)[/noparse]

I was aiming at something very similar, just using a slightly lower bit count in one transmission word. Therefore I wanted to know how the COG is resynced back to execution when waiting for a pin edge transition. Also I wanted to know at what time (within one instruction) the INA pins are sampled, to plan the timing, not empirically, but scientifically...

Probably I find those answers, or at least a practical implementation using these delays, in the link you provided. Thanks.

Your last comment indicated to me that you did not understand my original question, I - and everyone else knows that instructions in general (as NOP) takes 4 clock cycles to execute... Still I would like to see some timing diagrams clarifying such I/O timing details in the next data sheet.

Christian

EDIT: I read your thread. Nice work! But it only prooves my point; only someone KNOWING the (non-published) internals of the propeller could design such a protocol.

I don't get it - why do you keep these things secret? Its like you would not like people to use the propeller for serious applications. As long as such technical details are kept hidden (like the I/O timing), people will consider the propeller more like a hobby processor. And that is very very sad indeed. It would deserve more serious attention, but that also would require more serious documentation. Sorry to critisize, but I try to do that in a constructive way.

EDIT2: After reading and looking at your code it is now clear enough how the COG works internally when executing WAITPxx instructions. Thanks for the info. However, there might be other persons that will ask the same questions again, and therefore it would be clever to add such information to the data sheet. I'm talking out of experience, since I was once responsible for technical documentation and support at my workplace. I added all the best FAQ's on a constant basis into the technical manuals, and soon the email and phone traffic reduced to minimum. I just told them "RTFM"!!! ·[noparse]:)[/noparse]

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Post Edited (godzich) : 11/12/2008 4:12:20 PM GMT

godzich · 2008-11-12 17:24

Hi Beau,
Sorry being stubborn or stupid or both, but in this serial communication scheme of yours - the data bits are not sampled at the middle of the bit time,·but about only 25 ns after the bit transition, that occurs every 100ns.

Is this correct, or don't I still get it?

Alternatively - could the start bit sequence be extended to the double - to place the bit sampling time in the middle of the bit time? You would of course get better noise immunity, but with the expense of a slightly lower baud-rate...

According to this idea - I would like to use the following sequence for the start bit catch - to avoid the need to generate frequencies twice higher than the data bit rate, especially if you use some external differential drivers/reveicers (proper way of doing it if the distance is long). Would this work as intended (sampling in the middle of the data bits) assuming I modify also the transmitter accordingly?

 waitpeq    RX_PinMask, RX_PinMask      'Wait for SYNC - HIGH   (5 clocks, 12.5 ns sampling offset)
 waitpeq    RX_PinMask, RX_PinMask      'Wait for SYNC - HIGH   (5 clocks, 25.0 ns sampling offset)
 waitpne    RX_PinMask, RX_PinMask      'Wait for SYNC - LOW    (5 clocks, 37.5 ns sampling offset)
 waitpne    RX_PinMask, RX_PinMask      'Wait for SYNC - LOW    (5 clocks, 50.0 ns sampling offset)

 test       RX_PinMask, ina wc          'Read RX pin into "C"
 rcl        Buff,#1                     'Rotate Buff left 
 test       RX_PinMask, ina wc          'Read RX pin into "C"
 ...
 ...
 ...

Thanks for your help!

Christian

PS: Can you enlighten me about the technique you mention how to inrease the bandwith using parallel data lines?!

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Post Edited (godzich) : 11/12/2008 5:30:56 PM GMT

Beau Schwabe · 2008-11-12 22:16

godzich,

"...in this serial communication scheme of yours - the data bits are not sampled at the middle of the bit time,·but about only 25 ns after the bit transition, that occurs every 100ns." - actually they are... It was discovered sometime after I wrote that code, that the waitpeq·takes 6+ clocks and not 5+

5+ vs. 6+ discussion thread here... http://forums.parallax.com/showthread.php?p=722987

For the TX/RX code that I·put together, the 6+ still works out, just shift my original Start bit #2 and Data points accordingly.

Each clock tick represents 12.5nS
      Start bit #2 begins read here
       &#9474;
 11112222dddddddd <- Beginning of Serial Data Stream
 &#9474;           &#9474;
 &#9474;          Data begins read here
 &#9474;
Start bit #1 begins read here 
1 - Start Bit 1
2 - Start Bit 2
d - data

"Can you enlighten me about the technique you mention how to inrease the bandwith using parallel data lines?!" - The 14.5 Mega baud version is an example of Parallel data transfer.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Rayman · 2008-11-13 01:55

godzich said...
[font=ArialNarrow,BoldItalic size=2][font=ArialNarrow,BoldItalic size=2][font=ArialNarrow,BoldItalic size=2][font=ArialNarrow,BoldItalic size=2]·(assuming the maximum safe 80MHz system clock frequency):[/font][/font][/font][/font]

100 MHz is also safe!· But, 128 MHz is not safe...

godzich · 2008-11-13 13:53

Hi,

Are you serious about running the Prop with a 100MHz sysclk for example using an 12,5MHz xtal and the multiplier set to x8?

Does every propeller chip run guaranteedly on this frequency?

Over the industrial temperature range, and a VDD tolerance of +- 5%?

Than would be neat !!! Or would it be like pushing your luck in a commercial product, and aimed only for less serious projects?

Christian

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Rayman · 2008-11-13 17:16

I've run it at PLL16X with a 6.25 MHz crystal.
The new datasheet (V1.1) shows it can run safe at 100 MHz...

Paul Baker · 2008-11-13 17:39

There is no guarentee, however we reject chips which cannot run at 100 MHz at room temperature. We test each chip at 104 MHz at room temperature, this is to ensure the chip will operate at 80 MHz over the entire temperature range. Since the two parameters are interrelated, stretching one provides an indicator of the other parameter without having to go through the much more expensive process of temperature cycle testing every chip.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

godzich · 2008-11-13 17:44

Paul - thanx for this clarification, direct from the "lions nest" [noparse];)[/noparse]

Highly appreciated !!

Cheers,

Christian

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Paul Baker · 2008-11-13 17:50

Christian,
One further point, I don't think you would would be able to use a 12.5 MHz crystal, this means the PLL would be operating at 200MHz. I've seen products use a 10MHz crystal which results in the PLL operating at 160MHz, but I would not expect 200MHz to work. Everyone I've seen operate thier Propeller at 100MHz has used a 6.25 MHz crystal (not easy to find, but they do exist).

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

godzich · 2008-11-13 18:25

Sad to have to admit publicly to people what a moron I am. Sorry, I never used any other Xtal than 5MHz and forgot this detail. Sad also because I have a hube pile of 12,5MHz Xtals. How about adding an prescaler after the Xtal oscillator and before the PLL input, in the Prop II [noparse];)[/noparse] ??

Christian

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Paul Baker · 2008-11-13 19:26

Yeah you could use a crystal driver circuit as an input to a divide by two circuit before feeding it to the Propeller.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

godzich · 2008-11-14 09:01

Hi,

Well adding exterlan clk circuitry and dividers would be easy, but woud add to the complexity. The beauty with the propeller is the minimal amount of support components needed! I next try to hunt some 6MHz Xtals, should be easier to get than 6.5MHz. Probably I'm also more on the safe side running at room temperatures only at 6MHzX16=96MHz, especially when marginally lower than your test frequency.

And thanx for the new 1.1 Prop datasheet!!! I noticed that the waitpxx number of clocks error (it say 5+ but should be 6+) did not make it to this release [noparse]:([/noparse]

When do you release the new Prop manual ??

Cheers,

Christian

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The future does not exist - we must invent it!

Paul Baker · 2008-11-14 17:24

Ah your right, the 6+ didn't get added, I'll make a note of it. The revised Prop manual is still a few months off.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

Latency from WAITPEQ (or WAITPNE) to sampling an input pin in the next instruct

Comments