[FYI] waitpxx cog synchronisation
kuroneko
Posts: 3,623
Disclaimer: It's not rocket science but given what people think is acceptable a few words of advice. The list below may not be complete but contains what's usually suggested in this forum and what I use in preference.
Delay maps
What's a delay map? Well, it's simple. cog A sends out a signal (rising edge) to pin B. cog C will waitpxx on pin B looking for the rising edge. Both events are recorded with time-stamps for all valid cog pairings and all available pins. Ideally the delay is always the same. Unfortunately it isn't. I looked at a number of different ways of generating a rising edge and in the end it boils down to outa and phsx. The former simply sets a bit while the latter can use any known way of affecting a counter output.
I'll attach delay maps for the PPDB with a naked prop (no EEPROM) testing all pins excluding 30/31 and for my rev G demoboard with a limited pin map. The hard-wired (expected) delay is 3 cycles (dash), deviations are marked with # (4 cycles). Unused pins are marked with a dot.
What we find is that while the outa method is superior to phsx it is not perfect. Let's look at the sections where cog 0/1 is the edge generator.
So if we look at pin 25 it turns out that cogs 1..3 will get the signal one clock cycle later than cogs 4..7. Or rather their waitpxx h/w sees it slightly later. NG
For comparison the other end (cog 6/7):
A few words for operating the test program. Results are written after the test to serial. The pins to be tested are simply defined in a 32bit vector called pins which is effectively your dira mask. In order to pick a test method (default is outa.) select either of the following:
Be careful out there!
- cogs waitpxx on a pin which is controlled e.g. by a master cog
- cogs waitcnt for a specific system counter value
- hub-window utilisation
Delay maps
What's a delay map? Well, it's simple. cog A sends out a signal (rising edge) to pin B. cog C will waitpxx on pin B looking for the rising edge. Both events are recorded with time-stamps for all valid cog pairings and all available pins. Ideally the delay is always the same. Unfortunately it isn't. I looked at a number of different ways of generating a rising edge and in the end it boils down to outa and phsx. The former simply sets a bit while the latter can use any known way of affecting a counter output.
I'll attach delay maps for the PPDB with a naked prop (no EEPROM) testing all pins excluding 30/31 and for my rev G demoboard with a limited pin map. The hard-wired (expected) delay is 3 cycles (dash), deviations are marked with # (4 cycles). Unused pins are marked with a dot.
What we find is that while the outa method is superior to phsx it is not perfect. Let's look at the sections where cog 0/1 is the edge generator.
[COLOR="green"]0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3[/COLOR] T.R: [COLOR="green"]0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1[/COLOR] outa [COLOR="red"]0[/COLOR].[COLOR="blue"]1[/COLOR]: - - - - - - - - - - - - - - - - # # # # # # # - - # # # # # . . [COLOR="red"]0[/COLOR].[COLOR="blue"]2[/COLOR]: - - - - - - - - - - - - - - - - # # # # # # # - - # # # # # . . [COLOR="red"]0[/COLOR].[COLOR="blue"]3[/COLOR]: - - - - - - - - - - - - - - - - # # # # # # # - - # # # # # . . [COLOR="red"]0[/COLOR].[COLOR="blue"]4[/COLOR]: - - - - - - - - - - - - - - - - # # # # # # # - - - # # # # . . [COLOR="red"]0[/COLOR].[COLOR="blue"]5[/COLOR]: - - - - - - - - - - - - - - - - # # # # # # - - - - # # # # . . [COLOR="red"]0[/COLOR].[COLOR="blue"]6[/COLOR]: - - - - - - - - - - - - - - - - # # # # # # - - - - # # # # . . [COLOR="red"]0[/COLOR].[COLOR="blue"]7[/COLOR]: - - - - - - - - - - - - - - - - # # # # # - - - - - - # # # . . [COLOR="red"]1[/COLOR].[COLOR="blue"]0[/COLOR]: - - - - - - - - - - - - - - - - # # # - - - - - - - - - - - . . [COLOR="red"]1[/COLOR].[COLOR="blue"]2[/COLOR]: - - - - - - - - - - - - - - - - - # - - - - - - - - - - - - . . [COLOR="red"]1[/COLOR].[COLOR="blue"]3[/COLOR]: - - - - - - - - - - - - - - - - - # - - - - - - - - - - - - . . [COLOR="red"]1[/COLOR].[COLOR="blue"]4[/COLOR]: - - - - - - - - - - - - - - - - - # - - - - - - - - - - - - . . [COLOR="red"]1[/COLOR].[COLOR="blue"]5[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]1[/COLOR].[COLOR="blue"]6[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]1[/COLOR].[COLOR="blue"]7[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . .
So if we look at pin 25 it turns out that cogs 1..3 will get the signal one clock cycle later than cogs 4..7. Or rather their waitpxx h/w sees it slightly later. NG
For comparison the other end (cog 6/7):
[COLOR="red"]6[/COLOR].[COLOR="blue"]0[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]6[/COLOR].[COLOR="blue"]1[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]6[/COLOR].[COLOR="blue"]2[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]6[/COLOR].[COLOR="blue"]3[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]6[/COLOR].[COLOR="blue"]4[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]6[/COLOR].[COLOR="blue"]5[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]6[/COLOR].[COLOR="blue"]7[/COLOR]: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - . . [COLOR="red"]7[/COLOR].[COLOR="blue"]0[/COLOR]: - - - - - - - - - - - - - # # # - - - - - - - - - - - - - - . . [COLOR="red"]7[/COLOR].[COLOR="blue"]1[/COLOR]: - - - - - - - - - - - - - # # # - - - - - - - - - - - - - - . . [COLOR="red"]7[/COLOR].[COLOR="blue"]2[/COLOR]: - - - - - - - - - - - - - # # # - - - - - - - - - - - - - - . . [COLOR="red"]7[/COLOR].[COLOR="blue"]3[/COLOR]: - - - - - - - - - - - - # # # # - - - - - - - - - - - - - - . . [COLOR="red"]7[/COLOR].[COLOR="blue"]4[/COLOR]: - - - - - - - - - - - - # # # # - - - - - - - - - - - - - - . . [COLOR="red"]7[/COLOR].[COLOR="blue"]5[/COLOR]: - # - - - - - - - - - # # # # # - - - - - - - - - - - - - - . . [COLOR="red"]7[/COLOR].[COLOR="blue"]6[/COLOR]: - # # - - - - - - - # # # # # # - - - - - - - - - - - - - - . . T.R: [COLOR="green"]0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3[/COLOR] [COLOR="green"]0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1[/COLOR]
A few words for operating the test program. Results are written after the test to serial. The pins to be tested are simply defined in a 32bit vector called pins which is effectively your dira mask. In order to pick a test method (default is outa.) select either of the following:
CON pins = $0FFFF0FF ' demoboard DAT ' [COLOR="green"]movi ctrb, #%0_00100_000[/COLOR] ' ' [COLOR="green"]neg phsb, #1[/COLOR] ' ** phsx [COLOR="darkorange"]or outa, m_tx[/COLOR] ' low high edge ** outa
Be careful out there!
Comments
Is this related to the "clean up" for audio in the Stereo-Duty object?
http://forums.parallax.com/showthread.php?123396-New-Propeller-based-Product&highlight=audio
It talks about "relationship between you IO pins for the DACs and the cog location on the die"
Is due to physical distances being different for each cog?
So far I managed to simulate DUTY behaviour for $80010000 with an NCO (40MHz plus inversion of the output for 32767 cycles). What I get is my kHz noise loud and clear independent of cog ID (although slightly less so for cog 7, about 80% I'd say). But this suggests there is something funny going on with the DUTY carry output. But that's something for a rainy day.
I hope you're in communication with Chip on these matters. I can't help thinking that these anomalies will only be amplified in the Prop 2 if the same general circuitry is used.
-Phil
Not regarding this specific issue (yet), but I have a pending clarification request with Jeff (which in the end depends on Chip) regarding waitvid's [thread=126874]blind spot[/thread]. But now that you mention it I'll send it down the pipe as well.
The latter was newly introduced because DUTY cycles have the nasty property of occasionally being invisible. Meaning the mapping wouldn't complete. This has now been resolved by introducing a timeout while waiting for the result. Test cancellation due to timeout is indicated by '?' (expected timing '-', extra cycle '#'). Example shows DUTY test mode on my primary demoboard (TX: cogs 1..3). Which basically means that signalling from cog 3 to cog 2 via pin 18 using DUTY is not happening (@80MHz).
I have some general questions about this phenomenon that I'd like to ask to anyone who might have insight into it...
Q1 - Are these timing anomalies deterministic for a given chip (will the same chip always give an identical set of characteristics)?
Q2 - Do all propeller chips give the same exact set of characteristics?
Q3 - Do the different packages (D40, M44 and Q44) all have the same results?
Q4 - Presumably whatever is connected to the pin on the circuit board will have an effect on the rise and fall time of the pin, could that account for (or contribute to) these anomalies?
Q5 - What effect will cross talk between adjacent IO pins have on this inter-cog communication method at higher frequencies? (I've often noticed alarmingly high coupling on adjacent pins on the D40).
I've recently found that DUTY output can also be invisible to the cog that's producing it! I used that "feature" to advantage in the Ping))) workaround for the S2 by biasing the I/O pin to a lower voltage than the usual Vdd pullup. Even though the pin is thereby made into an output, ina is still sensitive to external influences. And, while waitpeq does not respond to the DUTY pulses, it does respond when combined with a momentary external low. It's weird. After writing the code, I thought, "How does this even work?"
-Phil
timeout is indicated by '?'
expected timing '-'
extra cycle '#'