WAITVID, HUB OPs and general PASM timing questions (Parallax & others)
dMajo
Posts: 855
WAITVID (Posted 9/12/2009 5:18 PM (GMT +1)·erroneusly in the Sandbox)
2) VGA mode, PLLA@80MHz, pixelclocks 1, frameclocks 5: can I say that at each systemclock cycle 8bits are delivered out:
Is the above code executed without wait penalties? What is output the 5th clock? is the 4th byte repeated?
3) One idea :
If we look at this from an other point of view waitvid is not the only method to deliver data to the video hardware. Whatever instruction that places the right data on the right bus on the right time will fulfil the request ! If the plla is running the same frequency as system clock how much they differ in sync? How difficult is deliver data with an other instruction that beside delivering the data will also make something useful in the same time?
Has someone ever tryed something like this?
PEDIT: Going on with the arguments:
4) PASM instructions (according to datasheet V1.2, chapter 4.8, fig.4)
is the above true? and what the behavior for ina?
5) can someone map the execution stages (similar to Fig.4) for HubOps (RDxxxx/WRxxxx) ?
6) is the following true?
so this mean that by selectively starting a (wished) cog, the hub can be used as a sync device.
BTW Since the CogID is just returning it's number, how is possible that the it is not aware of it and have to ask that to the hub? In other words, why it takes 7..22 clocks? It seems to me like·if someone ask me my name and·I answer: wait a moment I have to ask it to my mother. Have CogID some other uses?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
1) Which one is true? ·(I understand them in opposition to each other)Propeller manual V1.1 (page 225)
... For VGA, each color value’s upper 6-bits is the 2-bit red, 2-bit green, and 2-bit blue color components describing the desired color; the lower 2-bits are "don’t care" bits. ...
·
Propeller datasheet V1.2 (page 12/13)
... For VGA mode, each 8-bit color value is written to the pins specified by the VGroup and VPins field. For VGA typically the 8 bits are grouped into 2 bits per primary color and Horizontal and Vertical Sync control lines, but this is up to the software and application of how these bits are used. ...
2) VGA mode, PLLA@80MHz, pixelclocks 1, frameclocks 5: can I say that at each systemclock cycle 8bits are delivered out:
waitvid Data32N1,#0 waitvid Data32N2,#0 waitvid Data32N3,#0
Is the above code executed without wait penalties? What is output the 5th clock? is the 4th byte repeated?
3) One idea :
Is correct this: waitvid DestinationDataBus, SourceDataBus or is the above sentence referred to address busses?Propeller datasheet V1.2 (page 13) said...
... When FrameClocks cycles occur and the cog is not in a WAITVID instruction, whatever data is on the source and destination busses at the time will be fetched and used. So it is important to be in a WAITVID instruction before this occurs. ...
If we look at this from an other point of view waitvid is not the only method to deliver data to the video hardware. Whatever instruction that places the right data on the right bus on the right time will fulfil the request ! If the plla is running the same frequency as system clock how much they differ in sync? How difficult is deliver data with an other instruction that beside delivering the data will also make something useful in the same time?
Has someone ever tryed something like this?
setup VGA mode, single ended PLLA@80MHz with pina on output pin (datastrobe), pixelclocks1 frameclocks 9713717 (one clock more or less then some pasm lines), VGA on P0..7, PinA on P8 - loop to get in sync - set pixelclocks 4 mov buff1,buff1 ' just to be shure that both destination and source busses has the right value mov buff2,buff2 'these instructions are synced with video hardware window (like the hub one) mov buff3,buff3 mov buff4,buff4 result: data output at 80MB/s
PEDIT: Going on with the arguments:
4) PASM instructions (according to datasheet V1.2, chapter 4.8, fig.4)
mov outa, #1 mov outa, #0 ' here at 2nd clock (sising edge?) of this instruction the output changes phisicaly to 1 due to previous instruction
is the above true? and what the behavior for ina?
5) can someone map the execution stages (similar to Fig.4) for HubOps (RDxxxx/WRxxxx) ?
6) is the following true?
COG0 COG2 COG4 ====================== ====================== ====================== rdlong temp1,address rdlong temp2,address rdlong temp3,address '<= this executes at same system clock (cnt) cycle nop ' waiting window 'waiting window '<= this executes at same system clock (cnt) cycle nop nop 'waiting window '<= this executes at same system clock (cnt) cycle mov any,any mov any,any mov any,any '<= this executes at same system clock (cnt) cycle
so this mean that by selectively starting a (wished) cog, the hub can be used as a sync device.
BTW Since the CogID is just returning it's number, how is possible that the it is not aware of it and have to ask that to the hub? In other words, why it takes 7..22 clocks? It seems to me like·if someone ask me my name and·I answer: wait a moment I have to ask it to my mother. Have CogID some other uses?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
Comments
· Propeller Object Exchange (last Publications / Updates)
1) I hope you'll not be angry on me because I have moved the post here in such way
2) Thanks for beeing the only one that has answered (I thought this arguments would have had a greater public in the community)
3) I do not think the pipeline has to stall nor the waitvid takes more then 5+ clocks. According to (datasheet V1.2, chapter 4.8) fig.4 if the waitvid is our instruction N, when it get to the stage 4 (M+4) the istruction is executed: cog clock hold. Perhaps there is a set/reset FF: the execution set it and the video counter reaching 0 reset it. In this stage just the next(N+1) instruction's op code is feched while on the dest/source busses are still present the actual(N) instruction data and cog's clock is frozen.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
Bye. Tomorrow it's another day
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
-Phil
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
* No HUB access allowed while synchronized. The MOV color, pixel NR must happen every N cycles. To avoid HUB access you're either going to need a second cog feeding the display cog - which limits your pixel frequency, or render the line to a cog buffer (like I did for my NTSC sprite driver) which then limis your horizontal resolution.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
I stand corrected on the pipeline, confusing when it occurs.
To make this work, wouldn't frames then have to be some precise multiple of the system clock? I'm having trouble sorting out what that buys us, other than some really cool timing hack.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
If frameclocks is set to 4 (PLL@80MHz)then this is same time than the ordinary pasm instruction. This is also 1/4 of the rotating hub window. So the 16 clocks for hub window is a multiple of the "pasm(mov)-video" transmission window.
once the video circuitry is synced with hub window then this should be kept forever. Important is to know the timings and develop the right protocol (so that the cpld is aware of how and when data is coming). I am still a newbie in pasm (specially in self-modifing code) but hope the above examples are (code) correct. For sure they should meet the timings!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
Post Edited (dMajo) : 9/15/2009 4:20:06 PM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
In fact the frames are multiple (4) of system clock (video pll runing at 80MHz)
Some tricks that should help here:
1) run the system clock at 10M*8 (the end divide by 2 should give a 50% duty cycle): this is also the reason I want the cpld to provide the 80M clock to the prop (to avoid pll phase drifts)
2) run the video hw at 160M with frameclocks 8, pixelclocks 2: in this way the video registers can be loaded in the middle of the system clock·period and it should allow a greater timming tollerance (of course if everithing is internally synced on rising edges)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
-Phil
As you have seen I was sure that in vga mode the colors get multiplexed out independently from pixel (waitvid dataout,#0). Now I have understand this (I hope: waitvid col,(%11_01_11_11_10_01_00)· => c0,c1,c2,c3,c3,c1,c3)
Still I cannot understand how to have 32(data)bits in color or pixel register and bit-bang them out through a single pin with single instruction using vga mode. I thought it can be done with composite 2-colors mode even if I haven't focused the right software setup: perhaps cmode0, pixelclocks1, frameclocks32,· waitvid dataout,($FFFFFFFF)· or ·waitvid·($FFFFFFFF),dataout
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
The "pixels" (i.e. data bits) in data get shifted out one at a time by the video circuitry. But they don't get shifted to a pin. They're used as the index to a lookup table comprised of the mapping ("colors") long. Since there are only two colors, only the lower 16 bits get used: one block of eight for "1" pixels, and one block of eight for "0" pixels. It's one of these blocks of eight bits that gets written to the output pin, as selected by the "pixel" data in data.
-Phil
If you are using the video modes as a serializer, then you don't want to use the composite video mode - stick to the VGA mode.· The Propeller datasheet has a good diagram (figure 6) which should help you achieve what you want.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Composite NTSC sprite driver: Forum
NTSC & PAL driver templates: ObEx Forum
OnePinTVText driver: ObEx Forum
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
1) hubop is my instruction N and till stage 3 everything is equal to any pasm instruction.
2) stage 4 is extended to at least 3 cycles: the first waits for hub sync, the last two (because of hub half speed)·exchange data with hub
3) stage 5 becames M+7 and used only for read ops, where after the data has been exchanged must be written to cog register
So the hubop takes 8 real_clock(M) cycles but because the M+1 is devoted to previous instruction(N-1) we say it takes 7.
Moreover can we say that every output manipulation takes really (phisically) place in the second cycle of the next pasm instruction?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
I had it wrong on the pipeline. I was confusing the operation of the D & S registers with instruction pre-fetch.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
From an earlier posting:
So why would the h/w wait for another 2 cycles to actually output the new state? I'm pretty sure that output state is altered at the same time (i.e. result stage) otherwise most of the timing critical drivers around here wouldn't work (my hub DMA certainly wouldn't). FWIW, writing to frqx will show effect during the first cycle of the next instruction at the earliest.
Both the above examples will meet the hub window in sync, but the output behavior will be different: which one is right?
The hardware do not wait, the instructions simply take 6 clocks to execute. Since 2 stages are overlapped we have a 4 cycle time throughput but the real(phisical) state compared to software lines is delayed by 2 clocks. When you are accessing prop internals it doesn't matter since everything is running on 4 clocks, but if you are dealing with external timings I think that is important to know that. Eg.: if my N instruction in the example above is the first instruction you should know that the pin will change the state after 6 clocks and not 4.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· Propeller Object Exchange (last Publications / Updates)
As for the 2 cycle offset, I know what you mean now. For ease of use I view instruction cycles as SDER (and counting S as 1) while the full sequence is IdSDER. So the result phase of instruction N coincides with the decode phase of instruction N+1 which also happens to be the 2nd cycle. Sorry, my mistake.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade, RetroBlade,·TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm