Looking for answers & suggests
BTX
Posts: 674
Hi all !!
Due I was thinking in digitize a VGA 640x480 60 Hz signal, I've many questions about propeller assembly.
I would like if somebody could answer me, some or all of these following questions:
1- Supposing you have four propellers sharing the same clock input "XIN"... and all them are running the same
·code in asm, will be them all in syncro ?? ie: when I execute a "waitpeq" instruction ...do they all have the response at the same time ??
2- Which is the minnimun pulse width that "waitpeq" could detects ?? (80Mhz clock)
3-Supposing some clock out generated by "counters" at some COG...will all props have that clocks in syncro (same phase) ??
4-Due the "WRBYTE" instruction have a execution time between 7 to 22 clocks... will all props have the same execution times ??
5-I will digitize a signal, taking samples at a fixed amount of time (maybe 24 samples in 25uSeg) how to get all the samples
at the same difference time between of them ?? I thought about using waitcnt statement after save the sample data in
main memory....and after the next sample...and so on. Will I get the same "delay" between samples, with this method ??
Think ..doing a loop to take the samples...I will never don't know ...in a "WRBYTE" instruction ...how many clocks will take it.
So the samples could have a different amount of time between them.
6-I must take 96 samples in 25uSeg, and save them in main memory...so I think the propeller speed will get not enough for me
except, I could use four COGs, sampling each of them only 24 samples, and doing a different delay in each COG before to begin to sample the data.
Does somebody knows another way to do this ? or am I missing something ?? Ideas are wellcoming !!
(Remember, that all "XIN" came from one common oscillator chip for all props.)
Of course ...thanks for your help !!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Due I was thinking in digitize a VGA 640x480 60 Hz signal, I've many questions about propeller assembly.
I would like if somebody could answer me, some or all of these following questions:
1- Supposing you have four propellers sharing the same clock input "XIN"... and all them are running the same
·code in asm, will be them all in syncro ?? ie: when I execute a "waitpeq" instruction ...do they all have the response at the same time ??
2- Which is the minnimun pulse width that "waitpeq" could detects ?? (80Mhz clock)
3-Supposing some clock out generated by "counters" at some COG...will all props have that clocks in syncro (same phase) ??
4-Due the "WRBYTE" instruction have a execution time between 7 to 22 clocks... will all props have the same execution times ??
5-I will digitize a signal, taking samples at a fixed amount of time (maybe 24 samples in 25uSeg) how to get all the samples
at the same difference time between of them ?? I thought about using waitcnt statement after save the sample data in
main memory....and after the next sample...and so on. Will I get the same "delay" between samples, with this method ??
Think ..doing a loop to take the samples...I will never don't know ...in a "WRBYTE" instruction ...how many clocks will take it.
So the samples could have a different amount of time between them.
6-I must take 96 samples in 25uSeg, and save them in main memory...so I think the propeller speed will get not enough for me
except, I could use four COGs, sampling each of them only 24 samples, and doing a different delay in each COG before to begin to sample the data.
Does somebody knows another way to do this ? or am I missing something ?? Ideas are wellcoming !!
(Remember, that all "XIN" came from one common oscillator chip for all props.)
Of course ...thanks for your help !!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Comments
Great questions. I wish I had answers.
I have an interest in signal processing... and am convinced that a flexible multiprop design should be up to just about any signal... but the proof is in the pudding. You are asking all the right questions. Being able to do this sort of thing is one of my goals for for the year. So I will be more than interested in your progress.
Why don't you start out with a single prop and a single cog? For example... keep the VGA screen the same. Then see what it takes to digitize it... over time?
That would be a great app all by itself.
Thanks
Rich
1) The cogs are not expected to be in sync unless the program deliberately synchronizes them. The high resolution VGA routines do this by starting with the time the first cog is initialized, adding a fixed delay, then all the cogs do a WAITCNT with the same time used. After that time, they are synchronized until they do a hub memory access or a WAITVID or WAITPEQ/PNE.
2) The WAITPEQ has a minimum execution time of 5 clocks. It rechecks the pin state every clock cycle beginning with the 5th cycle.
3) If you synchronize two cogs by using WAITCNT, then start each cog's counters the same way (with the same number of clock cycles), they should be in sync once they start.
4) The datasheet has some nice diagrams that show how hub access works and how the "round-robin" access timing goes.
5/6) If you take 24 samples in about 25us, that's roughly 1us per sample or about 80 system clocks per sample (at 80MHz) or roughly 20 instructions per sample. You can use the WAITCNT instruction to lock the sampling to any multiple of system clocks (12.5ns). Since the system time is available to all cogs, you could even synchronize two cogs so they alternate sampling with each cog doing it once a microsecond, but the two cogs together sampling every 0.5us..
look here: http://forums.parallax.com/forums/default.aspx?f=25&m=120209&p=1
Rich
I should like to comment your posting as follows (and maybe repating some of Mike's remarks)
(A) Your questions do not strictly refer to "assembly language" but to hardware architecture and performance. You are requesting some features @ VGA clock rate (around 25 MHz). I am sure you are well aware that this is BEYOND the instruction clock speed (80 MHz /4 = 20 MHz). The propeller - at the moment - can only handle this (in ONE COG) by the most powerful WAITVID instruction, caring for 32 (or 16) pixels in a synchronized way, thus giving you about 10 to 20 instructions to feed the video output. There is no equivalent for input!
(B) A solution - as you are also aware of - is "divide and conquer" (or "decimation" from another point of view). This clearly has to be done by external hardware!
(C) As Mike hinted to, there is a wonderful synchronization mean by using CNT, which is global to all COGs within ONE chip. With a multiple chip architecture you have to externally provide a similar global time sync system. Maybe it can be integrated into the external "decimation logic".
(D) Getting out of sync is easy. Note that all COGs within one chip are principally "out-of-sync" (out-of-phase) by the HUB wheel. Each COG gets its "time slice" within the 80 Mhz/16 = 5 MHz HUB-frame. Being synchronized with a higher precision than 5 MHz would most likely mean to explicitly wait after each HUB-access to sync to this 5 MHz, which sounds like (and is) a waste of time!
(E) On the other hand you are talking about "96 samples @ 25 mys" (??) which amounts to just 4 MHz, which is not VGA (640x480 @ 60 Hz) . This MIGHT make your project more feasible with simplified external hardware...
Post Edited (deSilva) : 8/4/2007 7:33:49 AM GMT
Maybe, as is hard for me to explain in English what I want to ask, and it is hard too, to understand all what you mean..
Let me try again with some of these questions, abd let me explain·what I want to do.
I have three propellers, with the same clock input tied together, from a oscillator chip.
I have too a VGA signal, which is generated by another device, and I want to take 96 samples·of a line of it (in 25.17uSeg), each·seven lines, so I will have about 96*68 samples total in a VGA frame.
One propeller is to sample the VGA Red out, one prop for green, and one for Blue, all propellers will have a Hsync and Vsync inputs tied togheter to recognize the VGA syncro.
So I must check for the R,G,B data at the same time in the RGB props, so I need to get all three props running in the same time.
I have too, three differents flash A/D converters to do the digitize job, so I will only need·the props to get the data.
I hope it is clear now. [noparse]:)[/noparse]
@Rich.
Great thread....it will be usefull for me.
@Mike & @De Silva
My question number two..is stupid...sorry..[noparse]:([/noparse]... I want to check for the "H sync" pulse of 3.77uSeg in·a VGA out with the waitpeq/waitpne..and these statements are correctly enough for that purpose.
@Mike........ 5/6 is what I think to do but not with two COGs, instead with four of them..(to be more quiet)
So I assume that waitcnt could get all my samples at the same "difference of time" between them...as you suggest. Your answer is one thing of what I want to·know so !!!
@De Silva....... Yes, maybe some FIFO memories could·make me the life more easy, and a FPGA still more ....[noparse]:)[/noparse] I'm strongly thinking in that now....
Thanks all again !!!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
But I think yu can handle this with 3 COGs in one chip which will obsolete the need for any need of inter-chip synchronization. In fact, you already did what I called "decimation"
Anyway... I read somewhere here about a couple fellas overclocking the Propeller with a 6.5 Mhz·resonator... that takes the Propeller to 104 Mhz.... which takes the instruction clock speed to 104 / 4 = 26 Mhz·· which is just above the VGA clock...··
Here is a link related to overclocking the Prop to· 104 Mhz:
http://forums.parallax.com/showthread.php?p=635701
just an idea..
Post Edited (Joe "Bot" Red) : 8/4/2007 1:58:30 PM GMT
One active line of VGA has about 25.17 uSeg of data, like I need to take 96 samples in it, I have about 0.262 uSeg per sample...at 80Mhz clock for propeller, it is almost 21 clocks to take one sample.....so almost 5 instructions.
Considering that WRBYTE could take between 7..22 clocks, more the increment of the address pointer, more the DJNZ loop, I'm out of time !!
I thought to use four COGS, each one taking only 24 samples, but I must get in syncro all of them, and must have the exactly "same time" between samples too....maybe using waitcnt to·equal the delays...so I will waste more clocks in that .
So I thought to use three propellers, each one each color....but then I must get in syncro them too... [noparse]:([/noparse]
My problem is in the "sampling lines"...then, each seven lines, I've many time to do somethings.
Maybe like you said...it could be usefull to get the data in "COG memory" before send it to "main memory"... but I will waste many time to do the transfer too... just remember I can't lose "many frames", my final target is to get about 15fps sampling ...and sending that data to 16 slaves too.
Obviously we are talking about the "LED curtain" in this time too, with real time video controller.
It is correct ? or I misunderstanding something.....
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
I will later, now I'm leaving until night ...thanks again !!!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Your English is fine. BUT my engineer-speak is nearly incomprehensible. I'm going to try, but no doubt I'll get something backwards.
There is something that I don't understand here. Let's stick with the original description of the problem ... just for a moment (to force us to think of parallel Props[noparse]:)[/noparse].
Bean's overlay board uses an external chip that takes a video signal and outputs a logic state to indicate the occurance of a sync... so long as that logic state was being watched identically by several props... then wouldn't you simply have to ensure that the Props were syncronized within the same signal sync period?
I don't understand why the Prop clocks would have to be physically syncronized at all... the time critical measurement comes from the horizontal and vertical sync signals (that is, timing issues are relative to the signal). If that is true then the critical time variance would be the variance in counters performing the equivalent of a PEQ... So the question then becomes... how well two different counters can be syncronized to a external input... and that would be determined by the variance in the "slew rate" of an input pin and the interval= +/- 1/clkfreq... no?
This question would apply to several cogs in a single Prop of course.
Rich
Alberto, I TOLD YOU you have 5 instructions and I RECOMMENDED what you could try - if you would have read my posting :-( I highly esteem to all the things you have acomplished and will soon accomplish, but it is too frustrating to communicate with you further - sorry.
@rjo: Your idea is fine, and Alberto most likely uses it when processing the independent RGB signals. As he is hazitating to use the 6 intermediate lines for transport - as his most recent posting shows - he is working with a scheme how to do EVERYTHING in the most busy loop So he looks for a very strong sync on nearly pixel-clock level.
You are correct that you can sync again on each new line, but you can easily loose sync using e.g. conditional jumps...
The most remarkable feature of the WAITVID output instruction is to uphold this sync, but there is no equivalent at the input side...
Post Edited (deSilva) : 8/4/2007 4:45:55 PM GMT
Here are some links i stumble upon while surfing the internet:
http://www.epanorama.net/documents/pc/vga_timing.html
http://www.hardwarebook.info/VGA_%28VESA_DDC%29
http://www.repairfaq.org/sam/vidconv.htm#nvctoc
To first digitize the signal, have (3) 8-bit (parallel out) ADC's sampling and outputting directly to (3) propellers. Each propeller would be responsible for R-G-B. An external sync separator would act as the sync for the propellers and the display.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
E3 = Thought
http://folding.stanford.edu/·- Donating some CPU/GPU downtime just might lead to a cure for cancer! My team stats.
One single Propeller could do the sampling if you can share the I/O pins. You've got 3 x 8 = 24 pins you need for the 8-bit ADCs plus 2 pins for the horizontal and vertical sync signals from the sync separator. One additional line from the Propeller would enable the ADCs, maybe though a bus buffer with an enable line. The last line could be some other kind of status/control line.
During the horizontal sync period, one cog would compute a system clock time for the start of the active scan line, enable the ADCs, and do a WAITCNT for the computed time. It would then go through 96 samples in an "unrolled" loop as follows:
Post Edited (Mike Green) : 8/4/2007 6:45:29 PM GMT
@Rich.
What you said is what I think....the Hsync will get in syncro my props...but I don't know, if then..the props code will be in syncro too... so I ask for that....although I use the same oscillator for all. I'm afraid that ADC data for R,G,B get out of syncro in samples.
@deSilva.
Calm down....it is specially difficult for me· to understand you.....if you mean that, it·is ok.
@RinksCustoms.
We are still assembling modules of the curtain......and I'm also beggining to think in the "real time video controller" for it.
The links are very interesting ..(I just have the epanorama timming data). What do you explain in the graphic, is exactly what I thought to do.
@Mike.
Your idea to use only one prop reading all 24 bits from a single instruction is wonderfull !!!! I never thought in this way....and you get the three samples (RGB) with the minimun waste of clocks !!
But·doing this, I have a big problem...my propeller must comunicate with the 16 slaves boards too....so I would need 8 lines more for data send to slaves, and three lines more, to get in syncro with them... [noparse]:([/noparse]( ...........(Ideal for propeller 2 !!)
That's why, I think in separate props for RGB sampling.
But returning to your idea....PLEASE !! could you explain to me three things ????????
1-I can't understand, what the first 3 instructions do in your code ????
2-WHY do you count the WRLONG as 8 clocks ??? (why are you so sure of that...and not 7..22 as the manual talk)
3-Also, is not clear for me, how do you explain to "shoot" the sampling code, from the Hsync signal sensing it in·another COG ???? (you are not using waitpeq/waitpne to do that !!)
The goal for me, could be try to do that (some like your code), with three props, one each color, and comunicating it with a fourth prop...that send the data to the 16· slaves too. This fourth prop, will have 8 lines to comunicate with the three RGB props (bus), and 8 lines more to send to the slaves plus 3 lines to get in syncro with them. (be carrefully I'm not naming all lines to be used..I will need more lines to get in syncro, the RGB props with this fourth too)·I hope it is clear for you all !!!.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
2) The WRLONGs for the pixels all take 7 clock cycles. The first dummy one (before the first pixel) can take anywhere from 7 to 22 clock cycles. All of the subsequent ones take 7 clock cycles since the ADD and MOV instructions take only 8 cycles and the hub cycle time is 16. That leaves one extra. Have a look at the description in the Propeller datasheet.
3) I would use the same cog to detect the Hsync edge with a WAITPEQ/PNE and compute when the active scan line will start. It shouldn't take too many lines of code.
The 24 input pins could also be used for other purposes, particularly if you latch them with 8-bit latches. You could even use transparent latches (like the 74HC373) that would follow their input signals as long as they're enabled, then latch the last setting when they're disabled (to read the ADCs).
2- Is correct, now I understand it well....perfect !!
F- In your last comment, you are suggesting me to use some latches (three), at the outputs of the A/D....just I have·them, and they are 74AC573, and I can controll the OE of them, to get all 24 pins of the prop, free·to be used as "driver" for the slaves .... is this your suggest·?... and I·will add another latch, to the "slaves bus" too. So sometime I will fill the buffer with "data frame", and sometime I will send it to the slaves.
Answers -1- and -3- and NOT understood for me...
Lets talk about -3- first....the idea is ok...but what do you mean, when you talk about "compute when the active scan line will start" ???????
I understand that first, I detect the Hsync, and then, I will wait for some "constant time" to begin sampling data.
The answer -3- is similar, that...I can't see...how do you make a relation between "time as variable" and the Hsync detected in the same, or another COG ??????
And I add a -4- question...
4- If in the first dummy sample·(before the first pixel) can take anywhere from 7 to 22 clock cycles....What about this delay in the·followings·sampled lines ????? after the followings H syncro pulses.
If it will be different, I will have the digital video signal,·with some lines displaced·in "H" side.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
2F) I'm talking about new latches ... input from Propeller ... output to Slaves. During the unused scan lines, the Propeller can send information to the slaves through the latches.
4) This is a small problem ... There may be some jitter in relation to the Hsync pulse edge ... If that's not tolerable, you could buffer the scan line in the cog's memory instead and that would eliminate the jitter issue.
Now it's ok....sorry...like I said before, sometimes English is difficult for me to understand in this issues. [noparse]:)[/noparse]
-3- is OK, -F- is OK.
And in the -4-.... I was not wrong....maybe I must check, how much this jitter affects the video, and then take some decitions about what to do.
I was thinking before to use COG memory too, to avoid that, and then, in unsampled lines send the data to main memory. I will need to get only a 15fps video, so I think I will have enough time to do all. Maybe one frame sampled, and then three frames more to do the transfers to the rest of the electronics.
In the case to save data to COG memory....How to address an array in this memory ?????·Is that·in the prop manual ???
Thanks so much all !! you were·a big help !!
I'll keep posted in this too.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
Thanks so much MIKE !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Please, don't feel bad, if I will take several days/weeks to "analize" or "understand" all your code !!, It seems to be almost all necesary to do the job.
I will ask some about of this code...when I'll have·some doubts.
Really you're great in programming.!!·Thanks again !!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.
OK, but you are supposing that I've 8 differents bits/each slave,·eight for R·eight for G and·eight for B, but it is not my reality....
Since I still not posted some of my code in "Multipropeller board thread" you don't know ...how it is working !!
I will post some code early, to know that, for all forum people too.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Regards.
Alberto.