SDRAM Driver

Cluso99 · 2013-04-09 16:37

cgracey wrote: »

Here is the new DE0_Nano_Prop2 config file that should make the SDRAM work properly:

Attachment not found.

Thanks Chip

What a difference a day can make. Last night I was thinking I might port my debugger back to P1 - it is in fact a quite simple task as I hardly use any P2 features, at least for now (better get rid of cmpr as cmp can be almost as useful). But accessing the SRAM on DE0 is an extra bonus!!!

Personally, I am more interested in just being able to access the extra memory for now. So a background task just to do refresh (or perhaps just a counter), with the foreground task to read or write a byte/word when required is all that interests me on my current quest. This opens the door, so thankyou again Chip.

I will leave all the great video work to the rest of you. I can and will revist this later.

Ariba · 2013-04-09 16:41

Thanks Chip

My idea is to just have 3 subroutines:
- SDRAM init
- Read a Quad from SDRAM (+ 1 Refresh)
- Write a Quad to SDRAM (+ 1 Refresh)

This may be enough to make a few Video Drivers with screen-buffer in SDRAM.

Andy

Cluso99 · 2013-04-09 16:43

I have updated the first post in the sticky to link to this thread and FPGA code.

Sapieha: Looking forward to your Basic work

pedward · 2013-04-09 17:55

Chip, is it possible to run this code on the DE0, or does it too need a new blob?

Ariba · 2013-04-09 18:22

Yeah it works on the Nano too

Thank you Chip

Here is my test/demo file for the DE0-Nano. You need the PropTerminal to write the list and start the driver.
The modified code expects the Read/Write list at $7000 in hub and executes the list only once then returns to the monitor.
So you write the list with the monitor to $7000, switch back to byte mode with Y, start the driver with uploading the .OBJ file and launch it in cog 0, then press space again and dump the hub ram that was written by the driver/list.
Now you can write other list-commands to $7000 and start the driver again with 0+E80<return>.
There is an example in the comments at begin of the spin file.

Be aware: While you are in the monitor there is no refresh, so the data in the SDRAM will be lost after a short time.

Andy

cgracey · 2013-04-09 22:01

Ariba wrote: »

Thanks Chip

My idea is to just have 3 subroutines:
- SDRAM init
- Read a Quad from SDRAM (+ 1 Refresh)
- Write a Quad to SDRAM (+ 1 Refresh)

This may be enough to make a few Video Drivers with screen-buffer in SDRAM.

Andy

Keep in mind, though, that in order to get efficiency, you're going to have to do long, sustained burst transfers. Otherwise, all your time will be eaten up issuing the various commands that surround the actual reads and writes. The two-dollar 256Mb SDRAM can stream 512 words in or out in a single burst, and even doing that, my driver only gets to 89% efficiency.

Sapieha · 2013-04-10 00:50

Hi Cluso.

Thanks
It is NOT work on Interpreter that give me problems ----> I rewrite my old 8085 code ( so I have good base.)

It is PNut and Prop2 instructions that I have more work on.

PNut ---> As Now most problems --- That I can't specify codes HUBORG
Instructions ---> it is that many new ones and every one needs testing to learn its behaviour.

Cluso99 wrote: »

I have updated the first post in the sticky to link to this thread and FPGA code.

Sapieha: Looking forward to your Basic work

Rayman · 2013-04-10 06:04

Is the SDRAM being given the same clock as the Propeller?
Can the Propeller provide this clock to the SDRAM, or will we need to use a common external oscillator?

cgracey · 2013-04-10 08:21

Rayman wrote: »

Is the SDRAM being given the same clock as the Propeller?
Can the Propeller provide this clock to the SDRAM, or will we need to use a common external oscillator?

The Propeller outputs its own clock and controls the phase of data in and out for SDRAM.

Rayman · 2013-04-10 10:07

Does this driver do this? Or, is it hardwired in the DE0, DE2 boards?
I think I see that maybe for the real chip you'd use the CFGPINS command to put a clock pin in the "SDRAM Clock Out" state...
Is that right?

cgracey · 2013-04-10 15:28

Rayman wrote: »

Does this driver do this? Or, is it hardwired in the DE0, DE2 boards?
I think I see that maybe for the real chip you'd use the CFGPINS command to put a clock pin in the "SDRAM Clock Out" state...
Is that right?

You got it. Right now, the FPGA just drives it directly, outside of the Prop2 logic.

rogloh · 2013-04-10 17:39

cgracey wrote: »

I think it's cleanest to just make a generic SDRAM server, thereby decoupling any video dependency from it, and vice-versa. There is no speed advantage to be had, though saving a cog could be important.

Agree that makes things cleaner and generic, though how to arbitrate/prioritize between multiple COGs (eg. video driver and some VM COG running C or whatever from SDRAM memory space) will become important in some applications and is required elsewhere in the system. One basic mechanism would to have some sort of common cache driver/arbiter COG that uses some hub RAM and makes all SDRAM memory R/W requests on behalf all of other clients. Or maybe your processing list used by the SDRAM driver can be shared by different clients either using different "request slot" list positions for each client and supporting nops if the clients request slot is not required or alternatively by having a hub locking mechanism that blocks until the whole resource is free. Latency and bandwidth sharing/guarantees can then get tricky however.

This whole issue is always going to be a problem when more than one client is using the SDRAM memory and it can't be left out. The good thing is with a decent amount of intermediate hub buffering a hires video driver can probably tolerate some interruptions to its SDRAM access while any VMs need to do their thing with the SDRAM, but the VM's accesses themselves can be blocked from time to time for a larger interval (this is where a hub cache will help reduce the impact).

For 1k SDRAM burst transfers there is still ~284MB/s of bandwidth at 160MHz and 89% driver efficiency. That is quite a lot of raw bandwidth, but it will obviously drop away significantly when introducing shorter bursts especially for single byte/word transfers. Smallers burst sizes will reduce access latency however, so it is a big tradeoff. I do like how your SDRAM driver allows variable burst sizes for each request made which will mean this could all be controlled dynamically for the different clients in an attempt to tune or optimize performance for different applications.

All good stuff.

rjo__ · 2013-04-11 06:11

Chip is busy. So, I expect you guys to handle this:)

I fully understand about 10% of this conversation, but I do understand the purpose and overall organization of the driver.

If the goal is to have a generic driver, then shouldn't the driver be capable of gathering data from pin states as well as memory?

Rich

evanh · 2013-04-12 16:00

pedward wrote: »

It seems to me that to save cycles, the tile buffer needs to be in COG memory, to avoid hub access.

Hijack those modern DVI/HDMI displays, throw away the sequential scan loading and tile them instead. haha.

evanh · 2013-04-12 16:22

rogloh wrote: »

This whole issue is always going to be a problem when more than one client is using the SDRAM memory and it can't be left out. The good thing is with a decent amount of intermediate hub buffering a hires video driver can probably tolerate some interruptions to its SDRAM access while any VMs need to do their thing with the SDRAM, but the VM's accesses themselves can be blocked from time to time for a larger interval (this is where a hub cache will help reduce the impact).

Please give a concrete example. VMs (I assume you are talking about virtual machines) are typically not a timing critical component. The primary reason for them is to allow for high level user type coding, ie: Bigger, more complex and bloated code that relies more and more on lower level libs and handlers to hide the hardware. This type of coding cares more about throughput rather than timing, so, yes, caching it.

In the cases where there is a need for more than one deterministic datastream, they would have to be hand crafted to fit together as a matched set.

rogloh · 2013-04-13 02:14

Please give a concrete example. VMs (I assume you are talking about virtual machines) are typically not a timing critical component. The primary reason for them is to allow for high level user type coding, ie: Bigger, more complex and bloated code that relies more and more on lower level libs and handlers to hide the hardware. This type of coding cares more about throughput rather than timing, so, yes, caching it.
In the cases where there is a need for more than one deterministic datastream, they would have to be hand crafted to fit together as a matched set.

Yep I was talking about virtual machines such as some C code running from SDRAM. As you say they are generally used for doing the high level user control code and not going to be used for timing critical parts like I/O drivers. It was probably not explained well but my main point was the VM performance can really suffer if it is regularly held off for long periods of time once a hires video driver running in one COG is doing a large burst transfer from SDRAM and if the VM's data request cannot be serviced from a hub cache in the meantime. We really must have a decent hub cache to mitigate this issue otherwise we will start to see a big problem with latency as the video driver uses more SDRAM bandwidth and holds off other users from the SDRAM more often. The worst case obviously happens if you don't have one at all. That would be a very bad thing.

By the way I've put some more information here I was computing for myself that might be useful when figuring out memory bandwidths used up by high-res / high-bit depth video and latencies one might expect to see for the transfers.

The raw SDRAM bandwidth for 16 bit wide memory is 320 MB/s when running the bus at 160MHz, but due to setup and driver overheads it will be always less than this. In Chip's SDRAM driver for example he mentioned the efficiency of a large 1kB transfer burst was 8/9 or 88.9% which would be then extrapolated to 284 MB/s using 16 bit wide memories on a real P2 at 160MHz. To do much better than this likely requires faster SDRAM and a bit of overclocking. Unfortunately using wider 32 bit memory only doubles performance into the COG from the SDRAM but can't be sent to hub RAM at the full rate (as the hub can only do 320MB/s to/from an individual COG). But combining a video and SDRAM driver in the one COG may possibly still benefit from using 32 bit wide memory and allow higher resolutions/bit depths than those shown below...it's obviously a lot trickier to code and it is not generic once you do this but is still an interesting idea for higher performance.

Below are some bitmapped video mode memory read bandwidth requirements if data is streamed from SDRAM for different (xx)VGA resolutions at 60Hz screen refresh rates. I've listed the main ones that could potentially work with 16 bit SDRAM clocked @ 160MHz (from http://tinyvga.com/vga-timing) . Bandwidth is in millions of bytes not real megabytes here so it is MB not MiB (which is a confusion I still detest).

 800 x 600 ~   40 MB/s (8bpp),  80 MB/s (16bpp), 160 MB/s (32bpp)
1024 x 768  ~  65 MB/s (8bpp), 130 MB/s (16bpp), 260 MB/s (32bpp)
1280 x 1024 ~ 108 MB/s (8bpp), 216 MB/s (16bpp)
1600 x 1200 ~ 162 MB/s (8bpp)
1920 x 1200 ~ 193 MB/s (8bpp)

If you reduce the SDRAM transfer burst size the efficiency reduces as does the bandwidth but the transfer latency also comes down proportionally (to the point where the hub access latency becomes dominant).

burst      read efficiency     max driver bandwidth                   driver latency 
 1kB   64/72 = 8/9 = 88.9%         284.4MB/s                     512 + 64 = 576 clocks (3.60us)
512B   32/40 = 4/5 = 80.0%         256.0MB/s                     256 + 64 = 320 clocks (2.00us)
256B   16/24 = 2/3 = 66.7%         213.3MB/s                     128 + 64 = 192 clocks (1.20us)
128B    8/16 = 1/2 = 50.0%         160.0MB/s                      64 + 64 = 128 clocks (0.80us)
 64B    4/12 = 1/3 = 33.3%         106.6MB/s                      32 + 64 =  96 clocks (0.60us)
 32B    2/10 = 1/5 = 20.0%          80.0MB/s                      16 + 64 =  80 clocks (0.50us)
 16B    1/9        = 11.1%          35.5MB/s                       8 + 64 =  72 clocks (0.45us)
  8B    1/9 x 1/2  =  5.5%          17.7MB/s                                 72 clocks (0.45us) (driver doesn't actually support this)
  4B    1/9 x 1/4  =  2.7%           8.8MB/s                                      "
  2B    1/9 x 1/8  =  1.3%           4.4MB/s                                      "
  1B    1/9 x 1/16 =  0.7%           2.2MB/s                                      "

Now just to prove that cache is the only way to go. For for anyone thinking about doing a VM running from the SDRAM but without a working cache just look at the performance for the small transfers. We are talking 0.45us of latency here which yields 2.2 MIPs. And I'm not even accounting for the instruction execution just the time to get the instruction data from the driver and this is the case if the VM has exclusive access to SDRAM and doesn't compete with video. When it competes with video it gets worse as there can be up to an additional 3.6us of latency (average is half) for whatever proportion of total time that the video driver consumes SDRAM bandwidth. At say 1280 x 1024 x 16bpp and 512 byte bursts for video, video consumes 5x4 * 216MB/s = 270 MB/s of the 320MB/s, leaving 50MB/s of raw SDRAM bandwidth remaining for the VM but adding on average 2.0/2 = 1.0us more latency to the VM for 270/320 = 84% of the time. By dividing video transfers into 512 byte bursts you need to make 5 transfers per video line at this resolution (2560 bytes / 5). This then only gives sufficient time for 10 opportunities for other VM small transfers to be interleaved with the video on each video line which means 10 instructions per video line period making 0.64MIPs at best. Pretty slow and getting closer to P1 territory with SRAM.

When a hub cache is used as would be expected, depending on the video resolution being used it could be desirable to *lower* the maximum SDRAM memory bandwidth by reducing the video memory transfer burst size in order to get a latency reduction boost for the VM's cache assuming it already gets good hit rates and doesn't need a significant amount of of actual SDRAM bandwidth over what the video driver consumes. So you wouldn't simply just go choose a 1kB burst size for video accesses to get the highest efficiency if you don't need it. System performance could be improved if you reduce the transfer size (to a point).

I expect you also are going to see quite similar things happen when a VM is drawing lines/circles etc to video frames by writing single bytes/words holding pixel data directly into the SDRAM (or even with a write though cache) and in this same example given above it will probably achieve numbers like 0.64 Mpixels/second during the time video is actively transferring data and 2.2Mpixels/s when it is not. So it may perhaps take ~2 seconds to fill the entire 2.5MB screen at 1280x1024x16bpp that way which is rather slow. It would be much better to draw to a small off screen temporary hub RAM buffer and then update some screen sections in a larger burst. Bitmap rendering software will need to take that into account. Actually the byte write performance may indeed be lower than this if these writes involve fiddling with SDRAM qualifier mask pins that take extra cycles to control but I didn't look at that part and non-quad multiple writes are not in Chip's driver anyway (at least yet).

Just some food for thought. Despite my various theories I still can't wait to try this new device out and see how it really performs in the real world with hires graphics however.

Roger.

cgracey · 2013-04-13 02:33

Good analysis, Roger.

I think a cache can/must be handled by the cog needing the data, leaving the SDRAM driver to just do its core job, without detailed concern.

Like you pointed out, byte/word/long writes to SDRAM display memory are going to be slow. It might need to be that data is composed in hub RAM and then transferred to SDRAM in blocks. For drawing random lines and points, that doesn't work, though.

You understand the issue of having two SDRAMs. Maybe a cog set up to display video and serve blocks by hub requests can realize some advantage, but maybe not. I had the FPGA set up with two SDRAMs, at first, and it could do a 1080p 24bbp display, but there was no time for updating, which made it impractical.

rogloh · 2013-04-13 09:06

You understand the issue of having two SDRAMs. Maybe a cog set up to display video and serve blocks by hub requests can realize some advantage, but maybe not. I had the FPGA set up with two SDRAMs, at first, and it could do a 1080p 24bbp display, but there was no time for updating, which made it impractical.

Yeah I imagined 1080p60 at 24bpp was going to be very difficult even with 32 bit wide SDRAM memory. The pixel clock rate for that resolution is 148.25MHz so with unpacked data (ie. one 24 bit pixel per 32 bit word) this needs 4x that data bandwidth or 593MB/s! That is higher than hub bandwidth so it needs the video and SDRAM driver to be combined in the same COG.

Unfortunately as you found it really doesn't leave much SDRAM memory bandwidth free for other VM updates and refresh cycles.

Once I get myself a P2 setup one day and a 32 bit wide SDRAM memory attached I might still like to have a play around with the concept and see how well it works and what can be achieved. It's likely the only way to get the performance needed for the highest resolutions if using a shared SDRAM approach on the P2. But despite all that, going for either 1080p24 or 1080p30 is more likely to be practical in the end instead of trying for the full blown 1080p60 resolution with truecolor.

In the meantime 16 bits will just have to do.

potatohead · 2013-04-13 11:09

Can we do analog 24P? Edit: Well, of course we can. Displaying it without a converter chip is the real question. CRT displays would require very slow phospher... Maybe the video processors in modern HDTV displays would handle this.

If so, now I wonder about 24P in moderate and low resolutions. The product would be nice, slow sweep rates, which yield the most processing time and highest fill rates!

Gotta go look at the signaling document now.

Roy Eltham · 2013-04-13 12:12

Don't have to go 60Hz for the frame rate. If you could do 30Hz or 24Hz, the that would work with a lot of HDTVs (if not all of them) also.

cgracey · 2013-04-13 14:09

Roy Eltham wrote: »

Don't have to go 60Hz for the frame rate. If you could do 30Hz or 24Hz, the that would work with a lot of HDTVs (if not all of them) also.

I've found, too, that my HDTV is happy to do 1080p at 30Hz refresh. These flat screens buffer all the pixel data, anyway, so there's no visible retrace/refresh issue, just a slower frame rate.

tritonium · 2013-04-13 15:46

I might be niaive but wasn't celluloid film 24 frames a sec? - that always seemed smooth to me. I am often puzzled when video games hardware boasts of the number of frames per sec and say 24 is just barely playable. However resolution is everything as far as I'm concerned. Am I missing something?

Dave

potatohead · 2013-04-13 17:08

Yes. That's the reason for newer displays to support 24p The faster frame rates and or interlaced displays all cause problems with the original cinematography. This is particularly noticable on a 50/60Hz interlaced display like most older home TV's operate on. The film is at 24, meaning frames needed to be skipped or held over to match things up, and the interlace would cause motion artifacts across frames too, smearing things, tearing of edges, etc... where the original film didn't do that. You can see the difference on DVD media copies of TV programs filmed for TV. They look great, mostly because the original video capture was also interlaced. Where that's not true, processing needs to happen, or it just looks more crappy. Early progressive scan display capable TV's were made to compensate for this.

Originally, I really didn't think it was a big deal, particularly on progressive scan displays, but it is! On a lark, I bought a Blu-Ray copy of "How The West Was Won" that was encoded to play correctly in 24p. You get used to these things, until you view something without those artifacts.

The Blu-Ray is capable of reproducing this film extremely well! I was stunned at the quality that process is capable of and realized very few people, prior to Blu-Ray availability, have seen what really was captured on older films. I may buy a few more of these that I like, just to really enjoy them. Color, contrast, detail are excellent. That movie is among the best in my collection. Pure eye candy to watch! Highly recommended, even if you don't like older westerns. You can view it technically and just marvel at just how good those guys were using analog means only. It's impressive.

Motion artifacts are something I can see a lot more easily now too. After watching that one, which has many great scenes where there is a lot of basic activity and some camera motion, I can appreciate the cinematography. It just flows, which was the point.

Re: Games and frames per second.

A whole lot depends on the game. If the game demands short response times, frame rates really do matter. For those games, many people would trade both resolution and color depth for reduced latency in the experience overall. Quite a while back, I was playing a lot of Quake Arena. Great FPS shooter game. I used to run it at 640x480 to get high frame rates on the graphics device I was using. 100Hz was possible, and the CPU could keep up with that. Double the resolution, and the game would drop to 50-60Hz, and at "high resolution" would land somewhere around 20fps.

The difference was easily noticed. Other players could move, fire off a shot, and move again where I was still aiming! So that's one element where latency matters.

So then, I decided to maximize the experience. Setup my own server so that network latency was low < 10ms average, got a high frequency input device, ran the graphics at 100fps frame rates, vertically synced to the CRT for a no tearing, rock solid display experience, and the product of it was much better playability.

It all adds up.

These days, we struggle with the plasma / LCD display latency. CRT displays are basically frame accurate. If the display is running at 100hz, then the game state is communicated 100 times per second, assuming the engine can do that and the display isn't multi-buffered. And that's an issue right there! Some game engines draw dynamically, meaning each frame is rendered to the current game state. Others buffer it one or more times, meaning you are not seeing the current state, but one present a few frames back. At 100hz, this isn't much. At 25-30hz, it's considerably more, and well matched, serious players notice that impact on their performance.

Now the modern HDTV displays buffer things, adding latency there too.

A while back, I setup a nice CRT display running at high frame rates, PC, low latency input devices, etc... and let them game on it for a while. Going back to the console with it's dual analog sticks, compared to the insane fast keyboard + mouse, along with the latency seen on modern displays / game engines and they had quite a few comments, mostly negative!

That's the reason many will still game on a PC rather than a home console. Latency.

Additionally, people vary in their response to these things, and that tends to roll off some with age, though not always. The center of our eye is the slowest, but it offers the most detail and color perception. Peripherial vision is much faster, but lower detail and low to no color, depending.

For each frame, the persistence of vision helps to blend things together. In a dark theater, bright flashes from the projector persist long enough for most people to see a smooth sequence of images, but there are a lot of rules to be followed for that to really be true. When it's done well, the experience is immersive, and that is cinematography. When it's not done well, the illusion is broken, taking people out of the movie. A picket fence, for example, will jitter and appear disjointed when the camera motion isn't constrained to the set of vectors and velocities that insure that fence is seen smooth. The same scene without that fence can be shot "faster" with more aggressive motions.

Some of us pick up on that more than others too.

24p on a CRT is going to appear to flicker for most people, and that's due to the phosphers being fast enough to prevent smearing and ghosting of fast moving objects on high contrast backgrounds. That same 24p on a modern display will be just fine, because the display itself runs much faster and everything is buffered so that we don't see a series of flashing frames, because the display is always lit, which matches well with the projector in a dark room effect.

On a home TV, phosphors are a little slower than VGA monitors are, and this is so the 50 and 60hz display rates used in television do not appear to flicker for most people. To some people they still do. VGA monitors really cease to flicker at 70-80hz for central vision. Some people still see flicker in peripherial vision, depending on how their eyes operate. Most people don't however.

60hz is about where the line is on that. Below 60 = flicker on just about any CRT display, unless it's an old slow-phosphor display like those old composite monochrome displays. I've got one of those, and it works down to about 45hz and appears flicker free, and it's high resolution, able to display 800 dots per line NTSC too. But it's crappy for higher frequency moving images. I use it to view PAL content at 50Hz as my only PAL color capable device is one of those capture things that run over USB, and I don't get a frame exact representation off the laptop LCD that way.

If you have a CRT display, try setting it for 60hz, turn your room lights off and look at it with central vision. You may see some flicker if the contrast is turned up and you are looking at a bright image, or even solid white. Look away and let your eyes relax, then look at it with peripheral vision, just creeping your eyes toward it carefully. Chances are you can see the flicker that is really there.

BTW: Side note. Animal eyes appear to run "faster" and they see flicker where we don't. I've noticed the animals will watch the HDTV intently at times where before they would ignore the CRT. As predators, this makes perfect sense the same as it does for that FPS gamer... They both have the same need for interacting quickly.

Finally, interlaced displays bring a whole additional set of problems! If the primary display signal is 60hz, a full frame only happens every 30Hz, and some detail elements present on one frame or the other at high contrast will appear to flicker even though the display as a whole does not. NTSC TV's are particularly bad for this due to lower vertical resolution... The other problem is motion artifacts. To get smooth motion where the shape of things stays constant while in motion on an interlaced display, it's necessary to constrain that motion to half the scan frequency so that the object is presented in the same place for both even and odd frames. Failure to do this results in "tearing"

Computer graphics show this tearing quite easily! They are high contrast things. Natural images of things moving that are captured on interlaced devices tend to blend together across frames, resulting in a much smoother image overall. It costs a lot of computer power to reproduce this, so it's mostly not done. Much better to either up the frame rate of the display, or just cut the motion frame rate back to avoid interlace tearing problems... A similar kind of tearing happens on progressive scan displays where the image is changing faster than the frame rate of the display is capable of rendering. The viewer will see part of the old image and part of the new image or just a kind of jittery one depending on how ugly that difference is.

Early computers often ran TV displays non-interlaced to get the higher frame rates for motion and to avoid the flicker seen on high contrast objects that do not occupy a scan line in both frames. VGA displays would run interlaced for business graphics, and often switch to non-interlaced, at some reduction in vertical resolution, for gaming where motion was a higher priority than overall resolution was.

Most of these things were bigger problems on CRT displays due to their real time nature. Modern LCD / Plasma displays buffer everything, trading accuracy for latency. They render a frame or few behind, saving the viewer many of the frame rate problems, which makes among other things possible, 24p displays that make sense!

For us, on the Propeller, that buffering is probably exploitable. We can run slow sweep rates, which maximize display compute and fill rates, and still get high resolution displays that are useful for text and moving graphics, and where we can't do that and motion is a priority, drop the resolution down some and do it at a higher frame rate overall.

Sheesh... Sorry. Long post. I get to thinking about these things and ramble. Sorry!

Cluso99 · 2013-04-13 18:54

Thanks for the detailed explanation ptoatohead. A lot of valuable info to digest.

tritonium · 2013-04-14 12:01

Phew!!
Displays 101...
I bet you feel better now thats off your mind

Thanks, there's more to this than meets the eye...

Yep that makes a lot of sense and thanks for sharing.

Dave

evanh · 2013-04-14 15:04

potatohead wrote: »

Can we do analog 24P? Edit: Well, of course we can. Displaying it without a converter chip is the real question. CRT displays would require very slow phospher... Maybe the video processors in modern HDTV displays would handle this.

If so, now I wonder about 24P in moderate and low resolutions. The product would be nice, slow sweep rates, which yield the most processing time and highest fill rates!

The answer has always been, and still is to this day, interlace. Interlacing has the nice advantage of providing the needed higher sync rates for non-buffered displays (and by extension effective framerate when the source allows) while still providing full resolution picture and thereby inherently including progressive encoding for sources that don't use the full framerate.

The alternative being half the vertical resolution for the same bandwidth, ie: sequential scan. On that note, I do wonder what a modern HDMI based TV would do with such a screenmode. Would it scale a digitally transmitted 320x240, 15kHz picture nicely? I know the average DVI computer monitor won't handle this, they have an arbitrarily limited minimum scanrate of about 30kHz, requiring the graphics chip to scan-double any lower res screenmode.

potatohead · 2013-04-14 15:30

I agree with you on older real time displays. New ones buffer the whole thing. If they can accept 24p at low and moderate resolutions, I'm inclined to do just that, simplifying things considerably.

Sure worth a try at some point.

evanh · 2013-04-14 15:47

potatohead wrote: »

Finally, interlaced displays bring a whole additional set of problems! If the primary display signal is 60hz, a full frame only happens every 30Hz, and some detail elements present on one frame or the other at high contrast will appear to flicker even though the display as a whole does not. NTSC TV's are particularly bad for this due to lower vertical resolution... The other problem is motion artifacts. To get smooth motion where the shape of things stays constant while in motion on an interlaced display, it's necessary to constrain that motion to half the scan frequency so that the object is presented in the same place for both even and odd frames. Failure to do this results in "tearing"

Be a little wary of this type of thinking. It comes from situations where the interlaced data has been deinterlaced cheaply.

The statement fails in two ways that I can think of off the top of my head:
- On a display that directly uses (Analogue CRT) the interlaced signal the tearing is not present at all. And the effective framerate is the field rate. The level of flicker is the refresh rate of the display. There was a level of shimmering on high contrast but that was considered ok for business applications of the time. Games and TV could use softer contrasts.
- On a well implemented deinterlacer the motion tracking algorithm tracks the full framerate and thereby correctly prevents the visual artefacting from being displayed. This requires frame buffering and therefore some small lag. The algorithm has to add information the to image, this is because the reconstruction is typically doubling the stream bandwidth. This can result in no tearing, no flicker and by extension no shimmering at all as the display device may not even have a refresh.

Early computers often ran TV displays non-interlaced to get the higher frame rates for motion and to avoid the flicker seen on high contrast objects that do not occupy a scan line in both frames. VGA displays would run interlaced for business graphics, and often switch to non-interlaced, at some reduction in vertical resolution, for gaming where motion was a higher priority than overall resolution was.

Games used "non-interlaced" sequential screenmodes for speed of execution and compatibility. RAM space and RAM speed were a premium. Interlaced on a TV was easy but on the PC it was problematic. And when it was used for CAD work and the likes, not only was effort put into making it work, it was done at a higher refresh rate to reduce shimmering.

For us, on the Propeller, that buffering is probably exploitable. We can run slow sweep rates, which maximize display compute and fill rates, and still get high resolution displays that are useful for text and moving graphics, and where we can't do that and motion is a priority, drop the resolution down some and do it at a higher frame rate overall.

In theory, yes, in practise, very unlikely due to arbitrary set limits the manufacturers build in to the displays.

jmg · 2013-04-14 16:31

cgracey wrote: »

I've found, too, that my HDTV is happy to do 1080p at 30Hz refresh. These flat screens buffer all the pixel data, anyway, so there's no visible retrace/refresh issue, just a slower frame rate.

At what frame update speeds do these HDTV displays start to have visible issues ? ie just how slow can you go ?

potatohead · 2013-04-14 19:06

Be a little wary of this type of thinking. It comes from situations where the interlaced data has been deinterlaced cheaply.

The statement fails in two ways that I can think of off the top of my head:

On a display that directly uses (Analogue CRT) the interlaced signal the tearing is not present at all. And the effective framerate is the field rate.

I'm sorry, but that is absolutely not my experience, or we are talking past one another. To be clear on what I was writing about, say we are on a PAL interlaced display. All the lines are displayed at a rate of 25hz. Motion faster than that will not be scan line synchronized, and the shape of the image will degrade. In progressive scan, motion can be 50hz without tearing. Interlaced is constrained to 25hz. The degree of object tearing depends on the position change. With standard definition TV sets, the cost is vertical resolution. Full vertical resolution = 1/2 the motion, half vertical resolution = full motion at 25 and 50hz respectively for PAL, 30 / 60hz NTSC.

Video processing can mitigate this, at the potential cost of both detail and or temporal accuracy for the motion. Put one of those 3D games up on an interlaced display and go move around some. Or do the same with some bright sprites or vectors against a dark background. It's gonna tear or blur on an interlaced display when the motion exceeds the "show all the scan lines" rate, and the tear will happen on analog devices with no processing and the blur will happen on those that feature video processors.

The limits are obvious, again unless motion is constrained to the rate where all scan lines are displayed, then the only real cost is update latency and with that coarser motion perception.

I'm not against interlaced displays, and in fact have often made use of them for various things, and was citing the potential results for the various display technologies out there. Given what modern displays can do, I question whether or not "interlace has always been and is the answer today" A 24p display will be simpler to manage than an interlaced one is.

And since we are talking details, one big difference here, assuming we don't have a deinterlace system in play, is image tearing on eyeball movement. Given the same rate to display the entire image, motion and detail presented to the user would be equal. Say 30hz progressive vs 60hz interlaced. Both are capable of the same motion detail. However, a user glancing around at a larger display, or one with high contrast elements, or faster phosphors in the case of a CRT, will trigger the image tearing themselves just due to eye movement! For many people, this isn't something they perceive directly. Some of us can, and I can. What it does do generally is increase viewing fatigue whereas the same low frame rate displayed on a device that buffers the display, like a modern HDTV can, that fatigue isn't there.

...that is assuming the display itself does not produce similar artifacts. Some projection type devices rapidly cycle through the colors at something like 90-120+hz, which can be seen on a bright object with some eye movement, as the full frame is displayed at a rate lower than 50-60hz. The effect is a whole lot like the one we experience with an LED clock in a dark room, or when viewing LEDs that are not constantly lit on a dark road. Eye movement leaves a series of sharp, distracting images rather than a smudge, and we deal with smudges better as people than we do lots of little sharp persistent images...

Your final statement is one I agree with, as the cost is either detail, temporal accuracy in the motion, or latency in display getting to the user to patch it all up.

"non-interlaced" sequential screenmodes for speed of execution and compatibility

Sometimes yes. This depends on the game. People who like to play games and who author them prefer higher frame rates because there is less interaction latency and motion is smoother. For some kinds of games, it won't matter much. For others, particularly immersive ones, it matters a lot! I agree with speed of execution, which gets right back to frame rate and latency in the process overall. Higher latency contests that strain hand / eye skills suffer badly at lower frame rates. Other kinds of things don't.

Which brings me back to this:

On a well implemented deinterlacer the motion tracking algorithm tracks the full framerate and thereby correctly prevents the visual artefacting from being displayed. This requires frame buffering and therefore some small lag.

I think we are saying the same thing differently. Not sure how what I wrote is a failure though.

And when it was used for CAD work and the likes, not only was effort put into making it work, it was done at a higher refresh rate to reduce shimmering.

I've been doing CAD since the 80's and getting progressive scan displays at any reasonable frame rate was THE goal. For older 2D CAD systems featuring wireframe 16 color type graphics, interlaced displays were *ok*, but far from optimal. Dark backgrounds could reduce fatigue, but not eliminate it. For 3D CAD of any kind, wireframe, surface, solids, an interlaced display very quickly becomes unacceptable, again due to motion artifacts or user interaction latency.

When positioning a mouse for example, there are stark differences at various frame rates. Update the mouse pointer at 15Hz and most people will struggle, feeling a bit drunk as they overcompensate and are forced to move, slow down, tune in, then click. At 30hz this begins to diminish, and at 50hz and above, most people find it "real time" and no longer experience the difficulty with fine detail positioning. Ramp this to 100hz, and now it's smooth as glass, precise motions possible at very high speeds. The CAD user is all about this, and most of them doing serious CAD paid and paid big and early to get >60hz non-interlaced displays and input devices. (I should know, I sold a ton of them

As well as owned one as quickly as I could, and I could go off on an SGI lark here too. Those guys got all this stuff, delivering very highly interactive 2D 3D computing while many were still trying to just get a working 2D desktop at speed....)

SDRAM Driver

Comments