@Tubular said:
Here's a picture of WUXGA (1920x1200) DVI text on the Philips 288P monitor
That's so cool it displays that high a res at the low refresh and with coloured text too! VGA output above XGA can't even do coloured text like that with my single COG driver, it needs at least a 5x clock ratio, which is why it falls back to the blue monochrome text (uses palette entries 0 or 1 only). Maybe I should let those VGA modes run at lower refreshes to the typical 50 or 60Hz too. But most CRT based VGA monitors want at least 50Hz.
The problem might be that the pins (+6, +7) are floating and coupling noise from adjacent pins. Just drive them both low and that should take care of it.
Going much lower than 50 Hz on a CRT would be very flickery, if it even if the screen accepts it, because of how the phosphor looses energy between each refresh and because the frequency is below the flicker fusion threshold of the human eye. Modern screens has got buffers and retains the image until it's time to do a new refresh.
Actually, LCDs aren't buffered in the LCD matrix drivers. The crystals are so slow they may as well be static. They need to be driven in both directions to have any speed.
There is a buffer but that's used for scan rate conversion in TVs and monitors and it's on a separate board. The LCD interface, CMOS or LVDS, links the converter to the LCD drivers.
High-end monitors actually are moving away from static displays again, realizing that after like 20+ years that LCD motion clarity is still kinda trash. So they basically strobe either the backlight or the LCD itself, haha. OLED tech can do true scanned displays, but that's really fringe as of now.
Thinking about it, I could be wrong in another way. There may be latches that effectively act as buffers. They would be there, for say a whole scan line, to provide the needed lasting drive timings.
Got some P2 hw on my desk now, so time to try some video out.
Disappointingly, something seems not right. I wrote some test code to display a simple bitmap over svideo and this happens:
CON
OUTBUFSIZE = 360*2
VAR
long region[video.REGION_SIZE/4] ' text region structure
long display[video.DISPLAY_SIZE/4] ' display structure
byte context[video.CONTEXT_SIZE] ' context data for text region
byte outputBuffers[OUTBUFSIZE*2] ' space for two line buffers
byte bitBuffer[360*240] ' screen buffer
OBJ
video: "p2videodrv.spin2"
PUB main() |i
video.initDisplay(-1, { the cogid to use (-1 = auto-allocate)
} @display, { the display structure address in HUB RAM
} video.SVIDEO_CVBS, { video output type (VGA/DVI etc)
} 32, { base pin number (hsync pin) of DVI pin group
} 0, { VSYNC pin (not used for DVI)
} video.PROGRESSIVE, { display flags
} @outputBuffers, { address of the consecutive two scan line buffers in HUB RAM
} OUTBUFSIZE, { size of a single scan line buffer in bytes
} @prog240_timing, { obtain stock timing to use, (or create custom timing instead)
} 0, { optional external memory mailbox address in HUB RAM (0=none)
} video.initRegion( { setup a single text region as the display list
} @region, { region structure address in HUB RAM
} video.RGB8, { type of region is text
} 0, { size of region in scan lines (0=to end of screen)
} video.DOUBLE_WIDE, { region specific flags (if enabled text flashes if BG colour > 7)
} 0, { address of default palette to be used by region
} 0, { address of default font to be used by this region
} 0, { number of scan lines in font
} @bitBuffer, { address of screen buffer in HUB RAM
} 0) { link to next region, NULL = last region
} )
repeat i from 0 to 239
bytefill(@bitBuffer+i*360,i&1 ? %111_111_11 : %000_000_11,360)
repeat
DAT
prog240_timing '720x240p timing @ 60Hz with 13.5MHz pixel clock
long video.CLK108MHz
long 108000000
'_HSyncPolarity___FrontPorch__SyncWidth___BackPorch__Columns
' 1 bit 7 bits 8 bits 8 bits 8 bits
long (video.SYNC_NEG<<31) | ( 16<<24) | ( 64<<16) | ( 58<<8 ) | (720/8)
'long (SYNC_NEG<<31) | ( 56<<24) | ( 64<<16) | ( 98<<8 ) | (640/8)
'_VSyncPolarity___FrontPorch__SyncWidth___BackPorch__Visible
' 1 bit 8 bits 3 bits 9 bits 11 bits
long (video.SYNC_NEG<<31) | ( 0<<23) | ( 0<<20) | ( 13<<11) | 240
long 8<<8
long (8<<24) + (36<<16)
'long round(3579545.0 / 108000000.0 * float($7FFFFFFF) * 2.0)
long 284704235/2 'reserved for CFRQ parameter
Unrelatedly, it also does this weird flagging thing on my TV card. Notice how the top quarter or so of the screen is slightly offset.
CON
OUTBUFSIZE = WIDTH*2
WIDTH = 720
VAR
long region[video.REGION_SIZE/4] ' text region structure
long display[video.DISPLAY_SIZE/4] ' display structure
byte context[video.CONTEXT_SIZE] ' context data for text region
byte outputBuffers[OUTBUFSIZE*2] ' space for two line buffers
byte bitBuffer[WIDTH*240] ' screen buffer
OBJ
video: "p2videodrv.spin2"
PUB main() |i,j,a,b,tmp
video.initDisplay(-1, { the cogid to use (-1 = auto-allocate)
} @display, { the display structure address in HUB RAM
} video.SVIDEO_CVBS, { video output type (VGA/DVI etc)
} 32, { base pin number (hsync pin) of DVI pin group
} 0, { VSYNC pin (not used for DVI)
} video.PROGRESSIVE, { display flags
} @outputBuffers, { address of the consecutive two scan line buffers in HUB RAM
} OUTBUFSIZE, { size of a single scan line buffer in bytes
} 0, { obtain stock timing to use, (or create custom timing instead)
} 0, { optional external memory mailbox address in HUB RAM (0=none)
} video.initRegion( { setup a single text region as the display list
} @region, { region structure address in HUB RAM
} video.RGB8, { type of region is text
} 0, { size of region in scan lines (0=to end of screen)
} 0*video.DOUBLE_WIDE, { region specific flags (if enabled text flashes if BG colour > 7)
} 0, { address of default palette to be used by region
} 0, { address of default font to be used by this region
} 0, { number of scan lines in font
} @bitBuffer, { address of screen buffer in HUB RAM
} 0) { link to next region, NULL = last region
} )
repeat i from 0 to 239
tmp := @bitBuffer + (i*WIDTH)
a := i>>0
repeat j from 0 to WIDTH-1
b:= j>>2
byte[tmp+j] := ((a&b) AND((!a&!b)&255)) ? %111_111_111 : %000_000_11
repeat i from 0 to 239
'bytefill(@bitBuffer+i*WIDTH,i&1 ? %111_111_11 : %000_000_11,WIDTH)
repeat
Additionally, the top left corner gets weird when DOUBLE_WIDTH + border/skew is enabled (IMO skew shouldn't need to be set to account for the border, but maybe that saved an instruction somewhere, IDK)
CON
OUTBUFSIZE = WIDTH*2
WIDTH = 256
VAR
long region[video.REGION_SIZE/4] ' text region structure
long display[video.DISPLAY_SIZE/4] ' display structure
byte context[video.CONTEXT_SIZE] ' context data for text region
byte outputBuffers[OUTBUFSIZE*2] ' space for two line buffers
byte bitBuffer[WIDTH*240] ' screen buffer
OBJ
video: "p2videodrv.spin2"
PUB main() |i,j,a,b,tmp
video.initDisplay(-1, { the cogid to use (-1 = auto-allocate)
} @display, { the display structure address in HUB RAM
} video.SVIDEO_CVBS, { video output type (VGA/DVI etc)
} 32, { base pin number (hsync pin) of DVI pin group
} 0, { VSYNC pin (not used for DVI)
} video.PROGRESSIVE, { display flags
} @outputBuffers, { address of the consecutive two scan line buffers in HUB RAM
} OUTBUFSIZE, { size of a single scan line buffer in bytes
} @prog240_timing2, { obtain stock timing to use, (or create custom timing instead)
} 0, { optional external memory mailbox address in HUB RAM (0=none)
} video.initRegion( { setup a single text region as the display list
} @region, { region structure address in HUB RAM
} video.RGB8, { type of region is text
} 0, { size of region in scan lines (0=to end of screen)
} 1*video.DOUBLE_WIDE, { region specific flags (if enabled text flashes if BG colour > 7)
} 0, { address of default palette to be used by region
} 0, { address of default font to be used by this region
} 0, { number of scan lines in font
} @bitBuffer, { address of screen buffer in HUB RAM
} 0) { link to next region, NULL = last region
} )
video.setDisplayBorderSizes(@display,1,1,16)
video.setDisplayBorderColour(@display,$FF_FF_00)
video.setSkew(@region,-16*1)
repeat i from 0 to 239
wordfill(@bitBuffer+i*WIDTH,i&1 ? %111_111_11__000_000_00 : %000_000_11__111_111_11,WIDTH)
repeat
DAT
prog240_timing2 '544x240p timing @ 60Hz with 10.41Hz pixel clock (32/11 NTSC )
long $0128BCFB ' 96x NTSC colorburst (343.6 MHz)
long $147B798C ' ^^
'_HSyncPolarity___FrontPorch__SyncWidth___BackPorch__Columns
' 1 bit 7 bits 8 bits 8 bits 8 bits
long (video.SYNC_NEG<<31) | ( (16+3)<<24) | ( 48<<16) | ( (48+3)<<8 ) | ((512+32)/8)
'long (SYNC_NEG<<31) | ( 56<<24) | ( 64<<16) | ( 98<<8 ) | (640/8)
'_VSyncPolarity___FrontPorch__SyncWidth___BackPorch__Visible
' 1 bit 8 bits 3 bits 9 bits 11 bits
long (video.SYNC_NEG<<31) | ( 0<<23) | ( 0<<20) | ( 13<<11) | 240
long 33<<8
long (8<<24) + (36<<16)
'long round(3579545.0 / 108000000.0 * float($7FFFFFFF) * 2.0)
'long 284704235/2 'reserved for CFRQ parameter
long round(1.0/96.0 * 4294967296.0)
Ok, let me take a look at your samples later today when I get onto the P2. When I released this updated beta driver it was unverified for TV video graphics modes as it was aimed primarily at text and I'm still working on finalizing it and validating everything, but I'll check it out to see if somehow a regression has crept in. I did see something weird in graphics modes the other day when I ran some stuff for pik33 where I saw an offset but that was over 350MHz and so I put it down to that at the time as it went away when I reduced it to more normal speeds. I got this weird feeling the streamer fifo startup timing + PLL changes were a bit off in graphics modes. As I believed everything was 100% timing wise and pixel perfect before, I wonder if something crept in after these changes. I'll need to look into this, thanks for reporting it. You might be right about work buffer corruption being related, that can certainly mess the video up depending on the structure layout in memory.
@Wuerfel_21
Ok so I am reproducing the first issue. While normal pixel width looks okay, enabling double wide pixels with s-video/composite output seems to have triggered the offset and slip problem, I don't think I recall having tried that specific combo before so this might have even been in there from ages ago. I'll have to dig into the PASM2 code a bit to see why this would have a problem.
The second one (flagging with TV capture). I don't get that outcome on my display but I am feeding into a Dell LCD S-video input as well as the composite input for comparison. Both displays are stable here but there is certainly some amount of colour phasing on the composite on the sharp edges. Wondering if it might be something to do with the signal levels/sync accepted by your capture card not liking this P2 timing or waveform for some reason. Could it be the colour burst not being stable enough? In the past during development I have seen flagging like this if the timing between scan lines sync is not spot on, like if the streamer command to send sync gets delayed or something. When I can I'll move the setup into another room and test my plasma S-video input later to see how that copes.
@Wuerfel_21 said:
Additionally, the top left corner gets weird when DOUBLE_WIDTH + border/skew is enabled (IMO skew shouldn't need to be set to account for the border, but maybe that saved an instruction somewhere, IDK)
Yes the linkage between appropriate skew and border width is not automatic in the driver at this time. I could try to do something there at some future time in SPIN2 perhaps to compensate by adding skew if you enable a border, but right now if you enable a border it just eats into the amount of pixels normally displayed from real source data while it still uses the columns as setup in the timing.
As far as your 3rd issue goes, I think it is because your output buffer is not large enough. The timing says 544 pixels (68 columns), but your line buffer (OUTBUFSIZE) was only setup for 512 pixels. This is why there are 32 pixels more being generated and why your bitBuffer was being corrupted. The scan line buffer still needs room for the border as well.
Still digging into the first issue...something could be getting rounded down somewhere by a pixel. Maybe that explains the flagging too...? Did the flagging problem go away when you ran non pixel doubled?
Both displays are stable here but there is certainly some amount of colour phasing on the composite on the sharp edges.
Doing an xzero each frame did a lot to reduce that. It's accumulated phase error between the TV reference and the P2 generated signal.
To see it stand right out, create a set of vertical lines, half a color clock, or less, wide.
On moderate contrast images, this isn't very easily seen.
A P2 driven off a colorburst crystal would resolve it for either PAL or NTSC, depending. And there is always 13.500 Mhz, which can be common to the both of them.
I think I tried that xzero thing at some point and at the time I didn't really see a difference, but I am happy to test that again if I can get it to fit. I might have to put it in the ** lines as I don't think I have space for one extra standalone xzero instruction. It needs to fit in the area below (which gets patched differently for NTSC/PAL).
' SDTV interlaced sync code (larger PAL variant is coded, NTSC is patched over this)
interlacedsd 'some different sync code patches
cy testb fieldcount, #0 wc 'field interlace state
ci if_c call #\hsync 'deal with interlaced field lines
**if_c xcont m_slim, hsync0 'send a slim half line**
bitmask
ntsc1 rep #2, #5-0 'defaults to PAL
muxmask xcont sync_000, hsync1 'generate horizontal blanking/sync
offset xcont sync_001, hsync0
decod status, #31 'update status - in vertical sync
bpp setbyte status, fieldcount, #2 'update field counter in status
writestat wrlong status, statusaddr
ntsc2 rep #2, #5-0 'defaults to PAL
xcont sync_002, hsync1 'generate horizontal blanking/sync
xcont sync_003, hsync0
ntsc3 rep #2, #5-0 'defaults to PAL
xcont sync_000, hsync1 'generate horizontal blanking/sync
xcont sync_001, hsync0
'PAL variant below is the default
'while NTSC gets patched
ntsc4 if_c xor patchvbp, #1 'generate CQ XOR flip on fields 2,3
ntsc5 if_c callpa #1, #$+(blank_pal-syncspace-ntsc5+interlacedsd) 'relocated call
**ntsc6 if_nc xcont m_half, hsync0 'remainder after half line (PAL)**
I did not say it earlier, but the shimmer is phase error between the pixel clock and the color burst reference.
P1 did this too. But it was stable, and 4 bits in size. It never came up, unless one was mixing colors to get artifacts. The solution was to generate a software color burst, basically avoiding that circuit, using pixel clock only. Eric Ball did it, any my Nyan Cat demo puts it to use.
The P1 had its PLL, which kept it stable. The P2 NCO has a rolling error.
Chip suggested one xcont / frame. I had mixed, but better results. Next move was to use a colorburst crystal, or 13.5 Mhz.
Yes the s-video is a much better looking image right now. I'd say it's not 100% perfect but probably quite usable at the lower resolutions. Composite is still a bit ugly though I feel I did improve it a bit since the last time around. I think NTSC looked better than PAL (it seemed brighter more vibrant, maybe oversaturated).
I think the color phasing also happens because you can't make the line frequency = color carrier / (N+0.5) when everything has to be quantized to the pixel clock and the pixel clock isn't a multiple of 2*color carrier (which is somewhat undesireable because it creates lots of artifacts)
@Wuerfel_21 I found this seems to be happening when the number of columns in the timing is not divisible by 4 and you also select double wide pixels. I don't know why this should be the case, but I can make it happen with DVI the same way (it's not limited to the TV mode). The streamer is meant to be able to stop on a single pixel boundary, there should not be a limitation there and I compute the number of pixels to generate by multiplying by 8 (vis_pixels and m_rf in the p2videodrv.spin2 file). It's somehow stretching out the line it generates which would probably also explain your flagging issue, especially if that flagging does not happen when you didn't set double wide. In the case of 720 pixel wide scan lines, this should be 45 x double wide (16 pixel) columns = 360 wide pixels when pixel doubled, but I see 46 on the screen (it is adding an extra column).
This would be why I've not seen this before because all the standard video modes I used have columns that are divisible by 4 and that is what I did most of my testing with. Now after today we know that the driver doesn't (yet) handle the cases of columns not being divisible by 4 whenever pixel doubling is enabled.
640x480 (80)
800x600 (100)
1024x768 (128)
1280x1024 (160)
1920x1080 (240)
1920x1200 (240)
This mod to your code works (if you set your base pin correctly for your setup):
OUTBUFSIZE = 352*2
VAR
long region[video.REGION_SIZE/4] ' text region structure
long display[video.DISPLAY_SIZE/4] ' display structure
byte context[video.CONTEXT_SIZE] ' context data for text region
byte outputBuffers[OUTBUFSIZE*2] ' space for two line buffers
byte bitBuffer[352*240] ' screen buffer
OBJ
video: "p2videodrv.spin2"
PUB main() |i
video.initDisplay(-1, { the cogid to use (-1 = auto-allocate)
} @display, { the display structure address in HUB RAM
} video.SVIDEO_CVBS, { video output type (VGA/DVI etc)
} 0, { base pin number (hsync pin) of DVI pin group
} 0, { VSYNC pin (not used for DVI)
} video.PROGRESSIVE*0, { display flags
} @outputBuffers, { address of the consecutive two scan line buffers in HUB RAM
} OUTBUFSIZE, { size of a single scan line buffer in bytes
} @prog240_timing, { obtain stock timing to use, (or create custom timing instead)
} 0, { optional external memory mailbox address in HUB RAM (0=none)
} video.initRegion( { setup a single text region as the display list
} @region, { region structure address in HUB RAM
} video.RGB8, { type of region is text
} 0, { size of region in scan lines (0=to end of screen)
} video.DOUBLE_WIDE*1+video.DOUBLE_HIGH*1, { region specific flags (if enabled text flashes if BG colour > 7)
} 0, { address of default palette to be used by region
} 0, { address of default font to be used by this region
} 0, { number of scan lines in font
} @bitBuffer, { address of screen buffer in HUB RAM
} 0) { link to next region, NULL = last region
} )
video.setDisplayBorderSizes(@display, 30, 0,8)
repeat i from 0 to 239
bytefill(@bitBuffer+i*352,(i&1) ? %111_111_11 : %000_000_11,352)
repeat i from 0 to 100
byte[@bitbuffer][i*353]:=$0
repeat
bitBuffer[52]:=$ff
bitBuffer[60]:=$ff
bitBuffer[352+60]:=$03
bitBuffer[720+60]:=$ff
bitBuffer[1080+60]:=$03
bitBuffer[2*360+45*8]:=3
waitms(300)
bitBuffer[52]:=$03
bitBuffer[60]:=$03
bitBuffer[360+60]:=$ff
bitBuffer[720+60]:=$03
bitBuffer[1080+60]:=$ff
bitBuffer[2*360+45*8]:=$ff
waitms(300)
repeat
DAT
prog240_timing '720x240p timing @ 60Hz with 13.5MHz pixel clock
long 0'video.CLK108MHz
long 216000000
'_HSyncPolarity___FrontPorch__SyncWidth___BackPorch__Columns
' 1 bit 7 bits 8 bits 8 bits 8 bits
long (video.SYNC_NEG<<31) | ( 16<<24) | ( 64<<16) | ( (58+0*16)<<8 ) | (704/8)
'long (SYNC_NEG<<31) | ( 56<<24) | ( 64<<16) | ( 98<<8 ) | (640/8)
'_VSyncPolarity___FrontPorch__SyncWidth___BackPorch__Visible
' 1 bit 8 bits 3 bits 9 bits 11 bits
long (video.SYNC_NEG<<31) | ( 0<<23) | ( 0<<20) | ( 13<<11) | 240
long 16<<8
long (8<<24) + (36<<16)
'long round(3579545.0 / 108000000.0 * float($7FFFFFFF) * 2.0)
long 284704235/4 'reserved for CFRQ parameter
Ok yes that makes sense. My Dell monitor TV input is too forgiving there and doesn't show it up as a problem. I'd been experimenting with the timing and had patched that value from the original 1x16 when I pasted it here when I was switching between 720 and 704 pixels.
I still don't understand why the streamer is sending out the multiple of 32 pixels. In the case of 720 pixel timing it is being told to send 720, but seems to send 736. I need to dump my dynamic "m_rf" streamer command value into HUB RAM to be sure what is going on there in the driver. It's a bit bizarre as to what is happening in this odd double wide case. I understand the streamer to be able to stop on individual pixels so if I'm telling it to do 720, it should do 720 not round to nearest 32 somehow. It still must be some weird side effect in my code I don't yet see, which means it is hopefully still doable once I can figure it out.
I figured out why the flagging was happening: The DIP switches on the AV board are labeled reverse, so I erroneously left the luma pin in AC coupling mode.
@Wuerfel_21 said:
I figured out why the flagging was happening: The DIP switches on the AV board are labeled reverse, so I erroneously left the luma pin in AC coupling mode.
Ok, great you that found out why.
@Wuerfel_21 said:
Also, would it be possible to make the border color reload every line? That way it could be used as a low-overhead performance monitoring thing.
If a COG long can be found/saved somewhere it might be possible yeah. But finding space is much harder now, as the COG is full.
We might even need multiple longs, one to hold the hub address of where to read the colour from, one for the read instruction, and if we are paranoid to protect the driver, one to clear its LSByte so it can't mess up our sync.
Oh and another question: Can I link a couple of short regions in a loop? (So I could control attributes and stuff per scanline from a tile/sprite rendering cog)
@Wuerfel_21 said:
Oh and another question: Can I link a couple of short regions in a loop? (So I could control attributes and stuff per scanline from a tile/sprite rendering cog)
Yeah I think that could be possible if it ends on the screen that way but it's more CPU intensive compared to wrapping. You can back link the final region in the list to a prior one using "linkRegion" in the API. That group of regions linked together should then loop on indefinitely until the end of the screen (though I've not tried it). There is of course no loop count exit to this however.
You can also create quite a long display list instead of a loop, but you'd need to replicate the region information if something is common to all of them and needs to be updated down the list, still doable.
Depending on the type of region mode transition and P2 clock speed etc, there can be some performance limitation between regions as you hit cycle limits. Changing regions doesn't come for free and eats into the pixel rendering cycle budget too. This can be worse if you are pixel doubling too. But you can experiment to find the limits. A high CPU clock to pixel clock ratio will assist there if you push it very hard.
By the way I think I just found the issue with non-multiples of 4 columns when doubling, and I'm now figuring out if/how to solve that. It might be somewhat tricky as it relates to long only sized transfers with setq2 in my transfer loop during the doubling process.
Well, I'm asking because it'd be very helpful for porting JET Engine to P2 (well, "porting"... It'll only be able to use the first 64k RAM. But that's good enough to get my big game to also run on P2). JET Engine has it's own region system (except it calls them subscreens) and has to pass some info (gfx/text mode, horizontal scroll enable and bottom 4 bits of scroll value) about them to the output driver along with the rendered scanlines. Thus a wrapping region buffer is needed (the alternative would be to translate the JET subscreens into regions 1:1, but that seems more complex).
The resolution will only ever be 256/512x224, so I don't think it's going to run into speed issues. (Maybe I'll have it do text modes at 512x448?)
Pity border width can't be set per region, will have to make the buffers slightly wider than they need to be (OK because they need to be bigger than on P1, anyways, to work around the lack of a 4-colors-per-16-pixels mode (which is what JET text mode line buffers are)) and have the render cogs paint the mask on. I guess I might aswell make them much wider so I don't need borders at all (and draw my own borders in the render cogs)
Comments
That's so cool it displays that high a res at the low refresh and with coloured text too! VGA output above XGA can't even do coloured text like that with my single COG driver, it needs at least a 5x clock ratio, which is why it falls back to the blue monochrome text (uses palette entries 0 or 1 only). Maybe I should let those VGA modes run at lower refreshes to the typical 50 or 60Hz too. But most CRT based VGA monitors want at least 50Hz.
The problem might be that the pins (+6, +7) are floating and coupling noise from adjacent pins. Just drive them both low and that should take care of it.
Going much lower than 50 Hz on a CRT would be very flickery, if it even if the screen accepts it, because of how the phosphor looses energy between each refresh and because the frequency is below the flicker fusion threshold of the human eye. Modern screens has got buffers and retains the image until it's time to do a new refresh.
Actually, LCDs aren't buffered in the LCD matrix drivers. The crystals are so slow they may as well be static. They need to be driven in both directions to have any speed.
There is a buffer but that's used for scan rate conversion in TVs and monitors and it's on a separate board. The LCD interface, CMOS or LVDS, links the converter to the LCD drivers.
High-end monitors actually are moving away from static displays again, realizing that after like 20+ years that LCD motion clarity is still kinda trash. So they basically strobe either the backlight or the LCD itself, haha. OLED tech can do true scanned displays, but that's really fringe as of now.
Thinking about it, I could be wrong in another way. There may be latches that effectively act as buffers. They would be there, for say a whole scan line, to provide the needed lasting drive timings.
Got some P2 hw on my desk now, so time to try some video out.
Disappointingly, something seems not right. I wrote some test code to display a simple bitmap over svideo and this happens:
Huh, seems to work if I assume the resolution to be 368x240. Looks like there's an off-by-one somewhere.
Non-doubled 720x240 is fine.
Unrelatedly, it also does this weird flagging thing on my TV card. Notice how the top quarter or so of the screen is slightly offset.
Additionally, the top left corner gets weird when DOUBLE_WIDTH + border/skew is enabled (IMO skew shouldn't need to be set to account for the border, but maybe that saved an instruction somewhere, IDK)
Perhaps relatedly, I'm seeing an issue where the driver writes up to 32 bytes beyond the given work buffer.
Ok, let me take a look at your samples later today when I get onto the P2. When I released this updated beta driver it was unverified for TV video graphics modes as it was aimed primarily at text and I'm still working on finalizing it and validating everything, but I'll check it out to see if somehow a regression has crept in. I did see something weird in graphics modes the other day when I ran some stuff for pik33 where I saw an offset but that was over 350MHz and so I put it down to that at the time as it went away when I reduced it to more normal speeds. I got this weird feeling the streamer fifo startup timing + PLL changes were a bit off in graphics modes. As I believed everything was 100% timing wise and pixel perfect before, I wonder if something crept in after these changes. I'll need to look into this, thanks for reporting it. You might be right about work buffer corruption being related, that can certainly mess the video up depending on the structure layout in memory.
@Wuerfel_21
Ok so I am reproducing the first issue. While normal pixel width looks okay, enabling double wide pixels with s-video/composite output seems to have triggered the offset and slip problem, I don't think I recall having tried that specific combo before so this might have even been in there from ages ago. I'll have to dig into the PASM2 code a bit to see why this would have a problem.
The second one (flagging with TV capture). I don't get that outcome on my display but I am feeding into a Dell LCD S-video input as well as the composite input for comparison. Both displays are stable here but there is certainly some amount of colour phasing on the composite on the sharp edges. Wondering if it might be something to do with the signal levels/sync accepted by your capture card not liking this P2 timing or waveform for some reason. Could it be the colour burst not being stable enough? In the past during development I have seen flagging like this if the timing between scan lines sync is not spot on, like if the streamer command to send sync gets delayed or something. When I can I'll move the setup into another room and test my plasma S-video input later to see how that copes.
Yes the linkage between appropriate skew and border width is not automatic in the driver at this time. I could try to do something there at some future time in SPIN2 perhaps to compensate by adding skew if you enable a border, but right now if you enable a border it just eats into the amount of pixels normally displayed from real source data while it still uses the columns as setup in the timing.
As far as your 3rd issue goes, I think it is because your output buffer is not large enough. The timing says 544 pixels (68 columns), but your line buffer (OUTBUFSIZE) was only setup for 512 pixels. This is why there are 32 pixels more being generated and why your bitBuffer was being corrupted. The scan line buffer still needs room for the border as well.
Still digging into the first issue...something could be getting rounded down somewhere by a pixel. Maybe that explains the flagging too...? Did the flagging problem go away when you ran non pixel doubled?
Doing an xzero each frame did a lot to reduce that. It's accumulated phase error between the TV reference and the P2 generated signal.
To see it stand right out, create a set of vertical lines, half a color clock, or less, wide.
On moderate contrast images, this isn't very easily seen.
A P2 driven off a colorburst crystal would resolve it for either PAL or NTSC, depending. And there is always 13.500 Mhz, which can be common to the both of them.
I think I tried that xzero thing at some point and at the time I didn't really see a difference, but I am happy to test that again if I can get it to fit. I might have to put it in the ** lines as I don't think I have space for one extra standalone xzero instruction. It needs to fit in the area below (which gets patched differently for NTSC/PAL).
I did not say it earlier, but the shimmer is phase error between the pixel clock and the color burst reference.
P1 did this too. But it was stable, and 4 bits in size. It never came up, unless one was mixing colors to get artifacts. The solution was to generate a software color burst, basically avoiding that circuit, using pixel clock only. Eric Ball did it, any my Nyan Cat demo puts it to use.
The P1 had its PLL, which kept it stable. The P2 NCO has a rolling error.
Chip suggested one xcont / frame. I had mixed, but better results. Next move was to use a colorburst crystal, or 13.5 Mhz.
I have not been able to revisit this since.
Now having thought about it, we may benefit by not using the prop color circuit to do colorburst.
Do that with pixels, so high contrast artifacts are always aligned.
This moves the error to the color circuit. That may be far less visible on high contrast elements. Or, it may make colors themselves shimmer.
I am having editor trouble on mobile. Will fix on desktop later.
Seems a short post works, so...
This can all be seen in the standard NTSC demo with the bird. May be easier to hack on that to learn a better solution.
Finally, it may be a lot less of a problem using S-video. The chroma artifacts go away, meaning this shimmer should too.
Yes the s-video is a much better looking image right now. I'd say it's not 100% perfect but probably quite usable at the lower resolutions. Composite is still a bit ugly though I feel I did improve it a bit since the last time around. I think NTSC looked better than PAL (it seemed brighter more vibrant, maybe oversaturated).
I think the color phasing also happens because you can't make the line frequency = color carrier / (N+0.5) when everything has to be quantized to the pixel clock and the pixel clock isn't a multiple of 2*color carrier (which is somewhat undesireable because it creates lots of artifacts)
@Wuerfel_21 I found this seems to be happening when the number of columns in the timing is not divisible by 4 and you also select double wide pixels. I don't know why this should be the case, but I can make it happen with DVI the same way (it's not limited to the TV mode). The streamer is meant to be able to stop on a single pixel boundary, there should not be a limitation there and I compute the number of pixels to generate by multiplying by 8 (vis_pixels and m_rf in the p2videodrv.spin2 file). It's somehow stretching out the line it generates which would probably also explain your flagging issue, especially if that flagging does not happen when you didn't set double wide. In the case of 720 pixel wide scan lines, this should be 45 x double wide (16 pixel) columns = 360 wide pixels when pixel doubled, but I see 46 on the screen (it is adding an extra column).
This would be why I've not seen this before because all the standard video modes I used have columns that are divisible by 4 and that is what I did most of my testing with. Now after today we know that the driver doesn't (yet) handle the cases of columns not being divisible by 4 whenever pixel doubling is enabled.
640x480 (80)
800x600 (100)
1024x768 (128)
1280x1024 (160)
1920x1080 (240)
1920x1200 (240)
This mod to your code works (if you set your base pin correctly for your setup):
long (video.SYNC_NEG<<31) | ( 16<<24) | ( 64<<16) | ( (58+0*16)<<8 ) | (704/8)
has to be
long (video.SYNC_NEG<<31) | ( 16<<24) | ( 64<<16) | ( (58+1*16)<<8 ) | (704/8)
The former causes visual artifacts on the tomato monitor.
Ok yes that makes sense. My Dell monitor TV input is too forgiving there and doesn't show it up as a problem. I'd been experimenting with the timing and had patched that value from the original 1x16 when I pasted it here when I was switching between 720 and 704 pixels.
I still don't understand why the streamer is sending out the multiple of 32 pixels. In the case of 720 pixel timing it is being told to send 720, but seems to send 736. I need to dump my dynamic "m_rf" streamer command value into HUB RAM to be sure what is going on there in the driver. It's a bit bizarre as to what is happening in this odd double wide case. I understand the streamer to be able to stop on individual pixels so if I'm telling it to do 720, it should do 720 not round to nearest 32 somehow. It still must be some weird side effect in my code I don't yet see, which means it is hopefully still doable once I can figure it out.
I figured out why the flagging was happening: The DIP switches on the AV board are labeled reverse, so I erroneously left the luma pin in AC coupling mode.
Also, would it be possible to make the border color reload every line? That way it could be used as a low-overhead performance monitoring thing.
Ok, great you that found out why.
If a COG long can be found/saved somewhere it might be possible yeah. But finding space is much harder now, as the COG is full.
We might even need multiple longs, one to hold the hub address of where to read the colour from, one for the read instruction, and if we are paranoid to protect the driver, one to clear its LSByte so it can't mess up our sync.
Oh and another question: Can I link a couple of short regions in a loop? (So I could control attributes and stuff per scanline from a tile/sprite rendering cog)
Yeah I think that could be possible if it ends on the screen that way but it's more CPU intensive compared to wrapping. You can back link the final region in the list to a prior one using "linkRegion" in the API. That group of regions linked together should then loop on indefinitely until the end of the screen (though I've not tried it). There is of course no loop count exit to this however.
You can also create quite a long display list instead of a loop, but you'd need to replicate the region information if something is common to all of them and needs to be updated down the list, still doable.
Depending on the type of region mode transition and P2 clock speed etc, there can be some performance limitation between regions as you hit cycle limits. Changing regions doesn't come for free and eats into the pixel rendering cycle budget too. This can be worse if you are pixel doubling too. But you can experiment to find the limits. A high CPU clock to pixel clock ratio will assist there if you push it very hard.
By the way I think I just found the issue with non-multiples of 4 columns when doubling, and I'm now figuring out if/how to solve that. It might be somewhat tricky as it relates to long only sized transfers with setq2 in my transfer loop during the doubling process.
Well, I'm asking because it'd be very helpful for porting JET Engine to P2 (well, "porting"... It'll only be able to use the first 64k RAM. But that's good enough to get my big game to also run on P2). JET Engine has it's own region system (except it calls them subscreens) and has to pass some info (gfx/text mode, horizontal scroll enable and bottom 4 bits of scroll value) about them to the output driver along with the rendered scanlines. Thus a wrapping region buffer is needed (the alternative would be to translate the JET subscreens into regions 1:1, but that seems more complex).
The resolution will only ever be 256/512x224, so I don't think it's going to run into speed issues. (Maybe I'll have it do text modes at 512x448?)
Pity border width can't be set per region, will have to make the buffers slightly wider than they need to be (OK because they need to be bigger than on P1, anyways, to work around the lack of a 4-colors-per-16-pixels mode (which is what JET text mode line buffers are)) and have the render cogs paint the mask on. I guess I might aswell make them much wider so I don't need borders at all (and draw my own borders in the render cogs)