The graphic vga driver - now beta
pik33
Posts: 2,394
The post #26 contains the newest verilog and spin/pasm code. Now the Propeller works @ 115.625 MHz and the basic graphics procedures are written in PASM, so the driver is now fast.
The hardware part of the vga/sram driver reached beta stage, as all planned functions are now implemented. There is display list, hardware fine scroll register and 6 32x32x256 colors hardware sprites with x/y zooming and collision register. The sprites are completely independent from the rest of the graphics - they are 256 colors even in 1792x1120x4 color mode.
There is still some timing problems with these sprites and this reduces their number to 6; adding 7th sprite causes artifacts on the screen.
Use 77 MHz for the Propeller. This gives proper 154 MHz pixel clock.
The actual driver code is attached to post #19
Original post:
This is the first attempt to make a graphics vga for DE2-115 p1v with sram framebuffer. This time the Propeller sees sram as 32-bit memory. 16-to-32 bits translation is done in verilog.
The graphics procedures are written in Spin, so they are slow. Only cls and putpixel is done in PASM, converting the rest is TODO.
Line and circle procedures are translated to Spin from (very) old Basic sources - I hope they are in public domain They will be converted to PASM as soon as possible.
There are 2 graphics modes possible, selected by register $80002. The register $80000 contains display start address. $80001 is line counter and h/vblank indicator. $8000F is border color, $80010..8001F are 16 pallette registers.
The SRAM is now 512kx32 from $0 to $7FFFF
8x16 font definition starts at $180000, so you can redefine font using poke to these addresses, 16 bytes for 1 character.
There are 2 cycles left - maybe it will be possible to extend color space to 8/64 colors.
Attached .zip and .7z (should be the same content) contains quartus project file and spin driver. The p1v is programmable via propplug.
The hardware part of the vga/sram driver reached beta stage, as all planned functions are now implemented. There is display list, hardware fine scroll register and 6 32x32x256 colors hardware sprites with x/y zooming and collision register. The sprites are completely independent from the rest of the graphics - they are 256 colors even in 1792x1120x4 color mode.
There is still some timing problems with these sprites and this reduces their number to 6; adding 7th sprite causes artifacts on the screen.
Use 77 MHz for the Propeller. This gives proper 154 MHz pixel clock.
The actual driver code is attached to post #19
Original post:
This is the first attempt to make a graphics vga for DE2-115 p1v with sram framebuffer. This time the Propeller sees sram as 32-bit memory. 16-to-32 bits translation is done in verilog.
The graphics procedures are written in Spin, so they are slow. Only cls and putpixel is done in PASM, converting the rest is TODO.
Line and circle procedures are translated to Spin from (very) old Basic sources - I hope they are in public domain They will be converted to PASM as soon as possible.
There are 2 graphics modes possible, selected by register $80002. The register $80000 contains display start address. $80001 is line counter and h/vblank indicator. $8000F is border color, $80010..8001F are 16 pallette registers.
The SRAM is now 512kx32 from $0 to $7FFFF
8x16 font definition starts at $180000, so you can redefine font using poke to these addresses, 16 bytes for 1 character.
There are 2 cycles left - maybe it will be possible to extend color space to 8/64 colors.
Attached .zip and .7z (should be the same content) contains quartus project file and spin driver. The p1v is programmable via propplug.
Comments
You might have some reasonable chance if you limit the external/shared RAM accesses to a single COG and lock the video reads to some multiple of the hub rate. Otherwise it likely gets trickier as you mention with multiport controllers and/or extra fifos needed etc.
On the DE-0 nano, I'm planning to sync the external SDRAM to the hub cycle and give 2 reads for the video and one access to a single COG per hub cycle. This (just) fits in theory and should be able to yield 1280x800x16bpp output capabilities at about 72MHz prop rate while a single COG can run LMM from external SDRAM seen as expanded hub address memory >64kB. That's the plan anyway. It's slowly coming together I hope. I found some sample SDRAM controller code for Altera which will need some extensive customization for this.
How's the DDR3 coming along?
I managed to run this @ 1920x1200 today with pixel clock=154 MHz. Then the SRAM acces is @ 9.625 MHz.
Having now 32 bit RAM I can start really thinking about experimenting with execute PASM code directly from inb. It needs adding a new program counter, use it to adddress RAM and then stop the PC @ inb.
Your posts and comments are very interesting and helpful. I have compiled this, but I haven't had a chance to run it... I'm in the wrong place.
The compile time was longer for this code AND I now see that you are talking about higher clocks speeds... so it looks like you have been fiddling with Quartus settings:)
It does seem odd that in order to get better RAM access, we have to increase the pixel clock... I think I understand it, because you have explained it very well. But it still seems curious. So, I thought I would ask if what is missing is a hold circuit on the DACs?
I was also wondering if you were planning to replace the pin0-1 usage with USR type instructions or would this mess up the timing?
And FINALLY, I see that you have a debug register in VGA.v, but I don't see it hooked up to anything...oops... there it is in the block diagram hooked up to LEDs:)
Thanks,
Rich
After a long fight with timings I managed to compile a working 1920x1200 driver. It can't compile "stable" - not every compiles of the same circuit works. It still allows one 32 bit Propeller access @ 16 vga clocks, so it can be faster.
A debug was needed to debug the ram state machine which used to hang.
The result is 1920x1200 vga screen with 1792x1120 active display @ 4 colors and half of this (896x560) @ 16 colors.
Now I have stable vga 1920x1200 driver, it works after every Quartus compilation. I will publish it here after some code cleaning,
The frame buffer has up to 512 kB now...Still 1.5 MB SRAM free.
Interesting find - MUL works in PropTool. That probably means we can use ONES and ENC too to perform our other added instructions. Currently I had been using LONG for my new instructins.
Changed memory map (again )
80000..800FF - pallette
81000 - font def
82000 - sprites (still unused)
83000 - internal regs
Changed the propeller. I needed two inputs for h/vblank - bidirs are good with GPIO but they are ara PITA when connecting something inside the FPGA so I had to isolate p[0..1] making ina[0..1] available as inputs.
- display list. DL control register is $83006. Bit 31 - DL on, the rest is DL start address
To run display list you have to create 1200 longs structure, one long for one display line, place it in RAM and then set $80006 to point to this structure.Setting bit 31 of $80006 will then enable the display list.
DL entry structure:
Bit 31 - entry is on
Bit 30 - generate dl interrupt (for future use)
Bits 29..22 - pallette index for the line
Bits 21..18 - graphics mode for the line
Bits 17..0 - display start addres for the line. It will be shl<<2
If you don't want display list, simply poke $83006,0
- horizontal fine scroll register $83007
When non-zero, all display will be scrolled left by register content. The resolution of scrolling is 1 hi-res pixel.
Attached verilog code for vga only - replace existing vga10 with new vga11 in old project.
Still TODO: sprites and line length register
Edit: self starting demo .sof and .pof file added
Registers are available @ $83000
// 0 - frame counter
// 1 - display start address
// 2 - mode
// 3 - border color
// 4 - pallette bank
// 5 - unused
// 6 - display list start address; bit 31: dl active
// 7 - horizontal scroll
// 8 - line length
// 9..14 - sprites
[31..30] y zoom
[29..28] pallette bank
[27..16] y position
[15..14] x zoom
[13..0] x position
// 15 - collision register
// display list
// [31] - entry valid
// [30] - generate dli
// [29..22] palette index
// [21..18] mode
// [17..0] line start address;
Reminds me a bit of the Atari 400/800 display lists.
It can be used for example in wav/sid player, where high resolution graphics mode allows to display long file names and song info, and then 256 color mode can be used on the same screen for visualization effects, like spectral bars or oscilloscope.
Now there is Propeller side to do: fast graphics procedures in PASM. The spin is too slow for graphics with 0.5 MB of framebuffer.
That is graphics, sounds & sprite engine in verilog and seems to be very small and efficient.
It has 32 k RAM - no real framebuffer. My driver now has a real up to 512 k framebuffer which allows hi-res modes. I have no rotating sprites and only 6 of them,but 32x32 pixels.
This thing's code is gpled so I have to look at it - how they managed to add 96 sprites? Maybe the key is the speed, they use 800x600, 40 MHz pixel clock, while I use 1920x1200, where the pixel clock is 154 MHz. The FPGA can't select the 7th sprite in time in my code as it is now.
This Propeller run @ 115_625_000 Hz. The Propeller clock is 3/4 of the pixel clock, so 1 SRAM cycle is now 3 Propeller instructions.
Pins 0..7 of Port A was split into ina/.outa [7..0] to allow using them as inputs inside FPGA without messing with TRI buffers. I needed them for simulated IRQs (vblank, hblank, DL interrupts)
I will publish the source project later. (Edit: added ) Now the .zip I added contains the compiled .sof, .pof and a demo spin code. This is still beta. After a monitor poll I decided to add a 1920x1080 derived modes, but this is not done yet.
KEY[0] resets the Propeller
Now the putchar and line procedures are written in PASM so the driver is fast now
It is time to add some sound to it and then.. write a Basic interpreter - let this Basic demo screen becomes a reality. These BASIC bytes free are something near real value.. (2 MB minus the framebuffer length, which is about 240 kB in this mode).
I think LVDS with 50 Ohm termination without external components (with 3 lvds pairs) would do the job
for HDMI without external components.
Then I have no experience with these interfaces
Then the Cyclone 4E has no transmitters and is simply too slow to transmit more than full hd bitbanged signal via HDMI.
So as it is now let's stay with good old VGA. Today all or near all TVs and monitors still have VGA input.
Of course, if using SoCKit board with hdmi output... this should be possible, but I haven't such board.
PS: sound, SD card and kbd/mouse now added... SDRAM is next on the queue.
If I manage to get stable compilation with all these things, I will start a new topic - let's retrocompute with p1v and de2-115
Are you getting warnings when compiling the verilog? Just wondering since my DE0 compilations of the original P1V gives ~40 warnings IIRC.
But some are timing errors which I know are also present on the original code. I am not sure how to fix them, or what impact they may have for long term reliability of the P1V code. I think some are also due to the DE0 target. It bothers me when I don't understand them