The "speed" of the driver is determined by the waitvid loop, and the frame size. A 2 color driver can do 32 pixels in a frame, or tile. That's a lot of pixels! Higher pixel counts are associated with higher sweep frequencies. PASM instructions have a fixed execute time. As the sweep frequency goes up, the number of possible PASM instructions goes down, which, in turn places the emphasis on the longer frame sizes. 2 color = 32 pixels, 4 color = 16 pixels. Full color = 4 pixels!
As the frame size decreases, the number of waitvids possible per scan line does down as well. This is why the TV driver could easily do 320 pixels, while the VGA one required careful programming to get the same. I think Baggers said two cogs.
Additionally, there is the task of getting the pixels to the waitvid. If the COG doing the video signal is going to do it, the scheme must be simple, like fetch, add, fetch add, waitvid, etc... The more things to be done to get the pixels on the screen, the longer the waitvid is, the lower the resolution is, and or the lower the color depth is.
If this is to happen at all, the sweep frequency needs to be as low as possible, and it's likely multiple cogs will be needed. I would be surprised to see 640 pixel base resolution doing this, if it's at all possible.
So the tiles are another story. Tiles are basically waitvid frames. That's the easiest way to build them on the Propeller. Waitvids get one long for color data, which can be four unique colors, and they get one long for pixel data, which can be up to 32 pixels. Bits per pixel can be 1, 2, 8. That's it! So, on a 8x8 text driver, 2 color mode is chosen, then a frame size of 8 pixels is chosen, resulting in one waitvid per text character, where 80 characters are difficult to do in one cog at 80 mhz.
Full color requires a 4 pixel frame, such that each pixel can be each of the four colors.
So, if we are going to mix COGs, we still need to abide by the four pixel frame, because when we do more than that, there are not enough color slots to color all the pixels any color desired. If there were 5 pixels, for example, two of them would have to be the same color no matter what.
The code that Baggers wrote is probably the limit @ 320 pixels, unless the cogs are synced, and take turns generating video scan lines. Basically the cog would have one long delay, fetching data, then blast it out with back to back waitvids, only to wait again, with several of the cogs doing it one after the other...
Thanks ++ for that explanation. It makes a lot more sense now.
Leaving aside the code to get the data for the moment, ok, ballpark, if you have two color and 32 bits per pixel and it can do a 1280 line (which the above driver can), then does that imply that waitvid can handle 4 color, 16 bits per pixel and do a 640 line. Half the speed and half the number of pixels, right?
Dr_Acula, two colours, is bit 32bits per pixel, nor is 4 colour, 16 bits per pixel,
2 colours, means the 32bit ( max amount in a frame eg one waitvid ) can be used for 32 independent pixel colours, as each pixel is one bit ( for 2 colours )
4 colours, means the 32bit ( max amount in a frame eg one waitvid ) can be used for 16 independent pixel colours, as each pixel is two bits ( for 4 colours )
256 colours means the 32bit ( max amount in a frame eg one waitvid ) can be used for 4 independent pixel colours, as each pixel is eight bits ( for 256 colours )
hope this clears it for you a bit
so although when the cog can does 1280 it's only because it does 32pixels per frame ( 2 colour ) it takes the same amount of time to do 16 pixels at 4 colour, thus lowering the resolution to 640, and again for 256 colour. down to 320 pixel resolution.
As the frame size decreases, the number of waitvids possible per scan line does down as well. This is why the TV driver could easily do 320 pixels, while the VGA one required careful programming to get the same.
Additionally the higher the line frequency (frame rate/number of lines per frame) the time available for each waitvid also goes down. TV is nice because it has a relatively low line frequency of 15.73kHz, while VGA line frequencies start at 31.77kHz, or double the frequency.
That is an interesting teaser? I see Morpheus has a whole lot of video options. I think you pushed the video to 256x192 a couple of years ago - which is similar to the tile driver 256x240. Have you pushed it further with full color? (The Morpheus site is a treasure trove of info. I may be away for some time!)
I have an experimental 320x240 bitmap driver... it supports:
- 256 colors per pixel
- eight 16x16 sprites, with transparent and 255 colors per pixel
- horizontal split screen, each half independently fine scrollable
- support for (but not implemented yet) vertical split screen
Thanks for the nice comments... Morpheus is the result of over two years of work - and the drivers were an extreme pain in the posterior to get running. Sales have been slower than I expected, but I hope they will pick up this year.
I tried a few hacks on Kye's driver - it says 4 color but I'm not sure what that means. There are octuplets with values of 0,1,2,3 and I tried changing one to a different value but it seems to be 2 color, not 4. Maybe I changed the wrong value. Is there a speed difference between drivers that do 4 colors per tile vs 2? (actually, just to confirm, is there such a thing as a 4 color per tile driver?)
If that works, I'd like to try to synch three cogs, one for red, one for green and one for blue.
I think they are two bit per pixel, ie 4 color per group of 16 pixels - but I have not had time to try his driver yet.
VRAM looks intresting. Lots of ram around at 50-70ns but once you go below that, the price goes up and it seems a project in TTL might end up more than an entrie apad computer ($115).
I'll check out the atom.
VRAM is interesting, the problem is no one is making it any more.
The Atom, single core version, can easily be had for less than $60... I can get dual core Atom 330 mini itx board for $70Cdn. Add a power supply, ddr2 dimm, and a usb flash drive, stir with Knoppix -> instant 24 bit graphics Linux box ... I use them for small Linux servers.
Meanwhile, work continues porting the 1280 driver into C. I'm spending most of the time on the IDE rather than on the C. Sometimes the old school programming techniques work the best. Delete and add lines to a richtext box is slow with more than 100 lines. Write the richtextbox to a binary file, read it back using old fashioned mbasic type commands (which are still there in vb.net), edit, then write back as a binary file and then read back into a richtext box, and this has given about a 200 fold increase in speed.
Hopefully I'll have a demo in the next week swapping between several VGA modes from within code.
Thanks baggers. I think I get it now - always 32 bits and you can divide that up in various ways. Still, my comment about halving the screen ought to be true. I think I need to find an example of a 4 color driver - which I think Kye's code will do. 4 colors, 640 pixels per line, just drive the red gun and see if it will work in one cog.
If vram is not being made then not much point using out of date technology. I have similar feelings about using old ISA vga boards even though I have several sitting in a box in the shed.
Can Morpheus run Catalina in external memory? If so, I'd be hoping that all the C code I'm writing could be run on Morpheus as well.
The 'big picture' project is to have a black screen with a number of little 256 color icons, similar to an iphone, and you click one, and that runs other parts of the code which might, for instance, load a new vga driver to do text only.
With fast compilation (working now) and lots of external memory, I'm hoping one could include all sorts of cog drivers in a skeleton program that can then be added to as required.
Catalina runs fine on the extended memory of Morpheus, however the current XLMM kernel RossH wrote takes over the whole memory interface, and would need to be modified to share the bus with the video drivers.
This (should, famous last words) be fairly easy to do, my GPU cog shares the bus with display refresh already - but there would be a slow down.
How bad would the slowdown be?
It would depend on the bandwidth used by video display refresh; a fair guess would be a 50% slowdown in XLMM program speed at moderate resolutions.
however the current XLMM kernel RossH wrote takes over the whole memory interface, and would need to be modified to share the bus with the video drivers.
Yes, there is the same issue with the dracblade. I was going to bring this up with Ross when he comes back from leave.
I have a vague idea about an instruction you can send the kernel and the C kernel shuts down and goes into a loop looking for a long/byte/bit in hub ram to change. Then another cog can take over, do its access etc, then change that flag at the end and then C restarts.
This might be easier than, say, customising every Catalina version.
I think only high level code can be shared; the low level stuff is too unique to each platform; but most high level code written in Catalina should easily be portable.
Sapieha reminded me to mention that CPU#1 on Morpheus also has two 8 pin SPI memory sockets, by default populated with 1MB of FLASH (W25X80) and 32KB of RAM (23K256) ... which could run rather large C programs on CPU#1, leaving CPU#2 for graphics.
Comments
As the frame size decreases, the number of waitvids possible per scan line does down as well. This is why the TV driver could easily do 320 pixels, while the VGA one required careful programming to get the same. I think Baggers said two cogs.
Additionally, there is the task of getting the pixels to the waitvid. If the COG doing the video signal is going to do it, the scheme must be simple, like fetch, add, fetch add, waitvid, etc... The more things to be done to get the pixels on the screen, the longer the waitvid is, the lower the resolution is, and or the lower the color depth is.
If this is to happen at all, the sweep frequency needs to be as low as possible, and it's likely multiple cogs will be needed. I would be surprised to see 640 pixel base resolution doing this, if it's at all possible.
So the tiles are another story. Tiles are basically waitvid frames. That's the easiest way to build them on the Propeller. Waitvids get one long for color data, which can be four unique colors, and they get one long for pixel data, which can be up to 32 pixels. Bits per pixel can be 1, 2, 8. That's it! So, on a 8x8 text driver, 2 color mode is chosen, then a frame size of 8 pixels is chosen, resulting in one waitvid per text character, where 80 characters are difficult to do in one cog at 80 mhz.
Full color requires a 4 pixel frame, such that each pixel can be each of the four colors.
So, if we are going to mix COGs, we still need to abide by the four pixel frame, because when we do more than that, there are not enough color slots to color all the pixels any color desired. If there were 5 pixels, for example, two of them would have to be the same color no matter what.
The code that Baggers wrote is probably the limit @ 320 pixels, unless the cogs are synced, and take turns generating video scan lines. Basically the cog would have one long delay, fetching data, then blast it out with back to back waitvids, only to wait again, with several of the cogs doing it one after the other...
Leaving aside the code to get the data for the moment, ok, ballpark, if you have two color and 32 bits per pixel and it can do a 1280 line (which the above driver can), then does that imply that waitvid can handle 4 color, 16 bits per pixel and do a 640 line. Half the speed and half the number of pixels, right?
2 colours, means the 32bit ( max amount in a frame eg one waitvid ) can be used for 32 independent pixel colours, as each pixel is one bit ( for 2 colours )
4 colours, means the 32bit ( max amount in a frame eg one waitvid ) can be used for 16 independent pixel colours, as each pixel is two bits ( for 4 colours )
256 colours means the 32bit ( max amount in a frame eg one waitvid ) can be used for 4 independent pixel colours, as each pixel is eight bits ( for 256 colours )
hope this clears it for you a bit
so although when the cog can does 1280 it's only because it does 32pixels per frame ( 2 colour ) it takes the same amount of time to do 16 pixels at 4 colour, thus lowering the resolution to 640, and again for 256 colour. down to 320 pixel resolution.
I have an experimental 320x240 bitmap driver... it supports:
- 256 colors per pixel
- eight 16x16 sprites, with transparent and 255 colors per pixel
- horizontal split screen, each half independently fine scrollable
- support for (but not implemented yet) vertical split screen
Thanks for the nice comments... Morpheus is the result of over two years of work - and the drivers were an extreme pain in the posterior to get running. Sales have been slower than I expected, but I hope they will pick up this year.
I think they are two bit per pixel, ie 4 color per group of 16 pixels - but I have not had time to try his driver yet.
VRAM is interesting, the problem is no one is making it any more.
The Atom, single core version, can easily be had for less than $60... I can get dual core Atom 330 mini itx board for $70Cdn. Add a power supply, ddr2 dimm, and a usb flash drive, stir with Knoppix -> instant 24 bit graphics Linux box ... I use them for small Linux servers.
Sounds interesting!
If vram is not being made then not much point using out of date technology. I have similar feelings about using old ISA vga boards even though I have several sitting in a box in the shed.
Can Morpheus run Catalina in external memory? If so, I'd be hoping that all the C code I'm writing could be run on Morpheus as well.
The 'big picture' project is to have a black screen with a number of little 256 color icons, similar to an iphone, and you click one, and that runs other parts of the code which might, for instance, load a new vga driver to do text only.
With fast compilation (working now) and lots of external memory, I'm hoping one could include all sorts of cog drivers in a skeleton program that can then be added to as required.
This (should, famous last words) be fairly easy to do, my GPU cog shares the bus with display refresh already - but there would be a slow down.
How bad would the slowdown be?
It would depend on the bandwidth used by video display refresh; a fair guess would be a 50% slowdown in XLMM program speed at moderate resolutions.
Yes, there is the same issue with the dracblade. I was going to bring this up with Ross when he comes back from leave.
I have a vague idea about an instruction you can send the kernel and the C kernel shuts down and goes into a loop looking for a long/byte/bit in hub ram to change. Then another cog can take over, do its access etc, then change that flag at the end and then C restarts.
This might be easier than, say, customising every Catalina version.
Great to hear that we can share code.
Sapieha reminded me to mention that CPU#1 on Morpheus also has two 8 pin SPI memory sockets, by default populated with 1MB of FLASH (W25X80) and 32KB of RAM (23K256) ... which could run rather large C programs on CPU#1, leaving CPU#2 for graphics.