Successfully implemented smooth horizontal scrolling w/ wrapping!
escher
Posts: 138
I know video is pretty beaten to death on the propeller, but I'm just feeling really satisfied by what I've been able to accomplish with the P8X basically from scratch:
My system is a dual-Propeller setup: a CPU and GPU. I'm using the classic tile-sprite paradigm for my graphics. The CPU contains the game code as well as the tile map, color palettes, sprite attribute table, and other variables which are sent via a high-speed serial link (thanks Marko Lukat!) to the GPU. The GPU uses one cog for reception of the data, 6 cogs for rendering each of every 6 scanlines, and one cog to fetch the rendered data from Hub RAM and display it in full 8-bit color at 640x480, upscaled from 320x240.
Though the render and display code for the graphics driver is all 100% from scratch, it draws on a lot of inspiration and guidance from this community for which I am extremely grateful.
With smooth scrolling enabled, I can now make side-scrolling games which is a huge milestone for my project. And with wrapping, I can eventually implement real-time loading of levels outside the visible area from external RAM.
Next I am implementing vertical scrolling, as well as some sprite sizing.
My system is a dual-Propeller setup: a CPU and GPU. I'm using the classic tile-sprite paradigm for my graphics. The CPU contains the game code as well as the tile map, color palettes, sprite attribute table, and other variables which are sent via a high-speed serial link (thanks Marko Lukat!) to the GPU. The GPU uses one cog for reception of the data, 6 cogs for rendering each of every 6 scanlines, and one cog to fetch the rendered data from Hub RAM and display it in full 8-bit color at 640x480, upscaled from 320x240.
Though the render and display code for the graphics driver is all 100% from scratch, it draws on a lot of inspiration and guidance from this community for which I am extremely grateful.
With smooth scrolling enabled, I can now make side-scrolling games which is a huge milestone for my project. And with wrapping, I can eventually implement real-time loading of levels outside the visible area from external RAM.
Next I am implementing vertical scrolling, as well as some sprite sizing.
Comments
How many sprites (per line) do you get with your driver? 6 cogs can go a long way.
I recall implementing horizontal scrolling in my "JET Engine" driver was a headache. (The rendering only scrolls in 16-pixels steps, the rest is handled in the output cog by messing with VSCL). Vertical scrolling was significantly easier.
So when are you going to do a retro game machine for the P2? (without all the wires and breadboard)
So before scrolling support I was hitting 16 full sprites per scanline, but after I'm down to ~14. I'm considering splitting up my feature set into the classic "modes", where e.g. you can have Mode 1 with no scrolling and 16 spl, horizontal or vertical scrolling and 12 spl, bidirectional scrolling and 8 spl, etc.
I am absolutely certain however that I am not maximizing my efficiency with how I'm rendering my tiles and sprites, and potentially a lot of performance can be recovered from using some more intuitive PASM. I was actually considering putting out a $100 bounty to whomever could get me 16 spl with horizontal scrolling haha. But for now I think I'll wait until all of my features are fleshed out before banging my head against that wall. For anyone interested though, you can see my WIP rendering code here.
And I empathize with you on the scrolling. At first I was trying to solve the problem from a memory standpoint, i.e. rendering everything normally and then simply "shifting" the memory the appropriate amount somehow to spoof the scroll. Due to the `long`-aligned nature of the P1's memory however, this wasn't doable. Ultimately I approached the problem from a more mathematical standpoint, during real-time rendering: using a variable representing the left-most boundary of the visible screen, and dividing and flooring and modulo'ing it with the bit-width of my tiles to get everything in the right place. The biggest realization for me was that, from a rendering standpoint, the only janky bit is the very first tile you render, because it's the only one where you might start rendering partway through. Every subsequent tile, you can start rendering at its normal 0 index. The bit of overhead it DOES introduce however is that you could transition into a new tile on any given pixel, so performing a check for that condition and calling a routine to load the next tile does tax some cycles. You can see my thought process on solving these problems in the issue itself for the feature.
And @"Peter Jakacki" the very instant the P2 is in silicon, I'm jumping on it. The power of a GPU/CPU P2 system would be outstanding. And for the record, my ultimate goal with my project is to develop a final PCB that's compatible with the standard arcade edge connector format.
You can read about my project and development log here.
Wow that would be outstanding! I'd be happy to help foot the bill as well!
A lot of mine's performance comes from a rather novel pixel encoding, where all the masking can be taken care of by the hardware and a pixel can be poked out every 32 cycles. Only works for 4 color tiles though. The version for sprites has an extra compare between palette and a copy of it before any rotation, to mask the wrbyte.
I had a brief look at your code and I see there's a lot of If you "invert" pxindex (so it starts at usually-8 and counts down), I think you could optimize it to Which, instead of always being 12 cycles, is 4 cycles when the branch is taken. Also less instructions.
Your pixel poking solution is definitely awesome, however I'm working with full 8-bit colors for all of my graphics.
Your solution to pxindx tracking is great - exactly the kind of cycle-shaving I'm working towards.