Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Cluso99 · 2018-04-12 02:41

Dave,
Here is the code for testing the SD cards.
Postedit: Sorry, wrong code attached (now deleted). See a few posts below for correct code.

You will need to have PST or similar serial connected to P31/P30. My code will output at 115,200 some 5 seconds after the SD card is initialised (or fails). Capture and post the details.

You may need to change the location of the SD pins depending on your configuration. Note if they are <P32 then you will need to also change the pin definitions below that. I also have LEDs on pins too. Just change these pin definitions to some unused pins if required.

' microSD CV-A9 pins
  pinCS    = 61   '39  
  pinCLK   = 60   '41  
  pinDO    = 59   '36  
  pinDI    = 58   '40  
        CS       = pinCS-32
        CLK      = pinCLK-32
        DO       = pinDO-32
        DI       = pinDI-32

' RGB LED pins (h=on)
  _redled   = 5
  _greenled = 9
  _blueled  = 7

Forgot this. Change the following to a valid filename on your card. A text file with recognisable text is best. Note how the filename letters are reversed due to little endian (ie the filename here is _BOOT_P1.BIX (without the ".")

_fname          long    ("_" + "B"<<8 + "O"<<16 + "O"<<24)  '\\ filename...
                long    ("T" + "_"<<8 + "P"<<16 + "1"<<24)  '||   8.3 +$00
                long    ("B" + "I"<<8 + "X"<<16 + $00<<24)  '//

Note if there are still problems, then I have older code that has more debug info.

Dave Hein · 2018-04-12 03:16

Thanks, I'll give it a try in the morning. I figured out what the problem was with my code. It didn't handle SDHC cards correctly. Most of the cards that I use with the Prop are 2GB uSD cards. I hadn't tested my P2 code with an SDHC card until I tried the 4GB in the DE2-115 card slot. I fixed the problem and it works fine now.

Cluso99 · 2018-04-12 04:51

Dave Hein wrote: »

Thanks, I'll give it a try in the morning. I figured out what the problem was with my code. It didn't handle SDHC cards correctly. Most of the cards that I use with the Prop are 2GB uSD cards. I hadn't tested my P2 code with an SDHC card until I tried the 4GB in the DE2-115 card slot. I fixed the problem and it works fine now.

Excellent news.
I haven't been able to buy standard SD cards for years although I have quite a few from old cell phones.

Dave Hein · 2018-04-12 17:52

Cluso, how do I load you're test code? It start's at $FC000. Shouldn't there be some code at location 0 that starts up the test code?

Cluso99 · 2018-04-12 20:02

Dave,
I just use pnut.exe to compile and load the fpga.

Not sure where you are seeing $FC000 though.

The code starts at cog $000 which is loaded by Chips boot rom from hub $00000. IIRC it's called entry and in v121a starts the 80MHz using hubset. This code is for debugging the SD.

Dave Hein · 2018-04-12 21:01

The beginning of the code looks like this

DAT
'
'
'*******************************************
'*  Cog init - overwritten by SPI program  *
'*******************************************
'
		orgh	$FC000
		org
'
'
' Move code into position
'
		setq	#cog_end-cog_code-1	'move cog code into position
		rdlong	cog_start,##@cog_code

The first $FC000 bytes of the binary file are zero. After pnut loads the binary into hub RAM doesn't the P2 load the boot program from ROM into $FC000, which will overwrite your program? To run your code I would have to somehow copy your code into an FPGA image, and then program it into the FPGA. What am I missing?

Cluso99 · 2018-04-12 21:11

Oh Smile!
Sorry, I posted the wrong code.
Here is the correct code

Dave Hein · 2018-04-15 13:56

I'm working on updating spinsim to match the latest version of the FPGA, and I am having problems with blnpix and mixpix. The documentation says that the pixels are mixed as follows:

D[31:24] = ((D[31:24] * DMIX + S[31:24] * SMIX + $FF) >> 8 ) max $FF
D[23:16] = ((D[23:16] * DMIX + S[23:16] * SMIX + $FF) >> 8 ) max $FF
D[15:08] = ((D[15:08] * DMIX + S[15:08] * SMIX + $FF) >> 8 ) max $FF
D[07:00] = ((D[07:00] * DMIX + S[07:00] * SMIX + $FF) >> 8 ) max $FF

For blnpix, DMIX is !V, and SMIX is V. If I use values of D=$FF, S=$00 and V=$1 I calculate a result of $FE from the formula. This is the result that spinsim and older FPGA images produce, but the latest FPGA image produces a result of $FF. So it appears that the formula has changed slightly.

Chip, what is the current formula that's used for blnpix and mixpix?

Dave Hein · 2018-04-15 19:32

After looking at it some more, it appears that the blnpix instruction isn't changing the value of the D register. Has anybody else used the blnpix instruction with the latest FPGA image?

EDIT: I found the problem. I hadn't realized that the code values for SETPIV and SETPIX were swapped since the last time I looked at an FPGA update. I was using the code value for SETPIX instead of SETPIV. I changed it in spinsim and my test program. I still need to change it in my assembler and disassembler.

Roy Eltham · 2018-04-16 02:12

So it's been a while since I updated my A9 board, and I wanted to get this latest one on it for testing. However, I can't find any directions for doing that. I looked in the zip, I looked in docs (linked in first post here), I tried searching the forums, but the search is kind of worthless for single sub forum searching and I'm not hitting the right combination of search terms to get anything useful.

So can someone refresh me on it? or point me at where the directions are at? I think they should probably be in the zip file.

ozpropdev · 2018-04-16 02:46

For example on COM6

Switch to "PGM" then

px Prop123_A9_Prop2_8cogs_v32b.rbf /6 /p

then switch to "RUN"

Roy Eltham · 2018-04-16 03:06

Thanks ozpropdev, got it updated and running. The all cogs blink and both VGA demos worked fine.

Now to play a bit...

cgracey · 2018-04-16 03:12

Dave Hein wrote: »

After looking at it some more, it appears that the blnpix instruction isn't changing the value of the D register. Has anybody else used the blnpix instruction with the latest FPGA image?

EDIT: I found the problem. I hadn't realized that the code values for SETPIV and SETPIX were swapped since the last time I looked at an FPGA update. I was using the code value for SETPIX instead of SETPIV. I changed it in spinsim and my test program. I still need to change it in my assembler and disassembler.

Whew! The gate is closing on when we can make changes.

Cluso99 · 2018-04-16 12:27

Chip,
Will there be any way to find out how much hub ram has been implemented without testing for multiple address mapping?
Is it likely that the ROM code might be identical or different in various future P2 versions?

Guess I am asking might it be worthwhile to have a specific ROM address that will be copied into the top 16KB of Hub.

The reason I ask, is that when loading a file from SD, perhaps it's preferable to limit the load to the maximum Hub ram less the 16KB. I have to ensure code that needs to copy the hub to cog and then jump to cog is not clobbered by a file being loaded. Otherwise I guess I need to load a little code into LUT space for this.

cgracey · 2018-04-16 13:27

Cluso99 wrote: »

Chip,
Will there be any way to find out how much hub ram has been implemented without testing for multiple address mapping?
Is it likely that the ROM code might be identical or different in various future P2 versions?

Guess I am asking might it be worthwhile to have a specific ROM address that will be copied into the top 16KB of Hub.

The reason I ask, is that when loading a file from SD, perhaps it's preferable to limit the load to the maximum Hub ram less the 16KB. I have to ensure code that needs to copy the hub to cog and then jump to cog is not clobbered by a file being loaded. Otherwise I guess I need to load a little code into LUT space for this.

Yes, the ROM could have a few locations where such things are stored.

I will be getting into the ROM code tomorrow.

Dave Hein · 2018-04-16 13:28

cgracey wrote: »

Whew! The gate is closing on when we can make changes.

I figured the problem was due to something in my code. It just took me a while to figure it out. The good news is that the 175 instructions that I check in my test programs all match up with the documentation and spinsim.

cgracey · 2018-04-16 18:11

Dave Hein wrote: »

cgracey wrote: »

Whew! The gate is closing on when we can make changes.

I figured the problem was due to something in my code. It just took me a while to figure it out. The good news is that the 175 instructions that I check in my test programs all match up with the documentation and spinsim.

Dave, thank you so much for testing all that stuff. I'm feeling like it's probably all okay, but sometimes issues pop up.

Roy Eltham · 2018-04-16 21:14

Spent some time understanding the video examples. Modified the 256x192 NTSC one to be 320x240 (was pretty trivial to do).

Does anyone have an example doing BG tiles and/or sprites? Since that needs to work significantly differently from the simple streamer examples for displays.
Since we have so little HUB memory, we have to resort to tiles for higher resolution, and of course you gotta have sprites in any case.

potatohead · 2018-04-16 22:53

I only did a 2 color character driver. Ran the streamer at 32 bytes per line. Haven't tried smaller defined bursts. Pretty sure one can just trigger it again however, with a new address, before it gets done to scan different addresses back to back. Seems like I tried that, but will have to go look.

I know I did live, mid scanline, changes to LUT entries, as well as color depth, like from 2 color to 4 color, and those both worked after Chip settled a timing issue. Racing the beam type things are possible. Mixed mode displays, essentially. At the time, we were talking about that kind of thing, text, with a graphics area, or multiple graphics areas, different depths.

Haven't tried it P1 style, short defined bursts, but the COG runs fast enough to assemble basic tiles while streaming a scanline at TV frequencies. So I just did the whole line. 2 and 4 color.

That's probably enough for good, multi color, maybe multi palette tiles. Sprites probably end up in another COG P1 style, though we've added a lot of cool instructions since then.

Never did a Sprite engine for this one, only HOT. I was just running a single buffer bitmap, drawing to that.

Was quite a while ago. I'll see if I can find that code. Around that time Andy did a multi color char driver. That may be helpful too.

As far as I know, I'm not sure anyone has tested the pixel mixer. Was hoping to get to it, but the startup I'm knee deep into right now has gotten in the way.

Roy Eltham · 2018-04-16 23:19

potatohead,
The existing NTSC demos use the 8bit RFLONG LUT mode, so your granularity would be 4 pixels. Reading the docs it looks like you could do a RFBYTE LUT mode that does one byte at a time instead of LONG, so you could get pixel granularity if you are using 8bit pixels (Edit: nevermind there does not appear to be a RFBYTE LUT mode), otherwise you are looking at 2 or 4 pixel granularity if you use 4bit or 2bit color modes. You still have to read both the BG pixels and the sprite pixels and mix them to do sprites properly on the fly.

I think you could manually read BG and sprite pixels. combine them (not using the mix instructions since that requires expanding to 24/32bit color), and then feed the output a long at a time. You can probably get up to reasonably high resolution doing this on the final hardware running at 160Mhz+. The hangup on resolution would be the random hub access speed (vs sequential streaming). Perhaps you could store sprites in LUT memory to overcome that, but then you are severely limited on the number. Of course, all of this is assuming you want to do this all on the fly "racing the beam" as it were.

I think it'll be easier to just composite a buffer separately from displaying it.

What I would really like to be able to do is a 640x480 (or 720x480) display with 8bit color BG, and many arbitrary sized sprites that are 8bit color. To fit in memory and allow for other things going on with it, that pretty much means BG made of tiles.

Roy Eltham · 2018-04-16 23:52

Chip,
Was there a reason not to have LUT modes for RFBYTE and RFWORD modes for the streamer? You only have LUT modes for RFLONG and Immediate (using the whole immediate long). So one always has to construct a complete LONG to feed it with pixels that use the LUT.
Not asking you to change anything, please don't, just curious about the choice.

On a related note, I assume that if I want to expand 8bit pixels via LUT for use with the pixel mixing instructions, I need to do the lookup into the LUT manually, right?
Sorry for what might be simple/dumb questions, there are just so many instructions and the docs are still pretty sparse on some details.

potatohead · 2018-04-16 23:53

I think it'll be easier to just composite a buffer separately from displaying it.

So do I.

Wasn't really intending to drive it that way for general purpose sprites and tiles. Sorry, wasn't clear. Was more about what the thing can do, and was musing about P1 style tiles mostly.

Agreed on 720(640)x480. Seems a nice sweet spot, given resources.

Roy Eltham · 2018-04-17 01:03

I modified the NTSC demo such that for each line it does a loop reading a byte at a time, then a rdlut, then a xcont for 1 pixel using immediate 32bit mode. This works.
I then tried to add in a blnpix and that stopped working for 1 pixel at a time. I guess the rfbyte, rdlut, blnpix combo takes too long and so the timing gets messed up, and my monitor can no longer display the image.
So, I changed it to 2 pixels at a time of the same blended value and that works (so I read 2 bytes, tossing one, then do the xcont for 2 pixels). This works, and I was able to test addpix, mulpix, and blnpix. All worked as expected.

Here's a little video of the blnpix test: https://photos.app.goo.gl/lfB4u24yKbVPiaX43
just blending the green channel with a value going from 0 to 255 and wrapping.

ozpropdev · 2018-04-17 01:56

@Roy,@potatohead Re: Sprites

A while back I made a new Ivaders version that uses sprites.
It uses 4 cogs to generate the video scanlines on the fly.
1 for sprites and 2 for large and small text.
Its all sync'd using COGATN and also uses WMLONG.
I just updated the files for the V32b FPGA image.

See here
https://forums.parallax.com/discussion/165289/p2-invaders-2-0-their-back

Roy Eltham · 2018-04-17 02:14

I will dig into the code to try and understand it all, but can you do a simple description of how you get multiple cogs all driving video together? Or am I misunderstanding?

Edit: I believe I misunderstood before, looks like you have the interrupt driven video code as the only thing actually driving the VGA output. The sprite/text engines update the buffer lines as needed.

ozpropdev · 2018-04-17 03:23

Roy Eltham wrote: »

I believe I misunderstood before, looks like you have the interrupt driven video code as the only thing actually driving the VGA output. The sprite/text engines update the buffer lines as needed.

That's right Roy.

Cluso99 · 2018-04-19 08:20

Anyone know what happened to the setcz instruction?
I found modcz but I don't understand the required operands.

Peter Jakacki · 2018-04-19 08:59

Cluso99 wrote: »

Anyone know what happened to the setcz instruction?
I found modcz but I don't understand the required operands.

This is how I use it:

modcz	0,0 wc		' clear C for some operations that use this to determine behaviour

The 0,0 are the states for C and Z while wz wc specify which ones to set. So therefore, to set the carry:

modcz	1,0 wc		' Set C

MODCZ   c,z      {WC/WZ/WCZ}	Modify C and Z according to cccc and zzzz. C = cccc[{C,Z}], Z = zzzz[{C,Z}].

ozpropdev · 2018-04-19 09:09

From instructions_v32.txt

MODC    c               =       MODCZ   c,0         {WC}
MODZ    z               =       MODCZ   0,z         {WZ}


---------------
MODCZ constants
---------------

_CLR                    =       %0000
_NC_AND_NZ              =       %0001
_NZ_AND_NC              =       %0001
_GT                     =       %0001
_NC_AND_Z               =       %0010
_Z_AND_NC               =       %0010
_NC                     =       %0011
_GE                     =       %0011
_C_AND_NZ               =       %0100
_NZ_AND_C               =       %0100
_NZ                     =       %0101
_NE                     =       %0101
_C_NE_Z                 =       %0110
_Z_NE_C                 =       %0110
_NC_OR_NZ               =       %0111
_NZ_OR_NC               =       %0111
_C_AND_Z                =       %1000
_Z_AND_C                =       %1000
_C_EQ_Z                 =       %1001
_Z_EQ_C                 =       %1001
_Z                      =       %1010
_E                      =       %1010
_NC_OR_Z                =       %1011
_Z_OR_NC                =       %1011
_C                      =       %1100
_LT                     =       %1100
_C_OR_NZ                =       %1101
_NZ_OR_C                =       %1101
_C_OR_Z                 =       %1110
_Z_OR_C                 =       %1110
_LE                     =       %1110
_SET                    =       %1111


Examples:

MODCZ   _CLR, _Z_OR_C   WCZ     'C = 0, Z |= C
MODCZ   _NZ,0           WC      'C = !Z
MODCZ   0,_SET          WZ      'Z = 1

MODC    _NZ_AND_C       WC      'C = !Z & C
MODZ    _Z_NE_C         WZ      'Z = Z ^ C

Cluso99 · 2018-04-19 09:53

Thanks guys.
Modcz does not work for me as I have 4 labels for each of the possibilities. I need to use these labels to set the c and z flags and then return.
The old way with setcz allowed me to add #1 to a field and then setcz before returning.

I will need to setup 4 sets of 2 instruction pairs, a modcz followed by a ret.

I am working on getting the monitor section of my code running. I did this back on the old P2 in 2013 using LMM.

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Comments