You can not the FlexProp graphical IDE because it needs to compile multiple files with special settings and merge them together and it just can't do that.
Therefore you need to run build.sh from the terminal. (This requires a certain quantum of computer literacy, a rare resource these days, I know)
On Windows you might need to rename the file to .bat if you don't have Bash available or otherwise want to use the built-in command prompt.
If it can't find the flexspin executable, you need to add the bin folder in you FlexProp installation to path. To do this temporarily for the current terminal session (of course replace the path with your actual install location):
Mac/Linux/Etc: export PATH=$PATH:/somewhere/FlexProp/bin
Windows command prompt: set PATH=%PATH%;C:\somewhere\FlexProp\bin
Windows Bash: export PATH=$PATH:/c/somewhere/FlexProp/bin (note the drive letter changing into a unix-style path)
Then you just ./build.sh (or .\build.bat if applicable)
Then the file that ends in _lower.binary is what you want to load into the chip.
@rogloh there's a big annoying thing with sending the clock packet more often: The encoding needs to change during VSync and under the current model there isn't time to encode another packet on-the-fly I think. So I'd need to loose 32 LUT longs to store a pre-encoded VSync version. I can squeeze it in for now, but I think if that doesn't end up improving anything, I'll throw it out again.
Relatedly, I added RGB with CSync, Sync-on-Green and YPbPr modes and they work (currently 480p only). Inexplicably my ASUS PA248QV monitor does apparently support CSync and Sync-on-Green, completely undocumented (by extension YPbPr shows up but there's no setting that lets me fix the colors). They're all fine on the VisionRGB.
@Rayman
DIGITAL_BASEPIN is where your HDMI port is. (The destinction is introduced because you can do DVI+VGA dual modes (though the VGA portion thereof is somewhat poorly compatible)
IF you use pure DVI/HDMI modes, ANALOG_BASEPIN is ignored.
@rogloh try the hdmi-clock-regen-test branch I just pushed. This transmits the clock packets at spec-compliant intervals (which ends up being exactly every 50 samples). Code quality is dubious, I just hacked this together real quick.
@Wuerfel_21 Ok, figured out digital_basepin after a few minutes of darkness...
I got video now, but directory error... Seems now need a folder, like the others...
"Directory error" actually can mean any number of things (including the card not being inserted, IIRC). I don't think I ever changed NeoYume's directory layout, so if you had it working previously, it should still be fine. Make sure the SD card pins are correct and the card is properly inserted.
It works! Just tried crossed swords and HDMI audio is working. There was however, some major video corruption while waiting for me to hit the start button. After that, seemed all good.
The corruption looked like random 8x8 tiles all over..
@Rayman said:
It works! Just tried crossed swords and HDMI audio is working. There was however, some major video corruption while waiting for me to hit the start button. After that, seemed all good.
The corruption looked like random 8x8 tiles all over..
Can you give me a picture of that and test if it happens with the old version on the master branch?
Ah my favorite. I actually dug up my HyperRAM to see if that got broken (I very rarely test HyperRAM - For neoyume work I always use the 96MB PSRAM board)
There's still some bugs there though. UniBIOS support is currently busted it seems. I think the values read from the scanline counter during blanking are incorrect (or rather less correct than before).
EDIT: Bug squashed (on both video-nextgen and hdmi-clock-regen-test).
@Wuerfel_21 said:
If you plug into the projector directly, does that work?
Yeah I had the same thought when I woke up this morning. Will try it in a bit...
@Wuerfel_21 said:
@rogloh try the hdmi-clock-regen-test branch I just pushed. This transmits the clock packets at spec-compliant intervals (which ends up being exactly every 50 samples). Code quality is dubious, I just hacked this together real quick.
Ok cool, will try that out too on the plasma and AVR to see if anything is fixed. Be awesome if it fixes it.
Nope, no luck with the test. Plasma displays nothing now, and the projector direct connection reports an invalid signal with Vsync 474Hz and Hsync 31.5kHz. You might be right about the quick hack thing. Something seems to be broken now with sync.
Uhhhhh what? That really shouldn't happen, the video is fine on my end and should really not be affected at all. Just for ref, what flexspin version are you running? Mine is Version 7.0.0-alpha-v6.9.7-52-gada40c8f Compiled on: Jul 2 2024
@Wuerfel_21 said:
There may have been an issue if your device is picky about preambles. I just pushed a new commit, see if that one fixes it.
No luck with the plasma - blank screen and the projector also is blank. For a sanity test I went back to the other branch and ran it direct to the projector and was able to see a stable 640x480 picture when I used FAST option for the low logic level driver instead of 1K5. This may be due to a long cable run to the projector in the ceiling. I did get an image with the 1K5 option but it had a lot of red pixel noise. No sound obviously because there is no speaker in the projector.
One thing I need to check is whether it may be crashing when I plug in the HDMI cable. It's messy as I'm loading up a JonnyMac board from my Mac with your code then removing the proplug and walking it into the other room powered by a 5V USB battery pack and plugging in the HDMI while it is running. The other version doesn't crash with this but I just don't know if this one does. I see the LEDs on P39 and P38 change as I plug in the cable and it may be a reset. Not sure. I'll have to drag my laptop out upgrade the tools to whatever is required now and power it with that and load your image once it is plugged in the HDMI port to be sure.
Oh man, I've just been trying to grok your video driver code today @Wuerfel_21 . There's a lot of interesting things in there. I like the heavy overlay use and more extensive parameterization of code and data - if I rewrite my video driver one day there's a few things I can learn from this. My driver uses overlays only for syncs so it's much more limited taking advantage of what that can offer. Certainly a lot of setup work though to get this all working correctly. Tricky stuff.
I can't say I like needing to fine grain the audio polling and pixel doubling render code to be executed so frequently throughout the code path. That does makes a mess of the code and puts lots of call instructions everywhere, although it can certainly work with care. I'm wondering if there's a better way you can make use of the streamer & FIFO to get your pixels output and doubled, although I think you said you couldn't do it when the video output pin is in REPO mode. Can the pixel doubling also be achieved that way with NCO and the streamer/FIFO? Maybe I can make use of it for my own pixel doubling code, not sure. Might be too bit depth dependent.
If there was a better way to read audio samples at the end of the video line you might avoid some of the frequent audio poll calls at least. I had an idea to use a small FIFO buffer to hold several pending audio samples (e.g. 4 to 8 samples) before the video driver reads them in per scan line to be encoded and sent out in the islands but the extra complexity is that the writer side needs to manage memory and wrap its pointers etc and that's not nearly as simple as writing to a smart pin in REPO mode, although it could be done. You don't ever want to block the writer side though and you can never let this FIFO overflow, or audio artefacts will occur. Another idea is to issue a COGATN per audio sample (or group of samples) which might be faster to check on.
I don't see where you are sending an audio InfoFrame packet out, only the AVI InfoFrames. I think this may need to be sent too (I sent it once per frame during Vsync in my code IIRC).
If I can get motivated to relearn my FPGA stuff again I found this link on reddit which seems to imply you could do an HDMI capture in a cheap FPGA (i.e. without a serdes so free tools might be possible). This would be a handy way to capture an HDMI bitstream to analyze/compare how audio works for devices that have issues with P2 HDMI audio code. It even seems to mention operating with 27MHz pixel clocks which is what we'd need. We don't need to decode 10b to 8b in HW, just capture raw deserialized data and log it for post-processing afterwards. May need to add a PSRAM or other memory interface for storage, or better yet clock the 3x10b data @27MHz into a P2 directly which could do the capture work and write to PSRAM and read it back via a serial port or write some TMDS log files to SD. Although I have to say I really don't want to get sidetracked by this right now, the learning/bringup curve is huge. Plus so many other things on the go.
The pixel doubling thing is an inherent limitation of the TMDS hardware. It just can't do it. The audio polling is really not a problem, I just call it frequently to be on the safe side. The mailbox method is specifically designed to be as uninvasive to the sound driver as possible, which is usually already crammed to the max. It takes like 3 extra instructions to pack the data and write it. For a different scenario (i.e. not an emulator) I would consider a memory FIFO, but in that case I'd actually have the video cog run the analog DACs, which needs polling once again (because then the audio generation doesn't have to run continuously - could chunk it into larger blocks or even do the entire audio for a frame during vblank in a rendering cog)
There is no audio infoframe. The spec calls for it, but it seems entirely superfluous and there isn't memory to hold it. The regen timing thing has a lot more legs. I have no idea why the sync is messed up on that for you. On my end it works fine on the asus and the visionrgb. Will try it on the TV later. The only meaningful change is to the hsync/vsync functions (and the packets they send out)
You can try changing the AVI infoframe into an audio infoframe in the init code (the former is not a hard spec requirement - though some idiot device may try to go limited range without it). But really, the sync thing needs looking into.
@Wuerfel_21 said:
and there isn't memory to hold it
Perhaps you could put pre-computed (static) AVI and audio infoframe data into hub RAM, and read it into a common COG buffer space prior to output? You may have time to read this during VSYNC perhaps. Its not a log of longs to read in a block transfer.
@Wuerfel_21 said:
You can try changing the AVI infoframe into an audio infoframe in the init code (the former is not a hard spec requirement - though some idiot device may try to go limited range without it). But really, the sync thing needs looking into.
Agree the sync is important.
I wish I could capture your output for analysis with my decoder program to see what is wrong with the video bitstream but this mailbox thing makes a mess of it and it can't be logged.
You don't even need a block transfer, regular streamer DMA will do it. That's what my experimental code did. The real problem (and this is NeoYume specific) is that the lower code will obliterate all our upper hub memory as soon as the game starts. It all has to be in cog/lut.
Comments
@Ltech said:
Which one are you trying to build?
You can not the FlexProp graphical IDE because it needs to compile multiple files with special settings and merge them together and it just can't do that.
Therefore you need to run build.sh from the terminal. (This requires a certain quantum of computer literacy, a rare resource these days, I know)
On Windows you might need to rename the file to .bat if you don't have Bash available or otherwise want to use the built-in command prompt.
If it can't find the flexspin executable, you need to add the
bin
folder in you FlexProp installation to path. To do this temporarily for the current terminal session (of course replace the path with your actual install location):Mac/Linux/Etc:
export PATH=$PATH:/somewhere/FlexProp/bin
Windows command prompt:
set PATH=%PATH%;C:\somewhere\FlexProp\bin
Windows Bash:
export PATH=$PATH:/c/somewhere/FlexProp/bin
(note the drive letter changing into a unix-style path)Then you just
./build.sh
(or.\build.bat
if applicable)Then the file that ends in
_lower.binary
is what you want to load into the chip.@rogloh there's a big annoying thing with sending the clock packet more often: The encoding needs to change during VSync and under the current model there isn't time to encode another packet on-the-fly I think. So I'd need to loose 32 LUT longs to store a pre-encoded VSync version. I can squeeze it in for now, but I think if that doesn't end up improving anything, I'll throw it out again.
Relatedly, I added RGB with CSync, Sync-on-Green and YPbPr modes and they work (currently 480p only). Inexplicably my ASUS PA248QV monitor does apparently support CSync and Sync-on-Green, completely undocumented (by extension YPbPr shows up but there's no setting that lets me fix the colors). They're all fine on the VisionRGB.
@Wuerfel_21 I'm going to try it... But, what is "DIGITAL_BASEPIN"?
I was going to use P32 for PSRAM...
@Rayman
DIGITAL_BASEPIN is where your HDMI port is. (The destinction is introduced because you can do DVI+VGA dual modes (though the VGA portion thereof is somewhat poorly compatible)
IF you use pure DVI/HDMI modes, ANALOG_BASEPIN is ignored.
@rogloh try the
hdmi-clock-regen-test
branch I just pushed. This transmits the clock packets at spec-compliant intervals (which ends up being exactly every 50 samples). Code quality is dubious, I just hacked this together real quick.@Wuerfel_21 Ok, figured out digital_basepin after a few minutes of darkness...
I got video now, but directory error... Seems now need a folder, like the others...
"Directory error" actually can mean any number of things (including the card not being inserted, IIRC). I don't think I ever changed NeoYume's directory layout, so if you had it working previously, it should still be fine. Make sure the SD card pins are correct and the card is properly inserted.
It works! Just tried crossed swords and HDMI audio is working. There was however, some major video corruption while waiting for me to hit the start button. After that, seemed all good.
The corruption looked like random 8x8 tiles all over..
This was with hyperram by the way
Can you give me a picture of that and test if it happens with the old version on the master branch?
Tried a couple more times and didn’t come back.. must been a fluke
Ah my favorite. I actually dug up my HyperRAM to see if that got broken (I very rarely test HyperRAM - For neoyume work I always use the 96MB PSRAM board)
Maybe it just needs to warm up a bit? No idea, never seen it before and it didn't come back...
Just tested with SimpleP2 with PSRAM in 16-bit mode. Crossed Swords and Metal Slug worked perfectly.
Congrats! I think you have it.
Neat.
There's still some bugs there though. UniBIOS support is currently busted it seems. I think the values read from the scanline counter during blanking are incorrect (or rather less correct than before).
EDIT: Bug squashed (on both video-nextgen and hdmi-clock-regen-test).
Yeah I had the same thought when I woke up this morning. Will try it in a bit...
Ok cool, will try that out too on the plasma and AVR to see if anything is fixed. Be awesome if it fixes it.
Nope, no luck with the test. Plasma displays nothing now, and the projector direct connection reports an invalid signal with Vsync 474Hz and Hsync 31.5kHz. You might be right about the quick hack thing. Something seems to be broken now with sync.
Uhhhhh what? That really shouldn't happen, the video is fine on my end and should really not be affected at all. Just for ref, what flexspin version are you running? Mine is
Version 7.0.0-alpha-v6.9.7-52-gada40c8f Compiled on: Jul 2 2024
There may have been an issue if your device is picky about preambles. I just pushed a new commit, see if that one fixes it.
No luck with the plasma - blank screen and the projector also is blank. For a sanity test I went back to the other branch and ran it direct to the projector and was able to see a stable 640x480 picture when I used FAST option for the low logic level driver instead of 1K5. This may be due to a long cable run to the projector in the ceiling. I did get an image with the 1K5 option but it had a lot of red pixel noise. No sound obviously because there is no speaker in the projector.
One thing I need to check is whether it may be crashing when I plug in the HDMI cable. It's messy as I'm loading up a JonnyMac board from my Mac with your code then removing the proplug and walking it into the other room powered by a 5V USB battery pack and plugging in the HDMI while it is running. The other version doesn't crash with this but I just don't know if this one does. I see the LEDs on P39 and P38 change as I plug in the cable and it may be a reset. Not sure. I'll have to drag my laptop out upgrade the tools to whatever is required now and power it with that and load your image once it is plugged in the HDMI port to be sure.
Oh, that is a janky setup...
Also, feel free to diff the branches to look for any obvious goofs. I'm going to sleep now.
No worries, thanks for the efforts. I'll keep plugging away...
Update: Nope definitely bad after a direct laptop download with HDMI already plugged into the projector. Reporting no signal info at all.
Oh man, I've just been trying to grok your video driver code today @Wuerfel_21 . There's a lot of interesting things in there. I like the heavy overlay use and more extensive parameterization of code and data - if I rewrite my video driver one day there's a few things I can learn from this. My driver uses overlays only for syncs so it's much more limited taking advantage of what that can offer. Certainly a lot of setup work though to get this all working correctly. Tricky stuff.
I can't say I like needing to fine grain the audio polling and pixel doubling render code to be executed so frequently throughout the code path. That does makes a mess of the code and puts lots of call instructions everywhere, although it can certainly work with care. I'm wondering if there's a better way you can make use of the streamer & FIFO to get your pixels output and doubled, although I think you said you couldn't do it when the video output pin is in REPO mode. Can the pixel doubling also be achieved that way with NCO and the streamer/FIFO? Maybe I can make use of it for my own pixel doubling code, not sure. Might be too bit depth dependent.
If there was a better way to read audio samples at the end of the video line you might avoid some of the frequent audio poll calls at least. I had an idea to use a small FIFO buffer to hold several pending audio samples (e.g. 4 to 8 samples) before the video driver reads them in per scan line to be encoded and sent out in the islands but the extra complexity is that the writer side needs to manage memory and wrap its pointers etc and that's not nearly as simple as writing to a smart pin in REPO mode, although it could be done. You don't ever want to block the writer side though and you can never let this FIFO overflow, or audio artefacts will occur. Another idea is to issue a COGATN per audio sample (or group of samples) which might be faster to check on.
I don't see where you are sending an audio InfoFrame packet out, only the AVI InfoFrames. I think this may need to be sent too (I sent it once per frame during Vsync in my code IIRC).
If I can get motivated to relearn my FPGA stuff again I found this link on reddit which seems to imply you could do an HDMI capture in a cheap FPGA (i.e. without a serdes so free tools might be possible). This would be a handy way to capture an HDMI bitstream to analyze/compare how audio works for devices that have issues with P2 HDMI audio code. It even seems to mention operating with 27MHz pixel clocks which is what we'd need. We don't need to decode 10b to 8b in HW, just capture raw deserialized data and log it for post-processing afterwards. May need to add a PSRAM or other memory interface for storage, or better yet clock the 3x10b data @27MHz into a P2 directly which could do the capture work and write to PSRAM and read it back via a serial port or write some TMDS log files to SD. Although I have to say I really don't want to get sidetracked by this right now, the learning/bringup curve is huge. Plus so many other things on the go.
https://www.reddit.com/r/FPGA/comments/17fj8p5/8b10b_with_cdr_in_hdl_on_low_cost_fpgas/
The pixel doubling thing is an inherent limitation of the TMDS hardware. It just can't do it. The audio polling is really not a problem, I just call it frequently to be on the safe side. The mailbox method is specifically designed to be as uninvasive to the sound driver as possible, which is usually already crammed to the max. It takes like 3 extra instructions to pack the data and write it. For a different scenario (i.e. not an emulator) I would consider a memory FIFO, but in that case I'd actually have the video cog run the analog DACs, which needs polling once again (because then the audio generation doesn't have to run continuously - could chunk it into larger blocks or even do the entire audio for a frame during vblank in a rendering cog)
There is no audio infoframe. The spec calls for it, but it seems entirely superfluous and there isn't memory to hold it. The regen timing thing has a lot more legs. I have no idea why the sync is messed up on that for you. On my end it works fine on the asus and the visionrgb. Will try it on the TV later. The only meaningful change is to the hsync/vsync functions (and the packets they send out)
Maybe for some devices designed to use it, it is needed, especially if the spec calls for it to be present...
You can try changing the AVI infoframe into an audio infoframe in the init code (the former is not a hard spec requirement - though some idiot device may try to go limited range without it). But really, the sync thing needs looking into.
Perhaps you could put pre-computed (static) AVI and audio infoframe data into hub RAM, and read it into a common COG buffer space prior to output? You may have time to read this during VSYNC perhaps. Its not a log of longs to read in a block transfer.
Agree the sync is important.
I wish I could capture your output for analysis with my decoder program to see what is wrong with the video bitstream but this mailbox thing makes a mess of it and it can't be logged.
You don't even need a block transfer, regular streamer DMA will do it. That's what my experimental code did. The real problem (and this is NeoYume specific) is that the lower code will obliterate all our upper hub memory as soon as the game starts. It all has to be in cog/lut.