HDMI compatible audio with P2
rogloh
Posts: 5,787
in Propeller 2
This thread is started to discuss HDMI compatible audio as it might be cluttering up the P2-EVAL rev B thread.
Comments
I am using a Saleae logic analyzer to capture the TMDS bus at much lower than normal frequencies. It is clocked at 11.9MHz giving a pixel rate of 1.19MHz. The sampling is done at 12MHz (USB rate) and that gives me almost one TMDS sample every clock, every now an again I get a duplicate because of this mismatch but I'm living with it.
(In this case, channel 2 is red, channel 5 is green, channel 7 is blue, and channel 7 is TMDS clock - I lost the labels I had before I captured the pic for some reason).
I wrote a rudimentary decoder in C to process the data exported by the Logic software and to display the 10b codes on each channel aligned with the clock edge and also showing a debug bit on the rightmost column. Red is first, green second, blue third, then clock, then a debug pin. If it matches a TERC4 encoded 10b value the nibble mapped to that value is shown too.
So with these tools I can now see what is being sent out on the bus. Its not pretty but it sort of helps a lot. Every now and again (1 bit in 120), the clock brings in another sample due to the mismatch in timing (I really need a Logic analyzer with a clock input). Ozpropdev's version could get this data spot on. I just mark those with a * so I can ignore the data.
This dump shows the start of the first packet (null packet in this case) I send in the data island. I temporarily made the guard band 2 pixels of all zeros so I could notice it easily on the logic analyzer. It's not meant to be using those values but I find it easier to debug.
Seems to me, given that small error, a few samples can be dropped over time with low impact.
Perhaps that can be done by looking at them. See two very similar in size, maybe some level qualifier apart and just average them into one sample.
Compute the error, how many frames until an overrun happens, divide, and let the watcher just mooch a few samples out of the stream as needed to keep it sane.
The other thing to watch might be quiet regions. Apply same idea, but watch for near zero. Maybe that case is the fall back catch up case too. Multiple samples averaged into a few will have low impact.
This may not even be heard, but for pathological cases.
All one needs is frame level accuracy. Drop the right number at whole island intervals, and the error becomes hours, not minutes or seconds, or is gone, depending. More than good enough, IMHO.
If you had a source producing samples at 44.1kHz, you could make a tiny FIFO to buffer them and encode/transmit them as HDMI opportunities occur.
Or, you could just run a 32-bit NCO in software, updated twice on each scan line, to know when to encode/transmit new samples.
It would be interesting to know if the vertical blanking time is usable for audio without any at horizontal blanking time.
This tool took about 15mins to whip up and could potentially still be buggy, hope not, but YMMV.
To build and run:
Cheers,
Roger.
That's kind of what it is doing. On each scanline of 858 pixels @480p resolution (H_TOTAL) I add AUDIO_INC = 28*858 to an accumulator, and if the number is greater than LIMIT, I am allow to pull a sample from the audio fifo and send. I keep doing this until I either have an accumulator value below the LIMIT or I reach the 4 sub packet limit. Basically like an NCO in software. The factor of 28 keeps all values as integers with no fractions, and there are perfect relationships for all sample basic sample rates and the P2 clock.
1. Ensuring outgoing audio is sent via HDMI at the exact freq (e.g. 32/44.1/48 kHz) for different sysclocks (e.g. 252/270 MHz).
2. Ensuring incoming audio is sampled at the exact freq (e.g. 32/44.1/48 kHz) for different sysclocks (e.g. 252/270 MHz) then output via HDMI.
As mentioned in this post
http://forums.parallax.com/discussion/comment/1478781/#Comment_1478781
issue 1 is not a problem. Provided N and CTS are integers, the audio clock can be recreated from the video clock by the HDMI sink (aka receiver) with no errors, approximations or resampling. Audio Clock Regeneration packets contain the values of N and CTS so the sink knows what to use.
Issue 2 has nothing to do with HDMI per se.
The new smart pin mode for bitstream capturing with external clock should help a great deal with issue 2.
I found the issue in the code, I had V and H sync pulses inverted in one last place where they shouldn't have been. That's what was breaking it.
The audio samples that in theory I am encoding are not playing yet on my TV but that might be a problem either in the packet header data, the encoding or getting packets from the fifo. Will have to dig further there...
I've also been enhancing my decoder program a bit too to decipher each sample further and print more details. The regular invalid (11 bit) sized samples do get annoying, but that's only an artefact of my sampling method. Still tidying this up and trying to align things nicer. It will be useful in general I think going into the future. It would be rather good for it to decode the island packets too and compute/validate the ECC bits, which should be possible once the data is robustly captured.
This fact alone is great because for the HDTV to be able to decipher this packet it proves that the encoding I am doing is working properly across all bit lanes of the TMDS channels and the ECC bits are computing ok too, unless the Pioneer TV ignored them which it shouldn't. I've been looking at this code all week and can't see anything wrong with it since I fixed that sync problem, so unless it is subtle or staring me in the face and I'm missing it, my code might actually be ok.
I recall when I used the P1V+HDMI audio FPGA stuff I put together some time ago with my HDTV I did have some issue in that some sample rates worked and some didn't, so I am now going to try a few other clocking combinations to see I get some sound output in at least one of them. There are 2 different video pixels frequencies 27MHz and 25.2MHz to try along with 3 sample rates, 32/44.1/48kHz, plus different N and CTS values to use. Going to have a play with that some more...I might just have been using a combination (32kHz+27MHz) that it doesn't like.
I know, I might have to take you up on that offer next time I stop by...
Thanks. Looks good on the TV, but doesn't "sound" good yet.
There is one way I know I could to get this going: if I resurrect my previously working FPGA P1V project on the FleaOhm, underclock it and redirect these HDMI bits to its Raspi compatible header pins for logic analyzer capture then compare with my own data stream, I will find the difference between the working/non-working versions. I've been avoiding doing that until now as it is quite a bit of work to do and a big mental context switch from P2 asm back to Verilog coding but if I get desperate soon enough I might have to go that far. That should find the problem for sure.
I just thought of something. In YCbCr, just dynamically encoding Y makes a nice, monochrome signal.
Would not having to encode so many bits improve on the ability to do higher resolutions? Only the 8 Y bits need to change.
Secondly, one could define static color regions, pre encoded perhaps.
No, the pixel rate is still the same at ~27MHz so while you could have less hub memory used for a monochrome signal if you just used the Y channel, you still wouldn't increase the output resolution this way. To do that you'd need to overlock the P2 further. Eg, to 400MHz for 800x600 at 60Hz, perhaps somewhat less for devices supporting lower frame rates like some small LCDs or for other monitors if they can handle reduced blanking.
I imagine high resolution monochrome modes could be very appealing in some applications, for that perhaps component or VGA is the way to go.
I plan to use component in these ways myself.
I fixed it but was disappointed to find that is not the end of it. However now I am hearing some clicking sounds from the TV from my supposed 2kHz square wave test data. Prior to this I had nothing. So that's something at least but it's still broken though. Painfully slow to debug with just pure PASM right now and my fairly limited debug environment at this stage, but I'm working on it. I'm adding another COG to collect and output serial data items of choice to see if there is something else corrupted.
:frown:
Did you scope the clicks to see if they were maybe part or whole packets ? Does the spacing of them give any useful info ?
I don't know if this would apply to HDMI. It shouldn't because HDMI should signal the type of data being sent. But I suppose there is a chance that a TV could decide that a perfect squarewave was invalid data. Maybe a sawtooth would help with troubleshooting buffering issues.
Well the two packet headers are different in each but I guess anything is possible if there is data corruption somewhere and it if it ignores the CRC checks. I still want to test intentionally corrupting a CRC of the currently working video info frame to prove it ignores corrupted packets and stops changing into YCbCr mode for example.
I currently send audio in the second packet of the two packet data island keeping the first packet free for either info frames or the 900Hz/1kHz clock regen packets. I will also try reversing the send sequence to see if that helps.
One thing I read in someone's FPGA implementation is they said some TVs are sensitive to having Vsync and Hsync edges synchronized. I will check this too, as I don't think my own PASM2 code is actually doing that. That is one possible difference between Verilog and PASM2 that could be part of it. (By the way, this is Roger, ozpropdev is Brian.)
I haven't yet, but they appear to be a slowly repeating sequence, maybe over about 10-15s or so and take a few seconds to get started. Seems to sound the same for different sampling rates too. Click pattern is at a lowish volume, and didn't seem to change when I change my audio fifo sequence to nulls. Almost as if it was getting data from another area of hub memory. And as if it plays one single sample at a time, then waits a bit. Clock regeneration issue perhaps? When I am sending audio samples and regen packets I added a pin toggle output on the P2 and it seems to be sending samples at the right rate ~32kHz or 48kHz etc, though I need to setup a smartpin to count pulses over a full second to prove my rates are precise, because it does fluctuate per scan line between one and two samples. It looks reasonable on the logic analyser right now.
I am still playing around...
I have this in my code
What happens if the streamer is the middle of sending out a bunch of pixels and you change the value of the syncs register above while it is streaming? Is it ok to do? Is this S register parameter value latched once at the start when the xcont instruction is issued and reused for the entire duration when sending immediate data, or does this S register's data need to remain stable for the entire duration of the streaming (in this case above, for H_ACTIVE transfers)?
The D and S values in XINIT/XZERO/XCONT are latched by the streamer when the instruction executes. So, D and S can change within your code before and after with no ill effect. It only matters what the values are at the time the instruction executes, which may be a while before the actual streamer command executes, as it is buffered.