HDMI compatible audio with P2

in Propeller 2
This thread is started to discuss HDMI compatible audio as it might be cluttering up the P2-EVAL rev B thread.
Comments
I am using a Saleae logic analyzer to capture the TMDS bus at much lower than normal frequencies. It is clocked at 11.9MHz giving a pixel rate of 1.19MHz. The sampling is done at 12MHz (USB rate) and that gives me almost one TMDS sample every clock, every now an again I get a duplicate because of this mismatch but I'm living with it.
(In this case, channel 2 is red, channel 5 is green, channel 7 is blue, and channel 7 is TMDS clock - I lost the labels I had before I captured the pic for some reason).
I wrote a rudimentary decoder in C to process the data exported by the Logic software and to display the 10b codes on each channel aligned with the clock edge and also showing a debug bit on the rightmost column. Red is first, green second, blue third, then clock, then a debug pin. If it matches a TERC4 encoded 10b value the nibble mapped to that value is shown too.
So with these tools I can now see what is being sent out on the bus. Its not pretty but it sort of helps a lot. Every now and again (1 bit in 120), the clock brings in another sample due to the mismatch in timing (I really need a Logic analyzer with a clock input). Ozpropdev's version could get this data spot on. I just mark those with a * so I can ignore the data.
This dump shows the start of the first packet (null packet in this case) I send in the data island. I temporarily made the guard band 2 pixels of all zeros so I could notice it easily on the logic analyzer. It's not meant to be using those values but I find it easier to debug.
Seems to me, given that small error, a few samples can be dropped over time with low impact.
Perhaps that can be done by looking at them. See two very similar in size, maybe some level qualifier apart and just average them into one sample.
Compute the error, how many frames until an overrun happens, divide, and let the watcher just mooch a few samples out of the stream as needed to keep it sane.
The other thing to watch might be quiet regions. Apply same idea, but watch for near zero. Maybe that case is the fall back catch up case too. Multiple samples averaged into a few will have low impact.
This may not even be heard, but for pathological cases.
All one needs is frame level accuracy. Drop the right number at whole island intervals, and the error becomes hours, not minutes or seconds, or is gone, depending. More than good enough, IMHO.
If you had a source producing samples at 44.1kHz, you could make a tiny FIFO to buffer them and encode/transmit them as HDMI opportunities occur.
Or, you could just run a 32-bit NCO in software, updated twice on each scan line, to know when to encode/transmit new samples.
It would be interesting to know if the vertical blanking time is usable for audio without any at horizontal blanking time.
This tool took about 15mins to whip up and could potentially still be buggy, hope not, but YMMV.
To build and run:
RLs-MacBook-Pro:Documents roger$ gcc -o tm tm.c RLs-MacBook-Pro:Documents roger$ ./tm tm: decoder for TMDS binary data Usage: tm <red_bit> <green_bit> <blue_bit> <clk_bit> [<dbg_bit>] where each <xxx_bit> indicates which bit of the byte contains the data for channel xxx and optional <dbg_bit> can be used for helping identifying events like vsync etc RLs-MacBook-Pro:Documents roger$ ./tm 2 5 6 7 1 < dump2-all.bin Decoding TMDS bits on binary input bytes: sample bits Red channel 2 Green channel 1 Blue channel 0 Clock Debug 1 * 2 ---> 10 01 00 00 11 2 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 3 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 4 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 5 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 6 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 7 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 8 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 9 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 10 * 11 ---> 01111100000 10111100000 01111100000 00000111111 11111111111 11 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111 12 10 ---> 0111110000 1011110000 0111110000 0000011111 1111111111
Cheers,
Roger.
That's kind of what it is doing. On each scanline of 858 pixels @480p resolution (H_TOTAL) I add AUDIO_INC = 28*858 to an accumulator, and if the number is greater than LIMIT, I am allow to pull a sample from the audio fifo and send. I keep doing this until I either have an accumulator value below the LIMIT or I reach the 4 sub packet limit. Basically like an NCO in software. The factor of 28 keeps all values as integers with no fractions, and there are perfect relationships for all sample basic sample rates and the P2 clock.
PROP_FREQ = (PIXEL_CLK * 10) ' TODO: compute PLL frequency from this? ' Audio parameters SAMPLE_RATE = 32_000 'SAMPLE_RATE = 44_100 'SAMPLE_RATE = 48_000 AUDIO_MULT = 28 AUDIO_INC = (AUDIO_MULT * H_TOTAL) LIMIT = (PIXEL_CLK * AUDIO_MULT / SAMPLE_RATE)
1. Ensuring outgoing audio is sent via HDMI at the exact freq (e.g. 32/44.1/48 kHz) for different sysclocks (e.g. 252/270 MHz).
2. Ensuring incoming audio is sampled at the exact freq (e.g. 32/44.1/48 kHz) for different sysclocks (e.g. 252/270 MHz) then output via HDMI.
As mentioned in this post
http://forums.parallax.com/discussion/comment/1478781/#Comment_1478781
issue 1 is not a problem. Provided N and CTS are integers, the audio clock can be recreated from the video clock by the HDMI sink (aka receiver) with no errors, approximations or resampling. Audio Clock Regeneration packets contain the values of N and CTS so the sink knows what to use.
Issue 2 has nothing to do with HDMI per se.
The new smart pin mode for bitstream capturing with external clock should help a great deal with issue 2.
I found the issue in the code, I had V and H sync pulses inverted in one last place where they shouldn't have been. That's what was breaking it.
The audio samples that in theory I am encoding are not playing yet on my TV but that might be a problem either in the packet header data, the encoding or getting packets from the fifo. Will have to dig further there...
I've also been enhancing my decoder program a bit too to decipher each sample further and print more details. The regular invalid (11 bit) sized samples do get annoying, but that's only an artefact of my sampling method. Still tidying this up and trying to align things nicer. It will be useful in general I think going into the future. It would be rather good for it to decode the island packets too and compute/validate the ECC bits, which should be possible once the data is robustly captured.
Decoding TMDS bits on binary input bytes: sample bits Type Red(ch2) Green(ch1) Blue(ch0) .... 82250 * 11 :-> RGB (FF_FE_FF) 01111100000 10111100000 0111110000 82251 10 :-> RGB (FE_01_FE) 0111110000 1011110000 0111110000 82252 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82253 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82254 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82255 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82256 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82257 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82258 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82259 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82260 10 :-> DataPreamble +V +H 0010101011 0010101011 1010101011 82261 * 11 :-> RGB (FF_FF_FE) 00110101011 00110101011 1111010101 82262 10 :-> DataPreamble +V +H 0010101011 0010101011 1010101011 82263 10 :-> DataPreamble +V +H 0010101011 0010101011 1010101011 82264 10 :-> DataPreamble +V +H 0010101011 0010101011 1010101011 82265 10 :-> DataPreamble +V +H 0010101011 0010101011 1010101011 82266 10 :-> DataPreamble +V +H 0010101011 0010101011 1010101011 82267 10 :-> DataPreamble +V +H 0010101011 0010101011 1010101011 82268 10 :-> DataGuard +V H- 0100110011 0100110011 0101100011 82269 10 :-> DataGuard +V H- 0100110011 0100110011 0101100011 82270 10 :-> TERC4 0000_0000_0010 +V H- 1010011100 1010011100 1011100100 82271 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82272 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82273 * 11 :-> RGB (FE_FE_FF) 10100111100 10100111100 0110001110 82274 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82275 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82276 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82277 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82278 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82279 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82280 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82281 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82282 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82283 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82284 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82285 * 11 :-> RGB (FE_FE_FF) 10100111100 10100111100 0110011110 82286 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82287 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82288 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82289 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82290 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82291 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82292 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82293 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82294 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82295 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82296 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82297 * 11 :-> RGB (FE_FE_FF) 10100111000 10100111000 0110011100 82298 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82299 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82300 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82301 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82302 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82303 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82304 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82305 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82306 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82307 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82308 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82309 * 11 :-> RGB (FE_FF_FF) 10100111000 10100111001 0110011100 82310 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82311 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82312 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82313 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82314 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82315 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82316 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82317 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82318 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82319 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82320 * 11 :-> RGB (FE_FE_FE) 10100011100 10100011100 0011001110 82321 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82322 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82323 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82324 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82325 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82326 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82327 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82328 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82329 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82330 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82331 10 :-> TERC4 0000_0000_1010 +V H- 1010011100 1010011100 0110011100 82332 * 11 :-> RGB (FE_FE_FE) 10100111100 10100111100 1011000011 82333 10 :-> TERC4 0000_0000_1011 +V +H 1010011100 1010011100 1011000110 82334 10 :-> DataGuard +V +H 0100110011 0100110011 1011000011 82335 10 :-> DataGuard +V +H 0100110011 0100110011 1011000011 82336 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82337 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82338 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82339 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82340 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82341 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011 82342 10 :-> Blanking VH=3 +V +H 1101010100 1101010100 1010101011
This fact alone is great because for the HDTV to be able to decipher this packet it proves that the encoding I am doing is working properly across all bit lanes of the TMDS channels and the ECC bits are computing ok too, unless the Pioneer TV ignored them which it shouldn't. I've been looking at this code all week and can't see anything wrong with it since I fixed that sync problem, so unless it is subtle or staring me in the face and I'm missing it, my code might actually be ok.
I recall when I used the P1V+HDMI audio FPGA stuff I put together some time ago with my HDTV I did have some issue in that some sample rates worked and some didn't, so I am now going to try a few other clocking combinations to see I get some sound output in at least one of them. There are 2 different video pixels frequencies 27MHz and 25.2MHz to try along with 3 sample rates, 32/44.1/48kHz, plus different N and CTS values to use. Going to have a play with that some more...I might just have been using a combination (32kHz+27MHz) that it doesn't like.
I know, I might have to take you up on that offer next time I stop by...
Thanks. Looks good on the TV, but doesn't "sound" good yet.
There is one way I know I could to get this going: if I resurrect my previously working FPGA P1V project on the FleaOhm, underclock it and redirect these HDMI bits to its Raspi compatible header pins for logic analyzer capture then compare with my own data stream, I will find the difference between the working/non-working versions. I've been avoiding doing that until now as it is quite a bit of work to do and a big mental context switch from P2 asm back to Verilog coding but if I get desperate soon enough I might have to go that far. That should find the problem for sure.
I just thought of something. In YCbCr, just dynamically encoding Y makes a nice, monochrome signal.
Would not having to encode so many bits improve on the ability to do higher resolutions? Only the 8 Y bits need to change.
Secondly, one could define static color regions, pre encoded perhaps.
No, the pixel rate is still the same at ~27MHz so while you could have less hub memory used for a monochrome signal if you just used the Y channel, you still wouldn't increase the output resolution this way. To do that you'd need to overlock the P2 further. Eg, to 400MHz for 800x600 at 60Hz, perhaps somewhat less for devices supporting lower frame rates like some small LCDs or for other monitors if they can handle reduced blanking.
I imagine high resolution monochrome modes could be very appealing in some applications, for that perhaps component or VGA is the way to go.
I plan to use component in these ways myself.
I fixed it but was disappointed to find that is not the end of it. However now I am hearing some clicking sounds from the TV from my supposed 2kHz square wave test data. Prior to this I had nothing. So that's something at least but it's still broken though. Painfully slow to debug with just pure PASM right now and my fairly limited debug environment at this stage, but I'm working on it. I'm adding another COG to collect and output serial data items of choice to see if there is something else corrupted.
:frown:
Did you scope the clicks to see if they were maybe part or whole packets ? Does the spacing of them give any useful info ?
I don't know if this would apply to HDMI. It shouldn't because HDMI should signal the type of data being sent. But I suppose there is a chance that a TV could decide that a perfect squarewave was invalid data. Maybe a sawtooth would help with troubleshooting buffering issues.
Well the two packet headers are different in each but I guess anything is possible if there is data corruption somewhere and it if it ignores the CRC checks. I still want to test intentionally corrupting a CRC of the currently working video info frame to prove it ignores corrupted packets and stops changing into YCbCr mode for example.
I currently send audio in the second packet of the two packet data island keeping the first packet free for either info frames or the 900Hz/1kHz clock regen packets. I will also try reversing the send sequence to see if that helps.
One thing I read in someone's FPGA implementation is they said some TVs are sensitive to having Vsync and Hsync edges synchronized. I will check this too, as I don't think my own PASM2 code is actually doing that. That is one possible difference between Verilog and PASM2 that could be part of it. (By the way, this is Roger, ozpropdev is Brian.)
I haven't yet, but they appear to be a slowly repeating sequence, maybe over about 10-15s or so and take a few seconds to get started. Seems to sound the same for different sampling rates too. Click pattern is at a lowish volume, and didn't seem to change when I change my audio fifo sequence to nulls. Almost as if it was getting data from another area of hub memory. And as if it plays one single sample at a time, then waits a bit. Clock regeneration issue perhaps? When I am sending audio samples and regen packets I added a pin toggle output on the P2 and it seems to be sending samples at the right rate ~32kHz or 48kHz etc, though I need to setup a smartpin to count pulses over a full second to prove my rates are precise, because it does fluctuate per scan line between one and two samples. It looks reasonable on the logic analyser right now.
I am still playing around...
I have this in my code
xcont m_blank, syncs 'do the blank portion of the line ... ' 10b TMDS data patterns: syncs long %1101010100_1101010100_1010101011_10 'inactive H&V syncs ' The blank line streamer mode sends immediate data for the duration of the ' entire active line, and the HUB RAM is fully available in this interval. m_blank long $7F810000+H_ACTIVE + (HDMI_GROUP << 20) 'blank line portion
What happens if the streamer is the middle of sending out a bunch of pixels and you change the value of the syncs register above while it is streaming? Is it ok to do? Is this S register parameter value latched once at the start when the xcont instruction is issued and reused for the entire duration when sending immediate data, or does this S register's data need to remain stable for the entire duration of the streaming (in this case above, for H_ACTIVE transfers)?
The D and S values in XINIT/XZERO/XCONT are latched by the streamer when the instruction executes. So, D and S can change within your code before and after with no ill effect. It only matters what the values are at the time the instruction executes, which may be a while before the actual streamer command executes, as it is buffered.