As things stand, there is going to be awful lot of this
rdlut lut_data,lut_ptr ' or wrlut
Would it be possible to post-increment PTRA/PTRB if S = $1F8/1F9 in RDLUT/WRLUT? The above code would then be
rdlut lut_data,ptra ' or wrlut/ptrb
For the software-only 640x480x256 HDMI, this would reduce a loop of 33 cycles to 31, a saving of 6% and shorter loops would benefit more. The TMDS encoding for rev B will be done in hardware, but that won't eliminate the wish/need to save every cycle possible when using the two LUT instructions.