Two pieces of code which should do the same, but they don't ?!?
Patrick1ab
Posts: 136
Hi everyone,
I'm scratching my head, why these two pieces of Pasm code behave in a different way, although they are supposed to do exactly the same. My intention is to read the color information from a 960 byte array located at address t1 and to convert it into another format, so I can send it to my lcd.
for example: source bbbbbbbb_gggggggg_rrrrrrrr ...
result rrrrrrgg_ggggggbb_bbbbbb00 ...
1.
This one is working fine, but it's not very efficient since I have three rdbytes (worst case 3*22 clock cycles) for every pixel. So I created this one instead:
2.
When I try to use this code, the picture gets a grayish look and there are some ugly vertical stripes in it. As if something was missing or shifted to the wrong position.
I already tried to figure it out by writing down the register contents after every step. I had no success so far.
Maybe it's because I'm cheating several times... reading long values from a byte array, reading one byte twice all the time?
I'm scratching my head, why these two pieces of Pasm code behave in a different way, although they are supposed to do exactly the same. My intention is to read the color information from a 960 byte array located at address t1 and to convert it into another format, so I can send it to my lcd.
for example: source bbbbbbbb_gggggggg_rrrrrrrr ...
result rrrrrrgg_ggggggbb_bbbbbb00 ...
1.
:loop rdbyte arg0, t1 ' create a long out of 3 bytes of valuechain... and arg0, #$FC ' ...using the following pattern: shl arg0, #12 '((valuechain[x*3+0]&$FC)<<12)|((valuechain[x*3+1]&$FC)<<18)|((valuechain[x*3+2]&$FC)<<24) mov rgb, arg0 ' where x is the current pixel add t1, #1 rdbyte arg0, t1 and arg0, #$FC shl arg0, #18 or rgb, arg0 add t1, #1 rdbyte arg0, t1 and arg0, #$FC shl arg0, #24 or rgb, arg0 mov arg0, rgb ' write this long sized value into arg0 call #fast_data_out_ ' and pass it on to fast_data_out add t1, #1 ' increment t1 to get to the first byte of the next pixel djnz t3, #:loop ' decrement the number of pixels by one and jump if there are further pixels
This one is working fine, but it's not very efficient since I have three rdbytes (worst case 3*22 clock cycles) for every pixel. So I created this one instead:
2.
:loop rdlong rgb, t1 and rgb, mask24bit shr rgb, #7 movi arg0, rgb shr rgb, #8 rol arg0,#6 movi arg0, rgb shr rgb, #8 rol arg0,#6 movi arg0, rgb ror arg0, #12 call #fast_data_out_ add t1, #3 djnz t3, #:loop ... mask24bit long $FCFCFC00
When I try to use this code, the picture gets a grayish look and there are some ugly vertical stripes in it. As if something was missing or shifted to the wrong position.
I already tried to figure it out by writing down the register contents after every step. I had no success so far.
Maybe it's because I'm cheating several times... reading long values from a byte array, reading one byte twice all the time?
Comments
Re: hub sync, you could arrange your code so that the second and third rdbyte are in sync (it uses an extra arg1):
thanks for your reply.
I think I understand... So, now I'm in trouble, I guess.
I could do a rdword and a rdbyte afterwards, but that reduces the efficiency again.
Another possiblity would be to keep reading longs and store the rest of the long in case another pixel follows. Since this rest contains a different number of bytes each loop cycle, this will become a real mess.
Not necessarily. Either you sync your code similar to what I listed above or you could unroll your loop a bit (space permitting) in that you use 3 rdlongs in a row with slightly different extraction code which would give you 4 pixels per loop cycle (as you pointed out, it doesn't have to be messy though).
The problem with this solution is, that if the picture does not consist of a number of pixels which can be divided by 4, I will get several lines with "random" color. So, I would have to insert three other abort mechanism.
You also could load the picture with padding (line by line). Your call
One loop reading 4 pixels at once and another loop calling fast_data_out_ according to the number of pixels left. So maybe I read too many values, but they are not being sent to the display ;-)
I wrote a third piece of code which isn't working correctly. Address calculation should be fine now, because I'm incrementing the pointer by 4 and I'm only reading longs.
My idea was to change the order of the bits while pushing them out, instead of doing these "and, shl/r , rol/r, movi" register operations all the time.
Now I'm getting some kind of ghost images, where red, green and blue are on different lines instead of being unified into one pixel.
I'm just checking because it looks slightly different from your initial rdbyte example where you read the first pixel components from addr+0, addr+1, addr+2, now (rdlong) you assume the pixel components are stored at addr+1, addr+2, addr+3.
bbbbbbbb gggggggg rrrrrrrr 00000000
This data is converted into big-endian (with some other modifications like cutting off the lowest bits, etc.). In the initial version it was converted into a format like this, before sending it:
rrrrrrgg ggggbbbb bb000000
Now I'm not doing the conversion anymore, but I'm changing the bit order while sending the data. This is why I wrote a new "color_data_out" method.
The output should still be rrrrrrgg ggggbbbb bb.
Packed I assume, i.e. the 00000000 should be the blue component of the next pixel (and addresses are increasing from left to right)?
Anyway, in your rdbyte implementation you read blue, green and red in that order (first pixel). Now with rdlong you end up with a value like this (using above data): And in your code, after sending it out you shift everything left by #24, i.e. you keep the blue component of pixel 0 and throw away blue of pixel 1.
Oops, yes, the 00000000 should be bbbbbbbb of the next pixel.
I think I'm getting it now. The Propeller is working in big-endian. This has no effect if I just read byte by byte, but when it comes to reading longs I have to swap most and least significant bytes. Okay, I'll try to modify the code tomorrow.
Thanks for your help ;-)
The prop is little endian, the register format is bit 31 left, bit 0 right (byte 3 (MSB) left, byte 0 (LSB) right). And the byte read from the lowest address (addr+0) ends up in the lowest byte (bits 7..0).