Shop OBEX P1 Docs P2 Docs Learn Events
signed saturation — Parallax Forums

signed saturation

ManAtWorkManAtWork Posts: 2,176
edited 2020-05-06 12:35 in Propeller 2
What is the best method to limit the range of a 32 bit signed value to 16 bit signed? I'm looking for the reverse of the SIGNX instruction. Of course, I could do
  FLES x,##$00007FFF
  FGES x,##$FFFF8000
but that would require 4 longwords and 8 clocks. No big deal if only needed once. But I'll probably need this a lot (digital filtering).
  TESTB x,#31 wc
  BITC x,#16<<5+15
Is shorter and faster. ...EDIT: and doesn't work, unfortunatelly. :(

Is there a more elegant method?

Comments

  • evanhevanh Posts: 15,916
    The true reverse of SIGNX is just basic truncation. You're wanting clipping or scaling. A faster clipping can be got by prefilled register direct addressing. eg:
      ...
      FLES x,limit16sp
      FGES x,limit16sn
      ...
    
    limit16sp  long $00007FFF
    limit16sn  long $FFFF8000
    

    Doing scaling instead would be a little more guess work on suitable scale factors but does improve things. Only do the final clipping at output stage so then it can't wrap around due to forced truncation. I'd be inclined to try using the cordic and stick with 32-bit working data as much as possible. But for the fastest option maybe SCAS might be what you want. Although, that does require starting from 16-bit source data in the first place.
  • evanhevanh Posts: 15,916
    Oh! A really basic scaling is a crafted low-bit-truncation (instead of high-bit-truncation which causes the dreaded wraparound) - for unsigned integers at least. Basically just a shift-right of a predetermined amount that will always bring the most significant "1" bit within the desired 16 bits. Obviously only provides powers of two scaling though.

  • I think you could test the high bit and then mux that into the upper 17 bits, but that's going to take at least 2 instructions as well. FLES/FGES with predefined register limits is probably clearer.

  • ersmith wrote: »
    I think you could test the high bit and then mux that into the upper 17 bits, but that's going to take at least 2 instructions as well. FLES/FGES with predefined register limits is probably clearer.

    That's what the TESTB, BITC sequence does, but it doesn't deliver the desired result:
    If the 32 bit result is 32780 or $0000800C then the output is 12 or $0000000C rather than the desired 32767 or $00007FFF.

    My follow-up questions are: What is the maximum predicted excursion from 16 bit range? What can be done earlier to prevent excursions, and are the consequences worth the trade-off?
  • Here is a trick for signed to unsigned scaled conversion. May not be useful if you want signed output. If the subtraction result is less than 0, c is set. The scaling will be skipped and the output will be specified by a literal or other register.
    		fles    a3,posclamp             '5 prevent overflow of pixel
    		subs    a3,blacklevel  wc       '6 correct black level offset 
    	if_nc	scas    a3,az                   '7 if non-negative, scale pixel  
    		setbyte dy,#0,#3                '8 store scas result or 0
    

  • Ok thanks. So the
      FLES x,limit16sp
      FGES x,limit16sn
    
    seems to be the best solution to my original question. But it's always good to see different solutions to different problems. As the documentation of the P2 instructions is still very sparse I don't understand all of them and I 'm always unsure if I missed something. So any hints are welcome to learn from.

    I'm still not sure what approach I use for the filtering. MUL/ADD is fastest but I don't know if 16 bits resolution is sufficient. 32 bits resolution and using the cordic might have some advantages but the CORDIC unit doesn't (directly) support signed multiplication. If one factor is constant I could use the trick of doing a rotation with arcsin(factor) instead of a multiplication which supports negative numbers.

    The P2 might even have enough power to use (software) floating point math. In this case I don't have to worry about scaling, overflows and clipping at all and only have to limit the final output.

    BTW, a nice trick is that FLE also works on floating point numbers but only if they are positive.
Sign In or Register to comment.