Shop OBEX P1 Docs P2 Docs Learn Events
How to computing 1/(1+exp(x)) ? cordic maybe ? Solution Found! — Parallax Forums

How to computing 1/(1+exp(x)) ? cordic maybe ? Solution Found!

BeanBean Posts: 8,129
edited 2012-05-31 14:52 in General Discussion
I'm working on a project where I need to compute 1/(1+exp(x))*65536 for a given x in 16.16 fixed point format.
Is it possible to use cordic to compute this ? If not what other methods might work.
We need as much precision as possible.
This is for the SX chip, so I can't use the propeller log tables either.

Any ideas guys...?

[edit] Solution found. See post on page 2

Bean

Comments

  • plainsteveplainsteve Posts: 33
    edited 2012-02-10 18:54
    What is the range and precision of x?
  • LeonLeon Posts: 7,620
    edited 2012-02-11 03:03
    Calculating exp(x) as a series is a good technique, I've used it for other functions with Analog Devices fixed-point DSP chips. It should only take about five terms with a 16-bit processor. Here is the relevant chapter from one of their books:

    http://www.leonheller.com/ADI/Chapter_4.pdf

    It was easier to upload it than it was to find it on ADI's web site.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-02-11 09:26
    Bean,

    What's X's domain (min to max)?

    -Phil
  • DavidSmithDavidSmith Posts: 36
    edited 2012-02-12 19:30
    Precision or accuracy?
  • BeanBean Posts: 8,129
    edited 2012-02-13 04:15
    Bean,

    What's X's domain (min to max)?

    -Phil

    Phil,
    The "x" value is could be anything in the positive range (the negative range is just a mirror image around 0.5). So that's zero to 32767.999984.

    Bean
  • BeanBean Posts: 8,129
    edited 2012-02-13 04:17
    DavidSmith wrote: »
    Precision or accuracy?

    I'm thinking that precision is more important. Accuracy is not as long as it is 100% repeatable. We can deal with accuracy on the PC side.

    Bean
  • BeanBean Posts: 8,129
    edited 2012-02-13 04:21
    Just to pique your interest, this will be used in a neural network experiment.
    In it's basic form it is doing a curve fit.
    We have two (or more) input variables that are used. So I guess that would be a 2D curve fit.

    Bean
  • KyeKye Posts: 2,200
    edited 2012-02-13 07:12
    Lookup table!
  • Mark_TMark_T Posts: 1,981
    edited 2012-02-13 09:50
    Kye wrote: »
    Lookup table!
    Are you really suggesting a wordsized lookup table of size 2^32?
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-02-13 10:06
    Don't forget: this is an SX we're talking about. It doesn't even do 16-bit math natively.

    -Phil
  • DavidSmithDavidSmith Posts: 36
    edited 2012-02-13 10:31
    Bean,

    This sounds like a classic engineering design problem. Fun.

    First, we need better specs.

    Why are you using the SX? Interface, cost (meaningless on a oneoff), space, weight, etc.

    Significant places of your input. Wanting maximum precision or significant places is nice, but meaningless if your input/hardware has a limit.

    What is your turnover requirement (is that the right word guys?). If you need a million a second, forget the SX and start looking for a (more expensive) specialized chip. If you only need a response every second or so, calculation by series is great and you have the time to carry it out to any level of precision you can conceive.

    Next, if I read your description right, you will have input value of 1 - 32,000 and Y decimal places after that. Y may be specifying an accuracy that even the commercial lookup tables can't manage. (PARALLAX, how many entries in the Propeller log tables and how many places?)

    Unless you have a reason for using ONLY the SX and adding no additional hardware, I would say your best bet is a series calculation. Look it up on the web. I found several sites discussing different algorithms for calculating logs. Some must be smaller/more efficient that others.

    Also, be warned. Most hardware has an inherent limit on significant digits. Years ago I worked on a satellite problem that required 20 decimals of accuracy. We found out the hard way that it is NOT possible to double precision a value you have already double precisioned in FORTRAN. It required some special programming outside of the internal math functions.
  • ilovepiilovepi Posts: 9
    edited 2012-02-13 10:41
    Bean wrote: »
    Just to pique your interest, this will be used in a neural network experiment.

    Perhaps this may help: the Newton's handwriting recognizer neural networks were trained and then quantized. Their paper suggests that non-bias weights can be quantized down to a byte:

    "Combining Neural Networks and Context-Driven Search for On-Line, Printed Handwriting Recognition"
    http://aaaipress.org/ojs/index.php/aimagazine/article/download/1355/1255
  • BeanBean Posts: 8,129
    edited 2012-02-13 11:43
    David,
    The target device size is a 14 pin dip in a sealed metal can. The Propeller is way too large. Cost is not really a factor (within reason).
    Speed is not a big concern either, about 500 exp calcs per second.
    I'm working on something that seem to work, but needs more precision. I'll keep you all updated.
    Sorry, I cannot give more detail, but this is a work project.

    Bean
  • Tracy AllenTracy Allen Posts: 6,664
    edited 2012-02-13 14:44
    If accuracy is not as important as precision, repeatability and easy implementation, maybe the bitExp function might work. It is the inverse of the bitLog. There is a treatment of it in Jack Crenshaw's Math Toolkit for Real Time Programming. I've implemented it on the BASIC Stamp, where the starting point is the NCD or DCD operator. That would give the exponential as an intermediate step and then you are left with computing the function of that, 1/(1+ex), which ranges between 0 and 1, and for large values of x quickly becomes indistinguishable from e-x. How large an input argument will be necessary? e11 is about 16 bits. The output range of 0 to 1 can be scaled as a sixteen bit binary fraction with the radix implied to the left of the msb.
  • KyeKye Posts: 2,200
    edited 2012-02-14 06:06
    @Phil & Mark

    Was not serious. :-)
  • Tracy AllenTracy Allen Posts: 6,664
    edited 2012-02-14 09:55
    It's not a bad idea, Kye, combined with interpolation. The bitexp and bitlog function rely on a strategic interpolation and the exponential nature of the positional representation of numbers. An interpolation can be counted on for monotonicity and repeatability, and this function is quite well behaved.

    Here is a graph of the function, along with graphs of similar functions arctan and tanh scaled to the 0 to 1 range.

    graphs_3.png


    A series representation of 1/(1+exp(x)) converges slowly for larger values of the argument. Here is the Maclaurin expansion around x=0. At large x (meaning anything much greater than 1), it involves tiny differences between very large numbers. I haven't pursued it, but I question if it converges at all for x>1.

    Maclaurin.png
    542 x 79 - 11K
    447 x 431 - 30K
  • BeanBean Posts: 8,129
    edited 2012-02-14 11:30
    Tracy,
    I'd like to try using ATAN(x) instead of 1/(1+exp(x)), but the cordic ATAN routine needs two parameters as it computes ATAN2(y, x)
    Is there a way to use the ATAN2 cordic routine with a single parameter ?

    Bean
  • Tracy AllenTracy Allen Posts: 6,664
    edited 2012-02-14 12:22
    It is computing based on the ratio, Y/X, so you simply need to choose a convenient value for X, and then have Y range up to say 10 or 15 times X, depending on how close you need to take it to the limit within the precision allotted. As you know from your own Prop CORDIC for ATN, the accuracy is best when both Y and X are "large" numbers, so the function usually has a wrapper to scale up small numbers. For example, something like (X,Y) = (1,1) is scaled up up to say X,Y = (5000, 5000). The answer (45°) is the same in either case. The length of the vector is different of course, but who cares? That choice for X would allow Y to range up to 10*X within a 16 bit framework, with input steps of 1/5000.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-02-14 12:41
    I ran across an interesting paper that shows how to approximate the sigmoid function 1/(1 + e-x) using just additions and shifts:

    (Please note that the sign of x in the equation for use in nerual nets is supposed to be negative.)

    It produces a piecewise linear approximation, where the number of pieces increases as the power of the number of iterations. So I put it to the test, using the following Spin program:
    CON
    
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000 
    
    OBJ
    
      sio   :       "FullDuplexSerial"
    
    PUB  start | x, y1, y2, g, gp, h, delta
    
      sio.start(31, 30, 0, 38400)
      repeat x from -constant(10 << 16) to constant(10 << 16) step constant(20 << 9)
        sio.dec(x)
        sio.tx(" ")
        sio.dec(sigmoid(x))
        sio.tx(13)
    
    PUB sigmoid(x) : g | y1, y2, gp, h, delta, sx
    
        sx := x
        x := -||x
        delta := 17288              '0.26380 * 65536
        g~
        y1~
        h := y2 := 32768 + (x ~> 2)
        repeat 4
          gp := g #> h
          h := (g + h + delta) ~> 1
          g := gp
          delta >>= 2
        g #>= h
        if (sx > 0)
          g := 65536 - g
    

    This code assumes a 16.16 fixed-point scale. I then exported the data from the program to Excel (file attached), normalized the values to floating-point, and created a graph that plots the approximate value and the "real" value:

    attachment.php?attachmentid=89585&d=1329252049

    The agreement looks pretty good for small absolute values of x, but quickly reaches its limiting values outside the interval [-4, 4]. I think the method could be improved somewhat over what was presented in the paper, but I'm not sure how just yet.

    -Phil
    892 x 542 - 7K
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-02-15 11:06
    Here's another, non-recursive algo this time, that yields a piecewise linear approximation using only adds, subtracts, and shifts. It came from this paper:

    It seems to have better behavior than the previous algo in the tail regions. Here's my Spin code:
    CON
    
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000 
    
    OBJ
    
      sio   :       "FullDuplexSerial"
    
    PUB  start | x, y1, y2, g, gp, h, delta
    
      sio.start(31, 30, 0, 38400)
      repeat x from -constant(10 << 16) to constant(10 << 16) step constant(20 << 9)
        sio.dec(x)
        sio.tx(" ")
        sio.dec(sigmoid(x))
        sio.tx(13)
    
    PUB sigmoid(x) : y | sx, xi, xf
    
      sx := x
      ||x
      xi := x.word[1]
      xf := x.word[0]
      y := 65536 +  ((xf >> 1) - 65536) ~> (||xi + 1)
      if (sx < 0)
        y := 65536 - y
    

    And here's a chart produce from the above Excel file that compares the approximation to the real thing:

    attachment.php?attachmentid=89614&d=1329332760

    -Phil
    896 x 545 - 7K
  • BeanBean Posts: 8,129
    edited 2012-02-15 16:22
    Phil, that graph looks pretty good. I'll have to try that code.
    We have pretty much decided to use atan(x) and see how that works out.
    Bean
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-02-15 16:36
    Bean,

    Any continuous sigmoid-shaped function should work. There's certainly nothing magic about either the exponential or the atan form. You just want something that's relatively smooth and fast to compute. BTW, I'm pretty sure I can make the above approximation smoother yet. One moment, please ...

    -Phil
  • Tracy AllenTracy Allen Posts: 6,664
    edited 2012-02-15 19:41
    Back-propagation depends on the first derivative, so it is important to have that as easily as the function itself.

    The second paper Phil found points out that first derivative of the classic sigmoid is convenient, in the sense that if you have the fixed point binary value for f(x), then the 1st derivative is simply the original function times its own twos complement:
    f'(x) = f(x) * (1 - f(x))

    For atan(x), the first derivative is 1/sqrt(1 + x2), which could be harder, except if you are doing it with CORDIC, sqrt(1+x2) drops out of the algorithm as the length of the vector (which turns out to be important after all!), and you are left with finding the inverse and scaling, which is manageable but more work than would be the sigmoid.

    That paper also pointed out that the first derivative of the hyperbolic tangent has a simple relation to the function, that is f'(x) = (1 - f2(x)), the twos complement of the square of the function itself at x. CORDIC can calculate tanh, but I think it does so by first finding sinh and cosh, followed by the quotient.
  • User NameUser Name Posts: 1,451
    edited 2012-02-16 12:42
    Just as a note of encouragement, Bean: Anything the HP-35 could do, the SX can do better. And the HP-35 would have computed this with considerable accuracy and repeatability.
  • BeanBean Posts: 8,129
    edited 2012-05-31 14:52
    Just in case anyone is still interested in this...Here is my solution.
    ' CORDIC Demo program to compute 1/(1+Exp(-x))
    ' Values are in 16.16 format (1/65536 units).
    '
    
    CON
      _clkmode = xtal1 + pll16x
      _xinfreq = 5_000_000
    
    
    OBJ
      Debug : "FullDuplexSerial"
    
    VAR
      LONG X
        
    
    PUB Start
      Debug.Start(31, 30, 0, 115200)
      WaitCnt(clkfreq * 4 + cnt) ' Wait four seconds for user to start PST
    
      Debug.Str(string(13, "CORDIC Exp Test", 13))
    
      X := 204800 ' 3.125 in 16.16 format
    
      Debug.Str(String(" X = "))
      Dec1616(X)
    
      ' Perform X=Exp(-X)
      X := Exp(-X)
    
      ' X=X+1
      IF X < $7FFF_0000
        X := X + 65536 ' +1
      ELSE
        X := $7FFF_FFFF
    
      ' X = 1/X
      X := (($1000_0000 / X) << 4) + (($1000_0000 // X) << 4) / X
    
      Debug.str(String(13, "1/1+Exp(-X) =")) 
      Dec1616(X)
      Debug.Tx(13)
             
    
    PUB Dec1616(given) | temp ' Display a 16.16 fixed point value
      if given < 0
        given := -given
        Debug.Tx("-")
      temp := given ~> 16
      Debug.Dec(temp)
      temp := given & 65535
      temp := temp * 10000
      temp := temp / 65536
      Debug.Tx(".")
      if temp < 1000
        Debug.Tx("0")
        if temp < 100
          Debug.Tx("0")
          if temp < 10
            Debug.Tx("0")
      Debug.Dec(temp)    
    
    
    PUB Exp(given) : expValue | i  ' Return Exp(given) as a 16.16 fixed point value
      expValue := 1 * 65536                        ' Make result 1.0 to start
    
      IF given < 0                                 ' If given is negative then...
        -given                                     ' Use absolute value                                   
        -expValue                                  ' Make result -1.0 to start                       
    
      IF given > 681391                            ' If given is too large then...
        IF expValue > 0
          expValue := $7FFF_FFFF                     ' Result is maximum value
        ELSE
          expValue := $8000_0000  
    
      ELSE
        given <<= 8                                ' Shift given value because table is in 8.24 fixed point format (for better resolution)  
        REPEAT i FROM 0 TO 24                      ' For each table entry...
          REPEAT WHILE given => Ln_Table[i]        ' I'm pretty sure only the first entry is ever repeated, but just to be safe
            given -= Ln_Table[i]                   ' Adjust given value
            expValue += (expValue ~> i)            ' Adjust result value                
    
      IF expValue < 0                              ' If this was Exp of a negative value then...
        -expValue                                  ' Use absolute value of result
         expValue := (($1000_0000 / expValue) << 4) + (($1000_0000 // expValue) << 4) / expValue
    
    
    DAT
      ' Ln_Table is the value of Round(Ln(1+2^-i)*2^24) [i=0,1,2,3,...]
      Ln_Table LONG 11629080,6802576,3743728,1976071,1017112,516263,260117,130563,65408,32736,16376,8190,4096,2048,1024,512,256,128,64,32,16,8,4,2,1
    
    

    Bean
Sign In or Register to comment.