PDA

View Full Version : Solved How to computing 1/(1+exp(x)) ? cordic maybe ? Solution Found!



Bean
02-10-2012, 07:23 PM
I'm working on a project where I need to compute 1/(1+exp(x))*65536 for a given x in 16.16 fixed point format.
Is it possible to use cordic to compute this ? If not what other methods might work.
We need as much precision as possible.
This is for the SX chip, so I can't use the propeller log tables either.

Any ideas guys...?

[edit] Solution found. See post on page 2

Bean

plainsteve
02-11-2012, 02:54 AM
What is the range and precision of x?

Leon
02-11-2012, 11:03 AM
Calculating exp(x) as a series is a good technique, I've used it for other functions with Analog Devices fixed-point DSP chips. It should only take about five terms with a 16-bit processor. Here is the relevant chapter from one of their books:

http://www.leonheller.com/ADI/Chapter_4.pdf

It was easier to upload it than it was to find it on ADI's web site.

Phil Pilgrim (PhiPi)
02-11-2012, 05:26 PM
Bean,

What's X's domain (min to max)?

-Phil

DavidSmith
02-13-2012, 03:30 AM
Precision or accuracy?

Bean
02-13-2012, 12:15 PM
Bean,

What's X's domain (min to max)?

-Phil

Phil,
The "x" value is could be anything in the positive range (the negative range is just a mirror image around 0.5). So that's zero to 32767.999984.

Bean

Bean
02-13-2012, 12:17 PM
Precision or accuracy?

I'm thinking that precision is more important. Accuracy is not as long as it is 100% repeatable. We can deal with accuracy on the PC side.

Bean

Bean
02-13-2012, 12:21 PM
Just to pique your interest, this will be used in a neural network experiment.
In it's basic form it is doing a curve fit.
We have two (or more) input variables that are used. So I guess that would be a 2D curve fit.

Bean

Kye
02-13-2012, 03:12 PM
Lookup table!

Mark_T
02-13-2012, 05:50 PM
Lookup table! Are you really suggesting a wordsized lookup table of size 2^32?

Phil Pilgrim (PhiPi)
02-13-2012, 06:06 PM
Don't forget: this is an SX we're talking about. It doesn't even do 16-bit math natively.

-Phil

DavidSmith
02-13-2012, 06:31 PM
Bean,

This sounds like a classic engineering design problem. Fun.

First, we need better specs.

Why are you using the SX? Interface, cost (meaningless on a oneoff), space, weight, etc.

Significant places of your input. Wanting maximum precision or significant places is nice, but meaningless if your input/hardware has a limit.

What is your turnover requirement (is that the right word guys?). If you need a million a second, forget the SX and start looking for a (more expensive) specialized chip. If you only need a response every second or so, calculation by series is great and you have the time to carry it out to any level of precision you can conceive.

Next, if I read your description right, you will have input value of 1 - 32,000 and Y decimal places after that. Y may be specifying an accuracy that even the commercial lookup tables can't manage. (PARALLAX, how many entries in the Propeller log tables and how many places?)

Unless you have a reason for using ONLY the SX and adding no additional hardware, I would say your best bet is a series calculation. Look it up on the web. I found several sites discussing different algorithms for calculating logs. Some must be smaller/more efficient that others.

Also, be warned. Most hardware has an inherent limit on significant digits. Years ago I worked on a satellite problem that required 20 decimals of accuracy. We found out the hard way that it is NOT possible to double precision a value you have already double precisioned in FORTRAN. It required some special programming outside of the internal math functions.

ilovepi
02-13-2012, 06:41 PM
Just to pique your interest, this will be used in a neural network experiment.


Perhaps this may help: the Newton's handwriting recognizer neural networks were trained and then quantized. Their paper suggests that non-bias weights can be quantized down to a byte:

"Combining Neural Networks and Context-Driven Search for On-Line, Printed Handwriting Recognition"
http://aaaipress.org/ojs/index.php/aimagazine/article/download/1355/1255

Bean
02-13-2012, 07:43 PM
David,
The target device size is a 14 pin dip in a sealed metal can. The Propeller is way too large. Cost is not really a factor (within reason).
Speed is not a big concern either, about 500 exp calcs per second.
I'm working on something that seem to work, but needs more precision. I'll keep you all updated.
Sorry, I cannot give more detail, but this is a work project.

Bean

Tracy Allen
02-13-2012, 10:44 PM
If accuracy is not as important as precision, repeatability and easy implementation, maybe the bitExp function might work. It is the inverse of the bitLog. There is a treatment of it in Jack Crenshaw's Math Toolkit for Real Time Programming. I've implemented (http://www.emesystems.com/BS2math3.htm#bitlog) it on the BASIC Stamp, where the starting point is the NCD or DCD operator. That would give the exponential as an intermediate step and then you are left with computing the function of that, 1/(1+ex), which ranges between 0 and 1, and for large values of x quickly becomes indistinguishable from e-x[sup]. How large an input argument will be necessary? e[sup]11 is about 16 bits. The output range of 0 to 1 can be scaled as a sixteen bit binary fraction with the radix implied to the left of the msb.

Kye
02-14-2012, 02:06 PM
@Phil & Mark

Was not serious. :-)

Tracy Allen
02-14-2012, 05:55 PM
It's not a bad idea, Kye, combined with interpolation. The bitexp and bitlog function rely on a strategic interpolation and the exponential nature of the positional representation of numbers. An interpolation can be counted on for monotonicity and repeatability, and this function is quite well behaved.

Here is a graph of the function, along with graphs of similar functions arctan and tanh scaled to the 0 to 1 range.

89584

A series representation of 1/(1+exp(x)) converges slowly for larger values of the argument. Here is the Maclaurin expansion around x=0. At large x (meaning anything much greater than 1), it involves tiny differences between very large numbers. I haven't pursued it, but I question if it converges at all for x>1.

89583

Bean
02-14-2012, 07:30 PM
Tracy,
I'd like to try using ATAN(x) instead of 1/(1+exp(x)), but the cordic ATAN routine needs two parameters as it computes ATAN2(y, x)
Is there a way to use the ATAN2 cordic routine with a single parameter ?

Bean

Tracy Allen
02-14-2012, 08:22 PM
It is computing based on the ratio, Y/X, so you simply need to choose a convenient value for X, and then have Y range up to say 10 or 15 times X, depending on how close you need to take it to the limit within the precision allotted. As you know from your own Prop CORDIC for ATN, the accuracy is best when both Y and X are "large" numbers, so the function usually has a wrapper to scale up small numbers. For example, something like (X,Y) = (1,1) is scaled up up to say X,Y = (5000, 5000). The answer (45) is the same in either case. The length of the vector is different of course, but who cares? That choice for X would allow Y to range up to 10*X within a 16 bit framework, with input steps of 1/5000.

Phil Pilgrim (PhiPi)
02-14-2012, 08:41 PM
I ran across an interesting paper that shows how to approximate the sigmoid function 1/(1 + e-x) using just additions and shifts:


www.wseas.us/e-library/conferences/tenerife2001/papers/483.pdf

(Please note that the sign of x in the equation for use in nerual nets is supposed to be negative.)

It produces a piecewise linear approximation, where the number of pieces increases as the power of the number of iterations. So I put it to the test, using the following Spin program:



CON

_clkmode = xtal1 + pll16x
_xinfreq = 5_000_000

OBJ

sio : "FullDuplexSerial"

PUB start | x, y1, y2, g, gp, h, delta

sio.start(31, 30, 0, 38400)
repeat x from -constant(10 << 16) to constant(10 << 16) step constant(20 << 9)
sio.dec(x)
sio.tx(" ")
sio.dec(sigmoid(x))
sio.tx(13)

PUB sigmoid(x) : g | y1, y2, gp, h, delta, sx

sx := x
x := -||x
delta := 17288 '0.26380 * 65536
g~
y1~
h := y2 := 32768 + (x ~> 2)
repeat 4
gp := g #> h
h := (g + h + delta) ~> 1
g := gp
delta >>= 2
g #>= h
if (sx > 0)
g := 65536 - g


This code assumes a 16.16 fixed-point scale. I then exported the data from the program to Excel (file attached), normalized the values to floating-point, and created a graph that plots the approximate value and the "real" value:

http://forums.parallax.com/attachment.php?attachmentid=89585&d=1329252049

The agreement looks pretty good for small absolute values of x, but quickly reaches its limiting values outside the interval [-4, 4]. I think the method could be improved somewhat over what was presented in the paper, but I'm not sure how just yet.

-Phil

Phil Pilgrim (PhiPi)
02-15-2012, 07:06 PM
Here's another, non-recursive algo this time, that yields a piecewise linear approximation using only adds, subtracts, and shifts. It came from this paper:


http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.7932&rep=rep1&type=pdf (p. 53)

It seems to have better behavior than the previous algo in the tail regions. Here's my Spin code:



CON

_clkmode = xtal1 + pll16x
_xinfreq = 5_000_000

OBJ

sio : "FullDuplexSerial"

PUB start | x, y1, y2, g, gp, h, delta

sio.start(31, 30, 0, 38400)
repeat x from -constant(10 << 16) to constant(10 << 16) step constant(20 << 9)
sio.dec(x)
sio.tx(" ")
sio.dec(sigmoid(x))
sio.tx(13)

PUB sigmoid(x) : y | sx, xi, xf

sx := x
||x
xi := x.word[1]
xf := x.word[0]
y := 65536 + ((xf >> 1) - 65536) ~> (||xi + 1)
if (sx < 0)
y := 65536 - y


And here's a chart produce from the above Excel file that compares the approximation to the real thing:

http://forums.parallax.com/attachment.php?attachmentid=89614&d=1329332760

-Phil

Bean
02-16-2012, 12:22 AM
Phil, that graph looks pretty good. I'll have to try that code.
We have pretty much decided to use atan(x) and see how that works out.
Bean

Phil Pilgrim (PhiPi)
02-16-2012, 12:36 AM
Bean,

Any continuous sigmoid-shaped function should work. There's certainly nothing magic about either the exponential or the atan form. You just want something that's relatively smooth and fast to compute. BTW, I'm pretty sure I can make the above approximation smoother yet. One moment, please ...

-Phil

Tracy Allen
02-16-2012, 03:41 AM
Back-propagation depends on the first derivative, so it is important to have that as easily as the function itself.

The second paper Phil found points out that first derivative of the classic sigmoid is convenient, in the sense that if you have the fixed point binary value for f(x), then the 1st derivative is simply the original function times its own twos complement:
f'(x) = f(x) * (1 - f(x))

For atan(x), the first derivative is 1/sqrt(1 + x2), which could be harder, except if you are doing it with CORDIC, sqrt(1+x2) drops out of the algorithm as the length of the vector (which turns out to be important after all!), and you are left with finding the inverse and scaling, which is manageable but more work than would be the sigmoid.

That paper also pointed out that the first derivative of the hyperbolic tangent has a simple relation to the function, that is f'(x) = (1 - f2(x)), the twos complement of the square of the function itself at x. CORDIC can calculate tanh, but I think it does so by first finding sinh and cosh, followed by the quotient.

User Name
02-16-2012, 08:42 PM
Just as a note of encouragement, Bean: Anything the HP-35 could do, the SX can do better. And the HP-35 would have computed this with considerable accuracy and repeatability.

Bean
05-31-2012, 10:52 PM
Just in case anyone is still interested in this...Here is my solution.



' CORDIC Demo program to compute 1/(1+Exp(-x))
' Values are in 16.16 format (1/65536 units).
'

CON
_clkmode = xtal1 + pll16x
_xinfreq = 5_000_000


OBJ
Debug : "FullDuplexSerial"

VAR
LONG X


PUB Start
Debug.Start(31, 30, 0, 115200)
WaitCnt(clkfreq * 4 + cnt) ' Wait four seconds for user to start PST

Debug.Str(string(13, "CORDIC Exp Test", 13))

X := 204800 ' 3.125 in 16.16 format

Debug.Str(String(" X = "))
Dec1616(X)

' Perform X=Exp(-X)
X := Exp(-X)

' X=X+1
IF X < $7FFF_0000
X := X + 65536 ' +1
ELSE
X := $7FFF_FFFF

' X = 1/X
X := (($1000_0000 / X) << 4) + (($1000_0000 // X) << 4) / X

Debug.str(String(13, "1/1+Exp(-X) ="))
Dec1616(X)
Debug.Tx(13)


PUB Dec1616(given) | temp ' Display a 16.16 fixed point value
if given < 0
given := -given
Debug.Tx("-")
temp := given ~> 16
Debug.Dec(temp)
temp := given & 65535
temp := temp * 10000
temp := temp / 65536
Debug.Tx(".")
if temp < 1000
Debug.Tx("0")
if temp < 100
Debug.Tx("0")
if temp < 10
Debug.Tx("0")
Debug.Dec(temp)


PUB Exp(given) : expValue | i ' Return Exp(given) as a 16.16 fixed point value
expValue := 1 * 65536 ' Make result 1.0 to start

IF given < 0 ' If given is negative then...
-given ' Use absolute value
-expValue ' Make result -1.0 to start

IF given > 681391 ' If given is too large then...
IF expValue > 0
expValue := $7FFF_FFFF ' Result is maximum value
ELSE
expValue := $8000_0000

ELSE
given <<= 8 ' Shift given value because table is in 8.24 fixed point format (for better resolution)
REPEAT i FROM 0 TO 24 ' For each table entry...
REPEAT WHILE given => Ln_Table[i] ' I'm pretty sure only the first entry is ever repeated, but just to be safe
given -= Ln_Table[i] ' Adjust given value
expValue += (expValue ~> i) ' Adjust result value

IF expValue < 0 ' If this was Exp of a negative value then...
-expValue ' Use absolute value of result
expValue := (($1000_0000 / expValue) << 4) + (($1000_0000 // expValue) << 4) / expValue


DAT
' Ln_Table is the value of Round(Ln(1+2^-i)*2^24) [i=0,1,2,3,...]
Ln_Table LONG 11629080,6802576,3743728,1976071,1017112,516263,26 0117,130563,65408,32736,16376,8190,4096,2048,1024, 512,256,128,64,32,16,8,4,2,1



Bean