Very strange goings on in Excel??

skylight · 2013-02-09 03:36

A friend has sent me an expenditure sheet that they have made in excel, it's very simple and uses a formula to compare two ranges of sums and if equal to output the sum of one of the ranges to a cell, if not equal then print "ERROR" using the formula =IF(SUM(D52:M52)=SUM(N2:N51),SUM(D52:M52),"ERROR")
as you can see from the attached file the cell N52 is showing "ERROR" now some detective work has brought up the most weirdest thing i've seen using excel.

I've narrowed down the problem to two cells D35 and D41 both containing the data value 0.06 now this is where it starts to get weird...

delete either value of 0.06 and the error goes, substitute any value that doesn't end in the value 0.06 to 0.62 and the error dissapears! so it seems that excel doesn't like any value in those cells that are within the range 0.06 to 0.62 although saying that, value 2.45 works??
Other values will equal out, Is this some form of rounding going on or the fact negative numbers are being used in the sheet, does excel not like mixes of signed unsigned numbers?

To cap it all this only affects the Jan tabbed sheet, the other sheets work ok.

Just thought this might be of interest to some of you and if you have a solution please let me know what is going on here
thanks

Heater. · 2013-02-09 04:01

No idea what you have there but floating point numbers are funny things.
In general you can never assume that two values will ever be equal. Even if they look like they shoud be.

For example in IEEE floating point, as used by most programming languages now, we find that:
0.1 + 0.2 == 0.3
is false.

So I guess your test for equality of sums can fail in a lot of cases.

skylight · 2013-02-09 04:49

Thanks Heater, That doesn't leave a lot of confidence in all those spreadsheets around the world doing financial calculations :frown:
Perhaps financial institutions that use excel have discovered these discrepencies and adjust for the errors?
Strange how it's the range 0.06 to 0.62 only that's affected.
I'm so baffled by this and also have lost confidence in excel somewhat, though I presume it affects other makes of spreadsheets also and is a mathmatical computational quirk rather than the softwares fault?

Mike G · 2013-02-09 04:58

You are making an assumption about the IF function in Excel. I attached how I would accomplish the task. Essentially cells hold variables used in calculations. This makes troubleshooting issues much easier in Excel. The cell can exist anywhere and they can be protected if you like.

skylight · 2013-02-09 05:17

Hi Mike, As I said it's a friend who sent me this to try to work out what was happening, funnily enough he did mention that he'd tried a differential way to sort it like you did with zero meaning both balance but said he had to factor something like 0.001 into the formula to make it work (sometimes) obviously he agreed it wasn't a very robust way of calculating his expenditure, but anyway you have shown the difference between using a comparison of the ranges compared to looking at the difference.
I will throw this back at him and see if he can break it and if so will let you know, thanks for your help.

The one thing strange in this is he said that only the Jan sheet was affected and if he copied and renamed Feb's sheet for instance then no problem.

Mike G · 2013-02-09 05:46

funnily enough he did mention that he'd tried a differential way to sort it like you did with zero meaning both balance but said he had to factor something like 0.001 into the formula to make it work (sometimes) obviously he agreed it wasn't a very robust way of calculating his expenditure,

Significant figures... Use the ROUND function. One reason to use cells to hold intermediate calculations so you can find the problem.

=IF(ROUND(SUM(D52:M52), 2) = ROUND(SUM(N2:N51), 2),SUM(D52:M52),"ERROR")

The one thing strange in this is he said that only the Jan sheet was affected and if he copied and renamed Feb's sheet for instance then no problem.

The number is formatted differently or one of the source numbers affected significant figures. A copy in Excel 2003 also copies the formatting.

localroger · 2013-02-09 09:24

The problem is that the fraction 1 / 10 is, in binary, infinitely repeating, just as 1 / 3 is the infinitely repeating 0.33333... in decimal. Since your CPU does not have an infinitely wide ALU you always lose some precision when you use decimal fractions in binary floating point. Usually this doesn't affect the precision of your result noticeably, but it frequently affects equality of results that should be the same but underwent different degrees (of lack of) rounding. If you are testing equality you should always round both figures being compared immediately before doing the comparison; that will usually work. But sometimes even that fails depending on the implementation. I've gotten into the habit of always checking for closeness within a tolerance (which can be very small) instead of equality.

skylight · 2013-02-09 16:41

It's interesting that something you'd always taken to be reliable ( calculations in excel) are not so reliable, in a way I'm glad it's only financial calculations we are talking of here and not something life saving like working out drug doseage or something serious. How do space programs rely so heavily on computer calculations where a slight error could send a space craft way off course for instance?
Thanks for all the interesting comments so far.

Mike Green · 2013-02-09 17:38

The space programs and other such things pay attention to how this stuff is calculated. You can usually determine ahead of time how "inaccurate" the results of any calculation could be and that's figured into the whole process. A result value is given as a value with error bounds (plus or minus some factor). When calculations have to be more precise, they are computed using multiple precision arithmetic using more bits for the value or using scaled integer arithmetic, again with error bounds. Generally, when some kind of catastrophic error occurs (like with some of the Mars orbiters or landers), a really stupid gross error has been made (like using the wrong units), not the accumulation of small computational errors.

LoopyByteloose · 2013-02-09 21:52

Generallly, when I do speadsheets for taxes... I do any calculations aside from sums and subtractions off to the side and plug in a number in just dollar and cents. One can waste huge amounts of time trying to sort out half penny issues that creep in if everything is subject to division in floating point.

Of course, if you just don't want to pay for Excel any more Libre Office has a free spreadsheet. But the same hazards apply.

In other words...
When using a word processor, one still has to know how to spell intelligently, and
When using a spreadsheet, one still has to know how to calculate intelligently.

If you really look at how the IRS forms do things on paper, they try to avoid creating similar errors that just waste time.

Heater. · 2013-02-10 04:37

skylight,

It's interesting that something you'd always taken to be reliable ( calculations in excel) are not so reliable...

Sorry to say but you are a couple decades behind the times. People have been tripping themselves up over these real number calculations ever since there was an Excel or perhaps any spreadsheet. Programmers in other languages have the same issues.

It's not that the caclulations are unreliable, assuming no actual bugs in your spreadsheet software, it's just the users misunderstanding of numbers and arithmetic.

Did you know for example that 0.9999... recurring forever is actually the same number as 1.0?

Well, the above is a bit of a weird mathematical fact but when you throw in problems of representing real numbers in binary, or other base, on a finite sized machine it gets worse. We can't write even down 1/3 or the square root of two on paper accurately so why do we expect a CPU to be able to do it?

It's safest to assume that directly comparing for the equality of real (floating point) numbers in a computer is a meaningless operation.

Actually I'm starting to wonder why all such comparisons don't just return FALSE in programming languages or bomb out with an erro, like dividing by zero.

You should also be aware that simpler arthmetic can fail, like this in JavaScript:
console.log (1E16 + 1);
produces:
10000000000000000

Clearly working with numbers of vastly different sizes will get you into trouble.

Do read here for a summary of the issues:
http://floating-point-gui.de/

...in a way I'm glad it's only financial calculations we are talking of here and not something life saving like working out drug doseage or something serious.

Actually I see it the other way around. If the calculation for my meds or strength of a bridge does something like (0.1 + 0.2) and gets 0.30000000000000004, as happens in many langages, that is not going to have any measurable effect on my health or the strength of the bridge. On the other handI have known accountants spend all day chasing down a 0.00001 cent error in the books,

Loopy,

When using a spreadsheet, one still has to know how to calculate intelligently.

Exactly.

It's even worse than that, as soon as you use more than one cell in a spread sheet you are no longer just using a calculator. You are creating a program for a computer. You become a programmer.

I have see surveys that show that most spread sheets contain errors (bugs). Seems to me most users of spreadsheets are unaware they have become programers and are not prepared for the minefield they have entered. That leads to them arriving at opinions llike "Excel calculations are unreliable". The Zen of programming would have you look at yourself first:)

skylight,

I'm not getting at you. These are issues that have cought out generations of programmers.

localroger · 2013-02-10 15:02

Here is another bit of trivia that's gotten lost over the generations:

Single precision floating point routines are so imprecise that they use extra, hidden bits to do quality rounding after every operation. In a single precision float the mantissa is only 24 bits, so if you are incrementing a variable by 1 somewhere between 16 and 17 million it will stop accumulating because the precision becomes worse than 1. It's very easy to get in trouble if you don't realize that. Because single precision is so bad the people who wrote the libraries did as much as they could to keep you out of trouble.

But nowadays lots of folks, apparently including the folks who wrote Excel, think "computers are fast, I'll just throw double precision at everything and then I'll have lots of precision!" And it's true that you have lots of precision with 8-byte floats, and modern CPU's crunch them nicely in the coprocessor. But what gets forgotten is that all these math libraries were written in the 1970's when it was common to pay for your computer usage by the clock cycle. Double precision takes four times the clock cycles as single on most hardware, so the double precision libraries, in the interest of economy, don't automatically round after every operation. The authors figured you would probably rather round once after the calculation was done, saving a few bucks.

However, if you don't know that you NEED to round, you get behavior like that seen in the OP. Double precision is so precise people easily fall into the trap of thinking DP floats are "more like real numbers" than singles, but because of the way the libraries are written they're actually more dangerous in some ways. I once took possession of a beta evaluation unit of a weigh scale indicator which, when we programmed it to count by 0.02 as scales often need to, it would occasionally throw up an odd number in the LSD. And this was an instrument that had gone through and passed some pretty intensive testing for type compliance. It took them months to chase all the rounding bugs out of the firmware.

Very strange goings on in Excel??

Comments