Rounding error in Stdev function result.

David K · 01-09-2006, 02:10 PM

the function
=STDEV(1.4434,1.4434,1.4434)
gives
2.98023E-08
(at least on my computer using Excel 2002, sp3)

Is this just rounding error due to IEEE double precision. It seems pretty
large.

-David

JE McGimpsey · 01-09-2006, 02:20 PM

It is pretty large, but it's not simply due to rounding...

XL04:

=STDEV(1.4434,1.4434,1.4434) ===> 2.71947991102104E-16

(I don't have XL03 available right now, but I suspect that the result
would be the same).

The improvement was undoubtedly part of the overhaul of stats functions
for XL03/04:

http://support.microsoft.com/kb/828888/en-us

In article <[email protected]>,
"David K" <David [email protected]> wrote:

> the function
> =STDEV(1.4434,1.4434,1.4434)
> gives
> 2.98023E-08
> (at least on my computer using Excel 2002, sp3)
>
> Is this just rounding error due to IEEE double precision. It seems pretty
> large.

[email protected] · 01-10-2006, 12:50 AM

"David K" wrote:
> the function =STDEV(1.4434,1.4434,1.4434) gives
> 2.98023E-08 (at least on my computer using Excel
> 2002, sp3)
> Is this just rounding error due to IEEE double precision.

Yes. But the numerical error caused by the binary
representation is exacerbated by the variance formula
apparently used in Excel 2002 (sp3), which seems to
be different from Excel 2003.

Mathematically, the sample variance can be computed
using either the following equivalent formulas:

1. var = SUM((x - mean)^2, for all x) / (n - 1)

2. var = (SUM(x^2, for all x) - SUM(x, for all x)^2 / n) / (n - 1)

In both cases, std dev = SQRT(var).

When we compute var manually using #2, where
x1 = x2 = x3 = 1.4434 (n = 3), we get:

var = 8.88178419700125E-16
sd = 2.98023223876953E-08

That matches your results. But when we use #1,
we get:

var = 7.39557098644699E-32
sd = 2.71947991102104E-16

That matches the STDEV(1.4434,1.4434,1.4434)
result using Excel 2003.

Note that in both cases, VAR() and STDEV() are not
zero, as you might have expected, even though
AVERAGE(1.4434,1.4434,1.4434) displays 1.44340000000000E+00.
I believe the displayed result of AVERAGE() belies the
fact that there are non-zero bits in the remaining 2-3
binary bits that Excel must round in order to display a
decimal result. I believe that is evidenced by the fact
that (1.4434 - AVERAGE(...))^2 yields 7.39557098644699E-32.

(But I am mystified by the fact that this numerical error
is not evident when I program #1 in VBA, which does
display var = 0 -- a pleasant surprise. Without access
to the VBA binary representations, VBA compiled code
and internal Excel algorithms, I can only speculate wildly
about the disparity.)

[email protected] · 01-10-2006, 02:45 PM

I wrote:
> I am mystified by the fact that this numerical error
> is not evident when I program #1 in VBA, which does
> display var = 0 -- a pleasant surprise. Without access
> to the VBA binary representations, VBA compiled code
> and internal Excel algorithms, I can only speculate wildly
> about the disparity.

My suspicion was confirmed by a response by Martin
Brown to my inquiry in excel.programming and by some
more experimentation on my part. Apparently, even the
retooled Excel 2003 STDEV() implementation does not
take full advantage of the floating point coprocessor, as
the VBA compiler seems to. I'm surprised.

Harlan Grove · 01-10-2006, 04:10 PM

[email protected] wrote...
>I wrote:
>>I am mystified by the fact that this numerical error
>>is not evident when I program #1 in VBA, which does
>>display var = 0 -- a pleasant surprise. Without access
>>to the VBA binary representations, VBA compiled code
>>and internal Excel algorithms, I can only speculate wildly
>>about the disparity.
>
>My suspicion was confirmed by a response by Martin
>Brown to my inquiry in excel.programming and by some
>more experimentation on my part. Apparently, even the
>retooled Excel 2003 STDEV() implementation does not
>take full advantage of the floating point coprocessor, as
>the VBA compiler seems to. I'm surprised.

It's not the STDEV function, per se, it's the mean. The mean of x, x
and x isn't exactly x. That is,

=SUMSQ(1.4434-AVERAGE(1.4434,1.4434,1.4434),
1.4434-AVERAGE(1.4434,1.4434,1.4434),
1.4434-AVERAGE(1.4434,1.4434,1.4434))

isn't zero. This even defeats DEVSQ, which is usually pretty good, but

=DEVSQ(1.4434,1.4434,1.4434)

returns 1.47911E-31 rather than zero. More to the point, while the
formula

=1.4434-AVERAGE(1.4434,1.4434,1.4434)

returns zero, the formula

=(1.4434-AVERAGE(1.4434,1.4434,1.4434))

returns 2.22045E-16.

Ain't Excel fun!

As with all floating point calculations, use explicit rounding for
final results. In this case,

=ROUND(STDEV(1.4434,1.4434,1.4434),5)

where the 5 is the maximum significant digits in STDEV's arguments.

[email protected] · 01-10-2006, 04:30 PM

"Harlan Grove" wrote:
> It's not the STDEV function, per se, it's the mean.

I understand. But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not. Nor does AVERAGE(),
as you point out. But the OP asked about STDEV(), not
AVERAGE(). I tailored my comments to the OP's context.

Harlan Grove · 01-10-2006, 05:20 PM

[email protected] wrote...
....
>I understand. But the implementation of STDEV() could
>easily compute the mean internally, taking advantage of
>the 80-bit FP registers. It does not. Nor does AVERAGE(),
>as you point out. But the OP asked about STDEV(), not
>AVERAGE(). I tailored my comments to the OP's context.

80 bits wouldn't necessarily help in this case. If the mean is
calculated by summing all the values first then dividing by the count,
the sum itself could cause problems that dividing by the count wouldn't
necessarily solve. In an ideal world, cleverness would also be applied
to calculating the mean.

k = 0
n = 0
For Each v In d
n = n + 1
If n = 1 Then m = v Else m = m * (k / n) + v / n
k = n
Next v

Enter 1.4434 in A1:A3. Enter 1 in B1, 0 in C1 and =A1 in D1. Enter
=B1+1 in B2, =B1 in C2, =D1*(C2/B2)+A2/B2 in D2. Select B2:D2 and fill
down into B3:D3. Cell D3 contains the mean. Enter =AVERAGE(A1:A3) in
D4, and =(D3-D4) in D5. D5 evaluates to -2.22045E-16.

Enter =SUMSQ(A1-D4,A2-D4,A3-D4) in D6. It evaluates to 1.47911E-31. Now
enter =SUMSQ(A1-D3,A2-D3,A3-D3) in D7. It evaluates to 0.

Imagine that! Rounding error elimiated without using 80-bit FPU
registers!

Further, even calculating the mean as =(1.4434/3+1.4434/3+1.4434/3)
would eliminate the rounding error. Try
=SUMPRODUCT((A1:A3-SUMPRODUCT(A1:A3/3))^2).

The error is due *EXCLUSIVELY* to rounding error in the calculation of
the mean; however, it's possible to eliminate the rounding error
without using more bits of precision. For completeness, it's necessary
to understand that the order of operations is *ESSENTIAL* in floating
point arithmetic. While floating point addition and multiplication are
commutative, neither is associative, and the distributive law doesn't
hold (that is, all the time; there are fortuitous exceptions). In
floating point arithmetic it's usually best to perform multiplication
first, then addition if accuracy is more important than execution speed.

Dana DeLouis · 01-10-2006, 05:50 PM

> ... But the implementation of STDEV() could
> easily compute the mean internally, taking advantage of
> the 80-bit FP registers. It does not.

I think Harlan is correct. For an alternative opinion, Math programs that
DO take advantage of the FP processor show that the average of:
(1.4434 + 1.4434 + 1.4434)/3
is:
1.4433999999999998

We just can not see the last two digits with Excel.
It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
differently. An argument could be made either way if this is good or bad.

StandardDeviation[{1.4434, 1.4434, 1.4434}]
Returns:
2.7194799110210365*^-16

Which is the same as Excel 2003.
It appears to me that an Excel 2003 worksheet is doing it correctly at the
math coprocessor level.
--
Dana DeLouis
Win XP & Office 2003

"[email protected]" <[email protected]> wrote
in message news:[email protected]...
> "Harlan Grove" wrote:
>> It's not the STDEV function, per se, it's the mean.
>
> I understand. But the implementation of STDEV() could
> easily compute the mean internally, taking advantage of
> the 80-bit FP registers. It does not. Nor does AVERAGE(),
> as you point out. But the OP asked about STDEV(), not
> AVERAGE(). I tailored my comments to the OP's context.
>

[email protected] · 01-10-2006, 10:20 PM

"Harlan Grove" wrote:
> 80 bits wouldn't necessarily help in this case.

I agree that 80 bits of precision instead of 52 bits does not
always lead to "exact" results. I did not intend to imply that.

But the fact is: it does in this case (viz., average of
1.4434 three times). Or at least, that was the explanation
give, and I accepted ti. See my posting in excel.programming
for details. This discussion should really be part of that
thread, not this one. See "Why do VBA and Excel floating
point results differ?".

> The error is due *EXCLUSIVELY* to rounding error in
> the calculation of the mean

Yes. And having (54%) more bits of precision will always
diminish those effects -- at least until the (intermediate)
result is stored into lower-precision variables.

> For completeness, it's necessary to understand that the
> order of operations is *ESSENTIAL* in floating point
> arithmetic.

Which is why I initially said, in this thread, that anything I
would say about the explanation would be wild speculation
without my knowing the internal implementations of Excel
and VBA. I should have left well enough alone -- in this
thread.

If you have that kind of insight, I would appreciate it if you
would contribute to the thread in excel.programming.

Jerry W. Lewis · 01-10-2006, 10:25 PM

XL2003 also returns 2.71947991102104E-16, but per
http://support.microsoft.com/kb/826349/
XL2003 was the first version that changed the STDEV calculation to

=SQRT(SUMSQ(1.4434-AVERAGE(1.4434,1.4434,1.4434),1.4434-AVERAGE(1.4434,1.4434,1.4434),1.4434-AVERAGE(1.4434,1.4434,1.4434))/2)

Prior to XL2003, STDEV was calculated as

=SQRT((SUMSQ(1.4434,1.4434,1.4434)-SUM(1.4434,1.4434,1.4434)^2/3)/2)

which returns the value that the OP reported.

These two formulas are mathematically but not numerically equivalent.

A third approach would use a one-pass updating algorithm

http://groups.google.com/group/micro...6ee0c636ad016a

which would correctly return zero for this standard deviation, even though
1.4434 cannot be exactly represented in binary.

Jerry

"JE McGimpsey" wrote:

> It is pretty large, but it's not simply due to rounding...
>
> XL04:
>
> =STDEV(1.4434,1.4434,1.4434) ===> 2.71947991102104E-16
>
> (I don't have XL03 available right now, but I suspect that the result
> would be the same).
>
> The improvement was undoubtedly part of the overhaul of stats functions
> for XL03/04:
>
> http://support.microsoft.com/kb/828888/en-us
>
>
> In article <[email protected]>,
> "David K" <David [email protected]> wrote:
>
> > the function
> > =STDEV(1.4434,1.4434,1.4434)
> > gives
> > 2.98023E-08
> > (at least on my computer using Excel 2002, sp3)
> >
> > Is this just rounding error due to IEEE double precision. It seems pretty
> > large.
>

[email protected] · 01-10-2006, 11:40 PM

I wrote:
> having (54%) more bits of precision will always
> diminish those effects -- at least until the (intermediate)
> result is stored into lower-precision variables.

Well, that's not correct. First, the mantissa is only 23%
more bits of precision (64 bits). Second, I forget that the
original 52-bit mantissa is effectively simply zero-extended.
So the 80-bit FP result is not necessarily better than the
64-bit result. But I don't believe it will ever be worse.

Jerry W. Lewis · 01-11-2006, 01:10 AM

80-bit reals have 64-bit mantissas vs. effectively 53-bit mantissas in 64-bit
reals, so there would be some improvement, but not as much as from using a
better algorithm. VBA offers very limited control over register arithmetic.
The following code will use register arithmetic within the Abs()

Function SD(x As Double) As Double
SD = Sqr(Abs((x * x + x * x + x * x - (x + x + x) * (x + x + x) / 3) / 2))
End Function

but SD(1.4434) returns 4.65661287307739E-10; better than XL2002's
2.98023223876953E-08, but nowhere near as good as XL2003's
2.71947991102104E-16 or the exact 0 that an updating algorithm would give

http://groups.google.com/group/micro...6ee0c636ad016a

Jerry

"[email protected]" wrote:

> I wrote:
> > having (54%) more bits of precision will always
> > diminish those effects -- at least until the (intermediate)
> > result is stored into lower-precision variables.
>
> Well, that's not correct. First, the mantissa is only 23%
> more bits of precision (64 bits). Second, I forget that the
> original 52-bit mantissa is effectively simply zero-extended.
> So the 80-bit FP result is not necessarily better than the
> 64-bit result. But I don't believe it will ever be worse.
>

Jerry W. Lewis · 01-11-2006, 09:35 AM

1.4433999999999998 is the result of calculating the average using 64-bit
reals. You can display additional figures as
=D2D(AVERAGE(1.4434,1.4434,1.4434)) using my D2D function from
http://groups.google.com/group/micro...06871cf92f8465

64-bit reals can exactly represent all 15 digit integers, but require 17
digits to uniquely characterize a floating point number.

80-bit reals can exactly represent all 19 digit integers

The 80-byte representation of 1.4434 and of the average of three of them are
both equal to 1.44339999999999999997E0 (21 digits required to distinguish
from 1.4434), so the XL2003 algorithm would return 0 if it used 80-byte
calculations.

In this case, you could even use intermediate rounding to 64-bit reals and
still get 0 with a partial use of register precision, as the following VBA
code shows

Function DSq(Optional x As Double = 1.4434) As Double
Dim ave As Double
' each line is calculated in 80-bit register, then stored in 64-bit variable
ave = (x + x + x) / 3
DSq = (x - ave) * (x - ave) + (x - ave) * (x - ave) + (x - ave) * (x -
ave)
End Function

Jerry

"Dana DeLouis" wrote:

> > ... But the implementation of STDEV() could
> > easily compute the mean internally, taking advantage of
> > the 80-bit FP registers. It does not.
>
> I think Harlan is correct. For an alternative opinion, Math programs that
> DO take advantage of the FP processor show that the average of:
> (1.4434 + 1.4434 + 1.4434)/3
> is:
> 1.4433999999999998
>
> We just can not see the last two digits with Excel.
> It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
> differently. An argument could be made either way if this is good or bad.
>
> StandardDeviation[{1.4434, 1.4434, 1.4434}]
> Returns:
> 2.7194799110210365*^-16
>
> Which is the same as Excel 2003.
> It appears to me that an Excel 2003 worksheet is doing it correctly at the
> math coprocessor level.
> --
> Dana DeLouis
> Win XP & Office 2003
>
> "[email protected]" <[email protected]> wrote
> in message news:[email protected]...
> > "Harlan Grove" wrote:
> >> It's not the STDEV function, per se, it's the mean.
> >
> > I understand. But the implementation of STDEV() could
> > easily compute the mean internally, taking advantage of
> > the 80-bit FP registers. It does not. Nor does AVERAGE(),
> > as you point out. But the OP asked about STDEV(), not
> > AVERAGE(). I tailored my comments to the OP's context.
> >
>
>
>

Dana DeLouis · 01-11-2006, 11:55 AM

Hi. Thanks Jerry. As a side note to the op, if you just need the STDEV of
3 cells, it "appears" that an equation will returns zero (0) more often if
all 3 cells are equal.
Perhaps use this:
=SQRT((A1^2+B1^2-B1*C1+C1^2-A1*(B1+C1))/3)
instead of this:
=STDEV(A1:C1)

--
Dana DeLouis
Win XP & Office 2003

"Jerry W. Lewis" <post_a_reply@no_e-mail.com> wrote in message
news:[email protected]...
> 1.4433999999999998 is the result of calculating the average using 64-bit
> reals. You can display additional figures as
> =D2D(AVERAGE(1.4434,1.4434,1.4434)) using my D2D function from
> http://groups.google.com/group/micro...06871cf92f8465
>
> 64-bit reals can exactly represent all 15 digit integers, but require 17
> digits to uniquely characterize a floating point number.
>
> 80-bit reals can exactly represent all 19 digit integers
>
> The 80-byte representation of 1.4434 and of the average of three of them
> are
> both equal to 1.44339999999999999997E0 (21 digits required to distinguish
> from 1.4434), so the XL2003 algorithm would return 0 if it used 80-byte
> calculations.
>
> In this case, you could even use intermediate rounding to 64-bit reals and
> still get 0 with a partial use of register precision, as the following VBA
> code shows
>
> Function DSq(Optional x As Double = 1.4434) As Double
> Dim ave As Double
> ' each line is calculated in 80-bit register, then stored in 64-bit
> variable
> ave = (x + x + x) / 3
> DSq = (x - ave) * (x - ave) + (x - ave) * (x - ave) + (x - ave) * (x -
> ave)
> End Function
>
> Jerry
>
> "Dana DeLouis" wrote:
>
>> > ... But the implementation of STDEV() could
>> > easily compute the mean internally, taking advantage of
>> > the 80-bit FP registers. It does not.
>>
>> I think Harlan is correct. For an alternative opinion, Math programs
>> that
>> DO take advantage of the FP processor show that the average of:
>> (1.4434 + 1.4434 + 1.4434)/3
>> is:
>> 1.4433999999999998
>>
>> We just can not see the last two digits with Excel.
>> It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
>> differently. An argument could be made either way if this is good or
>> bad.
>>
>> StandardDeviation[{1.4434, 1.4434, 1.4434}]
>> Returns:
>> 2.7194799110210365*^-16
>>
>> Which is the same as Excel 2003.
>> It appears to me that an Excel 2003 worksheet is doing it correctly at
>> the
>> math coprocessor level.
>> --
>> Dana DeLouis
>> Win XP & Office 2003
>>
>> "[email protected]" <[email protected]>
>> wrote
>> in message news:[email protected]...
>> > "Harlan Grove" wrote:
>> >> It's not the STDEV function, per se, it's the mean.
>> >
>> > I understand. But the implementation of STDEV() could
>> > easily compute the mean internally, taking advantage of
>> > the 80-bit FP registers. It does not. Nor does AVERAGE(),
>> > as you point out. But the OP asked about STDEV(), not
>> > AVERAGE(). I tailored my comments to the OP's context.
>> >
>>
>>
>>

Harlan Grove · 01-11-2006, 01:00 PM

[email protected] wrote...
....
>If you have that kind of insight, I would appreciate it if you
>would contribute to the thread in excel.programming.

Too lazy to crosspost this myself. Nothing prevents you from putting a
link to this thread in a follow-up in the other ng.

Jerry W. Lewis · 01-12-2006, 12:25 AM

Your alternat formula does seem to be a little better numerically than
pre-2003 STDEV; but it is still squaring terms before subtracting, so it will
be nowhere near as numerically stable as =SQRT(DEVSQ(A1:C1)/2), which is used
by xl2003.

Jerry

"Dana DeLouis" wrote:

> Hi. Thanks Jerry. As a side note to the op, if you just need the STDEV of
> 3 cells, it "appears" that an equation will returns zero (0) more often if
> all 3 cells are equal.
> Perhaps use this:
> =SQRT((A1^2+B1^2-B1*C1+C1^2-A1*(B1+C1))/3)
> instead of this:
> =STDEV(A1:C1)
>
> --
> Dana DeLouis
> Win XP & Office 2003
>
>
> "Jerry W. Lewis" <post_a_reply@no_e-mail.com> wrote in message
> news:[email protected]...
> > 1.4433999999999998 is the result of calculating the average using 64-bit
> > reals. You can display additional figures as
> > =D2D(AVERAGE(1.4434,1.4434,1.4434)) using my D2D function from
> > http://groups.google.com/group/micro...06871cf92f8465
> >
> > 64-bit reals can exactly represent all 15 digit integers, but require 17
> > digits to uniquely characterize a floating point number.
> >
> > 80-bit reals can exactly represent all 19 digit integers
> >
> > The 80-byte representation of 1.4434 and of the average of three of them
> > are
> > both equal to 1.44339999999999999997E0 (21 digits required to distinguish
> > from 1.4434), so the XL2003 algorithm would return 0 if it used 80-byte
> > calculations.
> >
> > In this case, you could even use intermediate rounding to 64-bit reals and
> > still get 0 with a partial use of register precision, as the following VBA
> > code shows
> >
> > Function DSq(Optional x As Double = 1.4434) As Double
> > Dim ave As Double
> > ' each line is calculated in 80-bit register, then stored in 64-bit
> > variable
> > ave = (x + x + x) / 3
> > DSq = (x - ave) * (x - ave) + (x - ave) * (x - ave) + (x - ave) * (x -
> > ave)
> > End Function
> >
> > Jerry
> >
> > "Dana DeLouis" wrote:
> >
> >> > ... But the implementation of STDEV() could
> >> > easily compute the mean internally, taking advantage of
> >> > the 80-bit FP registers. It does not.
> >>
> >> I think Harlan is correct. For an alternative opinion, Math programs
> >> that
> >> DO take advantage of the FP processor show that the average of:
> >> (1.4434 + 1.4434 + 1.4434)/3
> >> is:
> >> 1.4433999999999998
> >>
> >> We just can not see the last two digits with Excel.
> >> It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
> >> differently. An argument could be made either way if this is good or
> >> bad.
> >>
> >> StandardDeviation[{1.4434, 1.4434, 1.4434}]
> >> Returns:
> >> 2.7194799110210365*^-16
> >>
> >> Which is the same as Excel 2003.
> >> It appears to me that an Excel 2003 worksheet is doing it correctly at
> >> the
> >> math coprocessor level.
> >> --
> >> Dana DeLouis
> >> Win XP & Office 2003
> >>
> >> "[email protected]" <[email protected]>
> >> wrote
> >> in message news:[email protected]...
> >> > "Harlan Grove" wrote:
> >> >> It's not the STDEV function, per se, it's the mean.
> >> >
> >> > I understand. But the implementation of STDEV() could
> >> > easily compute the mean internally, taking advantage of
> >> > the 80-bit FP registers. It does not. Nor does AVERAGE(),
> >> > as you point out. But the OP asked about STDEV(), not
> >> > AVERAGE(). I tailored my comments to the OP's context.
> >> >
> >>
> >>
> >>
>
>
>

Rounding error in Stdev function result.

LinkBack

Thread Tools

Rate This Thread

Display

Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

RE: Rounding error in Stdev function result.

RE: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Re: Rounding error in Stdev function result.

Thread Information

Users Browsing this Thread

Bookmarks

Bookmarks

Posting Permissions