Closed Thread
Results 1 to 4 of 4

STDEV

  1. #1
    Kimo
    Guest

    STDEV

    For some numbers (e.g. 1.35, 2.8, 11.73) the standard deviation of the three
    same numbers do not result to 0. Why? I tried it on four different computers.

  2. #2
    Biff
    Guest

    Re: STDEV

    Hi!

    The most probable explanation is that the values are calculated and only
    appear to be the same. The true underlying values are different. You can
    check by selecting the cells and changing the format to number and
    increasing the decimal place.

    Biff

    "Kimo" <[email protected]> wrote in message
    news:[email protected]...
    > For some numbers (e.g. 1.35, 2.8, 11.73) the standard deviation of the
    > three
    > same numbers do not result to 0. Why? I tried it on four different
    > computers.




  3. #3

    Re: STDEV

    "Biff" wrote:
    > "Kimo" <[email protected]> wrote:
    > > For some numbers (e.g. 1.35, 2.8, 11.73) the standard
    > > deviation of the three same numbers do not result to 0.
    > > Why? I tried it on four different computers.

    >
    > The most probable explanation is that the values are
    > calculated and only appear to be the same. The true
    > underlying values are different.


    That is certainly one possible explanation, but not necessarily
    "the most probable". Based on another thread on the same
    subject, I suspect the OP is entering the numbers as constants.
    (I wonder if this question is part of a class assignment.)

    Anyway, the answer is somewhat the same. Binary computers
    generally cannot represent decimal fractions exactly. When
    arithmetic is performed on these numbers, precision may be
    lost.

    (Also, before Excel 2003, STDEV() used an algorithm that
    exacerbated the problem with limited binary precision.)

    It is interesting to compare the results of STDEV() with the
    following VBA functions:

    Function mystdev(x As Double) As Double
    ' enter =mystdev(number) in spreadsheet
    Dim avg As Double
    Dim var As Double
    avg = (x + x + x) / 3
    var = ((x - avg) ^ 2 + (x - avg) ^ 2 + (x - avg) ^ 2) / 2
    mystdev = Sqr(var)
    End Function

    Function mystdev2(x As Double) As Double
    ' enter =mystdev2(number) in spreadsheet
    mystdev2 = mystdev3(x, x, 1)
    End Function

    Private Function mystdev3(x As Double, sum As Double, cnt As Integer) As
    Double
    ' internal; do not use in spreadsheet
    Dim avg As Double
    Dim var As Double
    If cnt < 3 Then
    mystdev3 = mystdev3(x, sum + x, cnt + 1)
    Else
    avg = sum / 3
    var = ((x - avg) ^ 2 + (x - avg) ^ 2 + (x - avg) ^ 2) / 2
    mystdev3 = Sqr(var)
    End If
    End Function

    For the OP's three examples (and others), mystdev() does
    result in zero, whereas mystdev2() has the same result as
    STDEV(). The most likely explanation is that compiled VBA
    code for mystdev() takes advantage of internal registers that
    have more precision for intermediate computation. mystdev2()
    is written to prevent such optimizations.

    > You can check by selecting the cells and changing the
    > format to number and increasing the decimal place.


    Specifically, format as Scientific with 14 decimal places.


  4. #4
    Jerry W. Lewis
    Guest

    RE: STDEV

    There is an extensive recent discussion of this phenomenon in the "Rounding
    error in Stdev function result" thread of the worhsheet functions newsgroup
    http://groups.google.com/group/micro...4a0544a0e6d1cc

    The short answer is that most decimal fractions (including all three you
    mention and 1.4434 which the other thread discusses) have no exact binary
    representation (much as 1/3 has no exact decimal representation) and hence
    must be approximated. When you do math with approximate inputs it shoud be
    no surprise that the output is only approximate.

    Given that, the accuracy of the approximate final result will vary with the
    algorithm used.

    The pre-2003 algorithm actually does give 0 for three identical values of
    1.35 and will give the least possible error when all inputs are integers of
    moderate size in a sample of moderate size. It was discarded for the newer
    algorithm because when it goes wrong, its error is much larger than that of
    the 2003 algorithm.

    The old algoirthm is roughly equivalent to
    =SQRT(ABS(SUMSQ(data)-SUM(data)^2/COUNT(data))/(COUNT(data)-1))
    in that this seems to agree with the old algorithm whenever the old formula
    is nonzero (oddly, this formula is nonzero for 1.35 even though the old
    formula is zero).

    The new algoritm is equivalent to
    =SQRT(DEVSQ(data)/(COUNT(data)-1))

    The old formula sums squares of big numbers then subtracts another big
    number. Most of the available precision is taken up in representing those
    big sums of squares, resulting in less precision for the result of the
    subtraction. The new formula (see help for DEVSQ) first calculates the
    average, then squares deviations from that average. Much more precision
    survives the subtraction, hence the better worst-case behavior.

    This also shows where the error must come in for your special case; try the
    following formula (keep the outer parentheses!)
    =(x-AVERAGE(x,x,x))
    When AVERAGE(x,x,x) is not exactly identical to the identical numbers being
    averaged, then STDEV will give a non-zero result. This can only happen
    because the numbers are non-terminating binary fractions, and the sum of the
    approximations is different than the approximation to the sum.

    I prefer yet a third approach to the standard deviation calculation
    http://groups.google.com/group/micro...6ee0c636ad016a
    One or the other of Excel's approaches may do a little better for a specific
    set of numbers, but the worst case properties for these updating algorithms
    are much better than either of Excel's approaches. Moreover these updating
    algorithms are guaranteed to recognize the situation where all input numbers
    are identical, and return exactly the input number as the mean, and zero as
    the standard deviation.

    If you want to learn more about internal binary approximations to numbers,
    you might find the functions at
    http://groups.google.com/group/micro...06871cf92f8465
    to be useful.

    Jerry

    "Kimo" wrote:

    > For some numbers (e.g. 1.35, 2.8, 11.73) the standard deviation of the three
    > same numbers do not result to 0. Why? I tried it on four different computers.


Closed Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1