+ Reply to Thread
Results 1 to 9 of 9

Explanation of SUMPRODUCT

  1. #1
    Toppers
    Guest

    Explanation of SUMPRODUCT

    Could someone please explain how the following works:

    =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1))

    I understand this calculates the number of unique values in a list so if
    A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For example,
    I don't understand how COUNTIF works in this situation i.e where the
    comparator is an array.

    In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha
    string e.g. abcde to a numbers (12345) and summed them. Any explanation of
    how this works would also be appreciated.

    I have looked at the XLD site article on SUMPRODUCT and understand the
    basics but these more advanced uses are baffling me!

    Many thanks in advance.

    John


  2. #2
    Bob Phillips
    Guest

    Re: Explanation of SUMPRODUCT

    John,

    Here goes with my attempt.

    Let's start by defining the range A1:A10 to talk specifics.

    Bob,John,Bob,Bob,John,Jon,Bob,Bill,Bill,Max

    In the example that you show, which should be (at least)
    =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) by the way, the COUNTIF with an
    array value is being used to count the number of each of times the value in
    each cell is repeated.
    So in this case, A1 holds Bob which is repeated 4 times, so that part of the
    COUNTIF returns 4.
    A2 holds John, so COUNTIF returns 3 for A2.
    But A3 also holds Bob, which also returns 4.
    And so on, with return values of 4,3,3,4,2,1 (I'll leave that to you to work
    through).
    So the COUNTIF returns an array of {4,3,4,4,3,3,4,2,1}.
    The array results of the COUNTIF are then divided into 1 to get a fractional
    value of each element of the array. This is the part that effectively does
    the counting, as the 4 instances of Bob each return an array element of 4,
    which when divided into 1, each give 0.25, which when added together gives
    1. Voila.
    So the array returned by 1/(COUNTIF(A1:A10,A1:A10)) is
    {0.25;0.33333;0.25;0.25;0.33333;0.33333;0.25;0.5;0.5;1}
    SUMPRODUCT then adds these up to come up with the number of unique entries,
    4 in this case, because each separate value in the test range sums to 1.

    I don't recall the case that you mention, but I do recall a formula where I
    used SUMPRODUCT to add the values at the end of a mixed text/number field,
    like Mike 10. The formula in question was

    =SUMPRODUCT(--(MID(A1:A10,FINDÂ*("~",SUBSTITUTE(A1:A10,"%%%","~",LEN(A1:A10)-
    LEN(SUBSTITUÂ*TE(A1:A10," ",""))))+1,99)))

    Not that the %%% should be a single space, but I put %%% in as the NG wraps
    at that point and can lose it.

    What this is doing is simply using a tried and trusted formula,
    MID(A1,FINDÂ*("~",SUBSTITUTE(A1,"","~",LEN(A1)-LEN(SUBSTITUÂ*TE(A1,"
    ",""))))+1,99), but this time on a range not a single cell, to get the last
    part of a delimited string, delimited by space in this case. So in Mike 10,
    it returns 10, In Text String 18 it returns 18. SUMPRODUCT is then used just
    to sum the results.

    BTW, to get a better understanding of what goes on in these things, use the
    F9 to evaluate the formula. In the formula bar, select the part of the
    formula you wish to evaluate, press F9, and you see the results. Hit Esc to
    exit.

    --

    HTH

    RP
    (remove nothere from the email address if mailing direct)


    "Toppers" <[email protected]> wrote in message
    news:[email protected]...
    > Could someone please explain how the following works:
    >
    > =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1))
    >
    > I understand this calculates the number of unique values in a list so if
    > A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For

    example,
    > I don't understand how COUNTIF works in this situation i.e where the
    > comparator is an array.
    >
    > In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha
    > string e.g. abcde to a numbers (12345) and summed them. Any explanation

    of
    > how this works would also be appreciated.
    >
    > I have looked at the XLD site article on SUMPRODUCT and understand the
    > basics but these more advanced uses are baffling me!
    >
    > Many thanks in advance.
    >
    > John
    >




  3. #3
    Toppers
    Guest

    Re: Explanation of SUMPRODUCT

    Bob,
    Many thannks for the prompt reply and excellent explanation (plus F9
    tip). All is much clearer now!

    I never cease to be amazed by the ingenuity shown by the experts like
    yourself on using the features/functions in Excel ... but it's great to have
    you all available online.

    Again many thanks - your help is really appreciated.

    John


    "Bob Phillips" wrote:

    > John,
    >
    > Here goes with my attempt.
    >
    > Let's start by defining the range A1:A10 to talk specifics.
    >
    > Bob,John,Bob,Bob,John,Jon,Bob,Bill,Bill,Max
    >
    > In the example that you show, which should be (at least)
    > =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) by the way, the COUNTIF with an
    > array value is being used to count the number of each of times the value in
    > each cell is repeated.
    > So in this case, A1 holds Bob which is repeated 4 times, so that part of the
    > COUNTIF returns 4.
    > A2 holds John, so COUNTIF returns 3 for A2.
    > But A3 also holds Bob, which also returns 4.
    > And so on, with return values of 4,3,3,4,2,1 (I'll leave that to you to work
    > through).
    > So the COUNTIF returns an array of {4,3,4,4,3,3,4,2,1}.
    > The array results of the COUNTIF are then divided into 1 to get a fractional
    > value of each element of the array. This is the part that effectively does
    > the counting, as the 4 instances of Bob each return an array element of 4,
    > which when divided into 1, each give 0.25, which when added together gives
    > 1. Voila.
    > So the array returned by 1/(COUNTIF(A1:A10,A1:A10)) is
    > {0.25;0.33333;0.25;0.25;0.33333;0.33333;0.25;0.5;0.5;1}
    > SUMPRODUCT then adds these up to come up with the number of unique entries,
    > 4 in this case, because each separate value in the test range sums to 1.
    >
    > I don't recall the case that you mention, but I do recall a formula where I
    > used SUMPRODUCT to add the values at the end of a mixed text/number field,
    > like Mike 10. The formula in question was
    >
    > =SUMPRODUCT(--(MID(A1:A10,FINDĂ‚Â*("~",SUBSTITUTE(A1:A10,"%%%","~",LEN(A1:A10)-
    > LEN(SUBSTITUĂ‚Â*TE(A1:A10," ",""))))+1,99)))
    >
    > Not that the %%% should be a single space, but I put %%% in as the NG wraps
    > at that point and can lose it.
    >
    > What this is doing is simply using a tried and trusted formula,
    > MID(A1,FINDĂ‚Â*("~",SUBSTITUTE(A1,"","~",LEN(A1)-LEN(SUBSTITUĂ‚Â*TE(A1,"
    > ",""))))+1,99), but this time on a range not a single cell, to get the last
    > part of a delimited string, delimited by space in this case. So in Mike 10,
    > it returns 10, In Text String 18 it returns 18. SUMPRODUCT is then used just
    > to sum the results.
    >
    > BTW, to get a better understanding of what goes on in these things, use the
    > F9 to evaluate the formula. In the formula bar, select the part of the
    > formula you wish to evaluate, press F9, and you see the results. Hit Esc to
    > exit.
    >
    > --
    >
    > HTH
    >
    > RP
    > (remove nothere from the email address if mailing direct)
    >
    >
    > "Toppers" <[email protected]> wrote in message
    > news:[email protected]...
    > > Could someone please explain how the following works:
    > >
    > > =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1))
    > >
    > > I understand this calculates the number of unique values in a list so if
    > > A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For

    > example,
    > > I don't understand how COUNTIF works in this situation i.e where the
    > > comparator is an array.
    > >
    > > In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha
    > > string e.g. abcde to a numbers (12345) and summed them. Any explanation

    > of
    > > how this works would also be appreciated.
    > >
    > > I have looked at the XLD site article on SUMPRODUCT and understand the
    > > basics but these more advanced uses are baffling me!
    > >
    > > Many thanks in advance.
    > >
    > > John
    > >

    >
    >
    >


  4. #4
    Shawn O'Donnell
    Guest

    Re: Explanation of SUMPRODUCT

    "Bob Phillips" wrote:
    > In the example that you show, which should be (at least)
    > =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10)))


    In this version of the formula, I don't think you need SUMPRODUCT. SUM will
    do, since you're dealing with a single array and you aren't doing any
    dot-producting. I don't even think you need to enter it as an array formula.


    But neither the SUMPRODUCT or SUM version will work if one of the cells is
    blank.

    For the purposes of exploring this SUMPRODUCT construct a little further,
    let's say we take Bill's example and erase the contents of the cell that has
    "Jon" in it. The data in A1:A10 is then Bob,John,Bob,Bob,John,
    ,Bob,Bill,Bill,Max

    Here's a parsing of another popular version of this formula, one that
    ignores blank cells:

    =SUM((A1:A10<>"")/COUNTIF(A1:A10,A1:A10&""))

    Piece by piece, here's what that says:

    COUNTIF(A1:A10,A1:A10), as Bob explained, produces a 10-element array where
    each element is the number of times the corresponding cell value appears in
    the range A1:A10.

    The extra &"" at the end is a string concatenation. You're adding a
    zero-length string onto whatever was in each cell. That converts empty
    cells, which Excel interprets as the number 0, to empty strings. Why do you
    want to do that? Because if you represent empty cells as 0, then you'll end
    up with zero in the denominator of the 1/COUNTIF() expression. And you know
    what happens when you do that. So the COUNTIF(A1:A10.A1:A10&"") part of the
    expression evaluates to: {4;2;4;4;2;1;4;2;2;1}. If we didn't have the &"",
    the array would have been {4;2;4;4;2;0;4;2;2;1}. Notice the zero.

    Why is that zero there? Because COUNTIF promotes an empty cell to 0 if the
    cell is in the 'criteria' argument, but not if the empty cell is in the
    'range' argument position. Try this: put the number 0 in B1 and B2. Put
    COUNTIF(B1,B2) in cell B3. Then try deleting the zeros in B1 and B2 one at a
    time. Watch how it affects B3. (Let me know if you find the documentation
    on this.)

    Now, back to the numerator of our array-over-array fraction: we've got
    (A1:A10<>"") We're checking whether each cell in A1:A10 is not an empty
    cell. That evaluates to an array of Booleans:
    {TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;TRUE}

    In arithmetic expressions, Excel interprets TRUE as 1 and FALSE as 0. You
    will sometimes see Booleans preceded by a double minus, --, which will
    explicitly force a Boolean-to-integer conversion.

    So for each entry representing a non-empty cell, we've got

    1/(the number of cells with that value in them)

    and for empty cells we've got

    0/(the number of empty cells).

    Add up the resulting array of fractions and you end up with a count of
    unique non-empty values in the range A1:A10.


    I find that I can get away without using an array formula if I use
    SUMPRODUCT to add the array, but not if I use SUM. If I use SUM, I have to
    enter the formula as an array formula. I suppose SUMPRODUCT knows how to
    handle an array divided by an array piecewise, while SUM doesn't.



  5. #5
    Bob Phillips
    Guest

    Re: Explanation of SUMPRODUCT

    "Shawn O'Donnell" <[email protected]> wrote in message
    news:[email protected]...
    > "Bob Phillips" wrote:
    > > In the example that you show, which should be (at least)
    > > =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10)))

    >
    > In this version of the formula, I don't think you need SUMPRODUCT. SUM

    will
    > do, since you're dealing with a single array and you aren't doing any
    > dot-producting. I don't even think you need to enter it as an array

    formula.

    Yes, we know this, but in this case SUM is an array formula SUMPRODUCT
    isn't.

    > But neither the SUMPRODUCT or SUM version will work if one of the cells is
    > blank.
    >
    > For the purposes of exploring this SUMPRODUCT construct a little further,
    > let's say we take Bill's example and erase the contents of the cell that

    has
    > "Jon" in it. The data in A1:A10 is then Bob,John,Bob,Bob,John,
    > ,Bob,Bill,Bill,Max
    >
    > Here's a parsing of another popular version of this formula, one that
    > ignores blank cells:
    >
    > =SUM((A1:A10<>"")/COUNTIF(A1:A10,A1:A10&""))


    Yes we know all this too, and it works just as well with SUMPRODUCT, which
    is till not an array formula.As I recall, this technique was first suggested
    with SP not SUM .

    > Piece by piece, here's what that says:
    >
    > COUNTIF(A1:A10,A1:A10), as Bob explained, produces a 10-element array

    where
    > each element is the number of times the corresponding cell value appears

    in
    > the range A1:A10.
    >
    > The extra &"" at the end is a string concatenation. You're adding a
    > zero-length string onto whatever was in each cell. That converts empty
    > cells, which Excel interprets as the number 0, to empty strings. Why do

    you
    > want to do that? Because if you represent empty cells as 0, then you'll

    end
    > up with zero in the denominator of the 1/COUNTIF() expression. And you

    know
    > what happens when you do that. So the COUNTIF(A1:A10.A1:A10&"") part of

    the
    > expression evaluates to: {4;2;4;4;2;1;4;2;2;1}. If we didn't have the

    &"",
    > the array would have been {4;2;4;4;2;0;4;2;2;1}. Notice the zero.


    You are not adding a "" onto whatever was ikn each cell, buut simply on to
    the criteria values.

    > Why is that zero there? Because COUNTIF promotes an empty cell to 0 if

    the
    > cell is in the 'criteria' argument, but not if the empty cell is in the
    > 'range' argument position. Try this: put the number 0 in B1 and B2. Put
    > COUNTIF(B1,B2) in cell B3. Then try deleting the zeros in B1 and B2 one

    at a
    > time. Watch how it affects B3. (Let me know if you find the

    documentation
    > on this.)


    I assume tyhat you mean =COUNTIF(B1:B2,B2)

    >snip


    > I find that I can get away without using an array formula if I use
    > SUMPRODUCT to add the array, but not if I use SUM. If I use SUM, I have

    to
    > enter the formula as an array formula. I suppose SUMPRODUCT knows how to
    > handle an array divided by an array piecewise, while SUM doesn't.


    It is not SUM that is the problem here but COUNTIF. SUM can work on arrays.
    SUMPRODUCT can work on arrays. COUNTIF expects to work on a range array, but
    a criteria value. To work on an array where a single value, be that a
    hard-coded value or a celle reference, you have to use an array formula.



  6. #6
    Shawn O'Donnell
    Guest

    Re: Explanation of SUMPRODUCT

    "Bob Phillips" wrote:
    > I assume tyhat you mean =COUNTIF(B1:B2,B2)


    Thanks for the reply.

    I meant COUNTIF(B1,B2), since that's the easiest way to show how COUNTIF
    treats empty cells differently, depending on whether the emtpy cell is in the
    'array' or 'criteria' argument. It's instructive to look at the four
    possibilities for B1 & B2, then compare the results with the same arguments
    but using COUNTIF(B1,B2&"").

    > It is not SUM that is the problem here but COUNTIF. SUM can work
    > on arrays. SUMPRODUCT can work on arrays. COUNTIF expects
    > to work on a range array, but a criteria value. To work on an array
    > where a single value, be that a hard-coded value or a celle reference,
    > you have to use an array formula.


    I don't follow you there. How is it that SUMPRODUCT gets to be an array
    formula regardless of how you enter it, but SUM has to be entered with
    Ctl-Shift-Enter to become an array formula?

    All that differs in the two versions is the context provided by SUM &
    SUMPRODUCT. Both functions accept arrays, but SUM doesn't seem to provide an
    array context for evaluating functions in its arguments, while SUMPRODUCT
    does.

    Here's another example. Say you wanted to know the total number of letters
    in A1:A10.

    =SUM(LEN(A1:A10)) will work only if entered as an array formula, but
    =SUMPRODUCT(LEN(A1:A10)) will work either way.

    If, in the SUM example, you select LEN(A1:A10) and hit F9 to evaluate the
    LEN() function, you create an array that SUM can digest, even in a non-array
    formula. But SUM by itself isn't able to interpret the LEN() as an array
    without the whole formula being entered as an array formula. SUMPRODUCT can.

    Do you know of any documentation concerning this array-producing context for
    evaluating scalar functions on ranges of cells (without having to enter the
    formula as an array formula)?

    Is SUMPRODUCT the only function that provides this context?

    I can imagine using this feature in user-defined functions (if possible,)
    but it would be nice if I could learn more about the phenomenon first.


  7. #7
    Bob Phillips
    Guest

    Re: Explanation of SUMPRODUCT


    "Shawn O'Donnell" <[email protected]> wrote in message
    news:[email protected]...

    Hi Shawn,

    > I meant COUNTIF(B1,B2), since that's the easiest way to show how COUNTIF
    > treats empty cells differently, depending on whether the emtpy cell is in

    the
    > 'array' or 'criteria' argument. It's instructive to look at the four
    > possibilities for B1 & B2, then compare the results with the same

    arguments
    > but using COUNTIF(B1,B2&"").


    Well as I said, with COUNTIF(B1,B2) I didn't get the point you were making,
    but with COUNTIF(B1:B2,B2) I thought I did.

    > > It is not SUM that is the problem here but COUNTIF. SUM can work
    > > on arrays. SUMPRODUCT can work on arrays. COUNTIF expects
    > > to work on a range array, but a criteria value. To work on an array
    > > where a single value, be that a hard-coded value or a celle reference,
    > > you have to use an array formula.

    >
    > I don't follow you there. How is it that SUMPRODUCT gets to be an array
    > formula regardless of how you enter it, but SUM has to be entered with
    > Ctl-Shift-Enter to become an array formula?


    What I mean is that SUM will work on an array anyway, COUNTIF will not work
    on an array criteria (directly that is). So it SUM doesn;y need to be array
    entered nor does SUMPRODUCT, but if you want the COUNTIF criteria to be a
    range, the whole thing gets array entered.

    > All that differs in the two versions is the context provided by SUM &
    > SUMPRODUCT. Both functions accept arrays, but SUM doesn't seem to provide

    an
    > array context for evaluating functions in its arguments, while SUMPRODUCT
    > does.


    No, again you are looking at the wrongt function.

    > Here's another example. Say you wanted to know the total number of

    letters
    > in A1:A10.
    >
    > =SUM(LEN(A1:A10)) will work only if entered as an array formula, but
    > =SUMPRODUCT(LEN(A1:A10)) will work either way.
    >
    > If, in the SUM example, you select LEN(A1:A10) and hit F9 to evaluate the
    > LEN() function, you create an array that SUM can digest, even in a

    non-array
    > formula. But SUM by itself isn't able to interpret the LEN() as an array
    > without the whole formula being entered as an array formula. SUMPRODUCT

    can.

    Yes but try =SUM(IF(LEFT(A1:A10,1)="a",LEN(A1:A10))). That doesn't work
    unless you array enter it. Again, it is not SUM that causes this but IF.

    =SUMPRODUCT(--(LEFT(A1:A10,1)="a"),LEN(A1:A10)) works non-array entered.

    > Do you know of any documentation concerning this array-producing context

    for
    > evaluating scalar functions on ranges of cells (without having to enter

    the
    > formula as an array formula)?


    Bob Umlas wrote the best paper that I have read. You can get it at
    http://www.emailoffice.com/excel/arrays-bobumlas.html

    > Is SUMPRODUCT the only function that provides this context?


    Depends upon what you mean. SUMPRODUCT is not the only function that can
    work upon ranges/arrays, but it is the only non-array function capable of
    multiple condition tests. Read more about it at

    http://www.xldynamic.com/source/xld.SUMPRODUCT.html



  8. #8
    Madiya
    Guest

    Re: Explanation of SUMPRODUCT

    Hi Bob,
    The link you provided is very usefull. I learnt a lot. Not only
    SUMPRODUCT but other functions also.
    Thanks.

    Bob Phillips wrote:
    > "Shawn O'Donnell" <[email protected]> wrote in

    message
    > news:[email protected]...
    >
    >
    > Bob Umlas wrote the best paper that I have read. You can get it at
    > http://www.emailoffice.com/excel/arrays-bobumlas.html
    >
    >


    > http://www.xldynamic.com/source/xld.SUMPRODUCT.html



  9. #9
    Bob Phillips
    Guest

    Re: Explanation of SUMPRODUCT

    Hi Madiya,

    Glad it helped.

    Bob


    "Madiya" <[email protected]> wrote in message
    news:[email protected]...
    > Hi Bob,
    > The link you provided is very usefull. I learnt a lot. Not only
    > SUMPRODUCT but other functions also.
    > Thanks.
    >
    > Bob Phillips wrote:
    > > "Shawn O'Donnell" <[email protected]> wrote in

    > message
    > > news:[email protected]...
    > >
    > >
    > > Bob Umlas wrote the best paper that I have read. You can get it at
    > > http://www.emailoffice.com/excel/arrays-bobumlas.html
    > >
    > >

    >
    > > http://www.xldynamic.com/source/xld.SUMPRODUCT.html

    >




+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1