Closed Thread
Results 1 to 14 of 14

Weighted Median

  1. #1
    Registered User
    Join Date
    01-21-2016
    Location
    Portland, Oregon
    MS-Off Ver
    Mac 2011
    Posts
    4

    Weighted Median

    Is it really the case that Excel can't calculate a weighted median? I need to calculate the median wage for various job positions in an industry. For each job position, wages are in one column, and the number of people who are paid that wage are counted in another column. For example,

    Col.A.......Col.B
    4.........$11.00
    3.........$12.00
    8.........$10.00
    2.........$12.00
    1.........$17.00
    5.......... $9.25

    I have about 700 entries, so there is no way I am going to create helper columns for this, and I probably shouldn't have to. A weighted median seems like a very basic and commonly needed calculation that Excel should easily be able to do. Right?

  2. #2
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Weighted Median

    Maybe ...

    Row\Col
    A
    B
    C
    D
    E
    1
    Qty
    Wage
    2
    4
    $11.00
    $10.00
    D2: {=MEDIAN(IF(COLUMN(INDIRECT("1:" & MAX(A2:A7))) <= A2:A7, B2:B7))}
    3
    3
    $12.00
    4
    8
    $10.00
    5
    2
    $12.00
    6
    1
    $17.00
    7
    5
    $9.25
    Entia non sunt multiplicanda sine necessitate

  3. #3
    Forum Guru Pete_UK's Avatar
    Join Date
    12-31-2011
    Location
    Warrington, England
    MS-Off Ver
    Office 2019 (still learning)
    Posts
    24,704

    Re: Weighted Median

    Try this:

    =SUMPRODUCT(A1:A6,B1:B6)/SUM(A1:A6)

    Adjust the ranges to suit your real data.

    Hope this helps.

    Pete

  4. #4
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Weighted Median

    That's a weighted average, Pete.

  5. #5
    Registered User
    Join Date
    01-21-2016
    Location
    Portland, Oregon
    MS-Off Ver
    Mac 2011
    Posts
    4

    Re: Weighted Median

    The array in column E above appears to work. I can't thank you enough!
    Last edited by jhinkel; 02-01-2016 at 02:50 PM. Reason: clarity

  6. #6
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Weighted Median

    You're welcome.

  7. #7
    Forum Expert
    Join Date
    05-01-2014
    Location
    California, US
    MS-Off Ver
    Excel 2010
    Posts
    1,795

    Re: Weighted Median

    Quote Originally Posted by jhinkel View Post
    Is it really the case that Excel can't calculate a weighted median? I need to calculate the median wage for various job positions in an industry. [....]
    I have about 700 entries, so there is no way I am going to create helper columns for this
    Quote Originally Posted by shg View Post
    Maybe [....]
    D2: {=MEDIAN(IF(COLUMN(INDIRECT("1:" & MAX(A2:A7))) <= A2:A7, B2:B7))}
    Quote Originally Posted by jhinkel View Post
    The array in column E [sic] above appears to work.
    Only if you want to calculate the actual median, not the weighted median.

    Shg's formula calculates the actual median. Of course, that is more accurate than the weighted median. But the correctness depends on what your assignment requires.

    PS.... The weighted mean makes more sense when we have data grouped in ranges; for example, 5 between 0 and 9.25, 8 between 9.25 and 10, 4 between 10 and 11, etc. Then we cannot calculate an actual mean. We must interpolate the range that contains the 50 %ile.

    Moreover, shg's formula seems unnecessarily inefficient. And it works in an "xls" file (XL2003 compatible) only if the frequency of data (column A) is not more than 256.

    First, for your example, the formula effectively calculates COLUMN(1:8). That is unnecessary, since that is the same as COLUMN(1:1). And COLUMN(1:1) returns 256 in an "xls" file and 16,384 in an "xlsx" or similar file (XL2007 and later).

    Consequently, for your actual data ("about 700 entries"), IF(COLUMN(1:1)<=A2:A7,B2:B7) would construct a 700-by-256 array (179,200 elements) or a 700-by-16384 array (11,468,800 elements!).

    Secondly, because shg's formula uses INDIRECT, Excel recalculates the array formula every time it (re)calculates anything. That can slow down data entry, especially in an "xlsx" file.

    That said, there are advantages to using INDIRECT. In particular, it protects the formula from unexpected changes when the worksheet design is changed. But IMHO, it is not worth potentially 11.5 million incessant recalculations.

    The following demonstrates these points....

    For demonstration purposes only, I simplified the COLUMN expression, since the INDIRECT is not really needed. Also, I replaced the two instances of $12.00 with $12.01 and $12.02 so we can see where each is in the array created by the formula.

    A B C D E F G H I
    1 freq amt
    2 4 $11.00
    3 3 $12.01
    4 8 $10.00
    5 2 $12.02
    6 1 $17.00
    7 5 $9.25
    8
    9 med $10.00 B9: {=MEDIAN(IF(COLUMN($1:$1)<=A2:A7,B2:B7))}
    10
    11 rows 6 B11: {=ROWS(IF(COLUMN($1:$1)<=A2:A7,B2:B7))}
    12 cols 16384 C11: {=COLUMNS(IF(COLUMN($1:$1)<=A2:A7,B2:B7))}
    13
    14 A15:I21: {=IF(COLUMN($1:$1)<=A2:A7,B2:B7)}
    15 11 11 11 11 FALSE FALSE FALSE FALSE FALSE
    16 12.01 12.01 12.01 FALSE FALSE FALSE FALSE FALSE FALSE
    17 10 10 10 10 10 10 10 10 FALSE
    18 12.02 12.02 FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    19 17 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    20 9.25 9.25 9.25 9.25 9.25 FALSE FALSE FALSE FALSE
    21 #N/A
    #N/A #N/A #N/A #N/A #N/A #N/A #N/A #N/A


    Formulas displayed with curly brackets {...} are array-entered, to wit: type the formulas without the curly brackets, then press ctrl+shift+Enter instead of just Enter. Excel displays the curly brackets in the Formula Bar to indicate that the formula is array-entered.

    Row 21 (#N/A) is outside the array. But column I is part of the array (not #N/A), presumably unintentionally. Forgive me for not displaying all 256 or 16384 columns.

    I suspect shg intended the following, which creates just MAX(A2:A7) columns:

    {=MEDIAN(IF(TRANSPOSE(ROW(INDIRECT("1:" & MAX(A2:A7)))) <= A2:A7, B2:B7))}

    That formula still limits the frequency of data (column A), but to 65,565 in "xls" files and 1,048,576 in "xlsx" files. As large as that might sound, it might not be large enough for "job positions in an industry".

    Moreover, that formula still uses INDIRECT. So it is still recalculated incessantly.

    The following demonstrates how to avoid INDIRECT by using INDEX.

    More importantly, it demonstrates how to calculate the weighted median, in principle. See the discussion that follows.

    [.... Contintued in the next posting, due to text size limitations in this forum ....]
    Last edited by joeu2004; 02-02-2016 at 01:56 PM. Reason: PS; cosmetic

  8. #8
    Forum Expert
    Join Date
    05-01-2014
    Location
    California, US
    MS-Off Ver
    Excel 2010
    Posts
    1,795

    Re: Weighted Median

    [.... Continued from ....]
    Quote Originally Posted by joeu2004 View Post
    The following demonstrates how to avoid INDIRECT by using INDEX.

    More importantly, it demonstrates how to calculate the weighted median, in principle. See the discussion that follows.
    A B C D E F G H I
    1 freq amt wt cuml wt
    2 5 $9.25 21.74% 21.74% C2: =A2/SUM($A$2:$A$7)
    3 8 $10.00 34.78% 56.52% D2: =SUM($C$2:C2)
    4 4 $11.00 17.39% 73.91%
    5 3 $12.01 13.04% 86.96%
    6 2 $12.02 8.70% 95.65%
    7 1 $17.00 4.35% 100.00%
    8
    9 wt med $9.86
    B9: =FORECAST(0.5,B2:B3,D2:D3)
    10 med $10.00 B10: {=MEDIAN(IF(TRANSPOSE(ROW(A1:INDEX(A:A,MAX(A2:A7))))<=A2:A7,B2:B7))}
    11
    12 rows 6 B12: {=ROWS(IF(TRANSPOSE(ROW(A1:INDEX(A:A,MAX(A2:A7))))<=A2:A7,B2:B7))}
    13 cols 8 C12: {=COLUMNS(IF(TRANSPOSE(ROW(A1:INDEX(A:A,MAX(A2:A7))))<=A2:A7,B2:B7))}
    14
    15 A16:I22: {=IF(TRANSPOSE(ROW(A1:INDEX(A:A,MAX(A2:A7))))<=A2:A7,B2:B7)}
    16 9.25 9.25 9.25 9.25 9.25 FALSE FALSE FALSE #N/A
    17 10 10 10 10 10 10 10 10 #N/A
    18 11 11 11 11 FALSE FALSE FALSE FALSE #N/A
    19 12.01 12.01 12.01 FALSE FALSE FALSE FALSE FALSE #N/A
    20 12.02 12.02 FALSE FALSE FALSE FALSE FALSE FALSE #N/A
    21 17 FALSE FALSE FALSE FALSE FALSE FALSE FALSE #N/A
    22 #N/A
    #N/A #N/A #N/A #N/A #N/A #N/A #N/A #N/A


    IMHO, the weighted median calculation is anything but "very basic". I don't know if it a "commonly needed calculation". But I would agree it should be implemented in a function: a VBA UDF, if not an Excel built-in or Data Analysis function.

    The data should be sorted by the numeric categories ("amt" in column B), and the weight factors (column C) and cumulative weight factors (column D) should be calculated. The weight factor is the category frequency divided by the sum of the frequencies, so that the sum of the weight factors is 1.

    Then the weighted median is the interpolated "amt" that corresponds to a weight factor of 50%. In this example, it is between B2 and B3.

    Where there are only 2 data points, FORECAST correctly interpolates along the line between them. That is, FORECAST(0.5,B2:B3,D2:D3) returns B2 + (B3-B2)*(0.5-D2)/(D3-D2).

    I realize this does not meet your objective of avoiding helper cells. And it is incomplete insofar as determining the B2 and B3.

    Perhaps some adventurous soul will provide a (torturous) formula to do all this. But as I said, I think it is easier and better implemented in a function.
    Last edited by joeu2004; 02-02-2016 at 02:05 PM. Reason: cosmetic; explain FORECAST

  9. #9
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Weighted Median

    shg's formula seems unnecessarily inefficient
    joeu's post seems unnecessarily verbose.

  10. #10
    Registered User
    Join Date
    06-01-2013
    Location
    Peru
    MS-Off Ver
    Excel 2010
    Posts
    2

    Re: Weighted Median

    thought I had a solution. not yet.
    Last edited by here to help; 06-08-2017 at 02:09 PM.

  11. #11
    Registered User
    Join Date
    03-10-2014
    Location
    France
    MS-Off Ver
    Excel 2019
    Posts
    11

    Re: Weighted Median

    I've worked with Excel quite a lot, but I'm baffled by the formula above in E2. Could someone please explain what the formula is doing, especially the first part of the IF command?

    Thanks for any help.

  12. #12
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2404 (Windows 11 22H2 64-bit)
    Posts
    80,410

    Re: Weighted Median

    Administrative Note:

    We are happy to help, however whilst you feel your request is similar to this thread, experience has shown that things soon get confusing when answers refer to particular cells/ranges/sheets which are unique to your post and not relevant to the original.

    Please see Forum Rule #4 about hijacking and start a new thread for your query.

    If you are not familiar with how to start a new thread see the FAQ: How to start a new thread
    Ali


    Enthusiastic self-taught user of MS Excel who's always learning!
    Don't forget to say "thank you" in your thread to anyone who has offered you help.
    You can reward them by clicking on * Add Reputation below their user name on the left, if you wish.

    Forum Rules (updated August 2023): please read them here.

  13. #13
    Registered User
    Join Date
    03-10-2014
    Location
    France
    MS-Off Ver
    Excel 2019
    Posts
    11

    Re: Weighted Median

    Sorry. I wasn't trying to post a new question. I just was asking for a clarification of a solution that had been posted. I was able to copy and paste the solution, and I could see that it worked, but I didn't understand how it worked or what it was doing. Should I copy the solution, paste it to a new thread, and ask for an explanation of the formula?

    Again, I apologize.

  14. #14
    Forum Moderator AliGW's Avatar
    Join Date
    08-10-2013
    Location
    Retired in Ipswich, Suffolk, but grew up in Sawley, Derbyshire (England)
    MS-Off Ver
    MS 365 Subscription Insider Beta Channel v. 2404 (Windows 11 22H2 64-bit)
    Posts
    80,410

    Re: Weighted Median

    Nonetheless, you still need to open your own thread, please, as explained in my post above.

    This is an old thread and is now being closed.

Closed Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Calculating a Weighted Median
    By kaybee in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 08-14-2019, 06:35 AM
  2. How to do a weighted median?
    By bnelson333 in forum Excel Formulas & Functions
    Replies: 13
    Last Post: 02-01-2016, 01:59 PM
  3. Weighted Median
    By Sammz in forum Excel Formulas & Functions
    Replies: 5
    Last Post: 09-01-2015, 07:56 AM
  4. [SOLVED] Need a UDF to calculate 'weighted median'
    By alice2011 in forum Excel Programming / VBA / Macros
    Replies: 14
    Last Post: 07-29-2015, 09:59 AM
  5. Median Indirect: Find median in range and bring back adjacent cells
    By Keelin in forum Excel Formulas & Functions
    Replies: 12
    Last Post: 06-27-2014, 08:31 AM
  6. [SOLVED] Median Ifs, need to find median $ amount per deal for each year
    By xenomorph8472 in forum Excel Formulas & Functions
    Replies: 3
    Last Post: 11-13-2012, 02:01 PM
  7. Weighted Median with a condition (group by)
    By jaganmskcc in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 03-27-2012, 06:07 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1