Originally Posted by
mccskey
If my quantities were equally distributed at the 50% percentile than my average price would be the weighted average price of $77.48.
Perhaps. I'm not bothering to vet your assertion about any dependency on the distribution.
But the point is: in fact, we know nothing about the distribution.
For example, for transaction #1 with an average $100 for a quantity of 100,000, the individual prices might be as low as $1 and as high as 9,900,001. So even if the n-th percentile were in transation #1, we cannot say what the "average price" of the n-th percentile is.
So with the data that you provide, the best that we can do is treat each of the 100,000 items in transaction #1 as having a price of $100, the average.
-----
That said, there are many things wrong with your question.
First, 50%ile is the median, not the average (mean), necessarily.
Second, you already calculate the weighted average -- or the best that we can do -- in C11.
So, either you have the answer that you require already. Or we should interpret the problem as: determine the median "average price".
-----
To that end, percentiles are based on an order set of data.
So the 10%ile is not among the prices in transaction #1 because 10% of 654476 (total) is 65447.6, which is less than 100,000.
Instead, we must create the following table, conceptually or actually, sorting the original data based on "average price".
|
B |
C |
D |
16 |
Cuml Qty |
Qty |
Avg Price |
17 |
0 |
171,984 |
$50.00 |
18 |
171,985 |
250,000 |
$75.00 |
19 |
421,985 |
28,854 |
$90.00 |
20 |
450,839 |
88,500 |
$99.00 |
21 |
539,339 |
100,000 |
$100.00 |
22 |
639,339 |
6,793 |
$110.00 |
23 |
646,132 |
8,345 |
$150.00 |
Since 65447.6 is less than 171,984, the 10%ile "average price" is $50, since we must treat each item as having the same $50 "average price", based on my first comment.
Likewise, since 130,895.2 (20% of 654,476) is also less than 171,984, the 20%ile "average price" is also $50.
So the formulas in G2:G12 are of the form (in G2):
=VLOOKUP(F2*$C$10, $B$17:$D$23, 3)
PS.... If the n-th percentile were between 171,984 and 171,985, for example, we should interpolate between $50 and $75. IMHO, it's a complication that is not worth the bother, since the probability seems unlikely.
-----
Normally, with "grouped data" like what you have, we would interpolate the lower and upper limits of the group in order to determine the n-th percentile.
But again, we do not have that data.
Bookmarks