+ Reply to Thread
Results 1 to 12 of 12

Matching a value that appears multiple times

  1. #1
    Registered User
    Join Date
    04-26-2006
    Posts
    7

    Matching a value that appears multiple times

    I have 2 columns with data in them, basically representing a gaussian distribution. Column A has the "X-axis" values and so is uniformly ascending with no duplicates. Column B has the "Y-axis" values and increases up to a maximum and then decreases again (this is data from an instrument and so its not completely smooth but is close). An example is below.

    0 4
    1 8
    2 16
    3 27
    4 50
    5 27
    6 16
    7 8
    8 4

    What I would like to do is get the 2 Column A values where the corresponding column B value is half of the max (in the case above, 25 is not available so the closest is 27). I am trying to calculate the difference between these values, so in the example, I would have 5-3. Is there a way to do this?

    Thanks.

  2. #2
    Domenic
    Guest

    Re: Matching a value that appears multiple times

    In your example, 27 is the closest value, and two such values exist.
    Will the two closest values always be the same or can they differ? For
    example, can your data contain the following?

    0 4
    1 8
    2 16
    3 27 <-----
    4 50
    5 23 <-----
    6 16
    7 8
    8 4

    In article <[email protected]>,
    BKGT <[email protected]> wrote:

    > I have 2 columns with data in them, basically representing a gaussian
    > distribution. Column A has the "X-axis" values and so is uniformly
    > ascending with no duplicates. Column B has the "Y-axis" values and
    > increases up to a maximum and then decreases again (this is data from
    > an instrument and so its not completely smooth but is close). An
    > example is below.
    >
    > 0 4
    > 1 8
    > 2 16
    > 3 27
    > 4 50
    > 5 27
    > 6 16
    > 7 8
    > 8 4
    >
    > What I would like to do is get the 2 Column A values where the
    > corresponding column B value is half of the max (in the case above, 25
    > is not available so the closest is 27). I am trying to calculate the
    > difference between these values, so in the example, I would have 5-3.
    > Is there a way to do this?
    >
    > Thanks.


  3. #3
    Registered User
    Join Date
    04-26-2006
    Posts
    7
    Thanks for the reply. Yes, the two values could be different as in your example. I would like to do something like vertical lookup but I don't want an error returned if an exact match is not found.
    Last edited by BKGT; 04-26-2006 at 02:03 PM.

  4. #4
    Domenic
    Guest

    Re: Matching a value that appears multiple times

    After taking another look at this, what result would you expect under
    the next two scenarios?

    First scenario...

    0 4
    1 8
    2 16
    3 48
    4 50
    5 18
    6 16
    7 8
    8 4

    Second scenario...

    0 4
    1 8
    2 18
    3 20
    4 50
    5 48
    6 18
    7 8
    8 4

    In article <[email protected]>,
    BKGT <[email protected]> wrote:

    > Thanks for the reply. Yes, the two values could be different as in your
    > example.


  5. #5
    Registered User
    Join Date
    04-26-2006
    Posts
    7
    Hi Domenic,

    Ideally, I would like to get 5 - 2 from the first scenario and 6 - 3 from the second. I believe the point you were getting at is the rounding up or down problem if the next value is much larger or much lower than half of the max value. The real data I have to work with probably won't be largely effected by this because there are alot of data points so the error should be minimal. I don't suppose there is a way to interpolate between two points if an exact match is not found.

  6. #6
    Harlan Grove
    Guest

    Re: Matching a value that appears multiple times

    Domenic wrote...
    >After taking another look at this, what result would you expect under
    >the next two scenarios?
    >
    >First scenario...
    >
    >0 4
    >1 8
    >2 16
    >3 48
    >4 50
    >5 18
    >6 16
    >7 8
    >8 4

    ....

    I think you'd have to assume the distribution should be symmetric and
    that the max value is the median as well as the mode. If so, and if the
    first column were named X and the second Y, you could use the array
    formula

    =INDEX(X,MATCH(MIN(ABS(Y-MAX(Y)/2)),ABS(Y-MAX(Y)/2),0))-INDEX(X,
    MOD(COUNT(Y)-MATCH(MIN(ABS(Y-MAX(Y)/2)),ABS(Y-MAX(Y)/2),0),COUNT(Y))+1)


  7. #7
    Registered User
    Join Date
    04-26-2006
    Posts
    7
    The distribution should be symmetric so that is a good assumption. However, the max value may not necessarily be the median. I will try your suggestion. Thanks.
    Last edited by BKGT; 04-26-2006 at 02:38 PM.

  8. #8
    Domenic
    Guest

    Re: Matching a value that appears multiple times

    In article <[email protected]>,
    "Harlan Grove" <[email protected]> wrote:

    > I think you'd have to assume the distribution should be symmetric...


    Is that because we're dealing with a 'gaussian distribution'? I
    suspected as much, but couldn't be sure since I'm not familiar with it.

    > If so, and if the first column were named X and the second Y, you could use the array
    > formula


    > =INDEX(X,MATCH(MIN(ABS(Y-MAX(Y)/2)),ABS(Y-MAX(Y)/2),0))-INDEX(X,
    > MOD(COUNT(Y)-MATCH(MIN(ABS(Y-MAX(Y)/2)),ABS(Y-MAX(Y)/2),0),COUNT(Y))+1)


    Is there any reason why MOD is included in the formula? I've analyzed
    it, but can't seem to find one...

  9. #9
    MartinW
    Guest

    Re: Matching a value that appears multiple times


    Hi BKGT,

    One way to get an exact match is to use an x,y scatter
    graph with 2 data series in it.
    Assuming your data is in columns A + B and the following
    0 4
    1 8
    2 16
    3 27
    4 50
    5 23
    6 16
    7 8
    8 4

    Data series 1 would be
    0 4
    1 8
    2 16
    3 27
    4 50 i.e. (A1:B5)

    And data series 2 would be
    4 50
    5 23
    6 16
    7 8
    8 4 i.e. (A5:B9)

    You can then add a polynomial (4th order) trendline to
    each series and check 'show equation on chart'.

    The equation for the first series is

    y = 0.4167x4 - 2.6667x3 + 7.0833x2 - 0.8333x + 4

    And the second series is

    y = 1.0833x4 - 27.333x3 + 256.42x2 - 1067.2x + 1688

    Now ALL?? you have to do is plug your y value into
    each equation (MAX/2 or 25 in this example) and you
    can calculate the value of x. Unfortunately my
    mathematical skills aren't up to the task at this bit. I can
    extract the equations from linear regression trendlines and
    use them in formulae but polynomials just leave me
    blubbering. Any one else care to step up to the plate and
    show how to extract these equations into usable formula
    that will reduce the process to
    1.Data Input 2.Read off results

    Regards
    Martin



  10. #10
    Registered User
    Join Date
    04-26-2006
    Posts
    7
    Hi Martin,

    Thanks for the suggestions. I was not aware that regression equations could be extracted to use in formulas. Could you provide info on how this can be done?

    The goal of this spreadsheet is for me to just paste the data that I receive from the instrument I am using and have excel instantly calculate all the information that I am looking for without further effort on my part . I have come up with an alternative way to calculate what I'd like if I can somehow set a range based on the value in another cell. I have been able to write the formula that picks the x-value (column A) where the maximum y-value (column B) occurs. If I could use that information to set the range for the MATCH function, I could get around the problem I was having with it and just pick the two values that are above and below the value I was looking for and interpolate to get the exact number I want (I have done this in my spreadsheet by manually setting the range based on my calculation for the x-value where the y-value is max). Is this possible?

  11. #11
    MartinW
    Guest

    Re: Matching a value that appears multiple times

    Hi BKGT,

    I'm sure what you're after is possible and I'm fairly sure
    that the approach should be similar to the way I use linear
    regression.

    The way I use linear regression is as follows.

    Insert the following data
    Column A Column B
    0 0.1
    1 0.5
    2 1.1
    3 1.8
    4 2.3
    5 3.2

    Now select A1 to B6
    Go to Insert>Chart
    Select "xy scatter" subtype no lines
    Then click finish
    Right click on one of the points on the chart
    and select add trendline
    Select linear click options tab and check show equation
    on chart then click ok.

    (forgive me for being too fundamental here you're
    probably 10 steps ahead of this but the written word
    is not as easy as sitting next someone)

    You now have an xy chart showing the data points, the
    line of best fit and the equation for the line in 'y=Mx +C'
    format where M= Gradient + C= Y-Intercept.

    In any blank cell say G2 enter =SLOPE(B1:B6,A1:A6)
    and G3 enter =INTERCEPT(B1:B6,A1:A6) these
    cells should agree with the values on the chart.

    Now G2=M and G3=C so, using normal
    equation solving procedures you can plug a known
    y value in to return an unknown x or vice versa.

    Of course the chart and the trendline are really obsolete
    so there is no real need to use them but sometimes I do
    go back to using them when something looks screwy and
    I can't work out why, then delete them when I'm done.

    I'm sure the same procedure can be used in your case.
    The trouble is getting that polynomial equation into your
    formulas. I can only get as far as the chart with the correct
    equations showing.

    HTH
    Martin













  12. #12
    Jerry W. Lewis
    Guest

    Re: Matching a value that appears multiple times

    =LINEST(B1:B5,A1:A5^{1,2,3,4})
    will return the coefficients of a 4th degree polynomial. Alternately,
    Tushar Mehta has enhanced code by David Braden to extract coefficients
    directly from a chart trendline
    http://groups.google.com/group/micro...da30f29434786d
    Note that for the chart trendline, you should format the equation to display
    scientific notation with 14 decimal places.

    In general, you should evaluate whether you can tolerate the wiggle of a
    polynomial between the fitted points. It doesn't look too bad here.

    Jerry

    "BKGT" wrote:

    >
    > Hi Martin,
    >
    > Thanks for the suggestions. I was not aware that regression equations
    > could be extracted to use in formulas. Could you provide info on how
    > this can be done?
    >
    > The goal of this spreadsheet is for me to just paste the data that I
    > receive from the instrument I am using and have excel instantly
    > calculate all the information that I am looking for without further
    > effort on my part . I have come up with an alternative way to
    > calculate what I'd like if I can somehow set a range based on the value
    > in another cell. I have been able to write the formula that picks the
    > x-value (column A) where the maximum y-value (column B) occurs. If I
    > could use that information to set the range for the MATCH function, I
    > could get around the problem I was having with it and just pick the two
    > values that are above and below the value I was looking for and
    > interpolate to get the exact number I want (I have done this in my
    > spreadsheet by manually setting the range based on my calculation for
    > the x-value where the y-value is max). Is this possible?
    >
    >
    > --
    > BKGT
    > ------------------------------------------------------------------------
    > BKGT's Profile: http://www.excelforum.com/member.php...o&userid=33862
    > View this thread: http://www.excelforum.com/showthread...hreadid=536387
    >
    >


+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1