+ Reply to Thread
Results 1 to 5 of 5

Trendline that emphasizes certain points

  1. #1
    Registered User
    Join Date
    02-06-2023
    Location
    The Netherlands
    MS-Off Ver
    Version 2301
    Posts
    2

    Trendline that emphasizes certain points

    Hello,

    I am making a trendline in Excel 365, and I wish to make the line emphasize certain points more. I have a dataset with average gas consumption of houses. However, certain values are measured more often (for instance the average of 820 cubic metres is measured 3000 times on average whereas the value 1860 cubic metres is measured 275 times). I only have the one datapoint and a value for the amount of times it is measured. Currently I tried finding a ratio between the different amount of measurements and then manually copy pasted my datapoints to get a full set, which I then put in a graph to obtain the trendline from. However, I was hoping if anyone might have a method in which I can just make the trendline prioritize the points that occur more over the others. Is this possible?

    I attached a small part of the dataset to show how I currently have done it.

    Kind regards and thanks in advance,
    Lucas
    Attached Files Attached Files

  2. #2
    Forum Guru Kaper's Avatar
    Join Date
    12-14-2013
    Location
    Warsaw, Poland
    MS-Off Ver
    most often: Office 365 in Windows environment
    Posts
    8,620

    Re: Trendline that emphasizes certain points

    Well, for a large dataset such approach will be indeed tough to implement and pretty time consuming.

    I think that 2 approaches could be easy:
    1) use Power Query to multiplicate the results (to automatically prepare content of your columns P and Q
    2) find the best fitting line using Solver and the same math (minimize sum of squared differences) which is used by excel trendline.

    The second approach is presented in a copy of your original sheet

    the calculated Average gas consumption as a linear function of Average surface area is in column I. The slope and constant are first taken from the initial fit (either from your graph, or from linest formula - calculated in G12:H12 and values copied to G16:H16. these will be variables for solver. The aim for solver will be to minimize squared differences between calculated and real-data avg gas consumption in column J.
    But this squared difference shall be multiplied (weighted) by a number of cases (column K).
    And now the weighted sum is calculated (G18). We will use solver (make sure to uncheck "set non-negative values for all variables") to minimize this sum.

    Now the fitted line (Col I vs Col G) can be added to the plot.
    Attached Files Attached Files
    Best Regards,

    Kaper

  3. #3
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP and 2007
    Posts
    15,827

    Re: Trendline that emphasizes certain points

    To my knowledge, Excel does not have built in support for weighted least squares regression. I see four possible paths forward on a problem like this:

    1) As the sign on my statistician coworker's door says, "Friends don't let friends use Excel for statistics." I would expect that a more robust stats package (like R or Minitab) would have built in support for weighted least squares. Would you be allowed to abandon Excel completely for this and seek out a more robust statistical package?

    2) Install a 3rd party add-in to Excel that knows how to do weighted least squares. When it comes to statistics in Excel, this is my go-to source: https://real-statistics.com/multiple...ar-regression/ He not only explains how the statistics works for a variety of statistical topics, but his add-in provides a wide array of functions that you can use in your spreadsheets.

    If you are required to use Excel, but cannot use outside utilities, then you are left programming your own weighted least squares.
    3) You can do like you are currently doing, where you use the information in the original data to build a new input matrix with weight copies of each data point. If this is your preferred approach, I expect that Power Query will be the easiest tool for reshaping your x,y,weight data into a suitable input matrix for the LINEST() function or the chart trendline utility (whichever you prefer. at present you seem to prefer the chart trendline utility. I usually prefer the LINEST() function).

    4) (Mathophobe warning: talk of matrix algebra ahead) If you do the research, I expect that weighted least squares (like ordinary least squares) reduces to a simple sequence of matrix operations. Excel has built in support for matrix algebra. Once one understands the sequence of matrix operations, I would expect it to be fairly simple to program those matrix operations into the spreadsheet using Excel's built in matrix functions [MMULT(), MINVERSE(), TRANSPOSE(), and so on].

    As the programmer, you have choices. Which approach to this problem do you like? How can we help you implement that solution?
    Quote Originally Posted by shg
    Mathematics is the native language of the natural world. Just trying to become literate.

  4. #4
    Forum Guru Kaper's Avatar
    Join Date
    12-14-2013
    Location
    Warsaw, Poland
    MS-Off Ver
    most often: Office 365 in Windows environment
    Posts
    8,620

    Re: Trendline that emphasizes certain points

    PS. As your data is already categorized, the quality of fit (for instance R^2) is just indicative. Obviously it would be the best to have access to full dataset with just required columns, may be you can have access to it?

    And just a side comment: The interesting issue is (as quite often with linear regression) the meaning of the intercept value. Slope is quite clear - it shows how quickly gas consumption increases with increasing area. Quite natural. But the intercept? If the area is 0, the consumption shall be 0 too. Or may be if there is any activity even in very small housing, certain amount is used - probably that's it. Or may be non-linear function shall be used for approximation? This could be better visible if not-categorized (full data) set is used.

  5. #5
    Registered User
    Join Date
    02-06-2023
    Location
    The Netherlands
    MS-Off Ver
    Version 2301
    Posts
    2

    Re: Trendline that emphasizes certain points

    Hello,

    First of all, thanks Kaper and MrShorty for your replies. To answer some of your questions/remarks:

    Kaper: I unfortunately cannot access the full dataset due to some privacy reasons, therefore I only have the dataset in which the data is already grouped in ranges for surface areas and amounts that a certain average value is measured. Your mentioning of the R^2 value just being indicative has been helpful since I was a bit fixated on that part eventhough I might not even need it since I have a different method of 'proving' that the formula I end up with is useable. Regarding your remark of the intercept, aside from the data I had put in the attachment file there is also data regarding the energy label and construction year of the different building types, so I see the intercept as a kind of baseline minimum gasconsumption for a certain buildingtype, construction year and energy label.

    MrShorty: Thanks for the multiple different suggestions. I am not required to use excel, however I end up using it out of habit since I usually do not need to do anything more than basic things. I ended up going to the website you linked and installing the add-in to run a weighted linear regression. When I ran it I the values for the intercept and slope were almost identical as when I tried my method with the ratio so I was happy to see that! The difference in values is probably because my ratio is rounded off whereas the add-in uses the actual ratio (I used 11 myself whereas it is more close to 10.9). I am planning to use that method since it is very straightforward and I can keep using excel.

    Going forward I will be using the weighted least squared regression addin from the real-statistics website. Thanks again to the both of you, especially for mentioning the weighted least squared regression since I did not know that existed. Have a nice day.

    Kind regards,
    Lucas

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Extracting Data points from trendline
    By jsuarez199 in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 10-25-2013, 10:21 AM
  2. Replies: 1
    Last Post: 10-11-2011, 12:52 AM
  3. trendline for arrays with missing data points
    By melnik in forum Excel Formulas & Functions
    Replies: 5
    Last Post: 02-10-2011, 01:25 PM
  4. Replies: 7
    Last Post: 05-11-2006, 04:00 AM
  5. Replies: 7
    Last Post: 05-11-2006, 04:00 AM
  6. Getting the values of the points in a trendline
    By [email protected] in forum Excel Programming / VBA / Macros
    Replies: 7
    Last Post: 05-11-2006, 04:00 AM
  7. Trendline points
    By Blondie in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 03-31-2006, 04:20 PM
  8. [SOLVED] [SOLVED] Trendline excluding last 2 data points of series
    By Mary Ann in forum Excel Charting & Pivots
    Replies: 2
    Last Post: 06-15-2005, 09:05 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1