+ Reply to Thread
Results 1 to 6 of 6

Re: histogram x-axis data (BINs) are not accounting for all the data entries

  1. #1
    Registered User
    Join Date
    12-09-2013
    Location
    cork
    MS-Off Ver
    Excel 2010
    Posts
    15

    Re: histogram x-axis data (BINs) are not accounting for all the data entries

    For ease of explanation, attached is the data in question and the three histograms based on the data in columns B and column J (BINs).

    Problem statement: with a total of 14 entries in column B, one expects to see 14 entries as an output in the histogram. However, it only shows 13?

    Question: why does the above occur? (ignore the colour formatting)

    Second problem but not as big an issue is how to prevent the "more" occurring in the x-axis especially in the middle. (I am aware this can be removed with the BIN when produced by manual edit) is there another method?

    Aside: I have about 5000 data entries of type number (transaction amount as per heading in column B), sample data pulled out as a spot check.

    Attempted actions: playing around with different BIN ranges, playing around with different histogram types.

    Result: always the data produced is one less than the data I have at hand (in this sample data provided).

    thank you in advance,
    Attached Files Attached Files
    Last edited by Mr.Magoo; 11-26-2014 at 02:00 AM. Reason: addition of two further attachments

  2. #2
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP and 2007
    Posts
    15,819

    Re: histogram x-axis data (BINs) are not accounting for all the data entries

    Are you certain you included B2 in the "input range" field of the histogram tool? It appears to me that all of your attempts are failing to include this point in the counts. When I try to recreate your histograms, it correctly counts this point in the totals. My best guess at this point is that you failed to include B2 in the input. Other than that, I cannot recreate your problem. (One of the problems with using the histogram tool is that Excel does not record your inputs, so we have no way of verifying ourselves what you used for inputs unless you tell us). Verify what you have put in the input fields of the histogram utility and include them here so we can more accurately recreate exactly what you have done that is giving erroneous results.

    One thought -- especially to get rid of the "more" label entry. The core part of the histogram tool is to use the built in FREQUENCY() function to do the counting (https://support.office.com/en-us/art...8-a138581492f8 ) Using the FREQUENCY() function will not automatically put the "more" in the last entry.

    I find that I prefer to use the worksheet function rather than the data analysis utility to perform the counts. Building the chart is straightforward from there. The main advantage to using the worksheet function is that the calculations (and resulting chart) will update automatically rather than waiting for me to invoke the data analysis utility.
    Quote Originally Posted by shg
    Mathematics is the native language of the natural world. Just trying to become literate.

  3. #3
    Registered User
    Join Date
    12-09-2013
    Location
    cork
    MS-Off Ver
    Excel 2010
    Posts
    15

    Re: histogram x-axis data (BINs) are not accounting for all the data entries

    many thanks for advise and prompt response - much appreciated.

    I have added two further attachments - data (1) and alternative.

    Yes, certain data points are accounted from B2 cell. screen shot shown for validation. BUt thanks for pointing it out.

    I will look at the frequency function.
    In passing this raises another question which seeks to use the "alternative" work books attached for reference.

    Note how the frequency is created by i.e. count(X1) - Count(X0), where the X variables are the respective BIN entry points (lower range of each BIN entry) to arrive a the correct output.

    PROBLEM STATEMENT/QUESTION: as you will note each entry in the frequency column has a manual entry required. How can be be modified such that these cells are automated to pick up the BIN range elements (lower range).

    I.e. this is not required for the sample data set provided, but in this real case, there are about 5000 row with BINS ranging from 100 to 70000. (note these are monetary amounts.

    PLease advise

  4. #4
    Registered User
    Join Date
    12-09-2013
    Location
    cork
    MS-Off Ver
    Excel 2010
    Posts
    15

    Re: histogram x-axis data (BINs) are not accounting for all the data entries

    actually, it turns out its not its not completely impractical - however if there is a smarter way - keen to understand.

  5. #5
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP and 2007
    Posts
    15,819

    Re: histogram x-axis data (BINs) are not accounting for all the data entries

    From testdata(1) -- I note that your screenshot of the "histogram" dialog shows that you have checked the "labels" checkbox. This checkbox tells Excel that the first entry in the "input range" is a simple text entry that is NOT to be included in counting frequency. When your input range is B2:B15 and this box is checked, Excel will not include B2 in the frequency count -- effectively leaving only 13 data points. Solution would be a) leave input range as is and uncheck the labels checkbox or b) leave labels box checked and set input range to B1:B15.

    From alternative -- the main problem I see with this approach is that you are entering something like "aa-bb" as your bin, which is a text string consisting of two pieces of information. This means you will need to use the various text manipulation functions to extract each piece of information and parse it to the appropriate part of your function -- an approach I personally despise. I believe spreadsheet programming is a lot easier when each cell contains one piece of information -- in this case, have a cell for the "low" value for that bin, and a 2nd cell for the "high" value for the bin. For a histogram, one would normally expect that the "low" value for a bin is identical to the "high" value for the previous bin, so there is usually no need to duplicate this information. Using a COUNTIF() function like you are doing, I would probably set it up the exact same way I set up the bins for the FREQUENCY() function or histogram utility -- simply a column of numbers that represents the "high" value for that bin. My countif() function could then be =countif(input_range,"<=this row's bin value") - countif(input_range,"<=previous row's bin value"). In the end, though, the only reason I could see for using this kind of "countif()" combination is if I really could not figure out how to use the FREQUENCY() function.
    Last edited by MrShorty; 11-26-2014 at 04:36 PM.

  6. #6
    Registered User
    Join Date
    12-09-2013
    Location
    cork
    MS-Off Ver
    Excel 2010
    Posts
    15

    Re: histogram x-axis data (BINs) are not accounting for all the data entries

    brilliant and perfect answer - thank you very much. I clearly had overlooked the "labels" aspect. It unfortunately did not occur to me!
    thank you for enlightening me.

    I have got the frequency function to work out. Thank you for sharing!

    this problem appears to be solved with you insights!

    thanks.
    MM.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. How do I build a histogram with two bins?
    By flyboy54 in forum Excel General
    Replies: 0
    Last Post: 10-10-2012, 05:44 PM
  2. Replies: 10
    Last Post: 09-18-2012, 10:33 AM
  3. Automatically Regenerate Histogram Bins By/Via Data Analysis Toolpak
    By splendidus in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 08-06-2012, 06:58 AM
  4. Using BINS in Excel 2007 (not for histogram)
    By jon.kuhn in forum Excel General
    Replies: 2
    Last Post: 05-08-2012, 06:06 AM
  5. [SOLVED] Histogram Bins Created From Sum Of Two Cells
    By pdberger in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 06-20-2005, 03:05 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1