+ Reply to Thread
Results 1 to 9 of 9

Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

  1. #1
    Registered User
    Join Date
    10-11-2013
    Location
    St. Louis, MO
    MS-Off Ver
    Excel 2010
    Posts
    5

    Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    I took a visual basic class in high school and pretty much don't remember how to write any code. I can record a macro fine, but when it comes to a blank module screen, yeah right. I have a data analysis that I'm trying to perform on two sets of data. I have attached the excel sheet here to make it easier.

    Basically, Data 1 and Data two need to be analyzed and displayed on a "results" tab. Both data sets will be analyzed the same way. I need to compare the first three columns, Confidence Level, Sequence, and Protein Group Accessions. For instance, if A2 = A3, B2 = B3, and C2 = C3, then that would be considered a "hit" for my research. I need to find any duplicates within these three columns. I do NOT care about singles. If it is a single, it might as well be erased from my sheet.

    Once the duplicates are found, possibly 5 or more matches, I want to average the Isolation Inerference, RT[min], and 113 column. Then, I would like to report this data on a results tab listing the duplicate value once with the average of all the duplicates in the set.

    This data analysis would be done twice, once for Data 1 and once for Data 2. The results tab would have 9 columns - the first three columns, and results for data 1 (3 columns), and results for data 2 (another 3 columns).

    I could be doing this for up to 15000 data points. Basically, I know how to accomplish this with formulas such as =COUNTIF("":"","") = 1, then sorting, then using AVERAGEIF, then VLOOKUP to match them on the final page. I just want to try to get some code for this because I need to do this data analysis with different sets of data for the next 4 years, so have a one button completes all would be nice.

    Let me know if you have questions! Any sample of code, partial solution, or even full solution of code (which would be awesome!) would be much appreciated.test.xlsx

  2. #2
    Forum Expert MickG's Avatar
    Join Date
    11-23-2007
    Location
    Banbury,Oxfordshire
    Posts
    2,650

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    Try this:-
    Not sure why you have 9 columns for results ???
    Please Login or Register  to view this content.
    Regards Mick

  3. #3
    Registered User
    Join Date
    10-11-2013
    Location
    St. Louis, MO
    MS-Off Ver
    Excel 2010
    Posts
    5

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    Ok this looks like an awesome start, but Data 2 in results tab isn't filled out. The averages of the Isolation Interference, RT, and 113 should only happen within data sets, and these averages need to be compared side by side for the two data sets on the results tab.


    The 9 columns in the results page will display the first three columns in the Data 1 and Data 2 tabs, "Confidence Level", "Sequence", "Protein Group Accessions". The other six will be "Isolation Inlet", "RT [min]", and "113", one for data set 1 (3 columns) and one for data set 2 (3 columns) totaling six columns.

    The idea is that if Data 1 has duplicates and Data 2 also has duplicates, then display what they both had in common (remembering that the Confidence Level, Sequence, and Protein Group Accessions have to match for it to be in common) and then compare the averages on the results tab...

    Does that make sense?

  4. #4
    Registered User
    Join Date
    10-11-2013
    Location
    St. Louis, MO
    MS-Off Ver
    Excel 2010
    Posts
    5

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    Ok this looks like an awesome start, but Data 2 in results tab isn't filled out. The averages of the Isolation Interference, RT, and 113 should only happen within data sets, and these averages need to be compared side by side for the two data sets on the results tab.


    The 9 columns in the results page will display the first three columns in the Data 1 and Data 2 tabs, "Confidence Level", "Sequence", "Protein Group Accessions". The other six will be "Isolation Inlet", "RT [min]", and "113", one for data set 1 (3 columns) and one for data set 2 (3 columns) totaling six columns.

    The idea is that if Data 1 has duplicates and Data 2 also has duplicates, then display what they both had in common (remembering that the Confidence Level, Sequence, and Protein Group Accessions have to match for it to be in common) and then compare the averages on the results tab...

    Does that make sense?

  5. #5
    Forum Expert MickG's Avatar
    Join Date
    11-23-2007
    Location
    Banbury,Oxfordshire
    Posts
    2,650

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    Try this:-
    Please Login or Register  to view this content.
    Regards Mick

  6. #6
    Registered User
    Join Date
    10-11-2013
    Location
    St. Louis, MO
    MS-Off Ver
    Excel 2010
    Posts
    5

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    Ok, when I ran this, it looks like on the results tab it shows blanks sometimes for Isolation Interference, RT, or 113. If all three of the columns are blank for Data 1, but they aren't blank for data 2, the whole row should be deleted. Same thing if Data 2 has blanks, but Data 1 shows values, the whole row should be deleted. I think I solved this problem (I can do simple code, by adding the following rungs at the bottom of the routine:

    Please Login or Register  to view this content.
    That seems to fix that problem on the results tab. If Colum D is empty, so will column E and F. Same thing for Column G. Another problem I'm having is this - I tried to add items on the Data 1 and Data 2 tab to double check that it would pick them up. I typed in "Hello" in the Isolation, RT, and 113 column and then 5 in the other three columns. I did it twice on each Data 1 and Data 2 sheet. This would mean that it should show up on the results tab one, with every value for Data 1 and Data 2 being 5. It didn't show up, but I'm not sure why. When you set the range as "A2" and do the rowscount, I thought it would pick up new items. I plan on copying in new data and using this same sheet to analyze it, but sometimes my data length rows varies to up to 15000 items. How would I make the code account for that?

  7. #7
    Forum Expert MickG's Avatar
    Join Date
    11-23-2007
    Location
    Banbury,Oxfordshire
    Posts
    2,650

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    The code should account for any new rows added from "A2" on.
    Ref Your Test:-
    The reason that your test data did not show is, the code is organised to sort the Results so that all the "High" appear first then the "Low" then the "Medium", if your column "A" test data is not one of those then it will not show in the results.

    I have Tried "Test" rows on both sheets that show on sheet "Results"

    Try:- Modified code below with your "Delete empty rows" code added and code to Clear sheet "Results" before running of code.
    Please Login or Register  to view this content.
    Regards Mick

  8. #8
    Registered User
    Join Date
    10-11-2013
    Location
    St. Louis, MO
    MS-Off Ver
    Excel 2010
    Posts
    5

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    Okay, I retested it and it works perfectly! You have been so awesome, thank you for you help with this issue!

  9. #9
    Forum Expert MickG's Avatar
    Join Date
    11-23-2007
    Location
    Banbury,Oxfordshire
    Posts
    2,650

    Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

    I'm not sure that code is giving you the right results. I think its possible giving you cases where row data "A,B,C" only appears onces on one of the sheets.
    Below is a differents code that hopefully stops that happening, perhaps you would like to try it.
    Results on sheet "New Results"

    Please Login or Register  to view this content.
    Regards Mick
    Last edited by MickG; 10-14-2013 at 07:30 AM.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Filter data based on columns, display unique row data
    By djarcadian in forum Excel General
    Replies: 14
    Last Post: 05-18-2013, 07:18 PM
  2. Replies: 3
    Last Post: 07-04-2012, 11:15 PM
  3. Replies: 0
    Last Post: 07-21-2010, 01:43 PM
  4. Sorting unique data from more columns
    By testaredescript in forum Excel Formulas & Functions
    Replies: 8
    Last Post: 05-14-2008, 12:03 PM
  5. comparing two columns of data to find common values
    By patman in forum Excel General
    Replies: 2
    Last Post: 07-25-2006, 10:10 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1