# Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

1. ## Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

I took a visual basic class in high school and pretty much don't remember how to write any code. I can record a macro fine, but when it comes to a blank module screen, yeah right. I have a data analysis that I'm trying to perform on two sets of data. I have attached the excel sheet here to make it easier.

Basically, Data 1 and Data two need to be analyzed and displayed on a "results" tab. Both data sets will be analyzed the same way. I need to compare the first three columns, Confidence Level, Sequence, and Protein Group Accessions. For instance, if A2 = A3, B2 = B3, and C2 = C3, then that would be considered a "hit" for my research. I need to find any duplicates within these three columns. I do NOT care about singles. If it is a single, it might as well be erased from my sheet.

Once the duplicates are found, possibly 5 or more matches, I want to average the Isolation Inerference, RT[min], and 113 column. Then, I would like to report this data on a results tab listing the duplicate value once with the average of all the duplicates in the set.

This data analysis would be done twice, once for Data 1 and once for Data 2. The results tab would have 9 columns - the first three columns, and results for data 1 (3 columns), and results for data 2 (another 3 columns).

I could be doing this for up to 15000 data points. Basically, I know how to accomplish this with formulas such as =COUNTIF("":"","") = 1, then sorting, then using AVERAGEIF, then VLOOKUP to match them on the final page. I just want to try to get some code for this because I need to do this data analysis with different sets of data for the next 4 years, so have a one button completes all would be nice.

Let me know if you have questions! Any sample of code, partial solution, or even full solution of code (which would be awesome!) would be much appreciated.test.xlsx

2. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

Try this:-
Not sure why you have 9 columns for results ???
``Please Login or Register  to view this content.``
Regards Mick

3. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

Ok this looks like an awesome start, but Data 2 in results tab isn't filled out. The averages of the Isolation Interference, RT, and 113 should only happen within data sets, and these averages need to be compared side by side for the two data sets on the results tab.

The 9 columns in the results page will display the first three columns in the Data 1 and Data 2 tabs, "Confidence Level", "Sequence", "Protein Group Accessions". The other six will be "Isolation Inlet", "RT [min]", and "113", one for data set 1 (3 columns) and one for data set 2 (3 columns) totaling six columns.

The idea is that if Data 1 has duplicates and Data 2 also has duplicates, then display what they both had in common (remembering that the Confidence Level, Sequence, and Protein Group Accessions have to match for it to be in common) and then compare the averages on the results tab...

Does that make sense?

4. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

Ok this looks like an awesome start, but Data 2 in results tab isn't filled out. The averages of the Isolation Interference, RT, and 113 should only happen within data sets, and these averages need to be compared side by side for the two data sets on the results tab.

The 9 columns in the results page will display the first three columns in the Data 1 and Data 2 tabs, "Confidence Level", "Sequence", "Protein Group Accessions". The other six will be "Isolation Inlet", "RT [min]", and "113", one for data set 1 (3 columns) and one for data set 2 (3 columns) totaling six columns.

The idea is that if Data 1 has duplicates and Data 2 also has duplicates, then display what they both had in common (remembering that the Confidence Level, Sequence, and Protein Group Accessions have to match for it to be in common) and then compare the averages on the results tab...

Does that make sense?

5. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

Try this:-
``Please Login or Register  to view this content.``
Regards Mick

6. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

Ok, when I ran this, it looks like on the results tab it shows blanks sometimes for Isolation Interference, RT, or 113. If all three of the columns are blank for Data 1, but they aren't blank for data 2, the whole row should be deleted. Same thing if Data 2 has blanks, but Data 1 shows values, the whole row should be deleted. I think I solved this problem (I can do simple code, by adding the following rungs at the bottom of the routine:

``Please Login or Register  to view this content.``
That seems to fix that problem on the results tab. If Colum D is empty, so will column E and F. Same thing for Column G. Another problem I'm having is this - I tried to add items on the Data 1 and Data 2 tab to double check that it would pick them up. I typed in "Hello" in the Isolation, RT, and 113 column and then 5 in the other three columns. I did it twice on each Data 1 and Data 2 sheet. This would mean that it should show up on the results tab one, with every value for Data 1 and Data 2 being 5. It didn't show up, but I'm not sure why. When you set the range as "A2" and do the rowscount, I thought it would pick up new items. I plan on copying in new data and using this same sheet to analyze it, but sometimes my data length rows varies to up to 15000 items. How would I make the code account for that?

7. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

The code should account for any new rows added from "A2" on.
The reason that your test data did not show is, the code is organised to sort the Results so that all the "High" appear first then the "Low" then the "Medium", if your column "A" test data is not one of those then it will not show in the results.

I have Tried "Test" rows on both sheets that show on sheet "Results"

Try:- Modified code below with your "Delete empty rows" code added and code to Clear sheet "Results" before running of code.
``Please Login or Register  to view this content.``
Regards Mick

8. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

Okay, I retested it and it works perfectly! You have been so awesome, thank you for you help with this issue!

9. ## Re: Data Analysis: Comparing 3 columns, sorting, removing unique values, display data

I'm not sure that code is giving you the right results. I think its possible giving you cases where row data "A,B,C" only appears onces on one of the sheets.
Below is a differents code that hopefully stops that happening, perhaps you would like to try it.
Results on sheet "New Results"

``Please Login or Register  to view this content.``
Regards Mick

##### Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts

## X vBulletin 4.1.8 Debug Information

• Page Generation 0.07648 seconds
• Memory Usage 9,116KB
• Queries Executed 16 (?)
Template Usage (34):
• (5)bbcode_code
• (1)footer
• (1)forumrules
• (1)gobutton
• (9)memberaction_dropdown
• (1)navbar
• (4)navbar_noticebit
• (6)option
• (9)postbit_legacy
• (9)postbit_onlinestatus
• (9)postbit_wrapper
• (1)spacer_close
• (1)spacer_open
• (1)tagbit_wrapper

Phrase Groups Available (6):
• global
• inlinemod
• postbit
• posting
• reputationlevel
Included Files (39):
• ./vbseo.php
• ./env.php
• ./vbseo/includes/functions_vbseo.php
• ./vbseo/includes/functions_vbseo_pre.php
• ./vbseo/includes/functions_vbseo_url.php
• ./vbseo/includes/functions_vbseo_createurl.php
• ./vbseo/includes/functions_vbseo_db.php
• ./vbseo/includes/functions_vbseo_vb.php
• ./vbseo/includes/functions_vbseo_seo.php
• ./vbseo/includes/functions_vbseo_misc.php
• ./vbseo/includes/functions_vbseo_crr.php
• ./vbseo/includes/functions_vbseo_cache.php
• ./vbseo/includes/functions_vbseo_hook.php
• ./vbseo/includes/functions_vbseo_startup.php
• ./includes/config.php
• ./global.php
• ./includes/class_bootstrap.php
• ./includes/init.php
• ./includes/class_core.php
• ./includes/functions.php
• ./includes/class_friendly_url.php
• ./includes/class_hook.php
• ./includes/functions_cforum.php
• ./includes/functions_bigthree.php
• ./includes/class_postbit.php
• ./includes/class_bbcode.php
• ./includes/functions_reputation.php
• ./includes/class_bootstrap_framework.php
• ./vb/vb.php
• ./vb/phrase.php
• ./packages/vbattach/attach.php
• ./vb/types.php
• ./vb/cache.php
• ./vb/cache/db.php
• ./vb/cache/observer/db.php
• ./vb/cache/observer.php
• ./includes/functions_notice.php

Hooks Called (49):
• init_startup
• friendlyurl_resolve_class
• database_pre_fetch_array
• database_post_fetch_array
• global_bootstrap_init_start
• global_bootstrap_init_complete
• cache_permissions
• fetch_foruminfo
• global_state_check
• global_bootstrap_complete
• global_start
• style_fetch
• global_setup_complete
• friendlyurl_redirect_canonical
• bbcode_fetch_tags
• bbcode_create
• postbit_factory
• postbit_display_start
• cache_templates
• template_register_var
• parse_templates
• notices_check_start
• notices_noticebit
• process_templates_complete
• reputation_image
• postbit_imicons
• bbcode_parse_start
• bbcode_parse_complete_precache
• bbcode_parse_complete
• postbit_display_complete
• memberaction_dropdown
• tag_fetchbit
• tag_fetchbit_complete