+ Reply to Thread
Results 1 to 2 of 2

Filtering Each Duplicate by Following Chronological Rules

  1. #1
    Registered User
    Join Date
    11-05-2016
    Location
    Texas, United States
    MS-Off Ver
    MS Office for Mac; 2016
    Posts
    2

    Filtering Each Duplicate by Following Chronological Rules

    Hello everybody! Hopefully somebody could shed some light upon me:

    I have been searching around for a while now, and found something some similar solutions to what I am trying to achieve.

    So, I have a excel spread sheet with thousands of duplicates based on the "m/z" column. And what I want to do is use either a function, VBA code, or Macro that can highlight the entire row that does not follow the rules I have stated below, so that I don't have to keep scrolling through 10,000+ rows of duplicate sets (What I mean by duplication set includes but not limited to: a duplication set containing 2 duplicates; 4 duplicates; and even 12 duplicates) manually applying these rules in my head, which potentially will lead to errors.

    What I want to do to these columns is filter out each set of rows based on the "m/z" duplicates by using a set of rules that must be followed in chronological order until one is remaining. To further explain, if I have 3 "m/z" values that are duplicates and all 3 of them pass the first rule, all 3 will be tested for the next rule until 2 of those 3 "m/z" values does not pass the rule, then filter the two that failed the rules first, leaving one row remaining. (I will explain more in depth later)

    The Rules to be put through every duplication set: (each duplicate set MUST follow each rule in chronological order!)
    1. Color every row red, within the duplicate set, if "N/O" value is more than 2 (if "N/O" value is less than or equal to 2, then it passes)
    2. Color every row red, within the duplicate set, if "(N+S)/O" value is more than 1 (If "(N+S)/O" value is less than or equal to 1, it passes).
    3. Color every row red, within the duplicate set, if each number in the column "N", "O","S", and "P" is greater than zero. (if any of these 4 columns contain a zero within the row, it passes)
    4. Color every row red, within the duplicate set, EXCEPT the highest value in the H/C column that is less than or equal to 2.25. (e.g.: there are 3 "m/z" duplicates; and of those duplicates, the "H/C" values are: 0.50, 2.00, and 2.26. Therefore, the row with the 2.00 "H/C" value will be the only one remaining, because its higher than 0.50, but less than or equal to 2.25)

    However, I may get some duplicates that pass all 4 rules. And in the case of this occurring, I would put them through 2 tie breaking rules (also following chronological order.

    These Tie-breakers include:
    1. If the value under "P" column is greater than zero, then the "H/C" value has to be between 1.5<= H/C <=2.25 (greater than or equal to 1.5, but also less than or equal to 2.25)
    2. The "winner" should have it's "stddev2" value that is closest to zero than the other row(s) (e.g.: if there are 3 rows that have passed all of the previous rules and their "stddev2" values are: 0.4, –0.3, and –0.1 the one that would remain is the –0.1 because that is the one closest to zero.)

    In addition, if there are duplicate sets of "m/z" that pass every single rule and they have equal H/C values in rule 4, and equal "stddev2" values in the tie-breaker rule #2, then the whole duplication set gets colored red.

    And for those who are thinking (if you haven't already stopped reading lol) "why doesn't he just type those commands in and filter out the ones highlighted, or even make a macro doing it?".... That is a good suggestion, which I myself thought that too, but that way won't work. To further explain, I mentioned earlier that if I am testing a duplicate set with 3 duplicates, and 2 of those fail and 1 passes, the 1 that did pass should not be further investigated to test whether it fails anymore tests. (For Example: Take a look at the duplicate set highlighted in yellow). So based on the rules, row 68 failed rule #1, but if you look further in the "(N+S)/O" column in row 67, it fails one of the 4 rules as well! However, because row 68 failed a rule before row 67 did, row 67 is the "winner," and therefore does not get colored red. So in simple terms, as soon as you come to a single "winner", don't test for any further rules.

    This is why if I were to just make a function for these rules and put them through all of the duplicates, it wouldn't follow the principle of putting each duplicate set through the rules in a chronological order.

    (Additional Information):
    Columns C-M are values only
    "N/O" column formula used
    "(N+S)/O" column formula used
    "N, O, S, P Test" column formula used (this was just to simply my life by me not having to look at each N, O, S, and P column to see if all of them were present, instead I made it to where if there is a zero present, return 1, if not return 0.)
    "H/C" column formula used


    Does anybody have any kind of suggestions of what I could do to be a lot more efficient than scrolling through each individual duplication set and put them through the rules one by one?

    Anything would be appreciated, Thanks!
    Last edited by maupinsmason; 11-05-2016 at 11:23 PM. Reason: It submitted the thread without me finishing

  2. #2
    Forum Contributor
    Join Date
    06-28-2016
    Location
    Pennsylvania, USA
    MS-Off Ver
    2013
    Posts
    308

    Re: Filtering Each Duplicate by Following Chronological Rules

    I'm going to go with you need to put each rule into a helper column.

    Then you can use a countif or whatever at the bottom of each helper column to see how many results return as true, and when you find one with a 1, return that result.

    I'm thinking you'd have to use index match match or something similar where you are capable of looking at rows and columns and finding the matching cell off that. I don't think you are able to do this with conditional formatting (in my mind) but I could be wrong. I definitely think dividing each rule into its own helper column is the way to go, as it will give you the best results.

    I'd suggest uploading a sample workbook if you need further assistance, this can be done by editing your post (or making a new one) and clicking the go advanced option.
    Scroll down to manage attachments and click it, then browse to find your workbook. Remember to desensitize your data if there is anything you don't want to be seen or whatever.

    Hope this was helpful.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Filtering for the correct duplicate
    By yuenk in forum Excel Programming / VBA / Macros
    Replies: 4
    Last Post: 11-20-2014, 02:09 PM
  2. Filtering for unique and 1 duplicate
    By frankee_gee in forum Excel General
    Replies: 3
    Last Post: 08-11-2009, 03:51 PM
  3. filtering duplicate entries
    By EAMOG in forum Excel General
    Replies: 2
    Last Post: 12-17-2007, 07:48 PM
  4. Filtering Duplicate Entrys
    By s0347688 in forum Excel General
    Replies: 1
    Last Post: 08-24-2007, 07:58 AM
  5. finding & filtering out duplicate cells
    By Jackie in forum Excel General
    Replies: 4
    Last Post: 10-21-2005, 02:05 AM
  6. RE: filtering duplicate enteries
    By David Hepner in forum Excel General
    Replies: 0
    Last Post: 08-29-2005, 05:05 PM
  7. filtering duplicate enteries
    By chris in forum Excel General
    Replies: 0
    Last Post: 08-29-2005, 05:05 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1