+ Reply to Thread
Results 1 to 38 of 38

Removing Duplicate Rows based on 2 columns

  1. #1
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Removing Duplicate Rows based on 2 columns

    Hi,

    I have a database with around 8 columns for people who have entered a competition with the results ordered as such:

    A = Name
    B = email address
    C, D, E = Various other fields
    F = Postcode
    G = Member (always contains either "Yes" or "No")


    Because the same people will enter the competition over and over my databases downloaded from the website contain duplicates. If they are logged in as members the G column will automatically asign a yes or no depending on their status.

    What I want to do is remove the duplicate rows based on whether both columns A and B (name and email) both match as this is how I am identifying unique people, NOT if the whole row matches (to avoid excluding people based on typos in addresses etc - name and email is accurate enough for what I need). I want the duplicate rows to be deleted in their entirety, preserving the A-G column structure for the uniques that remain.

    This is the code I am using as found on an Excel resource website...

    Please Login or Register  to view this content.
    --

    The code above works but my problem is that sometimes people will enter a competition when they are logged in as a member, and then again when they are not logged in, and thus I get duplicates which are idential across name and email but has different Yes or No values in column G.

    I thought the macro preserved the last entry in the list (deleting all duplicated before it) but I have noticed that one time it did not (which is how I discovered the Yes and No duplicate entries for someone).

    Can someone please explain how to modify the code to make sure its as solid and reliable as possible, and can we add in something to say "if there are Yes and No values in column G for duplicated rows based on name and email, I want to keep the YES value in my summarised list once the macro has finished".

    I hope that makes sense and that I have included everything needed for someone with more VBA experience than me to assist with!!


    Susie x
    Attached Files Attached Files
    Last edited by metalpoker; 10-07-2011 at 12:40 PM. Reason: I didnt include code tags by mistake - sorry!!

  2. #2
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    First, please put your code within code tags.

    2nd, please attach a workbook and we can help you out.

  3. #3
    Forum Expert NBVC's Avatar
    Join Date
    12-06-2006
    Location
    Mississauga, CANADA
    MS-Off Ver
    2003:2010
    Posts
    34,898

    Re: Removing Duplicate Rows based on 2 columns

    Your post does not comply with Rule 3 of our Forum RULES. Use code tags around code. Posting code without them makes your code hard to read and difficult to be copied for testing. Highlight your code and click the # at the top of your post window. For more information about these and other tags, found here
    Where there is a will there are many ways.

    If you are happy with the results, please add to the contributor's reputation by clicking the reputation icon (star icon) below left corner

    Please also mark the thread as Solved once it is solved. Check the FAQ's to see how.

  4. #4
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    I have a solution ready for you. Once you comply with the rules, i can provide it to you.

  5. #5
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Hi guys,

    Apologies for not using code tags I just posted in a hurry

    So I have attached a duplicate-test.xls file to my original post, which has some sample fictional data on it in the same format as my real spreadsheets. I have not saved this with the macro included though as thought it best to leave that off!

    catherine paice, francis healey and wendy price all appear twice, so these duplicates need to go, however only columns A = name, B = email and G = Yes/No (for member) need to be considered. All the other data is copied across but as the macro should not compare address, city and phone it shouldnt matter what is inside these cells right?

    What the macro should do is remove the duplicates based on both Name and Email together (which it does already) but if collective duplicates contain different Yes or No values in G, it needs to display Yes for that column in the final list.

    Wendy Price is the one in the test file that has "yes" in one row and "no" in the other.

    I hope that helps!!!


    Susie x

  6. #6
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Instead of going through your code, i found it easier () to do my own:

    Please Login or Register  to view this content.
    I have tested it and it works. You can check it and let me know.

  7. #7
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Many thanks for your reply!
    I am getting an error message with this section of code highlighted:

    Please Login or Register  to view this content.
    Also it will it work with just the active worksheet as I may chop and change where it is used?

  8. #8
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Oops..ok the reason you got the error is because i use 2007 and u use 2003.

    Replace
    Please Login or Register  to view this content.
    with

    Please Login or Register  to view this content.
    If you have a predefined sheet, just replace "Sheet1" with the name of the sheet.

  9. #9
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Hi,

    Yes it works amazingly - thank you so much!! I thought I had it wrapped up with my original code but as it turned out I had not considered the Yes/No thing...

    This now leads onto the next stage of this project which is more complex, so I'm going to post a new thread and will PM you the link as you might be able to help with that if you want something more challenging!!

    Thanks again so much!!


    Susie

  10. #10
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Thank you so much.

    Yes, i would surely be game for something more challenging.

    Besides, if you found this post helpful, please rate it accordingly.

  11. #11
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Hi arlu1201,

    Just preparing a sample file for my next thread, but just in reference to this one is there a way of reducing the screen flicker while it removes the duplicates? The actual lists in question are on average 3000 - 5000 rows and even though it worked perfectly there was a lot of screen flicker?

    Many thanks!

    Ps - I added to your reputation - you really are an expert!!

  12. #12
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Below
    Please Login or Register  to view this content.
    Just add this line:

    Please Login or Register  to view this content.
    At the end of the code, before End Sub, you can add

    Please Login or Register  to view this content.

  13. #13
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Hi,

    I just noticed a problem with the remove_dups macro (see attached file) - when there are multiple listings for the same user, in this case Wendy Price who I have duplicated several more times with both Yes and No values under the VIP category.

    After running the macro there are still a number of rows remaining with Wendy Price for both "yes" and "no"?

    When the macro is finished I only want 1 row per unique person (based on name and email address combined) and thus Wendy Price would only remain once, and her listing should say "Yes" under the VIP row as she had at least one Yes in the original database before cleaning.

    Not sure if I have broken it or if it needs fixing, so if you could have a look at it and run the test again with the new database I have attached I would really appreciate it!

    Many thanks,


    Susie x
    Attached Files Attached Files

  14. #14
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    I have tested it a bit but its still retaining several Wendy's. There should be a workaround. Let me look over it and give you an answer by tomorrow.

  15. #15
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Hi,

    I tried 2 more strategies but its getting all the more complicated. I tried using a count to count how many rows are there for one name and then proceed but that was not working properly.

    Then i tried using autofilter but that too got complicated. Maybe we can bring in some experts to help us with this. What you can do is (since you have already marked this as closed, you may not get more replies from others), post your question again but this time you can provide the code that i gave and say that it takes care of the main deleting of duplicates, but when the records are more in number, the logic doesnt work. So you need their help to work around that small bit.

    Sorry about this. I really tried a lot but couldn't think of anything to get it to work.

  16. #16
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Hey,

    I had a brain-wave last night on how i could solve your problem .

    Try this code:
    Please Login or Register  to view this content.

  17. #17
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    hi arlu1201,

    Thanks for the continued help, and don't worry about not being able to fix the issue I will repost the thread and see if I have more luck!

    Many thanks!!

    Susie

  18. #18
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    I did fix the issue Susie...see my today's post with the fresh code. Try it out and let me know if it works.

  19. #19
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Quote Originally Posted by arlu1201 View Post
    I did fix the issue Susie...see my today's post with the fresh code. Try it out and let me know if it works.
    Yeah sorry I just saw your post!!
    My brain is all over the place...

    I have tested your new code on the file I sent you and it seems to work, let me just test it something bigger as my real docs are like 4k rows long

    Did you still want to take on the bigger challenge I now have if this works?


    Susie

  20. #20
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Yeah sure...just post it and let me try.

  21. #21
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Ok so I have run a proper test now with a file containing about 5000 rows and what seems to be happening now is that I am getting 2 rows for each multiple user? They are both returning yes, which is correct, but it's still 2 rows when there should only be 1.

    Any ideas?

    I could then run the list through my original duplicate filter which would return the list I need, but I thought it best to see if we can resolve it properly?

    I really do appreciate your help, I hope you don't mind the challenge I know its a tough cookie to crack and I would have NO IDEA how to do what you guys do!!


    Susie


    EDIT

    Is it something to do with the length of the list? It worked fine on the test version but now with the super long lists its got this doubling issue? The orginals go through a cleaning process which puts them into the format seen in the example but the originals are not in alphatbetical order - not sure if that makes any difference?
    Last edited by metalpoker; 10-10-2011 at 06:56 AM.

  22. #22
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    I usually sort the list before i begin the calculations. You said both are "Yes", let me look through the code and see if i can add an additional condition.

  23. #23
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Try this out.
    Please Login or Register  to view this content.

  24. #24
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Quote Originally Posted by arlu1201 View Post
    I usually sort the list before i begin the calculations. You said both are "Yes", let me look through the code and see if i can add an additional condition.
    Yeah so in my full databases I've got things like:

    Wendy Price | Email | Phone | Address | Postcode | YES
    Wendy Price | Email | Phone | Address | Postcode | YES

    It's the correct result just duplicated twice, and only on the ones with Yes we don't get any dups on the ones that are correctly filtered as no.

    Susie

  25. #25
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Try the code above and let me know.

  26. #26
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Hi,

    This is very strange, the first time I ran the code it worked, and now every time it still returns the duplicate Yes results on 2 rows - weird!!

    Sorry about this, maybe Excel just doesnt like me lol!!

    Susie

  27. #27
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Can you run it again and let me know. Ensure you maintain a copy of the test data so you can just replace it when the code messes it up.

  28. #28
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    1st time I get the duplicates of the YES ones that have been filtered with your code..
    2nd time these are removed and the list is as it should be!!

    Just need to figure out why that doesnt happen all at once in the beginning??

  29. #29
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Are you running the same dataset the 1st time and then replacing again when you run the 2nd time? Can you attach the whole file if possible, so i can test?

  30. #30
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    I'm running the macro twice on the same dataset, the first time I see duplicates, always 2 rows for the same "Yes" user if they were filtered due to the Yes/No condition in column G, then when I re-run it for a 2nd time those duplicates are removed to display the list perfectly as it should be.

    I can't attach the whole file due to the sensitive information I'm afraid - sorry

    Susie


    EDIT

    What I mean is I have to run the macro twice one after another on the same dataset to see the final result as it should be.
    Last edited by metalpoker; 10-10-2011 at 08:33 AM. Reason: Additional information

  31. #31
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Then since you are not replacing the dataset while running it the 2nd time, it means your macro still has the problem of the 2 "yes". Let me work something out.

  32. #32
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    I tried different stuff but still it doesnt remove them. In my case, i am getting one yes and one no. Maybe it will help if you can run the macro 2 times for now and in the meantime i will try and work something out.

    Maybe i should get another brainwave like this one soon... Maybe i need to sleep more to get them

  33. #33
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    Ok I will run them twice - many thanks and I look forward to your next brainwave

    Many thanks


    Susie

  34. #34
    Registered User
    Join Date
    09-21-2011
    Location
    England
    MS-Off Ver
    Excel 2003
    Posts
    21

    Re: Removing Duplicate Rows based on 2 columns

    someone added this on another forum to ammend the code and make it work - just thought you'd like to see so you could find out what was causing the problem.

    Thanks again for your help!! Susie

    You have to loop backwards when deleting. Try like this (other changes to your code may be needed)
    Please Login or Register  to view this content.

  35. #35
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    Oh yeah....i actually didnt think of it. I am usually used to coding forwards. I will have to try out the code and see.

  36. #36
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Removing Duplicate Rows based on 2 columns

    Just as a heads-up.

    Please Login or Register  to view this content.
    will only work correctly if there are no empty spaces or rows in Column A. Its generally better to use:

    Please Login or Register  to view this content.
    Since it goes to the last cell in Row A and moves up to the last non-empty row.

    Hope this helps.

    abousetta
    Please consider:

    Thanking those who helped you. Click the star icon in the lower left part of the contributor's post and add Reputation.
    Cleaning up when you're done. Mark your thread [SOLVED] if you received your answer.

  37. #37
    Forum Contributor arlu1201's Avatar
    Join Date
    09-09-2011
    Location
    Bangalore, India
    MS-Off Ver
    Excel 2003 & 2007
    Posts
    19,166

    Re: Removing Duplicate Rows based on 2 columns

    I have worked out the code starting from the last row up. Just try it out and let me know if this works:

    Please Login or Register  to view this content.

  38. #38
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,418

    Re: Removing Duplicate Rows based on 2 columns

    Here's a shortened version that should run faster because it doesn't readjust the lastrow since it doesn't matter after the first loop:

    Please Login or Register  to view this content.
    Hope this helps.

    abousetta

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1