+ Reply to Thread
Results 1 to 10 of 10

Delete duplicate rows with same data in certain columns for million rows

  1. #1
    Forum Contributor
    Join Date
    01-26-2012
    Location
    Mumbai
    MS-Off Ver
    Excel 2013
    Posts
    154

    Delete duplicate rows with same data in certain columns for million rows

    Dear All ,

    I have million rows ,

    I have columns A to N ...the N may vary and i wish macro to self find the range using the range function and columns.count x1left function ,


    Second in each of the rows ,i want macro to check A,B,C,D,F .if they are same ,i want them to be deleted using the following logic

    If there are 5 rows X1,X2,X3,X4,X5 with identical data in A,B,C,D,F i want E column to be checked for each of this 5 rows ...keep the one with the largest string length ,i want rest all to be deleted .....in case data in cell E is same and repeats in all 5 entries ,i want all except 1 to be retained ...

    While rows get deleted ,i want them to be deleted in entirety and not in range of A:F or etc ,i wish order to be maintained while tis process is carried out and that is why i have kept the last column as row number ,

    i would wish someone to use loop and solve it for million entries ......if anyone can capture in ceryain cell rows getting deleted and duplicate entries ,it will be highly appreciated

    i Hope that i have been clear enough outlying my requirement ?


    Thanks ,


    Amlan Dutta
    Attached Files Attached Files

  2. #2
    Forum Contributor
    Join Date
    01-26-2012
    Location
    Mumbai
    MS-Off Ver
    Excel 2013
    Posts
    154

    Re: Delete duplicate rows with same data in certain columns for million rows

    Hiya ,


    If someone can take some time out and respond to this query ,i would really be very very grateful....i mean i have tried very hard but something or the other fails and i dunno how to solve this ...i mean ,i am yet learning .....when i do a sort and compare and delete duplicate row .i dunno why the macro fails to delete all duplicate rows .....this kinda makes me feel sad ,

    Will appreciate some expert trying to solve this ,

    Thanks in advance ,


    Hugs ,

    Amlan Dutta

  3. #3
    Forum Expert jaslake's Avatar
    Join Date
    02-21-2009
    Location
    Atwood Lake in Mid NE Ohio...look it up.
    MS-Off Ver
    Excel 2010 2019
    Posts
    12,749

    Re: Delete duplicate rows with same data in certain columns for million rows

    Hi Amlan
    This will take a while to run on a million rows. I wanted to check this
    in case data in cell E is same and repeats in all 5 entries ,i want all except 1 to be retained
    Is that correct...you want 4 retained and only 1 deleted? Don't know if I can manage that.
    John

    If you have issues with Code I've provided, I appreciate your feedback.

    In the event Code provided resolves your issue, please mark your Thread as SOLVED.

    If you're satisfied by any members response to your issue please use the star icon at the lower left of their post.

  4. #4
    Forum Contributor
    Join Date
    01-26-2012
    Location
    Mumbai
    MS-Off Ver
    Excel 2013
    Posts
    154

    Re: Delete duplicate rows with same data in certain columns for million rows

    Dear John ,

    Thanks for asking that bit ,i went horribly wrong explaining (delete 4 and keep 1 is what i want in case of what you asked ),

    Also i would like you to know that i had posted this thread initially at http://www.ozgrid.com/forum/showthre...goto=newpost** where rbrhodes had been very kind to help me and we had a huge huge discussion on this thread and we were awesomely close to get a solution before i had to depart for some work ,last i knew the solution still had problems with duplicate rows in which the parent row would also get deleted along with the duplicate row

    First ,i will reexplain what i intended the macro to do ,


    simply put compare rows ,check whether A TO F are identical for any rows ,if yes ,keep the one with the highest length of text in cell E of such rows and delete the rest of the rows

    If there are 5 rows X1,X2,X3,X4,X5 with identical data in A,B,C,D,F i want E column to be checked for each of this 5 rows ...keep the one with the largest string length ,i want rest all to be deleted ...say x1,x2,x3,x4,x5 had string length 100,120,70,60,50 respectively then i just want x2 to be retained and all other entries deleted .........


    In case x1,x2,x3,x4,x5 have identical data in A,B,C,D,F and have even identical data(in terms of text length and type ) in cell E .This shall mean duplicate rows ,in such case ,i wwould wish to retain just one and delete the rest ,perhaps keeping the one(latest one by date would be best )......

    Once done ,i would wish to have another macro to show me the original result on the same active sheet so that i can compare and see whether results are correct ...if that's difficult the output can simply be directed to sheet 2 or sheet 3 ..i would wish to have two buttons "Delete " and original on same sheet so that pressing delete deletes the duplicate rows based on criteria explained above and pressing original brings back the input data ,

    I am attaching a sample input and sample output sheet to explain how i would want delete duplicate to operate ,i purposely keep row numbers to the right so that incase you decide to do sorting would it be possible for the macro to unsort and keep results in order since the origial data come's from a report which is arranged datewise ....so if row nos are 2,3,4,5,6,7 and sorting makes it 5,2,4,3,7 post which deletion takes place to make it 5,4,3 ,i want that at the end the results become unsorted as 3,4,5


    Humble Regards ,


    Amlan Dutta
    Attached Files Attached Files

  5. #5
    Forum Expert jaslake's Avatar
    Join Date
    02-21-2009
    Location
    Atwood Lake in Mid NE Ohio...look it up.
    MS-Off Ver
    Excel 2010 2019
    Posts
    12,749

    Re: Delete duplicate rows with same data in certain columns for million rows

    Hi Amlan
    This code is in the attached. It first makes a copy of the original data to Sheet2. Then it creates 3 new column fields, 1 row for each record. I'm using Columns T, U and V for this. If this needs to be moved further to the right let me know. I'm NOT using your Column N.

    In Column T I'm placing the String Length of Column E.
    In Column U I'm placing the original sort sequence (the Row Number if you will)
    In Column V I'm placing a concatenation of Columns A, B, C, D and F.

    The only way I could accommodate this
    i would wish to have another macro to show me the original result on the same active sheet so that i can compare and see whether results are correct
    was to make a copy of the original data on Sheet2.

    There are notes in the code that explain what it's doing. Let me know of issues.

    Please Login or Register  to view this content.
    Attached Files Attached Files

  6. #6
    Forum Contributor
    Join Date
    01-26-2012
    Location
    Mumbai
    MS-Off Ver
    Excel 2013
    Posts
    154

    Re: Delete duplicate rows with same data in certain columns for million rows

    Dear John ,

    Firstly thanks a tonne ,john ,you are simply too good ........secondly sorry for the delayed response ,actually last night when i wrote the thread it was 1.30 am in the night and i immediately fell asleep without self realising,so am deeply sorry that i could not inform post writing that thread last night .....
    I woke up 5 minutes back and saw that you have send a response ,

    I am deeply thankful ,let me check this and get back with preliminary analysis in 10-15 minutes and detailed analysis in a little more time .....i say detailed because this is a tricky case to attempt solving ,

    In any case ,i dunno how to thank you ,your responses mean a lot .....as regards output in sheet 2 ,it's perfect that way i know my actual data and can see the output data ,so it suits me perfectly ,

    Gimme some 10 minutes to get back with output sheet ,

    Humble Regard's,

    Amlan Dutta

  7. #7
    Forum Expert jaslake's Avatar
    Join Date
    02-21-2009
    Location
    Atwood Lake in Mid NE Ohio...look it up.
    MS-Off Ver
    Excel 2010 2019
    Posts
    12,749

    Re: Delete duplicate rows with same data in certain columns for million rows

    Hi Amlan
    Don't rush...I'm pretty much done for today.

  8. #8
    Forum Contributor
    Join Date
    01-26-2012
    Location
    Mumbai
    MS-Off Ver
    Excel 2013
    Posts
    154

    Re: Delete duplicate rows with same data in certain columns for million rows

    Dear John ,

    That's okay ....lets do this case slowly ,john ...this case brings back tough memories ...i tried your code ...unfortunately it didn't show up results i want but like i said i deeply value your efforts ,so thanks a tonne dude ,

    I attach the input sheet ,output sheet like i thought should have appeared on the screen ,


    I then attach "what actually happened " sheet to show the real output.

    John ,the concatenation doesn't help ,i wish that the results are as they are since i use another macro post your macro execution to send mails to column B reciptents ....

    what i note is that ,in your sheet attached it actually looks fine though i didn't check it much in detail post i did a study in the sheets attached cause i would wish a generic solution which works on all sheets ..........

    There is a lot happening here as well....you see in the whats happened here sheet ,i see that results in subsequent columns get mixed up and the input sheet entries also change .....maybe the prior is due to concatenation and the later happens in my sheet but not your's ,

    Also ,if you see the attached input and desired output sheet ,i would like that the original order is retained post execution ,this case is complex .....i had tried very hard earlier without success ,

    You have asked whether the column would end at N ,actually i am looking at the columncount function to handle it if possible because the input sheet is one amongst any wherein there are columns till N but i get sheets wherein the only things fixed are A to F and then connected random columns,

    I think i am failing to express my case here ,i am kinda sorry ,

    Humble Regard's,


    Amlan Dutta

  9. #9
    Forum Contributor
    Join Date
    01-26-2012
    Location
    Mumbai
    MS-Off Ver
    Excel 2013
    Posts
    154

    Re: Delete duplicate rows with same data in certain columns for million rows

    Dear John ,

    Let this case pass ,please .I say this because rhodes and i were trying to attempt very hard to solve this case and this case has a very long thread with huge feedback at ozgrid ,so i don't want you to exert much on this case since we are actually nearing up a solution there and it is beyond the scope of thsi thread to attempt and describe what minor issues crop up when the macro runs ...the problem doesn't sound hard but actually is way tough than it appears .....

    You may see the complete thread and attachments and how it took 18 permutations to finally get a partial working solution ,so i wouldn't want you to exert much on this especially now when i am getting closer to teh solution ,

    Once Rhodes does it ,will obviously post the solution here for everyones benifit with rhode's consent ,

    Thanks a tonne ,John ....just wanna be transparent here ,finally am a student looking to learn and appreciate all you senior's help ,i dunno how to thank enough everyone for taking their valuable time out .....just know that we love you so so so much ,

    Thanks a tonne for even considering replying to the thread ,i need to make a move to office now


    The thred still active at ozgrid is this link ,

    http://www.ozgrid.com/forum/showthre...d=1#post594643

    Humble Regards ,


    Amlan Dutta

  10. #10
    Forum Expert jaslake's Avatar
    Join Date
    02-21-2009
    Location
    Atwood Lake in Mid NE Ohio...look it up.
    MS-Off Ver
    Excel 2010 2019
    Posts
    12,749

    Re: Delete duplicate rows with same data in certain columns for million rows

    Hi Amlan

    OK, we be done with this issue.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1