+ Reply to Thread
Results 1 to 12 of 12

Delete duplicates by requesting to keep a specific row from the bunch of duplicates

  1. #1
    Registered User
    Join Date
    06-27-2011
    Location
    US
    MS-Off Ver
    Excel 2007
    Posts
    7

    Delete duplicates by requesting to keep a specific row from the bunch of duplicates

    Hi

    I've attached a file for your reference.
    I'm trying to remove the duplicates on column B ('Amount') and E ('SId') but also want to keep a specific row from all the duplicates. I want to keep the row that has 'EN' in Column G ('Reference qualifier').

    I found a way to remove the duplicates (from the 'remove duplicates' function) and realize that Excel always keep the first row of the bunch of duplicates. But I can't figure out a way to keep a specific/designated row. Hope this make sense.
    It's something I will have to do on a regular basis. So an easy solution that I can use regurlarly will be appreciate.

    Thanks in advance
    Attached Files Attached Files
    Last edited by yolide; 07-07-2011 at 04:48 PM.

  2. #2
    Forum Guru (RIP) Marcol's Avatar
    Join Date
    12-23-2009
    Location
    Fife, Scotland
    MS-Off Ver
    Excel '97 & 2003/7
    Posts
    7,216

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Hi yolide

    Welcome to the Forum ....

    Try this workbook

    Create a number of Dynamic Named Ranges "DataTable", "Amount", "Bill_Id", "Reference_qualifier", and "Helper_Column"
    Note that all the the range heights are based on the height of "DataTable"
    e.g.
    "DataTable"
    Refers to:
    Please Login or Register  to view this content.
    "Bill_Id"
    Please Login or Register  to view this content.

    Then in Column K ("Helper Column") Row 2
    Please Login or Register  to view this content.
    Drag/Fill Down

    In L2 ("Remove Duplicates")
    Please Login or Register  to view this content.
    Drag/Fill Down until a blank is returned

    In M2 this array function
    Please Login or Register  to view this content.
    Confirm with Ctrl+Shift+Enter not just Enter
    Drag Down to last row in Column L then Across to Column S

    Column T is not required and only left to prove the formula results.

    You could do this without named ranges but array functions are best kept to a minimum size to avoid wasteful calculations and dynamic named ranges will do this for you.

    Hope this helps.
    Attached Files Attached Files
    Last edited by Marcol; 06-28-2011 at 07:11 PM.
    If you need any more information, please feel free to ask.

    However,If this takes care of your needs, please select Thread Tools from menu above and set this topic to SOLVED. It helps everybody! ....

    Also
    اس کی مدد کرتا ہے اگر
    شکریہ کہنے کے لئے سٹار کلک کریں
    If you are satisfied by any members response to your problem please consider using the small Star icon bottom left of their post to show your appreciation.

  3. #3
    Registered User
    Join Date
    06-27-2011
    Location
    US
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Thanks for this. I'm not an excel expert so I will try to go through and see if it answer. Meanwhile, is it possible to have this as a macro so that I can pin it on my toolbar and click on the icon each time I want to clean the data? Thanks.

  4. #4
    Forum Guru (RIP) Marcol's Avatar
    Join Date
    12-23-2009
    Location
    Fife, Scotland
    MS-Off Ver
    Excel '97 & 2003/7
    Posts
    7,216

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Macros are not always acceptable, or even permitted in some situations, using native formula is usually a better option where possible.

    Try using the workbook as a template.

    Paste any new data into Columns B:H then drag Columns K:S to suit.

    Then copy Columns M:S Paste Special > Values to a suitable destination if required.

  5. #5
    Registered User
    Join Date
    06-27-2011
    Location
    US
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Sorry but I can't understand all the formula and replicate them in my file. Could you pls set it up for the new file I've attached. That's how my the actual file will look like.
    As a reminder I want to remove all duplicates on column AC and AH and keep the row with 'EN' in column AN. If no 'EN' then any row can be kept.
    Thanks again.
    Attached Files Attached Files

  6. #6
    Registered User
    Join Date
    06-27-2011
    Location
    US
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    [Update] I understand the formula a little bit better now. I started applying it to my actual file and it takes a lot of time to run it. I have about 10000 rows. Is there a faster wayt to do this? Like I said this is something I will have to do on a weekly basis. Thanks.

  7. #7
    Forum Guru (RIP) Marcol's Avatar
    Join Date
    12-23-2009
    Location
    Fife, Scotland
    MS-Off Ver
    Excel '97 & 2003/7
    Posts
    7,216

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Okay. The principle is the same but because of the size of your data table it is perhaps easier to use a seperate sheet to get the desired result.
    In this workbook, I have called the Sheets "Data" & "Result", once the setup is complete these sheet names can be changed and the various formulae will automatically update.

    1/. Create the dynamic named ranges. (Formulas > Name Manager)
    "DataTable"
    Refers to:
    Please Login or Register  to view this content.
    Syntax
    OFFSET(reference,rows,cols,height,width)
    reference:= Data!$A$2 (This is the top left cell of the table, therefore the next two values are zero, i.e. no offset)
    rows:= 0
    cols:= 0
    height:= COUNTA(Data!$AH:$AH)-1 (This is the number of rows the table will have, the -1 removes the header from the count)
    width:= COLUMN(Data!AQ$1) (this returns the last column number in the table, in this case 41)

    Data!$AH:$AH is the column that is most likely to have continuous data that defines the height of the table. This is then used to control the height of the other named ranges.

    e.g.
    "Amount"
    Refers to:
    Please Login or Register  to view this content.

    For more on named ranges see this link Contextures - Excel Names -- Excel Named Ranges

    and for more techniques see Chip Pearson - Defined Names

    2/. With Sheet "Data" in the "Helper Column" (Column AQ) concatinate "Amount by Line Item" & "Shipment Id"
    In AQ2
    Please Login or Register  to view this content.
    Drag/Fill Down to the last row in your table.
    In AR2 "Remove Duplicates"
    Please Login or Register  to view this content.
    Drag/Fill down until a blank is returned.
    This will remove the duplicate strings.
    This only works for Text values. Numbers and mixed data require different formulae, but as we have concatinated the helper column then the results are text strings.

    3/. With Sheet "Result"
    Either paste the headers, or in A1
    Please Login or Register  to view this content.
    Drag/Fill Right to Column AO (i.e. - the last Column in Sheet "Data")

    In A2 this array formula, given you are using 2007
    Please Login or Register  to view this content.
    Confirm with Ctrl+Shift+Enter not just Enter.
    Drag this down until a blank cell is returned, then Right to Column AO.

    Zeros are returned for empty and zero cells with the sample you have provided, these can be suppressed with Number Formatting where appropriate.
    I have formatted the first three columns of Sheet "Result" to demonstrate the result
    Select the appropriate columns then
    Home > Format Cells... Number > Custom
    Type:
    Please Login or Register  to view this content.
    4/. Finally copy Sheet "Result" and Paste > Paste Special > Values to a suitable destination if required.

    All this can of course be done with a macro, but you will, in my opinion, learn more by understanding native formula, rather than using VBa.
    Changes are easier with this template using native methods, it can often be an endless task using VBa.

    To use Paste your new data into Sheet "Data" and drag the formulae to suit.

    Sorry about the rant, but it should stand you in good stead if you can follow and understand the steps.

    [EDIT]
    With 10,000 rows it will take some time to calculate, but so probably will a VBa solution.
    For a task that only has to be done once a week is this really a problem?

    Once the formulae have been expanded to cover your typical table size and retained as a template, it should just be a matter of pasting and having a coffee ... ...
    Attached Files Attached Files
    Last edited by Marcol; 06-29-2011 at 07:09 AM.

  8. #8
    Registered User
    Join Date
    06-27-2011
    Location
    US
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Thank you Marcol. It takes more than 2 coffees to run through (about 20min). However, the solution you provided solves the problem. I just really really wish it was faster. Thank you for your help/advice throughout.
    I definitely agree with your point of providing me with native formula. I'm certainly more 'Excel knowledgeable' at the end of this.
    Yet, would you mind giving me the macro that I can pin on my toolbar. So that each time I want to use it, I would just have to click on that button? Thanks.

  9. #9
    Forum Guru (RIP) Marcol's Avatar
    Join Date
    12-23-2009
    Location
    Fife, Scotland
    MS-Off Ver
    Excel '97 & 2003/7
    Posts
    7,216

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Hmm?
    I didn't notice this possibility before.

    What is the full list of values in "Reference qualifier" like?
    Is "EN" alphabetically the first in this list?

    If so then Sort the Data By:=
    "Amount by Line Item" Order:= smallest to largest.
    Add a level
    Then by
    "Shipment Id" Order:= smallest to largest.
    Add a level
    Then by
    "Reference qualifier" Order:= A-Z

    Then Data > Remove Duplicates
    Select and check "Amount by Line Item" and "Shipment Id"

    If this is possible it will beat any formula or VBa hands down!!!

  10. #10
    Registered User
    Join Date
    06-27-2011
    Location
    US
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    I thought about that but 'EN' is in the middle of the list. There are 'CO','MB', 'OL'. Plus, the list is not static: codes can be added on a need basis at anytime, so I don't know what will be the list in the future.

  11. #11
    Forum Guru (RIP) Marcol's Avatar
    Join Date
    12-23-2009
    Location
    Fife, Scotland
    MS-Off Ver
    Excel '97 & 2003/7
    Posts
    7,216

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    Okay, just a thought, it will be tomorrow before I can get back to this.

  12. #12
    Registered User
    Join Date
    06-27-2011
    Location
    US
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Delete duplicates by requesting to keep a specific row from the bunch of duplicat

    OK. Thank you.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1