+ Reply to Thread
Results 1 to 12 of 12

Removing Duplicate Data All At One Time for Power Query

  1. #1
    Registered User
    Join Date
    07-19-2020
    Location
    NH, US
    MS-Off Ver
    2016
    Posts
    19

    Removing Duplicate Data All At One Time for Power Query

    I'm working with HR data for about 25k employees. This data was spread across 15 different spreadsheets with about 5 tabs per spreadsheet which I've consolidated. The information was labeled only with their employee ID. I currently have about 225k rows of information (some duplicates, some not) for about 25k employees that I plan on consolidating with power query. The names had to be in a specific format (vertically instead of horizontally) so I've ran a formula and I've got exactly what I need. But I need to remove the values in red in order to remove the duplicate error.

    Any formulas, chart tricks/hacks, anything to automatically remove the reds?
    Attached Files Attached Files

  2. #2
    Forum Expert
    Join Date
    05-30-2012
    Location
    The Netherlands
    MS-Off Ver
    Office 365
    Posts
    14,987

    Re: Removing Duplicate Data All At One Time for Power Query

    I2 =IF(COUNTA(B2:E2)=4,"save","delete")

    After that you can filter on column I and choose for delete.

    You have to realise that you not have all data in the same row.
    Notice my main language is not English.

    I appreciate it, if you reply on my solution.

    If you are satisfied with the solution, please mark the question solved.

    You can add reputation by clicking on the star * add reputation.

  3. #3
    Registered User
    Join Date
    07-19-2020
    Location
    NH, US
    MS-Off Ver
    2016
    Posts
    19

    Re: Removing Duplicate Data All At One Time for Power Query

    Thanks for your reply. I can't do that - because then I'd be losing the other fields of information. Because the stuff was so scattered, line 2 may have contained one name + SSN, line 4 may have contained a different variation of the name with the DOB. Eventually this will all be sandwiched together into one horizontal line.

  4. #4
    Forum Expert bebo021999's Avatar
    Join Date
    07-22-2011
    Location
    Vietnam
    MS-Off Ver
    Excel 2016
    Posts
    8,711

    Re: Removing Duplicate Data All At One Time for Power Query

    In J2, drag down to display unique ID of name in A2:

    Please Login or Register  to view this content.
    In K2, to get the last info of that ID in column B:

    Please Login or Register  to view this content.
    Drag down

    Copy column K to the right.

    Now you can copy/paste special/value to get the full information.
    Attached Files Attached Files

  5. #5
    Administrator alansidman's Avatar
    Join Date
    02-02-2010
    Location
    Steamboat Springs
    MS-Off Ver
    MS Office 365 Version 2301 (Build 16012.20000) Win 11 Home 64 Bit
    Posts
    22,314

    Re: Removing Duplicate Data All At One Time for Power Query

    Try this vba code:
    Please Login or Register  to view this content.
    Alan


    Change an Ugly Report with Power Query
    Database Normalization
    Complete Guide to Power Query
    Man's Mind Stretched to New Dimensions Never Returns to Its Original Form

  6. #6
    Forum Guru Bo_Ry's Avatar
    Join Date
    09-10-2018
    Location
    Thailand
    MS-Off Ver
    MS 365
    Posts
    7,090

    Re: Removing Duplicate Data All At One Time for Power Query

    Please try
    Open Power Query/Get and Transform. Click on New Query.
    Open blank query in the editor, launch Advanced Editor and paste in the following code.
    Please Login or Register  to view this content.
    Attached Files Attached Files

  7. #7
    Registered User
    Join Date
    07-19-2020
    Location
    NH, US
    MS-Off Ver
    2016
    Posts
    19

    Re: Removing Duplicate Data All At One Time for Power Query

    This worked well for the most part. Any idea on how to deal with extra data? For example - one column contains phone numbers. Some people have two phone numbers. So when I do this it just gives me the first phone number listed but not the second one.

  8. #8
    Registered User
    Join Date
    07-19-2020
    Location
    NH, US
    MS-Off Ver
    2016
    Posts
    19

    Re: Removing Duplicate Data All At One Time for Power Query

    Quote Originally Posted by bebo021999 View Post
    In J2, drag down to display unique ID of name in A2:

    Please Login or Register  to view this content.
    In K2, to get the last info of that ID in column B:

    Please Login or Register  to view this content.
    Drag down

    Copy column K to the right.

    Now you can copy/paste special/value to get the full information.
    This worked well for the most part. Any idea on how to deal with extra data? For example - one column contains phone numbers. Some people have two phone numbers. So when I do this it just gives me the first phone number listed but not the second one.

  9. #9
    Registered User
    Join Date
    07-19-2020
    Location
    NH, US
    MS-Off Ver
    2016
    Posts
    19

    Re: Removing Duplicate Data All At One Time for Power Query

    Quote Originally Posted by Bo_Ry View Post
    Please try
    Open Power Query/Get and Transform. Click on New Query.
    Open blank query in the editor, launch Advanced Editor and paste in the following code.
    Please Login or Register  to view this content.
    I will try this shortly. In my comment below to another user I asked how this deals with extra data? For example - one column contains phone numbers. Some people have two phone numbers. So when I do this it just gives me the first phone number listed but not the second one.

  10. #10
    Administrator alansidman's Avatar
    Join Date
    02-02-2010
    Location
    Steamboat Springs
    MS-Off Ver
    MS Office 365 Version 2301 (Build 16012.20000) Win 11 Home 64 Bit
    Posts
    22,314

    Re: Removing Duplicate Data All At One Time for Power Query

    Did you look at my post #5 that deletes the data in Red using VBA. Then once you have cleared the data, you can bring your table/range into PQ for whatever other issues you need to work on. Probably will then need to Unpivot your data to get it into a format that you will require subsequently.

  11. #11
    Forum Expert bebo021999's Avatar
    Join Date
    07-22-2011
    Location
    Vietnam
    MS-Off Ver
    Excel 2016
    Posts
    8,711

    Re: Removing Duplicate Data All At One Time for Power Query

    In R2 for phone number 1 in column I
    Please Login or Register  to view this content.
    **Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).

    In S2 for phone number 2

    Please Login or Register  to view this content.
    Enter only.

    Drag down
    Attached Files Attached Files

  12. #12
    Forum Guru Bo_Ry's Avatar
    Join Date
    09-10-2018
    Location
    Thailand
    MS-Off Ver
    MS 365
    Posts
    7,090

    Re: Removing Duplicate Data All At One Time for Power Query

    Quote Originally Posted by row2000 View Post
    how this deals with extra data? For example - one column contains phone numbers.
    This combines all distinct data separate by comma.

    Please Login or Register  to view this content.
    Attached Files Attached Files

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Power Query - removing embedded filters after export
    By Lauris K in forum Excel General
    Replies: 0
    Last Post: 06-25-2019, 07:25 AM
  2. Power Query - Remove Specific Duplicate Rows (Based on sort criteria)
    By kersplash in forum Excel Formulas & Functions
    Replies: 6
    Last Post: 03-22-2019, 03:30 AM
  3. [SOLVED] use formulas or power query to add/duplicate linked rows with data based on column
    By loganpmgoa in forum Excel Formulas & Functions
    Replies: 1
    Last Post: 02-14-2019, 01:46 PM
  4. Power Query - removing duplicate data, but keeping second set?
    By IronCladRooster in forum Excel General
    Replies: 0
    Last Post: 10-18-2018, 07:24 PM
  5. Sub-Forum for Excel Power Tools (Power Query, Power Pivot & Power BI)
    By chullan88 in forum Suggestions for Improvement
    Replies: 10
    Last Post: 06-28-2018, 02:25 PM
  6. Duplicate filters in Power Query
    By leukep in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 06-20-2017, 05:06 AM
  7. Finding and deleting 1 duplicate in power query
    By stephme55 in forum Excel General
    Replies: 1
    Last Post: 09-07-2016, 10:09 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1