+ Reply to Thread
Results 1 to 3 of 3

Combining duplicate companies and data

  1. #1
    Registered User
    Join Date
    03-09-2008
    Posts
    8

    Combining duplicate companies and data

    I have successfully combined two spreadsheets company specific data which came from different sources. I just cut and pasted but made sure they both had the same exact "26" column headers. After cleaning up some data within cells and columns, I now have a "small" mess of "duplicate" companies. I would say 20-30% overlap. However, out of 7000, that is a lot.

    Problem: Collectively between the two data sources, they provide 26 fields of company specific information. Thus I have constructed up front an excel spreedsheet that accomodates 26 columns for this information. However, the two data sources only have about 11 fields/columns of common data. Whence, the two data sources need to be combined at the company level so I will have ONE company (not two) which combines all 26 fields at the company level.

    Can this be done in Excel?

  2. #2
    Forum Moderator - RIP Richard Buttrey's Avatar
    Join Date
    01-14-2008
    Location
    Stockton Heath, Cheshire, UK
    MS-Off Ver
    Office 365, Excel for Windows 2010 & Excel for Mac
    Posts
    29,464
    Quote Originally Posted by Watchdog
    I have successfully combined two spreadsheets company specific data which came from different sources. I just cut and pasted but made sure they both had the same exact "26" column headers. After cleaning up some data within cells and columns, I now have a "small" mess of "duplicate" companies. I would say 20-30% overlap. However, out of 7000, that is a lot.

    Problem: Collectively between the two data sources, they provide 26 fields of company specific information. Thus I have constructed up front an excel spreedsheet that accomodates 26 columns for this information. However, the two data sources only have about 11 fields/columns of common data. Whence, the two data sources need to be combined at the company level so I will have ONE company (not two) which combines all 26 fields at the company level.

    Can this be done in Excel?
    Almost certainly yes. But you'll need to consider for those companies that are duplicated whether all the fields are the same. If they're not, you'll need to decide which of the two should be combined.

    The approach I'd take is to derive a unique list of companies. Take the company name/code field from each source and stack one set underneath the other in a third sheet. Now use Data Filter Advanced with the Unique operator, to extract a unique list of companies.

    Now it's simply just a question of using VLOOKUP() to grab the 26 fields of data from the source lists. In the case of the 11 common fields it won't matter which source you use, but for the other 15 fields you'll need to use one or other of the two source lists.

    HTH

  3. #3
    Registered User
    Join Date
    03-26-2008
    Posts
    45
    Also, you could probably use a pivot table to determine the unique data.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1