+ Reply to Thread
Results 1 to 25 of 25

Finding duplicate data and merging cells

  1. #1
    Registered User
    Join Date
    11-05-2007
    Posts
    32

    Finding duplicate data and merging cells

    Hello all. I'm going to be incredibly selfish, as this is my first post in this forum and it will involve me asking for help. However, time is of the essence for me, and so I don't have much choice.

    What I have is a massive table of data (15k+ entries), which consists of 4 different tables merged together. I have a lot of duplicate entries. However, most of these entries have information in one or another columns that its duplicate does not.

    What I need to do is:

    Identify the duplicate entries (by part number, which is one of the columns).

    Merge the duplicate entries. In the merging process, I need to carry information over, taking data that is currently in 2 or more rows and merging it into one row.

    Here's an example:

    column1 column2 column3 column4
    row1 a . . d
    row2 . b
    row3 . . c

    needs to turn into:

    column1 column2 column3 column4
    row1 a b c d


    If I'm not explaining this well, I will do my best to clarify. My main question is, is there any way to automate this process (even partially)? Any help would be greatly appreciated.

    Thanks.

  2. #2
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    I'm going to be incredibly selfish, this is my first post in this forum and it will involve me asking for help
    A help forum where no one asked for help would be a Zen experience.

    Yes, it's possible to automate. What happens when overlapping data exists in both records?

  3. #3
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    With overlapping data, it would be best if I could use the data from the first column found with the relevant part number. So, for instance, if row1 was part number x, with description xyz, and row 2 was part number x, with description abc, it would work best if the description for the merged column became xyz.

    The reason is because there are some duplicate part numbers that are actually for different parts. I'm combining tables with the part numbers and information from a bunch of different manufacturers, and some of those manufacturers use the same part number. Losing that information in the merging process is something I'm willing to deal with, as I have to have this thing done by January 1st.

  4. #4
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    there are some duplicate part numbers that are actually for different parts. I'm combining tables with the part numbers and information from a bunch of different manufacturers, and some of those manufacturers use the same part number.
    Right, but if the manufacturer is in each record, then the combination of mfr and part number are unique (right?).

    Can you post a representative sample of data -- enough that it captures all the possibilities?

  5. #5
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    Sure thing.

    Edit: parsing into this forum isn't working, so I'll try attaching the spreadsheet.

  6. #6
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    Hope this works...
    Attached Images Attached Images

  7. #7
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    A spreadsheet -- zip and attach.

  8. #8
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    Again, hope this works...
    Attached Files Attached Files

  9. #9
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    I was just wondering if anyone thought they might be able to help with this. I've started merging them manually, and it's about to make me tear my hair out :/

  10. #10
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    1. What rule should be applied to decide if two part numbers are the same; e.g.,

    is 00007 (bearing) the same as 007 (1/2" blank)?
    is 00029 (yoke) the same as 0029 (motor yoke)?

    2. What determines which data should be used when there is different info in the same columns for the same part?

  11. #11
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    The part numbers should be exactly the same. So in the two examples you posted, they should have seperate listings. Unfortunately, the 2nd example you listed is actually the same part, but it's listed under the wrong part number in one of our locations, and therefore it's acting as a second part. So the greater good is served by only merging part numbers that are exactly the same.

    One possible exception I would like to make to that, if it's possible, is to have any -'s or spaces ignored, as there's hundreds of part numbers that are the same, but show up as different because someone wanted to add a -. Periods (.) are a seperate thing from that, as they are used to seperate parts that are different but that have the same part number.

    The second question is also going to be a greater good answer as well. The best answer that I can come up with is to always use the information that's in a higher row, since most of those came off a list that I believe to be more accurate than the other three. So really, everything in a higher row should override the other rows, unless the first row doesn't have anything entered in that column.

    Does that make sense?

  12. #12
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    Higher row meaning it appears higher in the table (i.e., has a lower row number), or meaning a higher row number (i.e, appears lower in the table)?

    If the higher precedence row is blank in a certain column, and the lower precedence row has a value, what's the output?

  13. #13
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    Higher in the table, so a lower row number should take precedence.

    The only way the row taking precedence (the one higher in the table) should accept data is if it doesn't already have a value for that column. So, to answer your question, if the higher row has nothing in the column, then it takes the value from the lower colum. I'll try to do an example.

    Row 1 | value a | value a | no value | value a|
    Row 2 | value b | value b | value b | value b|

    should, after merging, look like:

    Row 1 |value a | value a | value b | value a |

  14. #14
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    That's a much better explanation, thank you.

    I'll take a look at it later today if no one else does so first.

  15. #15
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    Take a look at the attachment; do Alt+F8, and run macro OBA. Only minimal testing.

    I did make one change to the input data before running this: I cleared all the data in the cost column that was zero, assuming that any price that appeared for a given part was better than no price.

    To do this, I selected col G, and did Edit > Find > 0, Find All. That gives you a big list of locations. Select the first line in the list, scroll to the bottom of the list, hold the Shift key down, and select the bottom line in the list. Now just hit the Delete key, and they're all gone in one fell swoop.
    Last edited by shg; 01-07-2009 at 12:10 PM.

  16. #16
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    From the cursory inspection I gave the output, this thing is bleeping beautiful! Thanks a ton man, if this works the way it looks like it does, you just saved me from a very hellish 1.5 months!

  17. #17
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    Well that's bleeping wonderful. Glad it worked for you.

    BTW, I didn't address the issue of a trailing s or period -- that explanation wasn't clear to me, and I didn't see examples in the sample.
    Last edited by shg; 11-08-2007 at 01:51 PM.

  18. #18
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    I was more concerned with the part numbers that are identical other than dashes or spaces.

    Such as

    C403 0292
    C403-0292

    These are the same part, with basically the same part number, but not close enough that the merging macro recognizes them and merges them. Periods and S's generally denote different parts, so those should stay.

    I feel like a heel asking for anything more, but is it possible to have the macro merge parts that are only different by a dash or a space?

    If not, no worries, there's only a few hundred of those, so I can do them manually.

    Thanks again for all your help!

  19. #19
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    Can all spaces and hyphens be deleted? Or all spaces replaced with hyphens?

  20. #20
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    Either one of those solutions would be fine. I think I'd prefer that all the spaces were replaced by hyphens, but anything would be better than what I've got now.
    Last edited by oneblueaugust; 11-08-2007 at 07:05 PM.

  21. #21
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678
    Try this ...
    Attached Files Attached Files

  22. #22
    Registered User
    Join Date
    11-05-2007
    Posts
    32
    Can you make the first macro work with this spreadsheet? The one in the post immediately above this didn't seem to work. If it's as easy as changing a value here and there, I'd be more than happy to do it myself if it's explained to me.

    If not, that's ok. I'm already a lot farther than I ever thought I'd be with this thing.
    Attached Files Attached Files
    Last edited by oneblueaugust; 12-20-2007 at 12:51 PM.

  23. #23
    Registered User
    Join Date
    07-23-2012
    Location
    United States
    MS-Off Ver
    Excel 2010
    Posts
    1

    Re: Finding duplicate data and merging cells

    Hi, Im looking for similiar macro and didnt see the attachment mentioned on the previous page.. thanks

  24. #24
    Forum Guru
    Join Date
    03-12-2010
    Location
    Canada
    MS-Off Ver
    2010 and 2013
    Posts
    4,417

    Re: Finding duplicate data and merging cells

    Hello and welcome to the forum,

    Please start a new thread as this thread is very old and attach a sample of your workbook. You can replace the actual data with dummy values but leave the structure the same.

    Thank you.

    abousetta
    Please consider:

    Thanking those who helped you. Click the star icon in the lower left part of the contributor's post and add Reputation.
    Cleaning up when you're done. Mark your thread [SOLVED] if you received your answer.

  25. #25
    Registered User
    Join Date
    06-14-2013
    Location
    Indonesia
    MS-Off Ver
    Excel 2007
    Posts
    1

    Re: Finding duplicate data and merging cells

    I like it!!

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1