+ Reply to Thread
Results 1 to 13 of 13

Thread: Delete Duplicates and Concatenate Unique Cells

  1. #1
    Registered User
    Join Date
    12-16-2010
    Location
    Phnom Penh
    MS-Off Ver
    Excel 2010
    Posts
    7

    Question Delete Duplicates and Concatenate Unique Cells

    Hello,
    This is my first post here - I have been searching around for quite a while for a solution, and I have found posts that are close, but because I don't know VB I am unable to modify them to work for my exact needs.

    What I need is like this: http://www.excelforum.com/excel-prog...nd-delete.html
    and
    http://www.excelforum.com/excel-work...en-delete.html

    Here's my data (I also attached a worksheet):
    What I HAVE:
    Term POS1 POS2 POS3 POS4 POS5 POS6
    find NUM
    fun ROY NNP
    hello NN VB
    hello NN JJ
    try JJ
    try NN


    WHAT I WANT:
    Term POS1
    find NUM
    fun ROY NNP
    hello NN VB JJ
    try JJ NN



    The data is words with their part of speech labels, so I only want one entry per word, but need to concatenate the unique part of speech labels with each word.

    I hope that makes sense - I would appreciate your help!

    Thank you ,
    Nathan
    Attached Files Attached Files
    Last edited by sungkhum; 12-16-2010 at 05:46 AM. Reason: undo

  2. #2
    Forum Guru, retired Admin royUK's Avatar
    Join Date
    11-18-2003
    Location
    Derbyshire,UK
    MS-Off Ver
    Xp; 2007; 2010
    Posts
    25,640

    Re: Delete Duplicates and Concatenate Unique Cells

    Select a cell in the table. From the Data tab select "Remove Duplicates" for Column A

    Not sure what you mean by the rest
    Last edited by royUK; 12-16-2010 at 04:23 AM.
    Hope that helps.

    RoyUK
    --------
    If you are pleased with a member's answer then use the Star icon to rate it, if you are pleased enough to part with cash consider a donation to Children in Need

    For Excel Tips & Solutions, free examples and tutorials why not check out my downloads

    New members please read & follow the Forum Rules

    Remember to mark your questions Solved and rate the answer(s)

  3. #3
    Registered User
    Join Date
    12-16-2010
    Location
    Phnom Penh
    MS-Off Ver
    Excel 2010
    Posts
    7

    Re: Delete Duplicates and Concatenate Unique Cells

    Yes, it is a little more complex than that.
    What I have is two identical terms with extra columns that have unique data that I want to concatenate.



    For example:
    hello JJ VB
    hello NN ROY VB

    Would become:
    hello JJ VB NN ROY

    Is that a little clearer?
    Last edited by sungkhum; 12-16-2010 at 05:46 AM. Reason: undo

  4. #4
    Registered User
    Join Date
    12-16-2010
    Location
    Phnom Penh
    MS-Off Ver
    Excel 2010
    Posts
    7

    Re: Delete Duplicates and Concatenate Unique Cells

    I found code that comes close to what I need for two columns, but I am not sure how to modify it for say, 5 or 6 columns (or however many I need) as well as remove duplicates. Can someone help?

    Sub ConcatData()
    Dim X As Double
    Dim DataArray(5000, 2) As Variant
    Dim NbrFound As Double
    Dim Y As Double
    Dim Found As Integer
    Dim NewWks As Worksheet
    
    Cells(1, 1).Select
    Let X = ActiveCell.Row
    Do While True
    If Len(Cells(X, 1).Value) = Empty Then
    Exit Do
    End If
    If NbrFound = 0 Then
    NbrFound = 1
    DataArray(1, 1) = Cells(X, 1)
    DataArray(1, 2) = Cells(X, 2)
    Else
    For Y = 1 To NbrFound
    Found = 0
    If DataArray(Y, 1) = Cells(X, 1).Value Then
    DataArray(Y, 2) = DataArray(Y, 2) & ", " & Cells(X, 2)
    Found = 1
    Exit For
    End If
    Next
    If Found = 0 Then
    NbrFound = NbrFound + 1
    DataArray(NbrFound, 1) = Cells(X, 1).Value
    DataArray(NbrFound, 2) = Cells(X, 2).Value
    End If
    End If
    X = X + 1
    Loop
    
    Set NewWks = Worksheets.Add
    NewWks.Name = "SummarizedData"
    Cells(1, 1).Value = "Code"
    Cells(1, 2).Value = "Colors Found"
    X = 2
    For Y = 1 To NbrFound
    Cells(X, 1).Value = DataArray(Y, 1)
    Cells(X, 2).Value = DataArray(Y, 2)
    X = X + 1
    Next
    Beep
    MsgBox ("Summary is done!")
    
    
    
    End Sub
    Last edited by sungkhum; 12-16-2010 at 06:06 AM. Reason: updated info about code

  5. #5
    Forum Guru
    Join Date
    08-26-2007
    Location
    London
    Posts
    3,788

    Re: Delete Duplicates and Concatenate Unique Cells

    Here is another approach. Puts results in K1.
    Sub x()
    
    Dim oDic As Object, vData As Variant, vOut, n As Long, i As Long, j As Long
    
    Set oDic = CreateObject("Scripting.Dictionary")
    
    vData = Range("A1").CurrentRegion.Value
    ReDim vOut(1 To UBound(vData, 1), 1 To 2)
    
    With oDic
        For i = 1 To UBound(vData, 1)
            If Not IsEmpty(vData(i, 1)) And Not .Exists(vData(i, 1)) Then
                n = n + 1
                .Add vData(i, 1), n
                vOut(n, 1) = vData(i, 1)
                For j = 2 To UBound(vData, 2)
                    If Len(vData(i, j)) = 0 Then Exit For
                    vOut(n, 2) = vOut(n, 2) & " " & vData(i, j)
                Next j
            ElseIf .Exists(vData(i, 1)) Then
                For j = 2 To UBound(vData, 2)
                    If Len(vData(i, j)) = 0 Then Exit For
                    If InStr(vOut(.Item(vData(i, 1)), 2), " " & vData(i, j) & " ") = 0 Then
                        vOut(.Item(vData(i, 1)), 2) = vOut(.Item(vData(i, 1)), 2) & " " & vData(i, j)
                    End If
                Next j
            End If
        Next i
    End With
    
    Range("K1").Resize(n, 100) = vOut
    Columns("L:L").TextToColumns Destination:=Range("L1"), DataType:=xlDelimited, _
                   TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=False, _
                   Semicolon:=False, Comma:=False, Space:=True, Other:=False
    Columns("L:L").Delete
    
    End Sub
    EDIT: amended slightly, ignore my previous comment about 100 limit, completely wrong!
    Last edited by StephenR; 12-16-2010 at 09:06 AM.

  6. #6
    Registered User
    Join Date
    12-16-2010
    Location
    Phnom Penh
    MS-Off Ver
    Excel 2010
    Posts
    7

    Re: Delete Duplicates and Concatenate Unique Cells

    Sweet - that seems to do the trick! Thanks! You saved me a ton of time!

    -nathan

  7. #7
    Registered User
    Join Date
    12-16-2010
    Location
    Phnom Penh
    MS-Off Ver
    Excel 2010
    Posts
    7

    Re: Delete Duplicates and Concatenate Unique Cells

    Actually doing some more testing, there seems to be a problem with duplicates (it almost works every time). Here is a situation where duplicates are allowed:
    test VB NN NN JJ
    test VB

    The result is:
    test VB NN NN JJ

    Is there a way you can remove the duplicates (so there is only one NN for instance).

    Thanks so much!
    -Nathan

  8. #8
    Forum Guru
    Join Date
    08-26-2007
    Location
    London
    Posts
    3,788

    Re: Delete Duplicates and Concatenate Unique Cells

    I think this is better code:
    Sub x()
    Dim oDic1 As Object, oDic2 As Object
    Dim i As Long, j As Long
    Dim s As String
    Dim vData As Variant, v1 As Variant, v2 As Variant
    
    Set oDic1 = CreateObject("Scripting.Dictionary")
    vData = Range("A1").CurrentRegion.Value
    ReDim vOut(1 To UBound(vData, 1), 1 To 2)
    
    With oDic1
        For i = 1 To UBound(vData, 1)
            If Not IsEmpty(vData(i, 1)) And Not .Exists(vData(i, 1)) Then
                Set oDic2 = CreateObject("Scripting.Dictionary")
                For j = 2 To UBound(vData, 2)
                    If Len(vData(i, j)) = 0 Then Exit For
                    If Not oDic2.Exists(vData(i, j)) Then oDic2.Add vData(i, j), vData(i, j)
                Next j
                .Add vData(i, 1), oDic2
            ElseIf .Exists(vData(i, 1)) Then
                Set oDic2 = .Item(vData(i, 1))
                For j = 2 To UBound(vData, 2)
                    If Len(vData(i, j)) = 0 Then Exit For
                    If Not oDic2.Exists(vData(i, j)) Then
                        oDic2.Add vData(i, j), vData(i, j)
                    End If
                Next j
            End If
        Next i
    End With
    
    With oDic1
        Range("K1").Resize(UBound(.Keys) + 1, 1) = Application.Transpose(.Keys)
        v1 = .Keys
        For i = 0 To UBound(v1)
            s = ""
            Set oDic2 = .Item(v1(i))
            v2 = oDic2.Keys
            For j = 0 To oDic2.Count - 1
                s = s & " " & v2(j)
            Next j
            Cells(i + 1, "L") = Mid(s, 2)
        Next i
    End With
    
    End Sub

  9. #9
    Registered User
    Join Date
    12-16-2010
    Location
    Phnom Penh
    MS-Off Ver
    Excel 2010
    Posts
    7

    Re: Delete Duplicates and Concatenate Unique Cells

    Thanks, that seems to do it!

    Would it be possible to put the results in separate columns like before though? That will make it easier when I import more to the data set later.

    Thanks so much - this has been a huge help (I have a list of about 40,000 terms I was going through - and this did it in like a second!).
    -nathan

  10. #10
    Forum Guru
    Join Date
    08-26-2007
    Location
    London
    Posts
    3,788

    Re: Delete Duplicates and Concatenate Unique Cells

    Oops, slipped off the radar. Dictionary is a very powerful tool.
    Sub x()
    
    Dim oDic1 As Object, oDic2 As Object
    Dim i As Long, j As Long
    Dim s As String
    Dim vData As Variant, v1 As Variant, v2 As Variant
    
    Set oDic1 = CreateObject("Scripting.Dictionary")
    vData = Range("A1").CurrentRegion.Value
    ReDim vOut(1 To UBound(vData, 1), 1 To 2)
    
    With oDic1
        For i = 1 To UBound(vData, 1)
            If Not IsEmpty(vData(i, 1)) And Not .Exists(vData(i, 1)) Then
                Set oDic2 = CreateObject("Scripting.Dictionary")
                For j = 2 To UBound(vData, 2)
                    If Len(vData(i, j)) = 0 Then Exit For
                    If Not oDic2.Exists(vData(i, j)) Then oDic2.Add vData(i, j), vData(i, j)
                Next j
                .Add vData(i, 1), oDic2
            ElseIf .Exists(vData(i, 1)) Then
                Set oDic2 = .Item(vData(i, 1))
                For j = 2 To UBound(vData, 2)
                    If Len(vData(i, j)) = 0 Then Exit For
                    If Not oDic2.Exists(vData(i, j)) Then
                        oDic2.Add vData(i, j), vData(i, j)
                    End If
                Next j
            End If
        Next i
    End With
    
    With oDic1
        Range("K1").CurrentRegion.ClearContents
        Range("K1").Resize(UBound(.Keys) + 1, 1) = Application.Transpose(.Keys)
        v1 = .Keys
        For i = 0 To UBound(v1)
            s = ""
            Set oDic2 = .Item(v1(i))
            v2 = oDic2.Keys
            For j = 0 To oDic2.Count - 1
                s = s & " " & v2(j)
            Next j
            Cells(i + 1, "L") = Mid(s, 2)
        Next i
    End With
    
    Columns("L:L").TextToColumns Destination:=Range("L1"), DataType:=xlDelimited, Space:=True
    
    End Sub
    Do you hang out at the FCC in PP?

  11. #11
    Registered User
    Join Date
    12-16-2010
    Location
    Phnom Penh
    MS-Off Ver
    Excel 2010
    Posts
    7

    Re: Delete Duplicates and Concatenate Unique Cells

    Awesome - thanks!

    Yes, I do go to FCC occasionally It's a nice spot to hang out. Have you been to Cambodia?
    Actually, the reason I was asking for this macro is because I am working on a grammar checker for Khmer and I am merging part of speech data from a lot of different sources.

    Thanks again,
    Nathan

  12. #12
    Forum Guru
    Join Date
    08-26-2007
    Location
    London
    Posts
    3,788

    Re: Delete Duplicates and Concatenate Unique Cells

    My pleasure.

    Yes visited a friend teaching English there a couple of times, getting on for 10 years ago now so no doubt the city has changed a lot. Very interesting place.

  13. #13
    Forum Guru snb's Avatar
    Join Date
    05-09-2010
    Location
    VBA
    MS-Off Ver
    Redhat
    Posts
    5,151

    Re: Delete Duplicates and Concatenate Unique Cells

    Maybe this gives the same result:

    Sub snb()
      sq = Cells(1).CurrentRegion
    
      For j = 1 To UBound(sq)
        c02 = ""
        For jj = 2 To UBound(sq, 2)
          c02 = c02 & "|" & sq(j, jj)
        Next
                  
        If InStr(c01, sq(j, 1)& "|") = 0 Then
          c01 = c01 & vbCr & sq(j, 1) & c02
        Else
          c01 = Replace(c01, Join(Filter(Split(c01, vbCr), sq(j, 1) & "|"), ""), Join(Filter(Split(c01, vbCr), sq(j, 1) & "|"), "") & c02)
        End If
      Next
    
      cells(1,10).resize(ubound(split(c01,vbcr)))=application.transpose(split(mid(c01,2),vbcr)
    End Sub



+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.2.0