Good day, all...
I'm trying to find a way to eliminate duplicate values in a comma separated string. For example, in cell "F1" I have the following text:
EMAIL, BULLETIN BOARDS, EMAIL, MESSAGING, RETRACT MESSAGES, SCROLLING MESSAGES, RETRACT MESSAGES
And I'd like to end up with
EMAIL, BULLETIN BOARDS, MESSAGING, RETRACT MESSAGES, SCROLLING MESSAGES
** the second instance of "EMAIL" and "RETRACT MESSAGES" have been removed.
I have nearly 3,750 rows to go through, ranging from 2 to 88 phrases.
I suppose I could do a text to columns, and have 1 phrase per cell, then alphabatize horizontally, then run a loop through each of the cells, but that seems horribly inefficient.
If any of you could lend a hand, I'd appreciate whatever guidance you're offering.
Thanks!
Last edited by JP Romano; 03-24-2011 at 05:10 PM. Reason: SOLVED
hi, JP Romano, can you provide a sample workbook with data?
Okay, took a while for some reason, but I added the attachment to my original post... it's just a small subset of the data. Thanks again for any guidance!
This code will provide the cleaned up text in column G
Private Sub Cleanup() Dim c As Range Dim a As Variant Dim i As Long Dim result As String Set c = Range("F2") Do While c <> "" a = Split(c, ",") result = "," For i = LBound(a) To UBound(a) If InStr(1, result, a(i)) = 0 Then result = result & a(i) & "," End If Next i ' Lop off leading and trailing commas result = Mid(result, 2, Len(result) - 2) c.Offset(0, 1) = result Set c = c.Offset(1, 0) Loop End Sub
Making the world a better place one fret at a time
||||||
If someone helped you, please click on the star icon at the bottom of their post
If your problem is solved, please update the first post:
EDIT, Go Advanced button, set Prefix to SOLVED
[code]
' Enclose code in tags like this
[/code]
Don't attach a screenshot--just attach your Excel file! It's easier and will let us experiment with your data, formulas, and code.
6String - that did it. Thank you so much! I'll try to get a better feel for how it does what it does so I can learn more about it. Very cool stuff...thank you!
Watersev - thanks a million for your time!
hi, again, I know the thread is marked as solved and the offered code is good I would not say it is efficient in terms of time to process test file data attached, please check attachment and run code "test".
Time to process attached file example - 5751 rows of data:
Offered code: 26,7 seconds
My code: 2,8 seconds
If you are interested in analyzing the offered codes you'll find only two small things (apart from the main part - finding unique values) not used in one of them but they do make a big difference at the end. If to apply them to 6StringJazzer code you will get better results than both results provided above. To keep the intrigue I will leave codes analysis to you.
Last edited by watersev; 03-24-2011 at 07:23 PM.
watersev uses some very astute techniques worth noting.
The code turns off Application.ScreenUpdating. This is responsible for most of the time savings. Adding this to my code, which should have been there in the first place, reduces a huge amount of I/O overhead.
The code loads data into an array from the worksheet, and dumps it back out again from an array. This also saves I/O time vs. reading and writing each cell individually.
The code adds each keyword to a collection, which causes an error if a duplicate is added. The error is handled by simply not including that duplicate keyword. This may be faster than using InStr to scan for an existing keyword. I suppose that the collection uses indexing or hashing, so this could be faster than searching an entire string, but I'm not sure if it's a significant savings.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks