+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 20

Remove duplicates from col.A and merge col.B

  1. #1
    Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    472

    Remove duplicates from col.A and merge col.B

    Hello,

    i have a wordlist containing about 0.5 Million words in col.A and Millions of meanings in Col.B. Now in Col. A there are many duplicates. I would like to make one cell of all duplicates in col.A and put their meanings together in B with seperator ";". Its very important that there are no duplicates in Col. B.

    For example:
    Col. A Col.B
    w1 m1;m2;m3
    w1 m1;m2;m5
    w2 m23;m24
    w2 m50

    Now after running the macro the list should look like:

    Col.A Col.B
    w1 m1;m2;m3;m5
    w2 m23;m24;m50

    The words in col. A are alphabetically ordered and all the duplicates are followed by each other.

    Thanks for your help in advance.

  2. #2
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,259
    Hello wali,

    I am working on a solution. Given the amount of data, I am testing different methods to obtain the fastest method. Hopefully, I will have a solution for you in few hours.

    Sincerely,
    Leith Ross

  3. #3
    Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    472
    Hi Leith Ross,
    thank you very much. I will be looking forward for your solution. Its quite a big list. I hope that it will work.

  4. #4
    Forum Guru mikerickson's Avatar
    Join Date
    03-30-2007
    Location
    Davis CA
    MS-Off Ver
    Excel 2011
    Posts
    6,155
    This Sub/UDF combination should work.
    Run ConsolidateData after changing the sheet name to match your situation.
    Please Login or Register  to view this content.
    _
    ...How to Cross-post politely...
    ..Wrap code by selecting the code and clicking the # or read this. Thank you.

  5. #5
    Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    472
    Hello Mikerickson,
    thank you very much for your code. But unfortunately it removes all my data. The sheet1 is blank after running the code.
    There is only one sheet named "sheet1" and i run your code and each time it delets all data. I dont know what i am doing worng. Or has it something to do with my excel language version. I am using German Excel 2007. And normaly the english formulas dont work for german version.
    Last edited by wali; 08-10-2008 at 05:38 PM.

  6. #6
    Forum Guru mikerickson's Avatar
    Join Date
    03-30-2007
    Location
    Davis CA
    MS-Off Ver
    Excel 2011
    Posts
    6,155
    Can you post a copy of your workbook? (it shouldn't be doing that)

    If its not possible to attach your workbook, try editing out these two line.
    Please Login or Register  to view this content.
    and reporting back the results of running that code. (That won't fix things but it will help diagnose.)

  7. #7
    Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    472
    Hi,
    here is a sample of my data.
    Now i installed english version of Excel but it didnt help. I also removed those two lines of code but they still give wrong result.
    As i tried it with smaller number of words, it didnt delet them but the second column doesnt give the wanted results. Please refer to the screenshots.
    Attached Images Attached Images
    Attached Files Attached Files
    Last edited by wali; 08-11-2008 at 02:11 AM.

  8. #8
    Forum Guru mikerickson's Avatar
    Join Date
    03-30-2007
    Location
    Davis CA
    MS-Off Ver
    Excel 2011
    Posts
    6,155
    Could you post a copy of the data before running the code? Attaching an Excel workbook would be more informative than a screen image.

    I'm thinking of two possible issues.
    a) the list in A is not sorted
    b) the delimiter might not be being read as ";"

    This version sorts the data first, bringing all of column A's duplicate entries together, and introduces a Delimiter variable to the routine, that can be changed to meet your situation. (The UDF is still needed)
    Please Login or Register  to view this content.

  9. #9
    Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    472
    Hello,
    i have uploaded the excel file containing the data. In that file macro is not yet run. So its the original file. Please refer to my older message "c.zip" thats the excel file containing the data.

    Before running the macro i have sorted the data. So that cant be the reason.

    Now i tried your new code and that doesnt work either.

    What is UDF? Am i doing something wrong in running the code? Is there something else to be done eccept running the code?

    I pasted the code of yours in the code area and run it. It runs but gives the wrong results. Then i installed the english version of office and it didnt work either.

    Thanks
    Last edited by wali; 08-11-2008 at 04:02 PM.

  10. #10
    Forum Guru mikerickson's Avatar
    Join Date
    03-30-2007
    Location
    Davis CA
    MS-Off Ver
    Excel 2011
    Posts
    6,155
    I can't get any of the Arabic to concatenate. (which is crucial to all of the routine.)
    All the arabic characters return as CHAR(95) in my version.
    Let me see if I can find someone who has experience dealing with Arabic text.

  11. #11
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,259
    Hello wali,

    Sorry for the delay, I have been working off site this week. The macro has run well and quickly for me (don't let the size put you off). It condenses the words and their definitions on the same page in columns "A" and "B". You can set the starting row, starting column, and the worksheet name to what you are using. I used "A" as the starting column, 1 as the starting row, and "Sheet2" as the worksheet. These variables are marked in red. I ran quite a variations of code and timed them, and this was the best. I don't have Excel 2007, but it should work. Make a copy of your worksheet before you run the macro. Let me know how it works.
    Please Login or Register  to view this content.
    Sincerely,
    Leith Ross

  12. #12
    Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    472
    Hi Leith Ross,
    first of all thank you very much that you took the time to solve my problem. I am really very thankful for that.

    Unfortunately the code is not giving the wanted results. I am sending screenshots of how the results look like now and how they should be. I am also sending an example excel file.

    Thank you very much
    Attached Images Attached Images
    Attached Files Attached Files

  13. #13
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,259
    Hello wali,

    Here is the amended code. I caught the error you posted when I rechecked the code. This should be error free.
    Please Login or Register  to view this content.
    Sincerely,
    Leith Ross

  14. #14
    Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    472
    Hi,
    its working correctly now. It cant deal with big list but that is not a problem. As i ran the code for 0.5 Millions entires in col. A, it deleted all the data. Then i reduced it to 200 000 words and it didnt function. I kept the number reducing till it worked. Now it works for about 20 000 entries very well. Now i have splitted the whole wordlist into 20t blocks and am processing it step by step. I will let you know if there are some irregularaties.

    Can you please do me another favour? Can you modify the code in such a way that the following columns like col. C, D, E, F and G are processed the same way as column B. I have to process also wordlists which contain word-->grammer-->phonetic-->meaning--->usage. So there are columns involved upto column G. Is it possible to merge the data of other columns the same way as in column B, if a duplicate in column A exist?

    thanks

  15. #15
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,259
    Hello wali,

    Can you post a sample worksheet I could use to develop the code with?

    Sincerely,
    Leith Ross

+ Reply to Thread
Page 1 of 2 1 2 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1