+ Reply to Thread
Results 1 to 11 of 11

Count non-repeated words in cell

  1. #1
    Registered User
    Join Date
    02-04-2007
    Posts
    10

    Unhappy Count non-repeated words in cell

    Hi,

    I want to count the nr of words in cell without counting the repeated words.
    For example, in "this is a ball and that is a square", the total nr of words is 9, and the total nr of non-repeated words is 7, because "is" and "a" are repeated twice, and I just want to count it once.

    Please help!

    Thanks,
    mxc668
    Last edited by mxc668; 10-14-2009 at 06:12 PM.

  2. #2
    Forum Expert NBVC's Avatar
    Join Date
    12-06-2006
    Location
    Mississauga, CANADA
    MS-Off Ver
    2003:2010
    Posts
    34,898

    re: Count non-repeated words in cell

    Assuming sentence is in A1, try:

    Please Login or Register  to view this content.
    confirmed with CTRL+SHIFT+ENTER not just ENTER. You will see { } brackets appear around the formula.

    This formula allows for up to 100 words in the cell...

    Only replace the red A1's to reference another cell...

    If you have more than 100 words adjust the occurrences $A$100 to suit.
    Last edited by NBVC; 10-11-2009 at 09:49 PM.
    Where there is a will there are many ways.

    If you are happy with the results, please add to the contributor's reputation by clicking the reputation icon (star icon) below left corner

    Please also mark the thread as Solved once it is solved. Check the FAQ's to see how.

  3. #3
    Registered User
    Join Date
    02-04-2007
    Posts
    10

    Re: Count non-repeated words in cell

    Thanks, but if you try "um a pink circle with a line across it rolled across the grey rectangle and um nudge the blue square to the end of the grey rectangle" in cell A1, then paste the fomula in cell b1, you get 0 in B1. Please help me a bit more on this.

    Best,
    MXC668

  4. #4
    Registered User
    Join Date
    02-04-2007
    Posts
    10

    Re: Count non-repeated words in cell

    By the way, I am using excell 2003.

  5. #5
    Registered User
    Join Date
    02-04-2007
    Posts
    10

    Re: Count non-repeated words in cell

    Sorry. I know how to use the fomula now. Thanks a lot! But maybe I did not make it clear that sometimes a word can be repeated more than once. Using real data, in cell A1 I have "um a pink circle with a line across it rolled across the grey rectangle and um nudge the blue square to the end of the grey rectangle", by using the fomula, I got 21. Actually, the number should be 19, because "the" has appeared four times in stead of twice.

    Could you please tell me how to deal with it?

    Thanks,
    mxc668
    Last edited by mxc668; 10-12-2009 at 03:30 PM.

  6. #6
    Forum Expert daddylonglegs's Avatar
    Join Date
    01-14-2006
    Location
    England
    MS-Off Ver
    2016
    Posts
    14,675

    Re: Count non-repeated words in cell

    In fact NBVC's formula is designed to only count the same word once, however many times it appears.

    In this case the failure is because using 100 as the multiplier and the number of spaces means that they will progressively get out of sync due to the word lengths....in this case that meant that the second "rectangle" wasn't extracted in its entirety but as "r" and then "ectangle", thus counting two words that don't really exist, hence a count of 21 rather than 19.

    Increasing the 100 multiplier by the average word length, e.g. to 104 for instance, would rectify the problem in most cases, i.e.

    =COUNT(1/FREQUENCY(IF(TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",100)),ROW($A$1:$A$100)*104,100))<>"",MATCH(TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",100)),ROW($A$1:$A$100)*104,100)),TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",100)),ROW($A$1:$A$100)*104,100)),0)),ROW($A$1:$A$100)))
    I'd be inclined to try for a more robust solution....but I haven't found it yet.....

  7. #7
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2010
    Posts
    40,678

    Re: Count non-repeated words in cell

    Comfortable with a VBA solution?
    Entia non sunt multiplicanda sine necessitate

  8. #8
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Count non-repeated words in cell

    FWIW I'm with shg - this has UDF written all over it given sheer complexity required in native approach.

    I suspect shg has some tricks up his sleeve but here's one possible approach using the Dictionary Object

    Please Login or Register  to view this content.
    The above would be stored in a Module in VBE (in Macro Enabled file) and called from a cell along lines of

    =UNIQUEWORDS(A1)

    A default delimiter of space is assumed, if that's not the case then pass optional parameter of delimiter, eg:

    =UNIQUEWORDS(A1,":")

    would count unique words split by colon

  9. #9
    Forum Expert NBVC's Avatar
    Join Date
    12-06-2006
    Location
    Mississauga, CANADA
    MS-Off Ver
    2003:2010
    Posts
    34,898

    Re: Count non-repeated words in cell

    From PM:

    Hi NBVC,

    You kindly gave me the formula "=COUNT(1/FREQUENCY(IF(TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",100)),ROW($A$1:$A$100)*100,100))<>"",MATCH(TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",100)),ROW($A$1:$A$100)*100,100)),TRIM(MID(SUBSTITUTE(" "&A1," ",REPT(" ",100)),ROW($A$1:$A$100)*100,100)),0)),ROW($A$1:$A$100)))" to count the nr of non-repeated words in a cell.

    Thanks a lot! But I did not make it clear that sometimes a word can be repeated more than once. Using real data, in cell A1 I have "um a pink circle with a line across it rolled across the grey rectangle and um nudge the blue square to the end of the grey rectangle", by using the fomula, I got 21. Actually, the number should be 19, because "the" has appeared four times in stead of twice.
    Btw, the maximum of of words in a cell for my data is 500.

    Could you please tell me how to deal with it? I really need it to be done for my project.

    Thanks,
    mxc668
    OP has added that he can may have up to 500 words in the cell...

    in that case, this formula should do it...

    Please Login or Register  to view this content.

  10. #10
    Forum Expert NBVC's Avatar
    Join Date
    12-06-2006
    Location
    Mississauga, CANADA
    MS-Off Ver
    2003:2010
    Posts
    34,898

    Re: Count non-repeated words in cell

    Hi again mxc668,

    thanks to daddylonglegs who has kindly alerted me that my last formula may give you potentially the same issue as before as you approach those 500 words...

    I mistakingly tested the formula for 500 characters and it seemed ok... but 500 words will undoubtedly take you to well over 1000 or 2000 chars and I am not certain the formula will always return the correct result... you may come across the phenomenon pointed out by daddylonglegs earlier.

    ... so, in short, I think you better look into DonkeyOte's (or shg's...if he has an alternative) UDF approaches.

  11. #11
    Registered User
    Join Date
    02-04-2007
    Posts
    10

    Re: Count non-repeated words in cell

    Well, many thanks to all of you, DonkeyOte, NBVC and daddylonglegs. DonkeyOte's method give me the result with no mistakes.

    This forum is really a nice place, and thanks again to all of you.

    Best,
    Mxc668

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1