Closed Thread
Page 1 of 2 12 LastLast
Results 1 to 15 of 20

Thread: Dictionary Object

  1. #1
    Registered User
    Join Date
    10-10-2006
    Posts
    18

    Dictionary Object

    Hi, I am looking for a way to make a list of words out of some straightforward prose using a vba algorithm. I have attached an example it's txt with the data in lines (for some reason I can't upload xls today?)

    -I'm looking to create a huge list of the words in the poem as I have started to do manually further down (the plan was to put this on a separate sheet). I discussed a similar algorithm with Leith Ross on this forum a few months ago and something called a "dictionary scripting object" was mentioned. I'm sad to say, I haven't been able to find out what this is. Any help anyone can offer would be greatly appreciated.

    Adam
    Attached Files Attached Files

  2. #2
    Forum Guru
    Join Date
    01-15-2007
    Location
    Brisbane, Australia
    MS-Off Ver
    2007
    Posts
    5,359
    Adam

    Here's a couple of ways, one using a collection, and the other using a dictionary.

    Run each of the codes, and view the output. They both use a space as a word separator. The dictionary will see the and The as 2 different words, while the collection treats them the same.

     
    Sub aaa()
      Set nodupes = New Collection
      For i = 1 To 12
        arr = Split(Cells(i, 1), " ")
        For j = LBound(arr) To UBound(arr)
          On Error Resume Next
          nodupes.Add Item:=arr(j), key:=arr(j)
          On Error GoTo 0
        Next j
      
      Next i
      Range("G:I").ClearContents
      For i = 1 To nodupes.Count
        Cells(i, "G") = nodupes(i)
      Next i
      
      Range("G:G").Sort key1:=Range("G1"), order1:=xlAscending, header:=xlNo
    End Sub
     
    Sub bbb()
      Set dic = CreateObject("scripting.dictionary")
      
      For i = 1 To 12
        arr = Split(Cells(i, 1), " ")
        For j = LBound(arr) To UBound(arr)
          If Not dic.exists(arr(j)) Then
            dic.Add Item:=arr(j), key:=arr(j)
          End If
        Next j
      
      Next i
      
      For Each ce In dic.items
        Cells(Rows.Count, 9).End(xlUp).Offset(1, 0).Value = ce
      Next ce
      
      Range("I:I").Sort key1:=Range("I1"), order1:=xlAscending, header:=xlNo
    
    End Sub

    rylo
    Attached Files Attached Files
    Last edited by VBA Noob; 01-07-2008 at 06:49 PM.

  3. #3
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007, 2010
    Posts
    25,777
    Dictionary has a CompareMode property that allows case-insensitive compares.

  4. #4
    Valued Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    389
    Hello AdamDay,
    thank you for the nice thread. I needed such a macro as well. :-)

    Hello SHG,
    i want to collect vocabulary from texts as well. Well the macro you have written for Adamday is almost what i needed too. I will be very thankful if you could modify the macro to fullfill my needs as well. Thank you very much in advance! :-)

    1) Every time the macro is run it should not delet the older wordlist in Column G.It would mean that if i run makro over a text on firstday it will make me a word list in column G. The next time if i want to run the macro over another text, the older wordlist in the column G should not be deleted. Only the newer words which dont already exist should be added to the column G. And there should be further no duplicates in column G.
    2) Column G should not be ordered alphabatically. The order can remain the way the text is processed. And every time the new text is processed the words should be simply put to the next empty cells in column G.
    3) Your macro now sperates some times words containing signs such as ":" or "," or ";" or ".". It put for example"hand:", "Jabberwock," and "wabe:". Is it possible to add a function to the macro that it ignores specific signs. For example if it finds a word "hand:" that its put in Column G only "hand" and not the singn ":".
    4) To modify the macro in such a way that even bigger texts can be processed. And that it can handle the job even if there are thousands of words already in column G.

    AdamDay,

    Through the modified macro one could enhance an existing dictionary. I would for example put all the target words of my existing dictionary in column G. Then i will put a text and run the macro. The macro will then only put those words in column G which dont exist in my older list already. The new suggested words could also contain some words which are nonsence. I would even dont delet them from G but put a specific signe infron of them in another column. That way the next time the macro is run over another text it wont put the same unuseful word in column G cus it will be there already. And that would save one the time to correct the same mistake each time. And one could always fish out the usefull words as well cus they wont be marked as unuseful.

  5. #5
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007, 2010
    Posts
    25,777
    Hello SHG,
    i want to collect vocabulary from texts as well. Well the macro you have written for Adamday is almost what i needed too. ...
    I think you're crediting me for rylo's post.

    However, posting a question in someone else's thread violates a crdinal rule of this and most other forums. Please start your own thread, and provide a link to this one for context as necessary.

    Thanks.

  6. #6
    Valued Forum Contributor
    Join Date
    11-12-2007
    Location
    Germany
    MS-Off Ver
    2007
    Posts
    389

    Apology

    oops!!
    i hope that rylo and AdamDay eccept my apology. Though i had read the rules long before but had forgotten that important one.

    2). Never post a question in the Thread of another member. You MUST ALWAYS start you own New Thread.

    SHG
    To avoid such an embaracing thing again, i will read the rules each time in future before i post.

    soooooooooooooorry!! :-)
    Last edited by wali; 01-10-2008 at 01:48 AM.

  7. #7
    Registered User
    Join Date
    10-10-2006
    Posts
    18
    Thank you for the contribution Wali. Infact I think your program maybe of use to me in another project. Thank you for bringing it to my attention.

    Thanks also to shg and rylo. I am still getting back into excel after not using for some time and your contributions have been most valuable.

    What I am hoping to do is similar to wali's project, however I really need to make the transition from just a list of words to a usable dictionary much more step-by-step so that I can alter the ways in which the dictionary is created and also view the dictionary in a number of different orders.

    First of all I really just need a program that can make the list of words using " " as the delimiter between any two words. I also need the repeats to appear separately.

    Secondly I need to take the results of the first program and then start to remove the superfluous characters like "." and ":" as wali suggested for his program.

    Third, I need a third program to perform the de-duplication of entries. This program must also count the number of times a word appears. The result will be a list of words with a list of numbers (the count of duplications) alongside. The frequency of words in the poem will be the most important thing I gain from this.

    Finally, it will be possible to write another list of words which will be removed from the dictionary.

    This is why I'm a little unsure if the dictionary object is the best thing for the job. I really need the process to be step by step as described.

    once again - thanks for all your help!

  8. #8
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007, 2010
    Posts
    25,777
    The attachment will histogram text taken from worksheet cells. I copied your post, for example, and it gave:
            ------A------ -B--
        1       Word      Freq
        2   the             21
        3   of              13
        4   to              13
        5   I               11
        6   a                9
        7   for              6
        8   need             6
        9   program          6
       10   words            6
       11   and              5
       12   be               5
       13   dictionary       5
       14   list             5
       15   also             4
       16   in               4
       17   is               4
       18   step             4
       19   will             4
       20   as               3
       21   from             3
       22   really           3
       23   your             3
       24   all              2
       25   am               2
       26   another          2
       27   by               2
       28   can              2
       29   count            2
       30   it               2
       31   just             2
       32   make             2
       33   most             2
       34   number           2
       35   project          2
       36   Thank            2
       37   that             2
       38   thing            2
       39   using            2
       40   which            2
       41   you              2
       42   after            1
       43   again            1
       44   alongside        1
       45   alter            1
       46   any              1
       47   appear           1
       48   appears          1
       49   attention        1
       50   back             1
       51   been             1
       52   best             1
       53   between          1
       54   bringing         1
       55   characters       1
       56   contribution     1
       57   contributions    1
       58   created          1
       59   de               1
       60   delimiter        1
       61   described        1
       62   different        1
       63   do               1
       64   duplication      1
       65   duplications     1
       66   entries          1
       67   excel            1
       68   Finally          1
       69   first            1
       70   frequency        1
       71   gain             1
       72   getting          1
       73   have             1
       74   help             1
       75   his              1
       76   hoping           1
       77   however          1
       78   if               1
       79   I'm              1
       80   important        1
       81   Infact           1
       82   into             1
       83   job              1
       84   like             1
       85   little           1
       86   maybe            1
       87   me               1
       88   more             1
       89   much             1
       90   must             1
       91   my               1
       92   not              1
       93   numbers          1
       94   object           1
       95   once             1
       96   orders           1
       97   perform          1
       98   poem             1
       99   possible         1
      100   process          1
      101   remove           1
      102   removed          1
      103   repeats          1
      104   result           1
      105   results          1
      106   rylo             1
      107   Secondly         1
      108   separately       1
      109   shg              1
      110   similar          1
      111   so               1
      112   some             1
      113   start            1
      114   still            1
      115   suggested        1
      116   superfluous      1
      117   take             1
      118   thanks           1
      119   then             1
      120   think            1
      121   third            1
      122   this             1
      123   time             1
      124   times            1
      125   transition       1
      126   two              1
      127   unsure           1
      128   usable           1
      129   use              1
      130   valuable         1
      131   view             1
      132   wali             1
      133   wali's           1
      134   ways             1
      135   What             1
      136   why              1
      137   with             1
      138   word             1
      139   write            1
    Last edited by shg; 01-07-2009 at 10:47 AM.

  9. #9
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007, 2010
    Posts
    25,777
    Added a few new interfaces, so you can histogram the text on the clipboard, delete words, ...
    Last edited by shg; 01-07-2009 at 10:47 AM.

  10. #10
    Registered User
    Join Date
    10-10-2006
    Posts
    18

    Thank you

    thanks for this. I am having a look at the data now.

  11. #11
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007, 2010
    Posts
    25,777
    Here's a little more to play with.
    Last edited by shg; 01-07-2009 at 10:47 AM.

  12. #12
    Registered User
    Join Date
    10-10-2006
    Posts
    18
    Hi! These are excellent. Really handy programs. The largest dataset I'm using has several thousand words. Unfortunately, my computer stalls when I try to run the histogram with this large a set. This is really why I need to have the program perform each step one at a time, so that I can see where any flaws like this occur. Is it easy enough to swap the code around like this?

    It really is brilliant stuff by the way - it would have taken me ages to program this!

    Many thanks.

    Adam

  13. #13
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007, 2010
    Posts
    25,777
    Try this version.
    Last edited by shg; 01-07-2009 at 10:47 AM.

  14. #14
    Registered User
    Join Date
    10-10-2006
    Posts
    18
    Yes, this is perfect. Thanks so much! It does have a bit of a problem with really HUGE sets, but I think that may actually be my old computer! Unfortunately, I now have my work cut out with my project!

    Thanks again!

    Adam

  15. #15
    Forum Guru shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2003, 2007, 2010
    Posts
    25,777
    I found a bug in that version when the text from the clipboard is larger that 32K characters. This version fixes that; I tested it to about 250K. Also added a log.
    Last edited by shg; 01-07-2009 at 10:47 AM.

Closed Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.2.0