+ Reply to Thread
Results 1 to 11 of 11

Macro to remove 650 stop words from excel text

  1. #1
    Registered User
    Join Date
    05-22-2014
    Posts
    11

    Macro to remove 650 stop words from excel text

    Hello!

    I'm working on a text prediction project classifer model and would like to remove the stop words before I stem the document to get the important topics.

    I found the thread that Stanley solved really useful. However, I have a lot more stop words that I'd like to remove, which I couldn't make work with the previous code (I'm completely new to this!)

    Below is a link to the document that I'm working on. Any advice, help, tips would be greatly appreciated.
    Many thanks,
    Pia
    Attached Files Attached Files

  2. #2
    Forum Expert
    Join Date
    07-31-2010
    Location
    California
    MS-Off Ver
    Excel 2007
    Posts
    4,070

    Re: Macro to remove 650 stop words from excel text

    I don't see the word "stop" anywhere in your file.

  3. #3
    Registered User
    Join Date
    05-22-2014
    Posts
    11

    Re: Macro to remove 650 stop words from excel text

    Hello,

    I was using the below code to remove common words like 'the', 'a', 'and', 'that' etc (full list below) but I think that there were too many words. The code seemed to remove the first line of words but not the next lines.

    I was thinking that there might be another way to do this, ie a lookup of the words in one column that checks the sentences in the next column, but I thought a macro might be the best approach.

    Best,
    Pia

    Option Explicit
    Sub RemoveSeparators()
    ' stanleydgromjr, 07/06/2012
    ' http://www.excelforum.com/excel-gene...-in-excel.html
    Dim s, b
    Dim i As Long, ii As Long, lr As Long, h As String
    s = Array(
    "s ", "ll ", "ve ", "a ", "i ", "m ", "am ", "able ", "about ", "above ", "abroad ", "according ", "accordingly ", "across ", "actually ", "adj ", "after ", "afterwards ", "again ", "against ", "ago ", "ahead ", "aint ", "aint ", "all ", "allow ", "allows ", "almost ", "alone ", "along ", "alongside ", "already ", "also ", "although ", "always ", "am ", "amid ", "amidst ", "among ", "amongst ", "an ", "and ", "another ", "any ", "anybody ", "anyhow ", "anyone ", "anything ", "anyway ", "anyways ", "anywhere ", "apart ", "appear ", "appreciate ", "appropriate ", "are ", "arent ", "arent ", "around ", "as ", "as ", "a ", "aside ", "ask ", "asking ", "associated ", "at ", "available ", "away ", "awfully ", "back ", "backward ", "backwards ", "be ", "became ", "because ", "become ", "becomes ", "becoming ", "been ", "before ", "beforehand ", "begin ", "behind ", "being ", "believe ", "below ", "beside ", "besides ", "best ", "better ", "between ", "beyond ", "both ", "brief ", "but ", "by ", "came ",
    "can ", "cannot ", "cant ", "cant ", "cant ", "caption ", "cause ", "causes ", "certain ", "certainly ", "changes ", "clearly ", "cmon ", "cmon ", "co ", "com ", "come ", "comes ", "concerning ", "consequently ", "consider ", "considering ", "contain ", "containing ", "contains ", "corresponding ", "could ", "couldnt ", "course ", "cs ", "currently ", "dare ", "darent ", "darent ", "definitely ", "described ", "despite ", "did ", "didnt ", "didnt ", "different ", "directly ", "do ", "does ", "doesnt ", "doing ", "done ", "dont ", "down ", "downwards ", "during ", "each ", "edu ", "eg ", "eight ", "eighty ", "either ", "else ", "elsewhere ", "end ", "ending ", "enough ", "entirely ", "especially ", "et ", "etc ", "even ", "ever ", "evermore ", "every ", "everybody ", "everyone ", "everything ", "everywhere ", "ex ", "exactly ", "example ", "except ", "fairly ", "far ", "farther ", "few ", "fewer ", "fifth ", "first ", "five ", "followed ", "following ", "follows ", "for ", "forever ", "former ", "formerly ",
    "forth ", "forward ", "found ", "four ", "from ", "further ", "furthermore ", "get ", "gets ", "getting ", "given ", "gives ", "go ", "goes ", "going ", "gone ", "got ", "gotten ", "greetings ", "had ", "hadnt ", "hadnt ", "half ", "happens ", "hardly ", "has ", "hasnt ", "hasn t ", "have ", "havent ", "haven t ", "having ", "he ", "hed ", "he d ", "hell ", "he ll ", "hello ", "help ", "hence ", "her ", "here ", "hereafter ", "hereby ", "herein ", "heres ", "heres ", "hereupon ", "hers ", "herself ", "hes ", "hes ", "hi ", "him ", "himself ", "his ", "hither ", "hopefully ", "how ", "howbeit ", "however ", "hundred ", "id ", "id ", "ie ", "if ", "ignored ", "ill ", "ill ", "im ", "im ", "immediate ", "in ", "inasmuch ", "inc ", "indeed ", "indicate ", "indicated ", "indicates ", "inner ", "inside ", "insofar ", "instead ", "into ", "inward ", "is ", "isnt ", "isnt ", "it ", "itd ", "itd ", "itll ", "itll ", "its ", "its ", "it s ", "itself ", "ive ", "ive ", "just ", "keep ", "keeps ", "kept ", "know ",
    "known ", "knows ", "last ", "lately ", "later ", "latter ", "latterly ", "least ", "less ", "lest ", "let ", "lets ", "lets ", "like ", "liked ", "likely ", "likewise ", "little ", "look ", "looking ", "looks ", "low ", "lower ", "ltd ", "made ", "mainly ", "make ", "makes ", "many ", "may ", "maybe ", "maynt ", "maynt ", "me ", "mean ", "meantime ", "meanwhile ", "merely ", "might ", "mightnt ", "mightnt ", "mine ", "minus ", "miss ", "more ", "moreover ", "most ", "mostly ", "mr ", "mrs ", "much ", "must ", "mustnt ", "mustnt ", "my ", "myself ", "name ", "namely ", "nd ", "near ", "nearly ", "necessary ", "need ", "neednt ", "neednt ", "needs ", "neither ", "never ", "neverf ", "neverless ", "nevertheless ", "new ", "next ", "nine ", "ninety ", "no ", "nobody ", "non ", "none ", "nonetheless ", "noone ", "nor ", "normally ", "not ", "nothing ", "notwithstanding ", "novel ", "now ", "nowhere ", "obviously ", "of ", "off ", "often ", "oh ", "ok ", "okay ", "old ", "on ", "once ", "one ", "ones ", "ones ", "only ", "onto ", "opposite ", "or ", "other ", "others ", "otherwise ", "ought ", "oughtnt ", "oughtnt ", "our ", "ours ", "ourselves ", "out ", "outside ", "over ", "overall ", "own ", "particular ", "particularly ", "past ", "per ", "perhaps ", "placed ", "please ", "plus ", "possible ", "presumably ", "probably ", "provided ", "provides ", "que ", "quite ", "qv ", "rather ", "rd ", "re ", "really ", "reasonably ", "recent ", "recently ", "regarding ", "regardless ", "regards ", "relatively ", "respectively ", "right ", "round ", "said ", "same ", "saw ", "say ", "saying ", "says ", "second ", "secondly ", "see ", "seeing ", "seem ", "seemed ", "seeming ", "seems ", "seen ", "self ", "selves ", "sensible ", "sent ", "serious ", "seriously ", "seven ", "several ", "shall ", "shant ", "shant ", "she ", "shed ", "shed ", "shell ", "shes ", "shell ", "shes ", "should ", "shouldnt ", "shouldnt ", "since ", "six ", "so ", "some ", "somebody ", "someday ", "somehow ", "someone ", "something ", "sometime ",
    "sometimes ", "somewhat ", "somewhere ", "soon ", "sorry ", "specified ", "specify ", "specifying ", "still ", "sub ", "such ", "sup ", "sure ", "take ", "taken ", "taking ", "tell ", "tends ", "th ", "than ", "thank ", "thanks ", "thanx ", "that ", "thatll ", "thatll ", "thats ", "thats ", "thats ", "thatve ", "thatve ", "the ", "their ", "theirs ", "them ", "themselves ", "then ", "thence ", "there ", "thereafter ", "thereby ", "thered ", "thered ", "therefore ", "therein ", "therell ", "therell ", "therere ", "therere ", "theres ", "theres ", "theres ", "thereve ", "thereupon ", "thereve ", "these ", "they ", "theyd ", "theyll ", "theyre ", "theyve ", "theyd ", "theyll ", "theyre ", "theyve ", "thing ", "things ", "think ", "third ", "thirty ", "this ", "thorough ", "thoroughly ", "those ", "though ", "three ", "through ", "throughout ", "thru ", "thus ", "till ", "to ", "together ", "too ", "took ", "toward ", "towards ", "tried ", "tries ", "truly ", "try ", "trying ", "ts ", "twice ", "two ", "un ",
    "under ", "underneath ", "undoing ", "unfortunately ", "unless ", "unlike ", "unlikely ", "until ", "unto ", "up ", "upon ", "upwards ", "us ", "use ", "used ", "useful ", "uses ", "using ", "usually ", "v ", "value ", "various ", "versus ", "very ", "via ", "viz ", "vs ", "want ", "wants ", "was ", "wasnt ", "wasnt ", "wed ", "way ", "we ", "wed ", "welcome ", "well ", "well ", "well ", "were ", "werent ", "were ", "weve ", "went ", "were ", "were ", "werent ", "weve ", "what ", "whatever ", "whatll ", "whats ", "whatve ", "whatll ", "whatve ", "whats ", "when ", "whence ", "whenever ", "where ", "whereafter ", "whereas ", "whereby ", "wherein ", "wheres ", "whereupon ", "wherever ", "whether ", "which ", "whichever ", "while ", "whilst ", "whither ", "who ", "whod ", "whoever ", "whole ", "wholl ", "wholl ", "whom ", "whomever ", "whos ", "whos ", "whose ", "why ", "will ", "willing ", "wish ", "with ", "within ", "without ", "wonder ", "wont ", "would ", "wouldnt ", "yes ", "yet ", "you ", "youd ",
    "youll ", "your ", "youre ", "yours ", "yourself ", "yourselves ", "youve ", "zero ",)
    lr = Cells(Rows.Count, 2).End(xlUp).Row
    b = Range("B2:C" & lr).Value
    For i = LBound(b, 1) To UBound(b, 1)
    h = Trim(b(i, 1))
    If Right(h, 1) = "." Then h = Left(h, Len(h) - 1)
    For ii = LBound(s) To UBound(s)
    h = Replace(h, s(ii), "")
    Next ii
    b(i, 2) = h
    Next i
    Range("B2:C" & lr).Value = b
    End Sub

  4. #4
    Forum Expert
    Join Date
    10-10-2008
    Location
    Northeast Pennsylvania, USA
    MS-Off Ver
    Excel 2007
    Posts
    2,387

    Re: Macro to remove 650 stop words from excel text

    PiaHarrison,

    You never sent me a Private Message, so I decided to see if you had created a NEW thread, and, you did.

    My attached workbook contains 4 sheets:
    1. forstemmingdfirstseventhou that the macro will run in.

    2. StopWords that contains 635 stop words with a trailing space character - your original list contained duplicates

    3. Instructions

    4. forstemmingdfirstseventhou_orig a duplicate of worksheet forstemmingdfirstseventhou that contains your original raw data


    Detach/open workbook RemoveStopWords sarray - PiaHarrison - EF1013514 - SDG15.xlsm and run the RemoveStopWords macro.
    Have a great day,
    Stan

    Windows 10, Excel 2007, on a PC.

    If you are satisfied with the solution(s) provided, please mark your thread as Solved by clicking EDIT in your original post, click GO ADVANCED and set the PREFIX box to SOLVED.

  5. #5
    Registered User
    Join Date
    05-22-2014
    Posts
    11

    Re: Macro to remove 650 stop words from excel text

    Hi Stanley,

    Thanks so much! You're a wizz at this. I haven't had internet for 2 days, so apologies for the delay.

    I've just downloaded the file so will have a play with it now. I'll let you know how I go.
    Many thanks for the help.

    Thanks again. Have a great day!
    All the best,
    Pia

  6. #6
    Registered User
    Join Date
    05-22-2014
    Posts
    11

    Re: Macro to remove 650 stop words from excel text

    Hi Stanley,

    I sent you a private message regarding this. As discussed, here is the file. Interested to get your take on this.
    Many thanks,
    Pia

  7. #7
    Forum Expert
    Join Date
    10-10-2008
    Location
    Northeast Pennsylvania, USA
    MS-Off Ver
    Excel 2007
    Posts
    2,387

    Re: Macro to remove 650 stop words from excel text

    PiaHarrison,

    Thanks for the three Private Messages.

    Interested to get your take on this.
    I will check out your attached workbook, and, the latest macro.

  8. #8
    Forum Expert
    Join Date
    10-10-2008
    Location
    Northeast Pennsylvania, USA
    MS-Off Ver
    Excel 2007
    Posts
    2,387

    Re: Macro to remove 650 stop words from excel text

    PiaHarrison,

    Detach/open workbook RemoveStopWordsV2 sarray - PiaHarrison - EF1013514 - SDG15.xlsm

    On the attached workbook, please see sheet Compare:

    Row 2 is the raw data from sheet forstemmingdfirstseventhou, cell B2.

    Row 3 is what the macro does based on your StopWords, cell C2.

    You will have to adjust the StopWords sheet to correct the problem.

  9. #9
    Registered User
    Join Date
    05-22-2014
    Posts
    11

    Re: Macro to remove 650 stop words from excel text

    Wow!! Very impressed.

    Excellent pick up! Thanks for being clever enough to pick that up and taking the time to explain it to me. Really appreciate it.

    Thanks again!
    Pia

  10. #10
    Forum Expert
    Join Date
    10-10-2008
    Location
    Northeast Pennsylvania, USA
    MS-Off Ver
    Excel 2007
    Posts
    2,387

    Re: Macro to remove 650 stop words from excel text

    PiaHarrison,

    Thanks for the feedback.

    You are very welcome. Glad I could help.

    And, come back anytime.

  11. #11
    Registered User
    Join Date
    05-17-2016
    Location
    Kalimantan timur, indonesia
    MS-Off Ver
    Microsoft Office 2010
    Posts
    1
    Quote Originally Posted by stanleydgromjr View Post
    PiaHarrison,

    Thanks for the feedback.

    You are very welcome. Glad I could help.

    And, come back anytime.
    Hi Stanley

    I've used ur macro, but i found out it cant ignore "you" from "your". This macro keep replace you even you is a part of your. I hope you can give me some hand how to replace only for words beside space not word nside a word something like that. Im a very beginner in macro things.
    Thx for ur concern
    Last edited by superikha; 05-18-2016 at 03:58 PM.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Removing 'stop words' from a sentence in excel
    By shashankbansal in forum Excel General
    Replies: 8
    Last Post: 06-20-2016, 11:32 PM
  2. Remove Stop Words from a column containing 16000 rows of sentences
    By Abhayrajify in forum Excel Programming / VBA / Macros
    Replies: 7
    Last Post: 10-30-2013, 07:14 PM
  3. [SOLVED] Macro to replace or remove certain words based on length
    By Mariah B in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 10-15-2013, 02:45 PM
  4. Replies: 1
    Last Post: 04-04-2012, 07:15 PM
  5. remove text after and before specific words
    By Statsman in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 02-22-2008, 02:38 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1