+ Reply to Thread
Results 1 to 14 of 14

Extract all the words from a string for processing, while leaving string intact.

  1. #1
    Registered User
    Join Date
    12-07-2012
    Location
    London, England
    MS-Off Ver
    Office 2010
    Posts
    81

    Question Extract all the words from a string for processing, while leaving string intact.

    Hi,

    Take a string like the following. Notice it has all sorts of punctuation.

    'This is a test-string', he said; the answer was going to be somewhere between 'hard' and 'very hard'.

    I need to be able to go through the string and take each word in turn to do some processing to it (replacing the word with another), but leave the punctuation, etc. as it is.

    In other words I need to end up with 2 versions of it:

    1. the original string
    2. another 'output' string, which has all the punctuation in the same places, but with 'processed' words.

    Everything I've tried has been hard to do, which either means I'm doing it wrong or it really is hard!

    I've been doing things like going along the line, looking for non-alpha characters add them to a variable until I get to an alpha char, process the word, then add the non-alpha chars back to wherever they came from. This has been not very successful.

    Is there a better way?

    Help appreciated.

    Staggers

  2. #2
    Forum Guru romperstomper's Avatar
    Join Date
    11-04-2008
    Location
    A1
    MS-Off Ver
    Most
    Posts
    12,302

    Re: Extract all the words from a string for processing, while leaving string intact.

    You can do this with regular expressions. Can you give an example of the sort of processing you want to do?
    Remember what the dormouse said
    Feed your head

  3. #3
    Forum Expert
    Join Date
    01-23-2013
    Location
    USA
    MS-Off Ver
    Microsoft 365 aka Office 365
    Posts
    3,863

    Re: Extract all the words from a string for processing, while leaving string intact.

    Hi Staggers,

    You may want to try using VBA function split() to start your parsing. I don't know what else to suggest, since you haven't described what your output string is supposed to look like.

    Sample code follows (tested and working in Excel 2003 - Assumes Option Base 0 for array indices):
    Please Login or Register  to view this content.
    Lewis

  4. #4
    Registered User
    Join Date
    12-07-2012
    Location
    London, England
    MS-Off Ver
    Office 2010
    Posts
    81

    Re: Extract all the words from a string for processing, while leaving string intact.

    Not really. The processing is 'simply!' to get the words out from all the other stuff. It will simply be a case of replacing one word with the (maybe) the same word spelled differently (think US v. UK spelling, for example).

    So really I need to make a copy of the whole thing (which might be several sentences), leave the original as it is and then on the copy 'output' version do the word replacements.

    My problem is that I haven't been able reliably to extract just the words from all the punctuation, process them, and put them back.

  5. #5
    Forum Guru romperstomper's Avatar
    Join Date
    11-04-2008
    Location
    A1
    MS-Off Ver
    Most
    Posts
    12,302

    Re: Extract all the words from a string for processing, while leaving string intact.

    Here's an example. It will process all the selected cells and replace the words with "foo" in the cell to the right:
    Please Login or Register  to view this content.

  6. #6
    Registered User
    Join Date
    12-07-2012
    Location
    London, England
    MS-Off Ver
    Office 2010
    Posts
    81

    Re: Extract all the words from a string for processing, while leaving string intact.

    @Romperstomper.

    Well, this is obviously the way to go. However, I should explain that I don't need you code to do the actual replacing of words.

    What I need is for it to find each word in turn - which yours obviously does very well, so I can then replace it (or not) and move on to the next word until the sentence(s) are complete.

  7. #7
    Forum Guru romperstomper's Avatar
    Join Date
    11-04-2008
    Location
    A1
    MS-Off Ver
    Most
    Posts
    12,302

    Re: Extract all the words from a string for processing, while leaving string intact.

    How do you plan to replace it? If manually, you could just alter the code to pop up an inputbox.

  8. #8
    Registered User
    Join Date
    12-07-2012
    Location
    London, England
    MS-Off Ver
    Office 2010
    Posts
    81

    Re: Extract all the words from a string for processing, while leaving string intact.

    No, its done in code.

    It will be looking up in what is currently a 40k word table.

    So a user pastes in a paragraph or single sentence and hits a button.

    The only way I've been able to make it work so far is to strip almost all punctuation except the periods at the end of sentences and then look up each word. So the result is that you have a cell with the original sentence, and another cell with the new version.

    But that obviously isn't ideal, because I want to leave the original punctuation there in both cases.

    So that's the thing. I need the code to find each word in turn, and then the replace with bit would be the result of the table look-up my code does. I don't know if this can be done with regex?


    Edit: Will probably be away from keyboard for a while now, but will reply to any response asap.
    Last edited by staggers47; 10-10-2014 at 10:11 AM.

  9. #9
    Forum Guru romperstomper's Avatar
    Join Date
    11-04-2008
    Location
    A1
    MS-Off Ver
    Most
    Posts
    12,302

    Re: Extract all the words from a string for processing, while leaving string intact.

    In my code you'd want to look up match.submatches(2) and then put the return value in place of "foo" in the concatenated string. So something like:
    Please Login or Register  to view this content.

  10. #10
    Registered User
    Join Date
    12-07-2012
    Location
    London, England
    MS-Off Ver
    Office 2010
    Posts
    81

    Re: Extract all the words from a string for processing, while leaving string intact.

    @ romperstomper

    Well I'll be damned.

    I really don't know my way around regex, so I modified your code a bit for testing purposes. I added a counter Cnt to simulate each lookup result and increased it for each match so I'd know I had the loop right, which it seems I do. Also, it isn't actually a vlookup, so I've used 'replace' instead.

    Please Login or Register  to view this content.

    Now here's what I get.

    Input: 'This is a test-string', he said; the answer was going to be somewhere between 'hard' and 'very hard'.
    Output: TEST0 TEST1 TEST2 TEST3-TEST4', TEST5 TEST6; TEST7 TEST8 TEST9 TEST10 TEST11 TEST12 TEST13 TEST14 'TEST15' TEST16 'TEST17 TEST18'.

    Now I've done that there's a tiny problem that I didn't spot at first. The original single quote is missing. Have you any idea what I've done to cause that?

    Regards

    Staggers
    Last edited by staggers47; 10-10-2014 at 12:44 PM. Reason: Added info

  11. #11
    Forum Guru romperstomper's Avatar
    Join Date
    11-04-2008
    Location
    A1
    MS-Off Ver
    Most
    Posts
    12,302

    Re: Extract all the words from a string for processing, while leaving string intact.

    Unfortunately a leading apostrophe in Excel is treated as an indicator of text formatting, not part of the cell value.

  12. #12
    Registered User
    Join Date
    12-07-2012
    Location
    London, England
    MS-Off Ver
    Office 2010
    Posts
    81

    Re: Extract all the words from a string for processing, while leaving string intact.

    Oh well, you've still made it vastly easier for me.

    Just one final thing: you have rCell in a loop, but in my application it will only ever be a single named cell. How do I change the code accordingly?

    Regards

    Staggers

  13. #13
    Registered User
    Join Date
    12-07-2012
    Location
    London, England
    MS-Off Ver
    Office 2010
    Posts
    81

    Re: Extract all the words from a string for processing, while leaving string intact.

    Oh well, you've still made it vastly easier for me.

    Just one final thing: you have rCell in a loop, but in my application it will only ever be a single named cell. How do I change the code accordingly?

    Regards

    Staggers

  14. #14
    Forum Guru romperstomper's Avatar
    Join Date
    11-04-2008
    Location
    A1
    MS-Off Ver
    Most
    Posts
    12,302

    Re: Extract all the words from a string for processing, while leaving string intact.

    Something like this:
    Please Login or Register  to view this content.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Extract Last Two Words of a string
    By VIJEXCEL in forum Excel Formulas & Functions
    Replies: 12
    Last Post: 12-30-2013, 10:40 AM
  2. [SOLVED] Routine to remove brackets around words, leaving the format intact. Doesn't always work.
    By staggers47 in forum Excel Programming / VBA / Macros
    Replies: 13
    Last Post: 04-30-2013, 09:02 AM
  3. Extract n words from string
    By holmes123 in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 08-07-2012, 10:29 PM
  4. UDF to extract first three words from string
    By stodge in forum Excel Formulas & Functions
    Replies: 1
    Last Post: 01-24-2011, 07:25 AM
  5. Extract last two words from string
    By dimid in forum Excel General
    Replies: 5
    Last Post: 11-28-2008, 11:07 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1