+ Reply to Thread
Results 1 to 4 of 4

Extracting variable-length substrings from a cell based on variable starting points

  1. #1
    Registered User
    Join Date
    11-20-2012
    Location
    St. Louis, Missouri
    MS-Off Ver
    Excel 2010
    Posts
    7

    Extracting variable-length substrings from a cell based on variable starting points

    I am working with drug data, and the data entry conventions are ... unconventional. Multiple spacing, single spacing, hyphens, slashes ... I've got it all.

    My problem is that I can identify a small handful of patterns that follow the data I want to extract (drug strength), but nothing unique that comes before it. To compound matters, there are some entries that contain spaces in the middle of the drug strength and there's no standard length to the substring I want to extract.

    For instance, I might have "PROGESTERONE 175 MG SL TABLET" or "WP ESTRADIOL 1MG/0.1ML HRT CREAM" ... in these two cases, I want "175 MG" from the first and "1MG/0.1ML" from the second, and "MG" and "ML" are each followed by a trailing space, which means I can locate those particular substrings in the field. Finding the start and stop points, however, is going to be a bear of a chore.

    Is there anything I can use to make this more doable? Or am I relegated to attempting to do this by brute force and/or nested IF statements?

  2. #2
    Valued Forum Contributor
    Join Date
    09-10-2012
    Location
    Los Angeles, CA
    MS-Off Ver
    Excel 2010
    Posts
    929

    Re: Extracting variable-length substrings from a cell based on variable starting points

    Hi Don,

    I've used a combination of two formulas (credit to martindwilson!) to get your results...

    You will have to list a series of substrings with a trailing space in col. H

    Let me know if this is not working...
    Attached Files Attached Files

  3. #3
    Registered User
    Join Date
    11-20-2012
    Location
    St. Louis, Missouri
    MS-Off Ver
    Excel 2010
    Posts
    7

    Re: Extracting variable-length substrings from a cell based on variable starting points

    Another problem.

    Some of the drug names have numbers in them as well.

    Here are more data points and the values that should be caught.

    Please Login or Register  to view this content.

  4. #4
    Valued Forum Contributor
    Join Date
    09-10-2012
    Location
    Los Angeles, CA
    MS-Off Ver
    Excel 2010
    Posts
    929

    Re: Extracting variable-length substrings from a cell based on variable starting points

    Hi Don,

    What are you trying to do to me?!? Just kidding...

    Please take a look at the attachment to see what I did...

    Several things to look out for...
    1. The functions are arrays, so you must use array-enter (CTRL-SHIFT_ENTER) to get the magical curly brackets, instead of just ENTER
    2. I had to use a helper column B just to make the formula in column C more readable (you can copy any references to column B and copy over the formula, but it will get too big and unreadable)
    3. Created two lookup tables in column H and I, notice that I had to "trick" finding numbers with a preceding space and finding the measurements with a trailing space (add more measurements if needed, just make sure it's still being captured in the formula)

    Let me know what you think...

    Happy Thanksgiving!

    Dennis
    Attached Files Attached Files

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1