+ Reply to Thread
Results 1 to 10 of 10

Tricky text string extraction

  1. #1
    Forum Contributor
    Join Date
    08-21-2014
    Location
    London
    MS-Off Ver
    Office 365
    Posts
    232

    Tricky text string extraction

    Hi,

    I have a file with image file names in column A.

    The text strings have a SIMILAR format in that they start with the ID;

    Then have a sales AGENT - i.e. AMAZON

    Then they have EITHER a brand withing the AGENT - in this case TICKETS or they have CHANNEL - i.e. email, social (FB for Facebook, TW for Twitter, etc), website, ETC

    Then they have a 6 DIGIT DATE

    Then, some (not all) have a letter after the 6 digit date - this is indicative or more than one occurrence of this activity on that date (i.e. they tweeted more than once on 12 June)

    The inconsistency of how many elements there are in the string is something I cannot change BUT the following are consistent.

    - after ID there is always an underscore _
    - there is one further underscore before a 6 digit date
    - the elements between ID and DATE are always split by an - character
    - the indicator of more than one occurrence is ALWAYS a letter, A being the first, B being the second, etc

    WHAT I NEED TO DO;

    1. extract the 6 digits that occur as a 6 digit block. I googled and found many examples on this and other sites - various formula which populate several rows of column D of the attached. D2 and D3 looks closest but fall over when there is more than one zero in the 6 digit number. Can someone help with a version that ALWAYS give the 6 digit extract and also help with how this can be adapted should I ever need to expand to 7 or 8 digits, or fewer.

    2. extract the single LETTER after a 6 DIGIT DATE only where it exists, as in most cases the 6 digit date is following by a . and a file extension

    3. a formula to extract the text between first and second -, between second and third -, etc, in a way I can edit it to look at different positions in the string

    OR even better a way (possibly VBA?) to split all elements into columns, with date always in the most right hand column at one touch.

    That may be too tricky, so I can build something with columns to do each part if I can figure out extracting the above elements from the string.

    I can then lookup on the words extracted to populate CHANNEL, BRAND, MACRO_CHANNEL, etc.

    Have highlighted on the attached, in YELLOW where I have manually typed in what I'd expect to extract.

    Even if someone can nail 1 and 2 for me, I can probably - even manually (sigh) manage the rest. Or run a file rename software to standardise the middle text blocks between -

    Hope someone can assist please?

    Ian
    Attached Files Attached Files

  2. #2
    Forum Expert 63falcondude's Avatar
    Join Date
    08-22-2016
    Location
    USA
    MS-Off Ver
    365
    Posts
    6,266

    Re: Tricky text string extraction

    Here's some to get you started:

    D2 (startdate) =DATE(20&MID(SUBSTITUTE(RIGHT(A2,11),"_",""),5,2),MID(SUBSTITUTE(RIGHT(A2,11),"_",""),3,2),LEFT(SUBSTITUTE(RIGHT(A2,11),"_",""),2))
    E2 (agent) =MID(A2,FIND("_",A2)+1,FIND("-",A2)-FIND("_",A2)-1)
    F2 (brand) =IF(LEN(A2)-LEN(SUBSTITUTE(A2,"-",""))>1,TRIM(MID(SUBSTITUTE(A2,"-",REPT(" ",100)),100,100)),"")

  3. #3
    Forum Expert
    Join Date
    05-05-2015
    Location
    UK
    MS-Off Ver
    Microsoft Excel for Microsoft 365 MSO (Version 2402 Build 16.0.17328.20068) 64-bit
    Posts
    28,284

    Re: Tricky text string extraction

    This will separate out the text into "Components"

    in D2


    =TRIM(MID(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE($A2,".","-"),"_","-"),"-",REPT(" ",250)),(COLUMNS($A:A)-1)*250+1,250))

    Copy across

    This will make it easier to then extract your desired data although varying number of components still needs some "manual" adjustment.
    Attached Files Attached Files

  4. #4
    Forum Contributor
    Join Date
    08-21-2014
    Location
    London
    MS-Off Ver
    Office 365
    Posts
    232

    Re: Tricky text string extraction

    Thanks both, I'm testing now

  5. #5
    Forum Contributor
    Join Date
    08-21-2014
    Location
    London
    MS-Off Ver
    Office 365
    Posts
    232

    Re: Tricky text string extraction

    @63Falcondude - 1 and 2 work awesome, 3 falls over on - for example - DOID29D_ENCORE-FB-COMP_200218.jpg, where it returns FB into the result, whereas the previous one DOID29D_ENCORE-EMAIL_230218.jpg (and most others) are blank. Was unsure how it distinguished over what was left blank under BRAND, as it's right to not drop in EMAIL but yet it drops in FB.

  6. #6
    Forum Contributor
    Join Date
    08-21-2014
    Location
    London
    MS-Off Ver
    Office 365
    Posts
    232

    Re: Tricky text string extraction

    @JohnTopley at the very least, that's super useful - only thing it doesn't handle is the split of the single letter after the 6 digit date, which is one thing that manual/lookups of all the other data split out won't do.

  7. #7
    Forum Expert
    Join Date
    05-05-2015
    Location
    UK
    MS-Off Ver
    Microsoft Excel for Microsoft 365 MSO (Version 2402 Build 16.0.17328.20068) 64-bit
    Posts
    28,284

    Re: Tricky text string extraction

    It is easily resolve by formula:

    For date in A1

    =LEFT(A1,6)

    for Letter

    =IF(LEN(A1)=7,RIGHT(A1),"")


    And of course VBA could do the lot!

  8. #8
    Forum Expert 63falcondude's Avatar
    Join Date
    08-22-2016
    Location
    USA
    MS-Off Ver
    365
    Posts
    6,266

    Re: Tricky text string extraction

    Response to post #5:

    I took a guess at what should and shouldn't have been pulled over, since you didn't include that in your sample.

    The F2 formula from post #2 pulls the data in between the first and second -. If there is only one - then it is left blank.

    What would you expect to be in column F for DOID29D_ENCORE-EMAIL_230218.jpg ?

  9. #9
    Forum Contributor
    Join Date
    08-21-2014
    Location
    London
    MS-Off Ver
    Office 365
    Posts
    232

    Re: Tricky text string extraction

    used in combo with your first one, that's superb, thank you

  10. #10
    Forum Contributor
    Join Date
    08-21-2014
    Location
    London
    MS-Off Ver
    Office 365
    Posts
    232

    Re: Tricky text string extraction

    Response to post #8 - if column F was BRAND, then it'd be BLANK - but there's no way for you to have known that of course.

    What would be good is if I make a named list of what would be in column G (email, website, social, etc) and have the column F formula check that.

    This is where the inconsistency of the string comes into play... but hadn't occurred that your column F formula was doing that 1st and 2nd - until now.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 1
    Last Post: 08-07-2015, 04:01 PM
  2. Number Extraction from String
    By quibilty in forum Excel Formulas & Functions
    Replies: 1
    Last Post: 06-22-2015, 03:10 PM
  3. [SOLVED] String extraction
    By pytheus in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 09-17-2014, 07:19 PM
  4. [SOLVED] Text Extractions From String (Tricky One)
    By keen2xl in forum Excel Programming / VBA / Macros
    Replies: 4
    Last Post: 09-30-2013, 11:16 PM
  5. String Extraction
    By qcity in forum Excel Programming / VBA / Macros
    Replies: 13
    Last Post: 10-27-2011, 10:58 PM
  6. Tricky Extraction Problem
    By Spreadsheet in forum Excel Formulas & Functions
    Replies: 5
    Last Post: 11-11-2006, 03:22 AM
  7. [SOLVED] String Extraction...
    By iceberg27 in forum Excel Formulas & Functions
    Replies: 2
    Last Post: 10-15-2005, 12:05 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1