+ Reply to Thread
Results 1 to 4 of 4

Extracting Data from varying Lenths of Data

  1. #1
    Registered User
    Join Date
    06-20-2013
    Location
    Atlanta, GA
    MS-Off Ver
    365
    Posts
    26

    Extracting Data from varying Lenths of Data

    Hello All,

    I'm trying to extract certain information from the Peer Review column (A) and inserting it to the corresponding columns Peer Review Date, Peer Reviewer, and Peer Reviewer Group (B:D).

    I have the extraction formula in Row 4 that worked in extracting the correct information, but if the data varies in length (A5 and A6), my formula does not carry over correctly. I believe the data only works if all cells in Peer Review were exactly the same.

    Please help to find a formula I extract the information easier across multiple varying lengths of data.
    Doing it manually for tons of rows would be difficult.

    Attached is the sample Excel sheet.
    Review__Extraction.xlsx
    Cells highlighted in Blue are correct, and the cells in red are what I need help on.

    Regards,
    Andy

  2. #2
    Forum Guru benishiryo's Avatar
    Join Date
    03-25-2011
    Location
    Singapore
    MS-Off Ver
    Excel 2013
    Posts
    5,147

    Re: Extracting Data from varying Lenths of Data

    hi Andy, welcome to the forum. maybe in B4:
    =MID($A4,FIND("^",SUBSTITUTE($A4,"-","^",LEN($A4)-LEN(SUBSTITUTE($A4,"-",""))))-20,19)

    in C4:
    Formula: copy to clipboard
    Please Login or Register  to view this content.


    in D4:
    Formula: copy to clipboard
    Please Login or Register  to view this content.

    Thanks, if you have clicked on the * and added our rep.

    If you're satisfied with the answer, click Thread Tools above your first post, select "Mark your thread as Solved".

    "Contentment is not the fulfillment of what you want, but the realization of what you already have."


    Tips & Tutorials I Compiled | How to Get Quick & Good Answers

  3. #3
    Registered User
    Join Date
    06-20-2013
    Location
    Atlanta, GA
    MS-Off Ver
    365
    Posts
    26

    Re: Extracting Data from varying Lenths of Data

    Hi Benishiryo,

    Thank you for the guidance. It worked perfectly!
    If it's not too much to ask, do you mind helping me to break up the formula and explain how you built the formula please?
    It would be really helpful to understand each part so I can learn how to extract other parts of the cell if I needed.

    I didn't really understand how you were able to come up with the position of the words you subsitituted and extracted.

    Thanks again!!

    Regards,
    Andy

  4. #4
    Forum Guru benishiryo's Avatar
    Join Date
    03-25-2011
    Location
    Singapore
    MS-Off Ver
    Excel 2013
    Posts
    5,147

    Re: Extracting Data from varying Lenths of Data

    sorry for the late reply:

    you can refer to this tutorial i did here to find the last occurrence of a certain character. it's at trick number 14.
    http://www.excelforum.com/tips-and-t...e-learned.html

    so for the first formula i provided you:
    =MID($A4,FIND("^",SUBSTITUTE($A4,"-","^",LEN($A4)-LEN(SUBSTITUTE($A4,"-",""))))-20,19)
    the red portion helps me find where's the last "-". the Date & Time stamp is a standard length of 19 characters. but to go from the last "-" to the Date/Time Stamp also includes a space, so it's actually -19 -1 (-20). the 19 characters then can be used for the MID formula's num_chars portion as well.

    the 2nd & 3rd formula are similar, but just looking for different characters. this time, when i find the last "-", i want to start 2 characters later, where the name is.
    Formula: copy to clipboard
    Please Login or Register  to view this content.


    to know the length of the name, i then i need to find when is the last occurrence of "Approval history" minus the last "-" position & minus 4 characters (basically 1 to go back to the character before "Approval history", 2 for the spaces after "-" & after the name & 1 more for the open bracket.
    Formula: copy to clipboard
    Please Login or Register  to view this content.


    1 slight difference here is the last argument of the SUBSTITUTE formula where the instance_num is. notice my first formula is:
    LEN($A4)-LEN(SUBSTITUTE($A4,"-",""))
    if you read my tutorial, it's to find how many times the paricular character appeared. i basically find how long is the full text in A4 minus the length of it when i substitute "-" with blanks. so it's 437 - 428. hence, there are 9 "dashes".

    but when i'm looking for characters like "Approval history", where there is more than 1 character, then it makes no sense. the length when i substitute the characters would be skewed. say for eg i have this text:
    apples are apples.
    if you do this formula:
    =LEN($A4)-LEN(SUBSTITUTE($A4,"apples",""))
    it will give you 12. but "Apples" only appeared twice. so we need to do:
    =(LEN(A5)-LEN(SUBSTITUTE($A5,"apples","")))/LEN("apples")
    6 is the length of the text "apples". so similarly, what i did in the formula is:
    (LEN($A4)-LEN(SUBSTITUTE($A4,"Approval history","")))/LEN("Approval history")

    the 3rd formula is just using "approval for" instead of "-". and "approved by" instead of "Approval history". hope that helps

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1