+ Reply to Thread
Results 1 to 5 of 5

Formula for classifying data

  1. #1
    Registered User
    Join Date
    09-17-2013
    Location
    United States
    MS-Off Ver
    Excel 2013
    Posts
    4

    Formula for classifying data

    Hi everyone! I'm doing some preliminary work for an academic project. I'm trying to measure the diversity of social networks in a part of the world where ethnicity can be identified by surname. I'm using publicly available data from Facebook. I've been doing this for a while using a cumbersome method, but I suspect that there may be a simpler solution using excel. Unfortunately, I haven't found an approach that works yet, hence the post here. Let me briefly explain what I'm trying to do -- I've attached a file here as an example.

    I have one spreadsheet for each study subject. The country I'm doing this work in has three main ethnic groups. In column A, I have the 100+ most common surnames from ethnic group A. In column B, I have the 100+ most common surnames from ethnic group B, and in column C, the 100+ most common surnames from ethnic group C.

    In column D, I paste the names of the subject's social network (in other words, everyone on their friend list). What I want to do is classify the ethnicity of each friend. To minimize manual work, the first cut would be to determine whether the names in column D appear in columns A, B, or C.

    In short, I need a formula that will mark a "1" (or something similar) when a name in column D appears in column A (I'll use the same formula to check for matches with columns B and C), and "0" when it doesn't. The slight complicating factor is that columns A, B, and C have only surnames, while D will have surnames, given names, and nicknames. So the formula should code a match if one of the names in column D is match with A, B, or C (in other words, the cell contents will never be a complete match, because column D will never have only surnames).

    The formula should produce the following using the sample file:

    D1 should be a column A match
    D2 a column C match
    D3 no match
    D4 a column B match
    D5 a column A match

    I'd really appreciate any help on this. I've spent quite a while looking for solutions, but nothing has worked reliably. Thanks very much in advance!
    Attached Files Attached Files

  2. #2
    Forum Guru Pete_UK's Avatar
    Join Date
    12-31-2011
    Location
    Warrington, England
    MS-Off Ver
    Office 2019 (still learning)
    Posts
    24,728

    Re: Formula for classifying data

    You need to be more consistent with your definition of surname - this formula in E1 will work for the first 3:

    =IF(ISNUMBER(MATCH(TRIM(RIGHT(D1,LEN(D1)-FIND(" ",D1))),A:A,0)),1,IF(ISNUMBER(MATCH(TRIM(RIGHT(D1,LEN(D1)-FIND(" ",D1))),B:B,0)),2,IF(ISNUMBER(MATCH(TRIM(RIGHT(D1,LEN(D1)-FIND(" ",D1))),C:C,0)),3,0)))

    but it fails on the 4th name as you have 2 spaces (and thus 3 names) and it fails on the 5th name as it is the first of the 3 names.

    Anyway, hope this helps.

    Pete

  3. #3
    Registered User
    Join Date
    09-17-2013
    Location
    United States
    MS-Off Ver
    Excel 2013
    Posts
    4

    Re: Formula for classifying data

    Thanks for the quick reply, Pete.

    Unfortunately I can't be more consistent with the order of names -- we'll be going through thousands of them, so the manual work would be overwhelming. In addition, it may often not be clear which of the names is the surname.

    Is there anyway that we can code a match is any of the names from column A, B, or C appear in D? That should be sufficient and shouldn't produce many errors.

    Thanks again!

  4. #4
    Forum Guru Pete_UK's Avatar
    Join Date
    12-31-2011
    Location
    Warrington, England
    MS-Off Ver
    Office 2019 (still learning)
    Posts
    24,728

    Re: Formula for classifying data

    Put this formula in E1:

    =IF(ISNUMBER(MATCH(MID(TRIM(D1),FIND("·",SUBSTITUTE(TRIM(D1)," ","·",LEN(TRIM(D1))-LEN(SUBSTITUTE(D1," ",""))))+1,256),A:A,0)),1,IF(ISNUMBER(MATCH(MID(TRIM(D1),FIND("·",SUBSTITUTE(TRIM(D1)," ","·",LEN(TRIM(D1))-LEN(SUBSTITUTE(D1," ",""))))+1,256),B:B,0)),2,IF(ISNUMBER(MATCH(MID(TRIM(D1),FIND("·",SUBSTITUTE(TRIM(D1)," ","·",LEN(TRIM(D1))-LEN(SUBSTITUTE(D1," ",""))))+1,256),C:C,0)),3,0))) + IF(ISNUMBER(MATCH(LEFT(D1,FIND(" ",D1)-1),A:A,0)),1,IF(ISNUMBER(MATCH(LEFT(D1,FIND(" ",D1)-1),B:B,0)),2,IF(ISNUMBER(MATCH(LEFT(D1,FIND(" ",D1)-1),C:C,0)),3,0)))

    then copy down. This will check the last name in column D to see if it exists in either A, B or C, and then it will check the first name - the two parts of the formula are separated by the + sign.

    Hope this helps.

    Pete
    Last edited by Pete_UK; 09-23-2013 at 04:21 AM.

  5. #5
    Forum Expert RobertMika's Avatar
    Join Date
    06-22-2009
    Location
    Haverhill, UK
    MS-Off Ver
    Excel 2003-13
    Posts
    1,530

    Re: Formula for classifying data

    =IF(SUM(--ISNUMBER(SEARCH($A$1:$A$5,D1)))=1,1,IF(SUM(--ISNUMBER(SEARCH($B$1:$B$5,D1)))=1,2,IF(SUM(--ISNUMBER(SEARCH($C$1:$C$5,D1)))=1,3,0)))
    Confirm COntrol+Shift+Enter
    If you are http://www.excelforum.com/image.php?type=sigpic&userid=125481&dateline=1392355029happy with the results, please add to the contributor's
    reputation by clicking the reputation icon (star icon).




    If you are satisfied with the solution(s) provided, please mark your thread as Solved.
    Select Thread Tools-> Mark thread as Solved.
    To undo, select Thread Tools-> Mark thread as Unsolved.
    http://www.excelaris.co.uk

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. formula for classifying data from facebook
    By kaiost in forum Excel Formulas & Functions
    Replies: 2
    Last Post: 09-17-2013, 02:57 PM
  2. A Formula for classifying groups of numbers into a letter.
    By surfer1 in forum Excel Formulas & Functions
    Replies: 14
    Last Post: 07-05-2013, 09:48 AM
  3. Classifying entries
    By henryrrjames in forum Excel General
    Replies: 2
    Last Post: 04-28-2012, 09:05 AM
  4. Classifying Data by use of an IF AND Statement
    By analyst1 in forum Excel General
    Replies: 6
    Last Post: 09-15-2010, 03:37 PM
  5. [SOLVED] Classifying Data
    By Mad Dog in forum Excel General
    Replies: 1
    Last Post: 10-06-2005, 09:05 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1