+ Reply to Thread
Results 1 to 5 of 5

Counting Text Pairs In a String of Text (ie "AA", "GC" etc)

  1. #1
    Registered User
    Join Date
    03-31-2010
    Location
    LA
    MS-Off Ver
    Excel 2003
    Posts
    2

    Counting Text Pairs In a String of Text (ie "AA", "GC" etc)

    Hi,

    I'm new to this forum so I'm not sure if this is the best place for this question. If I should post it under a different heading let me know.

    I'm trying to get excel to count how many times a pair of letters appears in a string. I'm working with DNA so I'm only using A, G, C and T.

    For example in a sequence GTA AAA CGA there is 1 GT pair, 1 TA pair, 3 AA pair, 1 AC pair, 1 CG pair, and 1 GA pair.

    I set up columns for every possible combination and have them counted with this formula
    =((LEN(C3)-LEN(SUBSTITUTE(C3,"AA","")))/2)
    There was probably an easier way to do it but this worked for me.

    My problem happens when there is a series of poly letters (like in the sequence above).
    Any help would be appreciated! Thank you!
    Last edited by smccann920; 03-31-2010 at 06:39 PM.

  2. #2
    Forum Expert martindwilson's Avatar
    Join Date
    06-23-2007
    Location
    London,England
    MS-Off Ver
    office 97 ,2007
    Posts
    19,320

    Re: Counting Text Pairs In a String of Text (ie "AA", "GC" etc)

    why is it 3 aaa and come to that where is ac or are the spaces misleading?
    "Unless otherwise stated all my comments are directed at OP"

    Mojito connoisseur and now happily retired
    where does code go ?
    look here
    how to insert code

    how to enter array formula

    why use -- in sumproduct
    recommended reading
    wiki Mojito

    how to say no convincingly

    most important thing you need
    Martin Wilson: SPV
    and RSMBC

  3. #3
    Registered User
    Join Date
    03-31-2010
    Location
    LA
    MS-Off Ver
    Excel 2003
    Posts
    2

    Re: Counting Text Pairs In a String of Text (ie "AA", "GC" etc)

    Sorry the spaces are irrelevant, it's just how DNA is usually listed. That should be one string of text with 9 characters.

  4. #4
    Registered User
    Join Date
    04-29-2008
    Location
    Tennessee
    MS-Off Ver
    2007
    Posts
    37

    Re: Counting Text Pairs In a String of Text (ie "AA", "GC" etc)

    Attached is one possible solution. It involves some helper cells (you can hide the columns if needed).
    Attached Files Attached Files

  5. #5
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Counting Text Pairs In a String of Text (ie "AA", "GC" etc)

    If your string is in A1 and your combinations are say listed in D1:D16 then you can calculate the results in E1:E16 using:

    E1: =SUMPRODUCT(--(MID(SUBSTITUTE($A$1," ",""),ROW($1:$8),2)=$D1))
    copied down

    That would return the outlined results for sample string.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1