+ Reply to Thread
Results 1 to 4 of 4

Conditional identification of duplicates

  1. #1
    Registered User
    Join Date
    11-21-2018
    Location
    Singapore
    MS-Off Ver
    Professional Plus 2013
    Posts
    9

    Conditional identification of duplicates

    Dear Community,

    I'm working with event attendance records and would like to identify all events that were attended by at least three participants who previously attended the same event or events together.

    Participants are assigned unique IDs, as are events. In the attached example, we have five events which were attended by five participants. However, some participants attended several events.

    I do not simply want to count individual duplicate event attendances (e.g., A1844Q attended two events, and so on).

    Rather, I would like to flag up all events attended by at least three people who previously attended another event (these three people would have had to be in the same event previously).

    For example, individuals A1844Q, B1665X and C1253Y attended Event 1 as well as Event 2. And individuals A1731Y, B1454Z and C1253Y attended Events 3 and 5.

    In terms of desired output, I would like to get a list of all Event IDs and corresponding Participant IDs to which this condition applies - however without counting the first occurrence, if possible.

    The assumption would be that the first occurrence would be valid, whereas subsequent occurrences were due to deliberate data falsification rather than genuine event attendances.

    I would be most grateful for any ideas on how to approach this problem.

    Best,

    J0ha
    Attached Files Attached Files

  2. #2
    Valued Forum Contributor
    Join Date
    11-02-2016
    Location
    NY
    MS-Off Ver
    2010
    Posts
    459

    Re: Conditional identification of duplicates

    I'm not sure if you would like a Pivot table to help 'see'/view the data? See attached. Let me know what you think
    Attached Files Attached Files

  3. #3
    Registered User
    Join Date
    11-21-2018
    Location
    Singapore
    MS-Off Ver
    Professional Plus 2013
    Posts
    9

    Re: Conditional identification of duplicates

    Hi Queuesef,

    I have tried a Pivot Table, however, my actual data-set is quite large which makes it not really practicable.

    My initial idea was to:

    1. Using an 'IF' statement to label the first occurrence of an attendance code as ‘0’ and subsequent occurrences as ‘1’, ‘2’, etc;
    2. Filtering out the '0' occurrences;
    3. Potentially using an 'INDEX'-'MATCH' combination or 'VLOOKUP' to identify events with questionable participant codes.

    Do you have any comments or suggestions on how to operationalise this approach?

    Best,

    J0ha

  4. #4
    Forum Guru
    Join Date
    08-28-2014
    Location
    USA
    MS-Off Ver
    Excel 2019
    Posts
    17,647

    Re: Conditional identification of duplicates

    Here are a couple of options.
    On Sheet2 is a formula proposal that doesn't lend itself to filtering:
    1. Columns B:F are populated using: =SUMPRODUCT((Sheet1!$A$2:$A$26=$A2)*(Sheet1!$B$2:$B$26=B$1))
    2. Columns G:K are populated using: =IF(B2=1,SUM($B2:B2)-1,"")
    On Sheet3 is a pivot table proposal that depends on a helper column appending the original data on Sheet1:
    1. The helper column (E) is populated using: =COUNTIFS(A$2:A2,A2)-1
    2. The Pivot table may be filtered so that it displays only the participants that have attended multiple events.
    Note that to see the whole list of participants the filter needs to be cleared from the 'Unique participant ID' column.
    Let us know if you have any questions.
    Attached Files Attached Files
    Consider taking the time to add to the reputation of everybody that has taken the time to respond to your query.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. De-identification of Dataset
    By brishtee in forum Excel Formulas & Functions
    Replies: 2
    Last Post: 08-15-2018, 09:17 AM
  2. [SOLVED] Identification of duplicates numbers in a column and print as duplicates with that number
    By kswapnadevi in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 10-16-2017, 05:25 AM
  3. Variables identification
    By Hassan1977 in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 02-11-2016, 05:39 PM
  4. Replies: 11
    Last Post: 05-16-2013, 09:57 AM
  5. Number identification
    By HelpExc in forum Excel Programming / VBA / Macros
    Replies: 10
    Last Post: 12-14-2012, 02:15 PM
  6. Identification and use of a particular box
    By Gawel in forum Excel Formulas & Functions
    Replies: 1
    Last Post: 03-20-2012, 06:52 AM
  7. Object Identification
    By JSMITHXX in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 03-19-2012, 07:36 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1