# Detect matching values in very large lists

1. ## Detect matching values in very large lists

Hi,

I'm trying match values (and set a Yes / No result) from values in two very large lists.

List 1 (approx 170,000 rows) contains the 'Find What' values
List 2 (approx 980,000 rows) contains the 'In What' values

Values in list 1 will be unique in list 2, but not all values in list 1 will appear in list 2.

The values in each list are all 16 character stings

This is a one-off otherwise I'd probably import into a AccessDB

Using formulas is taking an absolute age to calculate using MATCH function, so I'm wondering if coding is the better angle.

Any ideas/insight/methods welcome.

(Using XL07, on Core2Duo)

2. ## Re: Detect matching values in very large lists

Hello JBentley,

You should provide a representative sample of the data. That can then be used to create a macro solution. Remember to remove any confidential or sensitive information before posting the data. You should also include a copy in Excel 2003. The more people who can view your code, the better.

3. ## Re: Detect matching values in very large lists

Sorting your "look in" data allows you to use much more efficient approaches.

If you sort List 2 in Ascending order you will find that using:

List1
=IF(INDEX(Sheet2!A:A,MATCH(A1,Sheet2!A:A))=A1,"yes","no")

will work much much faster than any unsorted approach.

Why ?

Well, with data sorted you can use MATCH with return_type of 1 (default) which employs the Binary Search algorithm - when the return_type is 0 the MATCH has to conduct an exact match which means it has to look in each cell in the range - with Binary Search given the data is sorted it need only check a much smaller subset of records - thus on large datasets the differences are immense.

Here even though you're conducting an exact match we can do that by using INDEX to return the value generated by the MATCH and compare that to the original - if the same then "yes" else "no" - this is still much quicker than using

=MATCH(A1,Sheet2!A:A,0)

I should add I tested the above approach on data sets of your size and the sheet calculated in a few seconds.

4. ## Re: Detect matching values in very large lists

Hi,

@Leith - Good advice, thankyou, I normally would upload a representative desensitised sample, but I wasn't confident that a sample (when tested for calculation speed) would reflect what I might see IRL.

@DonkeyOte - Thanks, your solution was perfect. Difference in calculation is incredible, from over 1/2 hour down to 3 seconds (guessing - the hourglass barely flickered).

Justin

##### Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1