# Partial Match VLOOKUP Alternative

1. ## Partial Match VLOOKUP Alternative

Hi All,

I'm struggling with trying to do a VLOOKUP with a partial value and looking for some guidance. As you can see in the attached Tab-1 I have a list of values. The data highlighted in red is what I am interested in, the rest of the information is redundant and in my real file spans over 10000 lines.

On Tab-2 I have a cleansed data list with an associated value. These values match the text in red on Tab-1. I therefore want to try and lookup the partial value from each cell in Tab-1 and pick up the info from tab 2.

I had tried the standard partial VLOOKUP("*"&A2&"*") but this won't work because the additional information is actually on Tab-1 and not Tab-2. I can't do the VLOOKUP on Tab-2 because there is multiple versions of the data in red with different suffix and prefix's.

Also key to note throughout this dataset the suffix and prefix could be different lengths as could the data I need in Red.

I have tried lots of different formulas including VLOOKUP, INDEX MATCH etc etc but I think the dataset could need cleansed before I proceed.

Any ideas?

Thanks

2. ## Re: Partial Match VLOOKUP Alternative

12345 will match a few times as shown
will there always be a space at the end of the number you are trying to match and a - at the front
BUT
HDS-12345 A
&
ABC-12345 ZZZ
will match 12345

so how do you identify which is the correct 12345 to match

3. ## Re: Partial Match VLOOKUP Alternative

The suffix and prefixes aren't important. 12345 should both match with 12345 on Tab-2 and produce the same value.

The prefix will always be 3 characters at the start with a dash. The challenge comes with the suffix which could be any amount of characters and may or may not include a space between the red text.

4. ## Re: Partial Match VLOOKUP Alternative

so how would you know
12345619
is 12345 and 619 is the suffix
OR that
123456 is the number and 19 the suffix
we can cleanup and extract the prefix , using

=RIGHT(A2,LEN(A2)-4)
and we could search for the space and extract that number - BUT i dont know how to deal with a suffix that may not have a delimiter between the number/text you want

5. ## Re: Partial Match VLOOKUP Alternative

Good point. The true data set isn't as severe as this example, it is a combination of letters and numbers and is less likely to fall into this issue. Definitely take that on board though and if I can find a solution I may just need to take liberties with the odd outlier that may appear.

The most frequent suffix is a date code so it may be 2020 or (20) but there is a large amount of data and differing examples of lengths so unfortunately would not be easy to use the above formula for the other side of the text. Sometimes there is a space and sometimes there is a dash before the suffix aswell so it isn't even like we could use that as the starting point for the formula to work from.

I guess naively I was hoping I could use some form of match or search function to find the text in red and then perform a VLOOKUP but this is far beyond my skillset of trying to merge formulas

6. ## Re: Partial Match VLOOKUP Alternative

not sure about a lookup as 12345 will be found in 123456 or 221234588
length may not matter

when you say (20) do you mean with brackets or will you have

1234520
123452020
BUT 1234520 - is the number you actually want , as 20 in that case is the suffix

the problem with the search is that
12345 will be found in many examples - BUT may not be the full number you want
1234567AA
1234562020
12345 ZZ

so that would be 3 different numbers

i assume next year it would be 2021 or (21)

2020 and (20) are 4 characters
whats the chances of the suffix in the majority of cases being 4
and use a trim to get rid of any spaces

Left( A2, Len(a2)-4 ) and the right() to extract the middle section

7. ## Re: Partial Match VLOOKUP Alternative

It changes all the time, nothing is easy

I may have (20) or I may have 2020 or I could have a different identifier such as ABC20. The length varies.

I had considered using a combination of left and trim to remove the vast majority of extra information. I could then have just done a further formula to remove the ones that had even more characters. The problem with this approach is that some of the cells are actually correct without any manipulation, therefore I would be removing characters when there is no need.

I can't use formulas to remove text after a comma or after a '-' or after a space as it varies with the cell for each value.

I do understand the challenge you are outlaying that data may be found several times but it is actually an unlikely possibility with the true dataset

8. ## Re: Partial Match VLOOKUP Alternative

Am i correct the prefix is always 4 characters

=MATCH(TRUE,ISERROR(VALUE(MID(RIGHT(A2,LEN(A2)-4),ROW(INDIRECT("1:"&LEN(A2)-4)),1))),0)+4
will find the start number in the string of the first NON number value after the 4 digit prefix
so that covers everything except the 2020
or can you have 1234 and that be the suffix
AAA-54321671234

so using that info, i have modified the spreadsheet to pull out the number , change to a number rather than text *1
and then use a vlookup

We could add an IF to test if the last 4 characters =2020
something like
IF( Right(A2,4)="2020", Mid(A2,5,len(a2)-8, "other formula below")

anywhere close

Correct sequence
=MID(A2,5,MATCH(TRUE,ISERROR(VALUE(MID(RIGHT(A2,LEN(A2)-4),ROW(INDIRECT("1:"&LEN(A2)-4)),1))),0)-1)*1

including the 2020 suffix into the IF statement we have
=IF(RIGHT(A2,4)="2020",MID(A2,5,LEN(A2)-8),MID(A2,5,MATCH(TRUE,ISERROR(VALUE(MID(RIGHT(A2,LEN(A2)-4),ROW(INDIRECT("1:"&LEN(A2)-4)),1))),0)-1))*1

then a normal lookup for sequence value
=VLOOKUP(C2,'Tab-2'!A:B,2,FALSE)

There are currently 1 users browsing this thread. (0 members and 1 guests)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1