I am using Excel 2007 and I have a workbook with 12 different sheets. I have attached a sample of the field I need help with.
I need to extract the City/Town out of the branch field. The only pattern I can spot is that each branch begins with a number which is up to 4 numeric characters long, followed by a space and then the first part of a City/Town. This is followed by a space and either the second part of the town or another piece of information which I don't need.
I appreciate any help anyone can give me, either a formulae or VBA.
Last edited by dcaraher; 12-15-2010 at 09:13 AM.
One way, in cell B2:
=IF(ISERROR(FIND(" ",A2,FIND(" ",A2)+1)),MID(A2,FIND(" ",A2)+1,LEN(A2)),MID(A2,FIND(" ",A2)+1,FIND(" ",A2,FIND(" ",A2)+1)-FIND(" ",A2)-1))
This is basically looking for the two spaces and calculating and extracting the number of characters in between.
The first part, "IF(ISERROR(FIND(" ",A2,FIND(" ",A2)+1))" is checking to see if there is a *second* space; if not, you will get an error which is picked up by the ISERROR function.
If there is an error, the second part (the TRUE part of the IF statement), "MID(A2,FIND(" ",A2)+1,LEN(A2))", gives you the characters after the only space.
The third part (the FALSE part of the IF statement), "MID(A2,FIND(" ",A2)+1,LEN(A2)),MID(A2,FIND(" ",A2)+1,FIND(" ",A2,FIND(" ",A2)+1)-FIND(" ",A2)-1)", gives you the characters between the two spaces.
Drag down to cell B103.
Regards
Last edited by TMShucks; 12-15-2010 at 04:53 AM.
TM, thanks for your help with this. It is very close to what I need, and perhaps I am asking for the impossible.
Your formulae only extracts the first part which is fine for rows 1 , 2 & 3 but in row 4 it picks up Glasgow-Graham. I guess this is because the delimiter is a "-".
Also in row 21 it produces a result of High where I need it to produce High Wycombe.
Again many thanks foryour help so far, and it may be that what I want is not possible.
I thought it did what you described. I perhaps misinterpreted "the second part of the town". You may need to undertake a data cleansing exercise, at least for the "-" part of the problem. With regard to the name of a town, I'm not sure what you're going to do about that.The only pattern I can spot is that each branch begins with a number which is up to 4 numeric characters long, followed by a space and then the first part of a City/Town. This is followed by a space and either the second part of the town or another piece of information which I don't need.
I suspect that you would have the same issue with a VBA solution.
Maybe someone else can help; I will be interested in the answer.
Regards
TM having re-read my initialrequest I can see how I didn't make it clear.
I think this may be too difficult because even if I replace the "-" delimiter with a space, I am still asking for a solution to decide whether the second part of the string is required or not.
data is not consistent eg kings norton how would you know thats not just kings
you could compare each against a list of uk towns cities but even kings norton isnt in those lists as its an area of birmingham so youd need cities/towns/ district /borough lists
if you had the associated post codes it might be easier to extract from that but I'm not sure.
another example is glasgow hillington so why not london penge?
Last edited by martindwilson; 12-15-2010 at 05:30 AM.
"Unless otherwise stated all my comments are directed at OP"
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
I've attached the solution to the spread sheet you provided. Quite simple really. I used the TEXT functions: LEN and MID.
@ LL1980 all that does is remove the leading digits! and even then would fail if there were more than 4 or less than 3
the same thing you show can be accomplished with just
=MID(A2,FIND(" ",A2)+1,255)
it doesnt parse 144 Lytham St Annes-Graham to just Lytham St Annes
Last edited by martindwilson; 12-15-2010 at 05:56 AM.
"Unless otherwise stated all my comments are directed at OP"
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
i think perhaps as a prelimary move
just find/replace *graham* with nothing first
then find replace (* with nothing
then find replace *p&h* with nothing
then have a look whats whats left
"Unless otherwise stated all my comments are directed at OP"
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
Hi dcaraher
maybe...you can probably do the same with out regular expresionFunction PatternExtractTwo(rngString As Range) Dim oRgx As Object Set oRgx = CreateObject("VBScript.RegExp") With oRgx .IgnoreCase = True .Global = True .Pattern = "(-)" If .test(rngString) = True Then .Pattern = "([A-Za-z\s])+" PatternExtractTwo = .Execute(rngString)(0) Else .Pattern = "([A-Za-z]+)" PatternExtractTwo = .Execute(rngString)(0) End If End With End Function
regards pike
If the solution helped please donate here to the RSPCA
Sites worth visiting;
J&R Solutions - royUK
AJP Excel Information - Andy Pope
Spreadsheet Toolbox
VBA for smarties - snb
TMShucks
looks like the address names are like surnames with too many variations to able to filter all of them correctly
regards pike
If the solution helped please donate here to the RSPCA
Sites worth visiting;
J&R Solutions - royUK
AJP Excel Information - Andy Pope
Spreadsheet Toolbox
VBA for smarties - snb
This UDF is a compromise but does thin the list fairly drastically.
Function RemoveMaxPossible(rngString As Range) Dim n As Integer Dim strData As String Dim arrCompanies() As Variant strData = Mid(rngString, InStr(1, rngString, " ") + 1) arrCompanies = Array("GRAHAM", "Grahams", "Graham", "Jewson", "NGB", "P&H", "P & H", "NJB", "Plumbing & Heatn", "Contrac") For n = 0 To UBound(arrCompanies) strData = WorksheetFunction.Substitute(strData, arrCompanies(n), "") Next If InStr(1, strData, "-") > 0 Then strData = Left(strData, InStr(1, strData, "-") - 1) If InStr(1, strData, "(") > 0 Then strData = Left(strData, InStr(1, strData, "(") - 1) If InStr(1, strData, Space(2)) > 0 Then strData = Left(strData, InStr(1, strData, Space(2)) - 1) RemoveMaxPossible = Trim(strData) End Function
Add to the array of "company names" as examples are found.
And possibly add if statements to cover other delimiters as they reveal themselves.
The end result should allow you to choose whether the text is an address or a town name.
This is by no means ideal but might help some.
It might help with your geography if nothing else.
If you need any more information, please feel free to ask.
However, if this takes care of your needs, please click EDIT in your original post, click GO ADVANCED and set the PREFIX box to SOLVED. It helps everybody! ....
Also
If you are satisfied by any members response to your problem please consider using the small Star icon botom left of thier post to show your appreciation.
Guys thanks very much for all of your help, this is a perfect example of why this board is so helpful to excel novices like me.
I found an alternative solution to my problem. I extracted teh branch number using the LEFT function. I then went to the originator of the file and asked for a branch list and used that in a VLOOKUP. Not an elegant solution but at least it fixed my immediate problem.
However looking through the suggested solutions has increased my knowledge so a double success.
Once again many thanks for your help.
Never give up........lol
Actually the UDF from Marcol worked very well, and I think with a little adaptation I could use it for other similar problems.
Thanks everyone.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks