+ Reply to Thread
Results 1 to 13 of 13

Close match ?

  1. #1
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Close match ?

    Im scraping soccer information from 5 different sources for my fantasy league. Problem is that most team names can be written in many different ways with a slight difference. I've started on a very daunting task where I have in my data sheet Column AR whats supposed to be final rename of the team, then I use Vlookup in my script to try to find this team name in the data sheet. Here is a sample of my work

    Vitoria Guimaraes Guimaraes Vitoria Guimaraes FC Vitoria Guimaraes Guimaraes
    AEskilstuna AFC Utd Eskilstuna City AFC Eskilstuna Eskilstuna City
    Malmo FF Malmø FF Malmo Malmø Malmo
    Levante UD FC Levante UD Levante FC Levante Levante
    TSV Hartberg TSV Hartberg Hartberg Hartberg Hartberg
    Legia Warsaw Legia Warsaw Legia Warszawa Legia Warszawa Legia Warsaw
    Norwich City Norwich Norwich City FC Norwich City Norwich

    For the most part there are just some slight variation from site to site , although some names could be quite different
    Does anyone have a better solution/suggestion to help automate this process as much as possible, In my code I try to remove stuff like FC , FK, etc but even with all I get a ton of mismatches ..

    Any help would be most appreciated ,

    best regards

    frederik

  2. #2
    Forum Expert
    Join Date
    06-12-2012
    Location
    Ridgefield Park, New Jersey
    MS-Off Ver
    Excel 2003,2007,2010
    Posts
    10,241

    Re: Close match ?

    Can you post a sample worksheet? You can use VLOOK-UP with a wildcard or other methods which may help?

  3. #3
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    Hi thanks for your reply . Ive included a sample book (also the data) - as you see this is really hard to organize so any help speed up this process would be great.
    Attached Files Attached Files

  4. #4
    Forum Expert
    Join Date
    06-12-2012
    Location
    Ridgefield Park, New Jersey
    MS-Off Ver
    Excel 2003,2007,2010
    Posts
    10,241

    Re: Close match ?

    What information are you trying to pull and too where?

  5. #5
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    The main sheets with "sheet1" in the sample are generated from a few sites trough my web scraper script. when I generate the teams in Column E- G its done like this:

    Please Login or Register  to view this content.
    and it loops trough all the teams when it pulls them from the web.
    Last edited by colddeck84; 08-21-2017 at 09:34 AM.

  6. #6
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    I found a similar problem trough a google search and this code was provided, Ill attach my testbook , It managed to get 75% correct. This code is really brilliant but its designed I think to clear out typos so its not really designed for this but very close

    Please Login or Register  to view this content.
    This code is so beautiful I don't even know how to edit it. Ill attach my textbook it shows the problem
    Attached Files Attached Files

  7. #7
    Forum Guru
    Join Date
    08-15-2004
    Location
    Tokyo, Japan
    MS-Off Ver
    2013 O.365
    Posts
    22,588

    Re: Close match ?

    Seems the code I wrote years ago.

    What's the problem?

  8. #8
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    Yes that is correct. its the most beautiful code - I really like your work
    If you see in my test book im trying to get as close to 100% match when I ran your code it gave around 75% match which is pretty great I most admit.

    I have 5 sources (and more to come) that scrape soccer team info.
    problem is that most sites spell the teams differently here are some sample:

    West Ham
    West Ham United
    West Ham FC
    FC West Ham

    VfL Bochum
    Bochum
    VfL Bochum

    Brighton & Hove Albion U23
    Brighton & Hove Albion U23
    Brighton and Hove Albion U23

    Ayacucho FC
    Ayacucho
    Inti Gas Deportes

    The way im handling it for now it was a huge vlookup table with 6 columns but this i a gigantic task to sort everything. Because I need all teams to have same name.

    Also note team that have U21 U22 U23 U24 U19 U20 should not be comperead to other then Ux team
    and same with women teams W (W)


    All different teams:
    Liverpool
    Liverpool U18
    Liverpool U19
    Liverpool U20
    Liverpool U21
    Liverpool U22
    Liverpool U23
    Liverpool W

    So best to not let code do anything in those situations.
    (also included another test book with teams)

    If you could help in anyway with this I would be forever grateful
    Attached Files Attached Files

  9. #9
    Forum Guru
    Join Date
    08-15-2004
    Location
    Tokyo, Japan
    MS-Off Ver
    2013 O.365
    Posts
    22,588

    Re: Close match ?

    See if this is how you wanted
    =GetCommon(A2:F2)
    Please Login or Register  to view this content.
    Attached Files Attached Files

  10. #10
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    Hi I was just sniffing trough your code when I heard my phone pinging , I was very glad to see it was from here..
    This looks indeed very nice but How would I go when new team are added , I would still need to sort everything ? I kinda liked your first code better as only one list are needed but maybe that why I don't understand yet how your new code is working?

    Basically here is what i'm doing everyday:
    I load site 1 into my excel it generetes from 500 to 3000 names at the most.
    then I load site 2 on top of this
    site 2 tries to run the teams trough a vlookup , (and this is totally dependent upon me updating the data sheet to include all the new names )
    then I remove duplicates
    run site 3
    run site 4
    sort en rerun everything
    its almost become an endless process :/

    If you see the first page in that excel document you uploaded "sheet1". There I have sorted all the teams based on "Column E", - A trough Z ,
    but I have not run the "remove duplicate code yet."

    So question is how would I go to sort out all the teams on page "sheet1"
    would your newest code fix this as best as possible ?

    maybe a combination of both codes are the way to handle this problem? where one could use VLookLike trough master column H, and for better match finding also coloumn A-G could be a part of the criteria ? (if there are data in them)

    your original code produced 18 mistakes out of 77 rows when just searching through one column, then we could take those 17 mistakes and add them next to the search column , Then next time we run the code it will search 2 columns and compering both of them?

    I really appreciate the help Ive received thus far , I really hope you can help me solve this problem. I'm a little afraid a beast code would be necessary for this

  11. #11
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    also here is a part from one of the sites im pulling data from

    Please Login or Register  to view this content.
    I maked the part that handles the teams in red

  12. #12
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    Ok so ive been thinking alot about this problem and I have found a solution that I think will work very well.. I created a test sheet which i've included . It should be done in a few steps this problem

    First team name is stored as variable ateam then
    run your brilliant code vlooklike to get a script suggestion for a team name. then a code must be run to check if the suggestion is correct. In my test sheet I used vlookup which obviously will not work as this is dependent on everything being in the
    same row. So a code must be run to figure out which row number the suggestion comes from. then check if the suggested script name matches any of the names on that row - and it should alway return final "rename" from that row and column H
    if it can not find a match, the ateam string should be unaltered.

    From my test sheet:
    Column B is the import of ateam - vlooklike checks if it can find a match searching Column H (this should be expanded to coloum C:H)
    Then some crazy code must figure out which row the suggestion comes from ,

    there will be 3 different results (team found correctly, found incorrect, found incorrect because the team name is not stored in database=

    example: imported name: "Malaysia (W)" , script suggestion for rename "Malaysia Women"
    - "Malaysia Women" can be found in row 23 - now we need to check if both name is present in row 23 and yes its in "E23"- test passed. But as I stated earlier it Should return result from H23
    now it can move to next string

    sample2
    import name: "New Team Added" , script suggestion: "FK Energomash Belgorod" , which can be found in row 142 but scrpt suggestion "New Team Added" can not be found in C142:H142, suggestion fail ,
    it should return "New Team Added"

    sample3 (basically same as above)

    import name: "Ceres La Salle", script suggestion: "OSFK Sarisske Michalany", which can be found in row 51, but again Ceres La Salle is not present in cell "C51:H51" ,
    it should return "Ceres La Salle"

    Later I can manually add missing teams to the data base , but after a code like this there will be a lot less to do

    I include my test book, hope you have time to look at this,

    Also a little tweaking of the Vlookalike code for better matching would be great , it could ignore words like, AFC, afc, FA, MSK, MK, afc, FC, fc FK, fk, Fc, Fk, CD, Cd, cd, MSK, SV, sv "Ø" count as O Å as A

    Best regards

    frederik
    Attached Files Attached Files

  13. #13
    Forum Contributor colddeck84's Avatar
    Join Date
    06-18-2016
    Location
    bergen, norway
    MS-Off Ver
    2016
    Posts
    254

    Re: Close match ?

    here is a working sample made by an amatour coder like myself but gives a clear indication of what im trying to accomplish , think a little poolish of the vlooklike code and my code and this could be very great
    Attached Files Attached Files

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] sumif which finds text, not an exact match, but a close match instead
    By Beefy1 in forum Excel Formulas & Functions
    Replies: 3
    Last Post: 02-19-2014, 08:29 PM
  2. If, Match, and Somewhere Close To
    By just_some_dude in forum Excel General
    Replies: 10
    Last Post: 12-02-2011, 12:31 PM
  3. Code req for returning results of a close match if two other columns match exactly
    By davidparkes in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 10-23-2010, 12:00 PM
  4. VLOOKUP Closest Match Not Close Enough
    By Panic in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 05-11-2008, 03:41 PM
  5. VLOOKUP Closest Match Not Close Enough
    By Ronster in forum Excel Programming / VBA / Macros
    Replies: 15
    Last Post: 05-07-2008, 01:14 PM
  6. Finding an alpha numeric close match
    By carsto in forum Excel General
    Replies: 6
    Last Post: 11-15-2006, 10:17 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1