+ Reply to Thread
Results 1 to 2 of 2

Fuzzy string comparison / detecting "similar" strings

  1. #1
    xirx
    Guest

    Fuzzy string comparison / detecting "similar" strings

    When dealing with real live data, you often have some
    variation of minor errors in your data. E.g. I have
    two lists (databases) in which Names sligthly differ.

    Examples:

    "Clark Kent" vs "Clark Kent"
    "John P. Smith" vs "John Paul Smith"
    "Miller Limited" vs "Miller Ltd."
    "Peter Hammer" vs "Petre Hammer"

    I am looking for a way to handle this (semi-) automatic.
    My idea is to have a function f, that takes two strings
    and delivers a measure on how much the are alike. E.g.
    f should be 1, if both arguments are identical and it
    should be 0 if they are "completely" different.

    I am pretty sure that a lot of ppl have been thinking
    abouut such a thing already and there should be more
    than one solution for this.

    Any pointers?

  2. #2
    Harlan Grove
    Guest

    Re: Fuzzy string comparison / detecting "similar" strings

    xirx wrote...
    >When dealing with real live data, you often have some
    >variation of minor errors in your data. E.g. I have
    >two lists (databases) in which Names sligthly differ.

    ....

    Read the two linked threads in

    http://groups-beta.google.com/group/...86dfc0974048ac

    (or http://makeashorterlink.com/?G12C11C7A ). You're correct that other
    people have discussed this before, so you should search the newsgroup
    archives before posting questions.


+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1