Handling variances in user-entered text data

**mwwoodm** · 11-01-2011, 08:45 AM

The database that I'm working with has some fields that are critical for my analysis (e.g. for VLOOKUPs) but come from user-entered free text, so I often need to match something like:
"Montezuma #10-W25"
with entries such as:
"muntezuma 10-W 25"
"Montezuma1025W"
"Monte Zuma 25W #10"
"MZA 10-W25"
To date, I've been handling case-by-case errors using TRIM, CLEAN, UPPER, LEFT, MID, etc, but the dataset is large and variations in data entry abound; I'm looking for a more robust approach.

Any suggestions or recommenations on a neural network add-in or other solution that would promote automatic recognition/correction of these kinds of variances?

Thanks in advance,
Mike

**MarvinP** · 11-01-2011, 09:06 AM

Hi Mike,

I believe you are looking gor a Fuzzy Logic match. Read
http://excellerando.blogspot.com/201...ng-to-get.html
to see if this topic helps.

**Domski** · 11-01-2011, 09:08 AM

Hi,

I've not tried this yet but could be worth a look:

http://www.microsoft.com/download/en....aspx?id=15011

Dom

**mwwoodm** · 11-01-2011, 11:42 AM

Excellent feedback from both MarvinP and Dom. Thanks, guys.

Handling variances in user-entered text data

LinkBack

Thread Tools

Rate This Thread

Display

Handling variances in user-entered text data

Re: Handling variances in user-entered text data

Re: Handling variances in user-entered text data

Re: Handling variances in user-entered text data

Thread Information

Users Browsing this Thread

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions