Does anyone know much about data mining and computer learning? I find it fascinating but at the same time not sure where one ends and the next begin. I've started to play around with data mining getting frequencies of words and word segments from text but I know that the concepts are well-developed in the IT world. My final goal would be to process a vast number of records by having the program screen them all and exclude the ones will the least probability of meeting my needs. To accomplish this, I would screen for example the first 1000 records by hand, indicating which ones I want and which ones I don't. The program should then analyze the individual words that make up each record and generate a pattern for inclusion vs. exclusion. So I tended to include every record with the word "car" in it and excluded every record with the word "house" then it can go through the remaining records and do that for me.

Anyone have any experience with these concepts?

abousetta