Hi everyone,
Let me begin by thanking everyone who has contributed to this wonderful community. I truly believe in the power of shared experiences and learning. A special thank you for shg who has helped me out a lot and shown me avenues that I never thought were even possible with an everyday program as excel. Keep up the good work everyone.
OK, now my question... Plagiarism is a problem in many fields, but its detection is not always simple and straightforward. In a recent thread, shg was able to demonstrate how to compute the the Levenshtein Distance between s and t from two strings and determine their similarities. This works very well, but with large strings carries a major computing burden. Does anyone know of any way of checking two strings for similarities based on the similarity of words? Maybe this can be an extenstion of shg's code to compare words instead of characters... Maybe this can be a completely different approach.. I am very interested in understanding more ways of comparing strings in an efficient manner, but have found very little on the internet on the subject.
So the simple question is... how can we detect possible plagiarism between two groups of words, code, etc.
All thoughts, ideas and suggestions are welcome.
Thanks.
abousetta
Bookmarks