I'm trying to find out the rule for de-duplicating data. I am removing duplicates based on an identification number in a data set of about 6000 records, including the duplicates (some records appear about 4 times). Due to the nature of the data I'm working with, there are only a handful of records that are "true" duplicates, i.e. some of the records appear 4 times but there is a difference in terms of location, etc and some are true duplicates in that there is no difference.
I need to know how Excel removes duplicates - does it only keep the first line that it finds for that identification number?
Also, is there a way that I could create a rule for it to keep the record with the highest rate for example?
Bookmarks