I have over 500 000 data records. The variable fields of interest are amount, issue date and age. Data can be identical across the three fields and I want to remove some of the data records.
For example, if i have 10 data records with the same amount, issue date and age, I need to remove 20% of the records (i.e. 2 records) and only have 8 records left. 20% is the scaling factor.
The code will therefore find the total number of records that are identical and remove 20% of such records.
Due to the size of the data, I cannot do that manually so I would really really appreciate your help.
Please see sample data attached. Original data is in columns B to G. After scaling down by 20%, I expect data to be as in columns I to N. Records with same cell colors are identical. Note a table for checking in columns P to T.
Please note that I have also asked for help on MrExcel.com and ozgrid.com
Thank you in advance
Evans
Bookmarks