I am doing a project which requires me to clean up a dataset. The relevant columns are 'Number', 'End_date' and 'Start_date'. The things that require a bit of programming are the following:
- All the inputs in my dataset are on people that have a number in the dataset. However, on some people there are multiple inputs due to them having performed different functions
- For all the people that there are multiple inputs, I would like to create one row with data
- These rows should contain the oldest start date and newest end date
For example, if the person with number 4 has start dates 01-01-1990 and 01-01-2000, and end dates 01-01-2000 and 01-01-2010, I only need the start date 01-01-1990 and the end date 01-01-2010. I do not exactly know how to program this myself, so I was wondering whether there is anyone here that is able to help me? Thanks in advance!
Bookmarks