Dear all,

A little intro - my name is Robert, I study Computer Science and have an interest in Excel. Im currently doing a summer job, and someone in the IT department here is stuck and got wind of my interest and has asked me for help - however it is above my knowledge level, and was wondering if anyone on here could point me in the right direction.

Basically, the company uses an online SQL interface that a group of researchers use to record details of certain forums, blogs etc. These are imported to the Excel sheet in question.

The data is organised into several headings including the type of website (forum, blog etc), a preset ID (in numerical order), Parent ID, URL etc.

The problem with the data is that any URL's that have the same domain need to have the same parent ID, but currently do not. For example:

www.abc.blogspot.com

www.123.blogspot.com

These two URL's have the same domain (blogspot) but currently have different parent ID's. I need to figure out a way of sorting the data (just once, it will not be updated) so that any URL that has the same domain should have the same parent ID. At the moment, I am assuming the parent ID will be preset to the first instance of the domain, so subsequent ones should be assigned that parent ID.

For the record, there are close to 14,000 lines of URL's that need to be sorted, with approx. 2,000 domains.

I have looked into searching for text within text etc which would be fine if there wasn't so much data.

I appreciate that this is a massive shot in the dark, and I probably haven't explained it very well, but I would really appreciate it if someone could point me in the right direction, so at least my research is concentrated in the right area! Of course my bosses wanted it done yesterday (i was only told about it this morning!)

Many thanks,

Robert