Hey,
Here's what I'm trying to do: basically I have a list of subdomains from which I want to extract the root domain.
Up until now I've used the following steps:
- Find the last "." occurence and retrieve what's to the right of it (basically the TLD): "something.example.com" would retrieve ".com"
- Repeat step 1 for the string without the result in step 1: "something.example" would retrieve "example"
- Concatenate step 2 and step 1 to get root domain: "example.com"
Here's the problem I've encountered using this method (it's in step 1): trying to retrieve root domain for "something.example.co.uk" would only retrieve ".co.uk"
I'm not sure if there's a quick way to fix that, other than actually use a list of possible TLDs (including second level), and see if my string ends in one of the TLDs from the list.
Here's an example:
Subdomain Result TLD List something.example.com .com .com abc.letters.co.uk .co.uk .org www.comparison.org .org .net .co.uk .etc
I wanna put emphasis that the string must end and not just contain the tld, since for the third example, it might retrieve ".com" from ".comparison" instead of ".org".
So, any ideas on what I can use to solve this (if there's any way I can avoid using a TLD list, that'd be best)?
Cheers!
Bookmarks