Re: extracting the root domain from a URL [message #171667 is a reply to message #171654] |
Fri, 14 January 2011 23:54 |
Thomas 'PointedEars'
Messages: 701 Registered: October 2010
Karma:
|
Senior Member |
|
|
Jonathan Stein wrote:
> Den 13-01-2011 22:50, Mike skrev:
>> Given any valid URL, I'd like to extract the root domain like this:
>>
>> http://www.site.com = site.com
>> http://xxx.yyy.site.com = site.com
>> http://subdomain.site.com = site.com
>> http://www.site.com.tw = site.com.tw
>> http://xxx.yyy.site.com.asia = site.com.asia
>> http://subdomain.site.com.af = site.com.af
>
> If you need a fast lookup, you'll probably need to maintain a database
> with rules for each TLD you intend to support.
>
> Otherwise you could go for a series of "whois" lookups. If whois
> succeeds for "site.com.af" but fails for "subdomain.site.com.af", then
> "site.com.af" was probably what you was looking for.
WHOIS would be overkill here and it is not universally supported anymore
(for example, DENIC dropped WHOIS support a few years ago except via their
website because of misuse), so you would get false positives.
The proper internet service to use here is DNS itself, of course. You would
make a connection to port 53/udp on a nameserver that does recursive DNS
lookups (unless you want to consider the local host's DNS configuration) and
request information about the `A' (IPv4) or `AAAA' (IPv6) resource record of
the domain-part part (sic!). Repeat adding a sub-level component until the
query is successful or the full domain-part is reached.
PointedEars
--
Use any version of Microsoft Frontpage to create your site.
(This won't prevent people from viewing your source, but no one
will want to steal it.)
-- from <http://www.vortex-webdesign.com/help/hidesource.htm> (404-comp.)
|
|
|