tld¶
Extract the top level domain (TLD) from the URL given. List of TLD names is taken from Mozilla.
Optionally raises exceptions on non-existing TLDs or silently fails (if
fail_silently
argument is set to True).
Knows about active and inactive TLDs.
If only active TLDs shall be matched against, active_only
argument
shall be set to True (default - False).
Prerequisites¶
- Python 2.6, 2.7, 3.4, 3.5, 3.6 and PyPy
Installation¶
Latest stable version on PyPI:
pip install tld
Or latest stable version from GitHub:
pip install https://github.com/barseghyanartur/tld/archive/stable.tar.gz
Or latest stable version from BitBucket:
pip install https://bitbucket.org/barseghyanartur/tld/get/stable.tar.gz
Usage examples¶
Get the TLD name as string from the URL given:
from tld import get_tld
get_tld("http://www.google.co.uk")
# 'google.co.uk'
get_tld("http://www.google.idontexist", fail_silently=True)
# None
If you wish, you could get the TLD as an object:
from tld import get_tld
res = get_tld("http://some.subdomain.google.co.uk", as_object=True)
res
# 'google.co.uk'
res.subdomain
# 'some.subdomain'
res.domain
# 'google'
res.suffix
# 'co.uk'
res.tld
# 'google.co.uk'
Get TLD name, ignoring the missing protocol:
from tld import get_tld
get_tld("www.google.co.uk", fix_protocol=True)
# 'google.co.uk'
Update the list of TLD names¶
To update/sync the tld names with the most recent version run the following from your terminal:
update-tld-names
Or simply do:
from tld.utils import update_tld_names
update_tld_names()
Troubleshooting¶
If somehow domain names listed here are not recognised, make sure you have the most recent version of TLD names in your virtual environment:
update-tld-names
License¶
MPL 1.1/GPL 2.0/LGPL 2.1
Author¶
Artur Barseghyan <artur.barseghyan@gmail.com>
Documentation!¶
Contents:
tld package¶
Subpackages¶
Submodules¶
tld.conf module¶
tld.defaults module¶
tld.exceptions module¶
-
exception
tld.exceptions.
TldBadUrl
(url)[source]¶ Bases:
exceptions.ValueError
TldBadUrl.
Supposed to be thrown when bad URL is given.
tld.test module¶
tld.utils module¶
-
tld.utils.
get_tld
(url, active_only=False, fail_silently=False, as_object=False, fix_protocol=False)[source]¶ Extract the top level domain.
Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw
TldBadUrl
orTldDomainNotFound
exceptions if there’s bad URL provided or no TLD match found respectively.Parameters: - url (str) – URL to get top level domain from.
- active_only (bool) – If set to True, only active patterns are matched.
- fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
- as_object (bool) – If set to True,
tld.utils.Result
object is returned,domain
,suffix
andtld
properties. - fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).
Returns: String with top level domain (if
as_object
argument is set to False) or atld.utils.Result
object (ifas_object
argument is set to True); returns None on failure.Return type: str
-
tld.utils.
get_tld_names
(fail_silently=False, retry_count=0)[source]¶ Build the
tlds
list if empty. Recursive.Parameters: - fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
- retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops.
Returns: List of TLD names
Type: iterable
Module contents¶
-
tld.
get_tld
(url, active_only=False, fail_silently=False, as_object=False, fix_protocol=False)[source]¶ Extract the top level domain.
Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw
TldBadUrl
orTldDomainNotFound
exceptions if there’s bad URL provided or no TLD match found respectively.Parameters: - url (str) – URL to get top level domain from.
- active_only (bool) – If set to True, only active patterns are matched.
- fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
- as_object (bool) – If set to True,
tld.utils.Result
object is returned,domain
,suffix
andtld
properties. - fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).
Returns: String with top level domain (if
as_object
argument is set to False) or atld.utils.Result
object (ifas_object
argument is set to True); returns None on failure.Return type: str
-
tld.
get_tld_names
(fail_silently=False, retry_count=0)[source]¶ Build the
tlds
list if empty. Recursive.Parameters: - fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
- retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops.
Returns: List of TLD names
Type: iterable