tld

Extract the top level domain (TLD) from the URL given. List of TLD names is taken from Mozilla.

Optionally raises exceptions on non-existing TLDs or silently fails (if fail_silently argument is set to True). Knows about active and inactive TLDs. If only active TLDs shall be matched against, active_only argument shall be set to True (default - False).

Prerequisites

  • Python 2.6, 2.7, 3.4, 3.5, 3.6 and PyPy

Installation

Latest stable version on PyPI:

pip install tld

Or latest stable version from GitHub:

pip install https://github.com/barseghyanartur/tld/archive/stable.tar.gz

Or latest stable version from BitBucket:

pip install https://bitbucket.org/barseghyanartur/tld/get/stable.tar.gz

Usage examples

Get the TLD name as string from the URL given:

from tld import get_tld

get_tld("http://www.google.co.uk")
# 'google.co.uk'

get_tld("http://www.google.idontexist", fail_silently=True)
# None

If you wish, you could get the TLD as an object:

from tld import get_tld

res = get_tld("http://some.subdomain.google.co.uk", as_object=True)

res
# 'google.co.uk'

res.subdomain
# 'some.subdomain'

res.domain
# 'google'

res.suffix
# 'co.uk'

res.tld
# 'google.co.uk'

Get TLD name, ignoring the missing protocol:

from tld import get_tld

get_tld("www.google.co.uk", fix_protocol=True)
# 'google.co.uk'

Update the list of TLD names

To update/sync the tld names with the most recent version run the following from your terminal:

update-tld-names

Or simply do:

from tld.utils import update_tld_names

update_tld_names()

Troubleshooting

If somehow domain names listed here are not recognised, make sure you have the most recent version of TLD names in your virtual environment:

update-tld-names

Testing

Simply type:

./runtests.py

Or use tox:

tox

Or use tox to check specific env:

tox -e py36

License

MPL 1.1/GPL 2.0/LGPL 2.1

Support

For any issues contact me at the e-mail given in the Author section.

Author

Artur Barseghyan <artur.barseghyan@gmail.com>

Documentation!

Contents:

tld package

Subpackages

tld.commands package
Submodules
tld.commands.update_tld_names module
tld.commands.update_tld_names.main()[source]

Updates TLD names.

Example:python src/tld/commands/update_tld_names.py
Module contents

Submodules

tld.conf module

tld.defaults module

tld.exceptions module

exception tld.exceptions.TldBadUrl(url)[source]

Bases: exceptions.ValueError

TldBadUrl.

Supposed to be thrown when bad URL is given.

exception tld.exceptions.TldDomainNotFound(domain_name)[source]

Bases: exceptions.ValueError

TldDomainNotFound.

Supposed to be thrown when domain name is not found (didn’t match) the local TLD policy.

exception tld.exceptions.TldIOError(msg=None)[source]

Bases: exceptions.IOError

TldIOError.

Supposed to be thrown when problems with reading/writing occur.

tld.helpers module

tld.helpers.project_dir(base)[source]

Project dir.

tld.helpers.PROJECT_DIR(base)

Project dir.

tld.test module

tld.utils module

tld.utils.get_tld(url, active_only=False, fail_silently=False, as_object=False, fix_protocol=False)[source]

Extract the top level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:
  • url (str) – URL to get top level domain from.
  • active_only (bool) – If set to True, only active patterns are matched.
  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
  • as_object (bool) – If set to True, tld.utils.Result object is returned, domain, suffix and tld properties.
  • fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).
Returns:

String with top level domain (if as_object argument is set to False) or a tld.utils.Result object (if as_object argument is set to True); returns None on failure.

Return type:

str

tld.utils.get_tld_names(fail_silently=False, retry_count=0)[source]

Build the tlds list if empty. Recursive.

Parameters:
  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
  • retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops.
Returns:

List of TLD names

Type:

iterable

class tld.utils.Result(subdomain, domain, suffix)[source]

Bases: object

Container.

domain
extension

Alias of suffix.

Return str:
subdomain
suffix
tld

TLD.

tld.utils.update_tld_names(fail_silently=False)[source]

Update the local copy of TLDs file.

Parameters:fail_silently (bool) – If set to True, no exceptions is raised on failure but boolean False returned.
Returns:True on success, False on failure.
Return type:bool

Module contents

tld.get_tld(url, active_only=False, fail_silently=False, as_object=False, fix_protocol=False)[source]

Extract the top level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:
  • url (str) – URL to get top level domain from.
  • active_only (bool) – If set to True, only active patterns are matched.
  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
  • as_object (bool) – If set to True, tld.utils.Result object is returned, domain, suffix and tld properties.
  • fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead).
Returns:

String with top level domain (if as_object argument is set to False) or a tld.utils.Result object (if as_object argument is set to True); returns None on failure.

Return type:

str

tld.get_tld_names(fail_silently=False, retry_count=0)[source]

Build the tlds list if empty. Recursive.

Parameters:
  • fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure.
  • retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops.
Returns:

List of TLD names

Type:

iterable

class tld.Result(subdomain, domain, suffix)[source]

Bases: object

Container.

domain
extension

Alias of suffix.

Return str:
subdomain
suffix
tld

TLD.

tld.update_tld_names(fail_silently=False)[source]

Update the local copy of TLDs file.

Parameters:fail_silently (bool) – If set to True, no exceptions is raised on failure but boolean False returned.
Returns:True on success, False on failure.
Return type:bool

Indices and tables