How to use the tld.exceptions.TldBadUrl function in tld

To help you get started, we’ve selected a few tld examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github j0lv3r4 / dnscheck / index.py View on Github external
def post_auth_ns():
    domain = request.forms.get("domain")

    if domain is None:
        return jsonify(status=400, message="Param domain missing.")

    try:
        nameserver_list = get_authoritative_nameserver(domain)
    except exceptions.TldDomainNotFound as e:
        print("TldDomainNotFound", e)
        return jsonify(status=400, message=str(e))
    except exceptions.TldBadUrl as e:
        print("TldBadUrl", e)
        return jsonify(status=400, message=str(e))

    return jsonify(status=200, message=nameserver_list)
github creativecommons / cccatalog-api / ingestion_server / ingestion_server / cleanup.py View on Github external
def cleanup_url(url, tls_support):
        """
        Add protocols to the URI if they are missing, else return None.
        """
        parsed = urlparse(url)
        if parsed.scheme == '':
            try:
                _tld = get_tld('https://' + url, as_object=True)
                _tld = _tld.subdomain + '.' + _tld.domain + '.' + _tld.tld
                _tld = str(_tld)
            except TldBadUrl:
                _tld = 'unknown'
                log.info('Failed to parse url {}'.format(url))
            try:
                tls_supported = tls_support[_tld]
            except KeyError:
                tls_supported = TlsTest.test_tls_supported(url)
                tls_support[_tld] = tls_supported
                log.info('Tested domain {}'.format(_tld))

            if tls_supported:
                return "'https://{}'".format(url)
            else:
                return "'http://{}'".format(url)
        else:
            return None
github creativecommons / cccatalog-api / ccbot / crawl_planner / crawl_plan.py View on Github external
Given a URL dump csv, associate each domain with a provider. This is
    necessary because image CDNs are often not on the same domain as the parent
    website.
    """
    provider_domains = defaultdict(set)
    num_urls = 0
    with open(filename, 'r') as url_file:
        reader = csv.DictReader(url_file)
        for row in reader:
            url = row['url']
            # Parse domain and TLD from the URL.
            try:
                parsed = get_tld(url, as_object=True)
                parsed = parsed.domain + '.' + parsed.tld
                parsed = str(parsed)
            except TldBadUrl:
                log.warn('Ignoring malformed url {}'.format(url))
                continue
            provider = row['provider']
            provider_domains[provider].add(parsed)
            num_urls += 1
    return provider_domains, num_urls

tld

Extract the top-level domain (TLD) from the URL given.

MPL-1.1 OR GPL-2.0-only OR LG…
Latest version published 1 year ago

Package Health Score

76 / 100
Full package analysis