https://www.kingname.info/2020/10/02/copy-from-ssh/kingname.info
可能有人会这样写代码:
url = 'https://www.kingname.info/2020/10/02/copy-from-ssh/'
domain = '.'.join(url.split('/')[2].split('.')[1:])
运行效果如下图所示:
https://
blog.exercise.kingname.com.cnkingname.com.cn
kingname.infokingnamegoogle.com.hkgoogle
对于这些需求,如果手动写规则来提取的话,会非常麻烦。
tld
我们先来安装它:
python3 -m pip install tld
安装完成以后,我们来看看它的使用方法:
>>> url = 'https://www.kingname.info/2020/10/02/copy-from-ssh/'
>>> from tld import get_tld
>>> result = get_tld(url, as_object=True)
>>> domain = result.domain
>>> print(domain)
kingname
>>> domain_with_suffix = result.fld
>>> print(domain_with_suffix)
kingname.info
get_tld.domain.fld
运行效果如下图所示:
https
fix_protocol=True