python 爬虫需要的库

pip install builtwit 
该模块将URL作为参数,下载该URL并对其进行分析,然后返回该网站使
用的技术。下面是使用该模块的-一个例子。
import builtwith
builtwith.parse('http://example.webscraping.com')
{'web-servers': ['Nginx'], 'web-frameworks': ['Web2py', 'Twitter Bootstrap'], 'programming-languages': ['Python'], 'javascript-frameworks': ['jQuery', 'Modernizr', 'jQuery UI']}
寻找网站所有者 pip install python-whois
import whois
print (whois.whois('http://example.webscraping.com/'))
{
  "domain_name": "WEBSCRAPING.COM",
  "registrar": "GoDaddy.com, LLC",
  "whois_server": "whois.godaddy.com",
  "referral_url": null,
  "updated_date": [
    "2013-08-20 08:08:30",
    "2013-08-20 08:08:29"
  ],
  "creation_date": "2004-06-26 18:01:19",
  "expiration_date": "2020-06-26 18:01:19",
  "name_servers": [
    "NS1.WEBFACTION.COM",
    "NS2.WEBFACTION.COM",
    "NS3.WEBFACTION.COM",
    "NS4.WEBFACTION.COM"
  ],
  "status": [
    "clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited",
    "clientRenewProhibited https://icann.org/epp#clientRenewProhibited",
    "clientTransferProhibited https://icann.org/epp#clientTransferProhibited",
    "clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited",
    "clientTransferProhibited http://www.icann.org/epp#clientTransferProhibited",
    "clientUpdateProhibited http://www.icann.org/epp#clientUpdateProhibited",
    "clientRenewProhibited http://www.icann.org/epp#clientRenewProhibited",
    "clientDeleteProhibited http://www.icann.org/epp#clientDeleteProhibited"
  ],
  "emails": "abuse@godaddy.com",
  "dnssec": "unsigned",
  "name": null,
  "org": null,
  "address": null,
  "city": null,
  "state": "Victoria",
  "zipcode": null,
  "country": "AU"
}

 

转载于:https://www.cnblogs.com/liangliangzz/p/10159980.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值