python实现同服网站地址获取

最新推荐文章于 2024-09-11 08:51:43 发布

寒江雪语

最新推荐文章于 2024-09-11 08:51:43 发布

阅读量1.3k

点赞数 1

分类专栏： python 文章标签： python 蜘蛛

本文链接：https://blog.csdn.net/hjxyshell/article/details/39980647

版权

python 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

说明：程序使用http://s.tool.chinaz.com/same此网站查询的结果，使用python简单的实现抓取结果

先随便查询一个结果，抓包分析，如图：

使用python模仿post表单，使用正则表达式匹配结果

代码如下：

# -*- coding: utf-8 -*- 
import urllib
import urllib2
import re
import sys

#get url in the same ip
def get_url(url):
    #set header info
    headers = {  
               'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36',
               'Referer': 'http://s.tool.chinaz.com/same'  
               }
    postdata = urllib.urlencode({'s':url})
    req = urllib2.Request('http://s.tool.chinaz.com/same',postdata,headers)
    try:
        result = urllib2.urlopen(req)
    except:
        print 'Failed to open url,you can try again...'
        return
    fweb = result.read()
    #.</span> <a href='http://www.31hzp.com'
    pattern = re.compile(r'</span> <a href=\'(.+?)\'')
    match = pattern.findall(fweb)
    filename = str(url).replace(':', '').replace('\\', '')
    fp = open(filename+'.txt','w')
    if match:
        for m in match:
            fp.write(m)
            fp.write('\n')
            print m  
    else:
        print 'find nothing...'
    fp.close()
#usage
def usage(name):
    #www.31jmw.com
    print '%s www.xxx.com'%name
    sys.exit(1)
#entry point
if __name__ == '__main__':
    if len(sys.argv) != 2:
        usage(sys.argv[0])
    print 'start...'
    url = "".join(sys.argv[1])   #取出列表中的字符串
    #print url
    get_url(url)
    print 'end...'

测试结果如下：

F:\mycode\python\pytest\src>ipsamescan.py www.31jmw.com
start...
http://www.31hzp.com
http://100ec.cn
http://ec100.cn
http://toocle.cn
http://www.31jmw.com
http://www.31expo.com
http://www.toocle.cn
http://561288.com
http://www.toocle.com.cn
http://www.31metals.com
http://31expo.com
http://www.100ec.cn
end...

寒江雪语

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python实现同服网站地址获取

说明：程序使用http://s.tool.chinaz.com/same此网站查询的结果，使用python简单的实现抓取结果先随便查询一个结果，抓包分析，如图：使用python模仿post表单，使用正则表达式匹配结果代码如下：# -*- coding: utf-8 -*- import urllibimport urllib2import reimport
复制链接

扫一扫

专栏目录