Python目录扫描

最新推荐文章于 2024-07-14 12:31:04 发布

wxlblh

最新推荐文章于 2024-07-14 12:31:04 发布

阅读量852

点赞数 2

文章标签： python 安全

本文链接：https://blog.csdn.net/weixin_44426869/article/details/104156683

版权

简介

使用Python的sys模块，实现命令交互；
使用多线程，加快扫描速度。

原理

对某一站点的目录进行检测，是将其可能存在的目录地址拼接在其站点的URL后，请求该拼接后的URL，观察其相应状态码，若是成功(200)、重定向类(3XX)或者禁止访问(403)，则证明可能存在此目录。Http响应码参考

实现代码

import requests
import warnings
import threading
import random
import queue
import sys
import time

class DHCScan:
    logo = '''
 ____  _    _  ____  ____ 
|  _ \| |  | |/ ___)/  __| ____  ___ _____
| | \ | |__| | |   |  \__ /  __|/ _ |  _  |
| |_/ /  __  | |___ \__  |  |__| |_|| | | |
|____/|_|  |_|\____)____/ \____|\___|_| |_|  
            '''
    help = '''
usage  : python dhcscan.py [-u url] [--type scriptType] [-t threadNum(Default 5)]
example: python dhcscan.py -u https://www.baidu.com --type php
         python dhcscan.py -u https://www.baidu.com --type php -t 5
         python dhcscan.py -u https://www.baidu.com --type php -t 5
    '''
    class scanThread(threading.Thread):
        code = [200,301,302,303,403]
        def __init__(self, url, userAgent, addUrlQueue):
            threading.Thread.__init__(self)
            self.url = url
            self.userAgent = userAgent
            self.addUrlQueue = addUrlQueue
        #重写run方法
        def run(self):
            while True:
                if self.addUrlQueue.empty():
                    break
                addToUrl = self.addUrlQueue.get()#获取队列中的一个目录名增加
                agent = {'User-Agent':'%s'%random.choice(self.userAgent)}#随机获取User-Agent列表中的一个
                url = self.url + addToUrl
                try:
                    #超时时间返回timeout，verify=False是设置不验证证书
                    re = requests.request(method='GET',headers=agent,url=url,timeout=5,verify=False)
                    if re.status_code in self.code:
                        print('   %s\t\t%s'%(re.status_code,url))
                except:
                    continue#超时，丢了
    #在存放User-Agent的文本里，保存User-Agent的值在列表中，返回列表
    def getUserAgent(self):
        agentList = []
        with open(file='./UserAgent.txt',mode='r') as f:
            agents = f.readlines()
        for agent in agents:
            agentList.append(agent.replace('\n',''))
        return agentList

    #获取对应类型的目录字典,加入到队列中并返回
    def getDic(self,type):
        with open(file='./%s.txt'%type,mode='r') as f:
            dicList = f.readlines()
        for i in dicList:
            dicQueue.put(i.replace('\n',''))

    #创建线程
    def createThread(self,num,scanUrl,scanUA,scanDic):
        threads = []
        for i in range(int(num)):
            threads.append(DHCScan.scanThread(scanUrl,scanUA,scanDic))
        return threads

if __name__ == '__main__':
    warnings.filterwarnings('ignore')  # 忽略警告的输出
    threadNum = 5#默认5个线程
    scan = DHCScan()
    print(scan.logo,scan.help)
    dicQueue = queue.Queue()#待检测的目录队列
    url = ''
    type = ''
    argvs = sys.argv
    if len(argvs) == 7 and argvs[5] == '-t' and argvs[3] == '--type' and argvs[1] == '-u':
        url = argvs[2]
        type = argvs[4]
        threadNum = argvs[6]
    elif len(argvs) == 5 and argvs[3] == '--type' and argvs[1] == '-u':
        url = argvs[2]
        type = argvs[4]
    else:
        print('ERROR：Please input correct command according to usage')
        sys.exit()
    #判断URL是否能访问到
    try:
        re = requests.request(method='GET',url=url,timeout=10)
    except:
        print('Check your network or your entry')#URL存不存在
        sys.exit()
    scan.getDic(type)
    print('The dic size：%s'%dicQueue.qsize())
    start = time.time()
    agents = scan.getUserAgent()
    threads = scan.createThread(threadNum,'%s/'%url,agents,dicQueue)
    lock = threading.Lock()
    print('Scaning...')
    print('STATUS_CODE\t\tURL')
    for t in threads:
        t.start()
    for t in threads:
        t.join()
    print('Scan Over')
    end = time.time()
    print('Use time：%s'%(end-start))

可能不明白的地方

User-Agent设置

def getUserAgent(self):
        agentList = []
        with open(file='./UserAgent.txt',mode='r') as f:
            agents = f.readlines()
        for agent in agents:
            agentList.append(agent.replace('\n',''))
        return agentList

因为使用Python发出的包，默认的User-Agent是这样的：
User-Agent: python-requests/2.22.0
容易给检测出来，因此使用多个不同的User-Agent，并且每次请求一个拼接好的URL都随机选择一个。
此方法是读取存放User-Agent的文本，放在列表中，每次要发送请求时，使用agent = {'User-Agent':'%s'%random.choice(self.userAgent)}，再将agent赋值给request方法的headers参数即可。
User-Agent参考

怎么查看Python发出的包
上面提到User-Agent是用BurpSuite抓到的，requests请求中有个参数proxies，给它赋值BurpSuite上监听的接口。
例如：
在BurpSuite上的代理设置是这样的。

给请求的proxies赋值。

requests.request(method='GET',url='http://www.4399.com',proxies={'http':'127.0.0.1:8080'})

这样就可以抓到包了
在这里插入图片描述
同样也可以抓取HTTPS的包，但请求HTTPS的时候有安全认证，requests请求中将参数verify赋值为False，即可忽略认证。同时，在proxies中，要增加抓取HTTPS的代理，修改后如下：

requests.request(method='GET',url='https://www.4399.com',proxies={'http':'127.0.0.1:8080','https':'127.0.0.1:8080'},verify=False)

warnings模块作用
前面我们设置verify=False忽略安全认证，但是会输出警告，这里可以使用warnings.filterwarnings('ignore')用来忽略警告，保证输出想要的信息。

使用方法

usage  : python dhcscan.py [-u url] [--type scriptType] [-t threadNum(Default 5)]
example: python dhcscan.py -u https://www.baidu.com --type php
         python dhcscan.py -u https://www.baidu.com --type php -t 5
         python dhcscan.py -u https://www.baidu.com --type php -t 5

注：

URL需要完整，要有前面的http://或https://；
scriptType是指后台脚本语言类型，php，asp等，实际上是字典名，名字改成php.txt或者asp.txt，用于针对性扫描，或者可以自己准备特定cms的字典；
threadNum是指线程数，默认是5个线程，这里多了容易挂掉，但是其他扫描器好像不容易挂，很疑惑，希望大佬指出问题；
此处的文本文件都是和.py文件放在同目录下的，或者也可以自行修改源码和位置。