史上最全的信息收集总结！！

程序员菠萝

已于 2024-01-07 15:42:16 修改

阅读量593

点赞数 1

文章标签：安全 web安全大数据学习网络

于 2023-11-10 11:00:00 首次发布

本文链接：https://blog.csdn.net/xnxqwzy/article/details/134311480

版权

信息收集篇

简介

渗透的本质是信息收集

信息收集也叫做资产收集

信息收集是渗透测试的前期主要工作，是非常重要的环节，收集足够多的信息才能方便接下来的测试，信息收集主要是收集网站的域名信息、子域名信息、目标网站信息、目标网站真实IP、敏感/目录文件、开放端口和中间件信息等等。通过各种渠道和手段尽可能收集到多的关于这个站点的信息，有助于我们更多的去找到渗透点，突破口。

信息收集的分类

服务器的相关信息（真实ip，系统类型，版本，开放端口，WAF等）
网站指纹识别（包括，cms，cdn，证书等） dns记录
whois信息，姓名，备案，邮箱，电话反查（邮箱丢社工库，社工准备等）
子域名收集，旁站，C段等
google hacking针对化搜索，word/电子表格/pdf文件，中间件版本，弱口令扫描等
扫描网站目录结构，爆后台，网站banner，测试文件，备份等敏感文件泄漏等
传输协议，通用漏洞，exp，github源码等

常见的方法有

whois查询

域名在注册的时候需要填入个人或者企业信息
如果没有设置隐藏属性可以查询出来通过备案号查询个人或者企业信息
也可以whois反查注册人邮箱电话机构反查更多得域名和需要得信息。
收集子域名

域名分为根域名和子域名

moonsec.com 根域名顶级域名

www.moonsec.com子域名也叫二级域名

www.wiki.moonsec.com 子域名也叫三级域名四级如此类推
端口扫描

服务器需要开放服务，就必须开启端口，常见的端口是tcp 和udp两种类型

范围 0-65535 通过扫得到的端口，访问服务规划下一步渗透。
查找真实ip

企业的网站，为了提高访问速度，或者避免黑客攻击，用了cdn服务，用了cdn之后真实服务器ip会被隐藏。
探测旁站及C段

旁站:一个服务器上有多个网站通过ip查询服务器上的网站

c段:查找同一个段服务器上的网站。可以找到同样网站的类型和服务器，也可以获取同段服务器进行下一步渗透。
网络空间搜索引擎

通过这些引擎查找网站或者服务器的信息，进行下一步渗透。
扫描敏感目录/文件

通过扫描目录和文件，大致了解网站的结构，获取突破点，比如后台，文件备份，上传点。
指纹识别

获取网站的版本，属于那些cms管理系统，查找漏洞exp，下载cms进行代码审计。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-9HAGOPDQ-1669458115177)(https://xiaox1ao.github.io/images/信息收集-1664899453316.png)]

在线whois查询

通过whois来对域名信息进行查询，可以查到注册商、注册人、邮箱、DNS解析服务器、注册人联系电话等，因为有些网站信息查得到，有些网站信息查不到，所以推荐以下信息比较全的查询网站，直接输入目标站点即可查询到相关信息。

站长之家域名WHOIS信息查询地址 http://whois.chinaz.com/

爱站网域名WHOIS信息查询地址 https://whois.aizhan.com/

腾讯云域名WHOIS信息查询地址 https://whois.cloud.tencent.com/

美橙互联域名WHOIS信息查询地址 https://whois.cndns.com/

爱名网域名WHOIS信息查询地址 https://www.22.cn/domain/

易名网域名WHOIS信息查询地址 https://whois.ename.net/

中国万网域名WHOIS信息查询地址 https://whois.aliyun.com/

西部数码域名WHOIS信息查询地址 https://whois.west.cn/

新网域名WHOIS信息查询地址 http://whois.xinnet.com/domain/whois/index.jsp

纳网域名WHOIS信息查询地址 http://whois.nawang.cn/

中资源域名WHOIS信息查询地址 https://www.zzy.cn/domain/whois.html

三五互联域名WHOIS信息查询地址 https://cp.35.com/chinese/whois.php

新网互联域名WHOIS信息查询地址http://www.dns.com.cn/show/domain/whois/index.do

国外WHOIS信息查询地址 https://who.is/

在线网站备案查询

网站备案信息是根据国家法律法规规定，由网站所有者向国家有关部门申请的备案，如果需要查询企业备案信息（单位名称、备案编号、网站负责人、电子邮箱、联系电话、法人等），推荐以下网站查询

天眼查https://www.tianyancha.com/
ICP备案查询网http://www.beianbeian.com/
爱站备案查询https://icp.aizhan.com/
域名助手备案信息查询http://cha.fute.com/index

查询绿盟的whois信息

nsfocus.com.cn

Whois查询nsfocus.com.cn

通过反查注册人和邮箱得多更多得域名

{width=“5.75625in” height=“2.192361111111111in”}

收集子域名

子域名作用

收集子域名可以扩大测试范围，同一域名下的二级域名都属于目标范围。

常用方式

子域名中的常见资产类型一般包括办公系统，邮箱系统，论坛，商城，其他管理系统，网站管理后台也有可能出现子域名中。

首先找到目标站点，在官网中可能会找到相关资产（多为办公系统，邮箱系统等），关注一下页面底部，也许有管理后台等收获。

查找目标域名信息的方法有：

FOFA title=“公司名称”
百度 intitle=公司名称
Google intitle=公司名称
站长之家，直接搜索名称或者网站域名即可查看相关信息：http://tool.chinaz.com/
钟馗之眼 site=域名即可https://www.zoomeye.org/

找到官网后，再收集子域名，下面推荐几种子域名收集的方法，直接输入domain即可查询

域名的类型

A记录、别名记录(CNAME)、MX记录、TXT记录、NS记录：

A (Address) 记录：

是用来指定主机名（或域名）对应的IP地址记录。用户可以将该域名下的网站服务器指向到自己的web server上。同时也可以设置您域名的二级域名。

别名(CNAME)记录：

也被称为规范名字。这种记录允许您将多个名字映射到同一台计算机。通常用于同时提供WWW和MAIL服务的计算机。例如，有一台计算机名为"host.mydomain.com"（A记录）。它同时提供WWW和MAIL服务，为了便于用户访问服务。可以为该计算机设置两个别名（CNAME）：WWW和MAIL。这两个别名的全称就是"www.mydomain.com"和"mail.mydomain.com"。实际上他们都指向"host.mydomain.com"。同样的方法可以用于当您拥有多个域名需要指向同一服务器IP，此时您就可以将一个域名做A记录指向服务器IP然后将其他的域名做别名到之前做A记录的域名上，那么当您的服务器IP地址变更时您就可以不必麻烦的一个一个域名更改指向了，只需要更改做A记录的那个域名其他做别名的那些域名的指向也将自动更改到新的IP地址上了。

如何检测CNAME记录？

1、进入命令状态；（开始菜单 - 运行 - CMD[回车]）；

2、输入命令" nslookup -q=cname 这里填写对应的域名或二级域名"，查看返回的结果与设置的是否一致即可。

例如：nslookup -qt=CNAME www.baidu.com

MX（Mail Exchanger）记录：

是邮件交换记录，它指向一个邮件服务器，用于电子邮件系统发邮件时根据收信人的地址后缀来定位邮件服务器。例如，当Internet上的某用户要发一封信给user@mydomain.com时，该用户的邮件系统通过DNS查找mydomain.com这个域名的MX记录，如果MX记录存在，用户计算机就将邮件发送到MX记录所指定的邮件服务器上。

什么是TXT记录？：

TXT记录一般指为某个主机名或域名设置的说明，如：

1）admin IN TXT “jack, mobile:13800138000”；

2）mail IN TXT “邮件主机, 存放在xxx ,管理人：AAA”，Jim IN TXT"contact: abc@mailserver.com"也就是您可以设置 TXT ，以便使别人联系到您。

如何检测TXT记录？

1、进入命令状态；（开始菜单 - 运行 - CMD[回车]）；

2、输入命令" nslookup -q=txt
这里填写对应的域名或二级域名"，查看返回的结果与设置的是否一致即可。

什么是NS记录？

ns记录全称为Name Server
是一种域名服务器记录，用来明确当前你的域名是由哪个DNS服务器来进行解析的。

子域名查询

子域名在线查询1

https://phpinfo.me/domain/

子域名在线查询2

https://www.t1h2ua.cn/tools/

dns侦测

https://dnsdumpster.com/

IP138查询子域名

https://site.ip138.com/moonsec.com/domain.htm

FOFA搜索子域名

https://fofa.info/

语法：domain=“bfreebuf.com”

提示：以上两种方法无需爆破，查询速度快，需要快速收集资产时可以优先使用，后面再用其他方法补充。

Hackertarget查询子域名

https://hackertarget.com/find-dns-host-records/

注意：通过该方法查询子域名可以得到一个目标大概的ip段，接下来可以通过ip来收集信息。

360测绘空间

https://quake.360.cn/

domain:“*.freebuf.com”

Layer子域名挖掘机

SubDomainBrute

pip install aiodns

运行命令

subDomainsBrute.py baidu.com

subDomainsBrute.py baidu.com --full -o baidu 2.txt

Sublist3r

可能太老了，运行有问题，一直没解决，不用这个工具了

好像把sublist3r.py改成下面的就行了

#!/usr/bin/env python
# coding: utf-8
# Sublist3r v1.0
# By Ahmed Aboul-Ela - twitter.com/aboul3la

# modules in standard library
import re
import sys
import os
import argparse
import time
import hashlib
import random
import multiprocessing
import threading
import socket
import json
from collections import Counter

# external modules
from subbrute import subbrute
import dns.resolver
import requests

# Python 2.x and 3.x compatiablity
if sys.version > '3':
    import urllib.parse as urlparse
    import urllib.parse as urllib
else:
    from urllib.parse import urlparse
    import urllib

# In case you cannot install some of the required development packages
# there's also an option to disable the SSL warning:
try:
    #import requests.packages.urllib3
    requests.packages.urllib3.disable_warnings()
except:
    pass

# Check if we are running this on windows platform
is_windows = sys.platform.startswith('win')

# Console Colors
if is_windows:
    # Windows deserves coloring too :D
    G = '\033[92m'  # green
    Y = '\033[93m'  # yellow
    B = '\033[94m'  # blue
    R = '\033[91m'  # red
    W = '\033[0m'   # white
    try:
        import win_unicode_console , colorama
        win_unicode_console.enable()
        colorama.init()
        #Now the unicode will work ^_^
    except:
        print("[!] Error: Coloring libraries not installed, no coloring will be used [Check the readme]")
        G = Y = B = R = W = G = Y = B = R = W = ''

else:
    G = '\033[92m'  # green
    Y = '\033[93m'  # yellow
    B = '\033[94m'  # blue
    R = '\033[91m'  # red
    W = '\033[0m'   # white

def no_color():
    global G, Y, B, R, W
    G = Y = B = R = W = ''

def banner():
    print("""%s
                 ____        _     _ _     _   _____
                / ___| _   _| |__ | (_)___| |_|___ / _ __
                \___ \| | | | '_ \| | / __| __| |_ \| '__|
                 ___) | |_| | |_) | | \__ \ |_ ___) | |
                |____/ \__,_|_.__/|_|_|___/\__|____/|_|%s%s
                # Coded By Ahmed Aboul-Ela - @aboul3la
    """ % (R, W, Y))

def parser_error(errmsg):
    banner()
    print("Usage: python " + sys.argv[0] + " [Options] use -h for help")
    print(R + "Error: " + errmsg + W)
    sys.exit()

def parse_args():
    # parse the arguments
    parser = argparse.ArgumentParser(epilog='\tExample: \r\npython ' + sys.argv[0] + " -d google.com")
    parser.error = parser_error
    parser._optionals.title = "OPTIONS"
    parser.add_argument('-d', '--domain', help="Domain name to enumerate it's subdomains", required=True)
    parser.add_argument('-b', '--bruteforce', help='Enable the subbrute bruteforce module', nargs='?', default=False)
    parser.add_argument('-p', '--ports', help='Scan the found subdomains against specified tcp ports')
    parser.add_argument('-v', '--verbose', help='Enable Verbosity and display results in realtime', nargs='?', default=False)
    parser.add_argument('-t', '--threads', help='Number of threads to use for subbrute bruteforce', type=int, default=30)
    parser.add_argument('-e', '--engines', help='Specify a comma-separated list of search engines')
    parser.add_argument('-o', '--output', help='Save the results to text file')
    parser.add_argument('-n', '--no-color', help='Output without color', default=False, action='store_true')
    return parser.parse_args()

def write_file(filename, subdomains):
    # saving subdomains results to output file
    print("%s[-] Saving results to file: %s%s%s%s" % (Y, W, R, filename, W))
    with open(str(filename), 'wt') as f:
        for subdomain in subdomains:
            f.write(subdomain + os.linesep)

def subdomain_sorting_key(hostname):
    """Sorting key for subdomains
    This sorting key orders subdomains from the top-level domain at the right
    reading left, then moving '^' and 'www' to the top of their group. For
    example, the following list is sorted correctly:
    [
        'example.com',
        'www.example.com',
        'a.example.com',
        'www.a.example.com',
        'b.a.example.com',
        'b.example.com',
        'example.net',
        'www.example.net',
        'a.example.net',
    ]
    """
    parts = hostname.split('.')[::-1]
    if parts[-1] == 'www':
        return parts[:-1], 1
    return parts, 0

class enumratorBase(object):
    def __init__(self, base_url, engine_name, domain, subdomains=None, silent=False, verbose=True):
        subdomains = subdomains or []
        self.domain = urlparse.urlparse(domain).netloc
        self.session = requests.Session()
        self.subdomains = []
        self.timeout = 25
        self.base_url = base_url
        self.engine_name = engine_name
        self.silent = silent
        self.verbose = verbose
        self.headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
            'Accept-Language': 'en-US,en;q=0.8',
            'Accept-Encoding': 'gzip',
        }
        self.print_banner()

    def print_(self, text):
        if not self.silent:
            print(text)
        return

    def print_banner(self):
        """ subclass can override this if they want a fancy banner :)"""
        self.print_(G + "[-] Searching now in %s.." % (self.engine_name) + W)
        return

    def send_req(self, query, page_no=1):

        url = self.base_url.format(query=query, page_no=page_no)
        try:
            resp = self.session.get(url, headers=self.headers, timeout=self.timeout)
        except Exception:
            resp = None
        return self.get_response(resp)

    def get_response(self, response):
        if response is None:
            return 0
        return response.text if hasattr(response, "text") else response.content

    def check_max_subdomains(self, count):
        if self.MAX_DOMAINS == 0:
            return False
        return count >= self.MAX_DOMAINS

    def check_max_pages(self, num):
        if self.MAX_PAGES == 0:
            return False
        return num >= self.MAX_PAGES

    # override
    def extract_domains(self, resp):
        """ chlid class should override this function """
        return

    # override
    def check_response_errors(self, resp):
        """ chlid class should override this function
        The function should return True if there are no errors and False otherwise
        """
        return True

    def should_sleep(self):
        """Some enumrators require sleeping to avoid bot detections like Google enumerator"""
        return

    def generate_query(self):
        """ chlid class should override this function """
        return

    def get_page(self, num):
        """ chlid class that user different pagnation counter should override this function """
        return num + 10

    def enumerate(self, altquery=False):
        flag = True
        page_no = 0
        prev_links = []
        retries = 0

        while flag:
            query = self.generate_query()
            count = query.count(self.domain)  # finding the number of subdomains found so far

            # if they we reached the maximum number of subdomains in search query
            # then we should go over the pages
            if self.check_max_subdomains(count):
                page_no = self.get_page(page_no)

            if self.check_max_pages(page_no):  # maximum pages for Google to avoid getting blocked
                return self.subdomains
            resp = self.send_req(query, page_no)

            # check if there is any error occured
            if not self.check_response_errors(resp):
                return self.subdomains
            links = self.extract_domains(resp)

            # if the previous page hyperlinks was the similar to the current one, then maybe we have reached the last page
            if links == prev_links:
                retries += 1
                page_no = self.get_page(page_no)

        # make another retry maybe it isn't the last page
                if retries >= 3:
                    return self.subdomains

            prev_links = links
            self.should_sleep()

        return self.subdomains

class enumratorBaseThreaded(multiprocessing.Process, enumratorBase):
    def __init__(self, base_url, engine_name, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        enumratorBase.__init__(self, base_url, engine_name, domain, subdomains, silent=silent, verbose=verbose)
        multiprocessing.Process.__init__(self)
        self.q = q
        return

    def run(self):
        domain_list = self.enumerate()
        for domain in domain_list:
            self.q.append(domain)

class GoogleEnum(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = "https://google.com/search?q={query}&btnG=Search&hl=en-US&biw=&bih=&gbv=1&start={page_no}&filter=0"
        self.engine_name = "Google"
        self.MAX_DOMAINS = 11
        self.MAX_PAGES = 200
        super(GoogleEnum, self).__init__(base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        self.q = q
        return

    def extract_domains(self, resp):
        links_list = list()
        link_regx = re.compile('<cite.*?>(.*?)<\/cite>')
        try:
            links_list = link_regx.findall(resp)
            for link in links_list:
                link = re.sub('<span.*>', '', link)
                if not link.startswith('http'):
                    link = "http://" + link
                subdomain = urlparse.urlparse(link).netloc
                if subdomain and subdomain not in self.subdomains and subdomain != self.domain:
                    if self.verbose:
                        self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                    self.subdomains.append(subdomain.strip())
        except Exception:
            pass
        return links_list

    def check_response_errors(self, resp):
        if (type(resp) is str or type(resp) is str) and 'Our systems have detected unusual traffic' in resp:
            self.print_(R + "[!] Error: Google probably now is blocking our requests" + W)
            self.print_(R + "[~] Finished now the Google Enumeration ..." + W)
            return False
        return True

    def should_sleep(self):
        time.sleep(5)
        return

    def generate_query(self):
        if self.subdomains:
            fmt = 'site:{domain} -www.{domain} -{found}'
            found = ' -'.join(self.subdomains[:self.MAX_DOMAINS - 2])
            query = fmt.format(domain=self.domain, found=found)
        else:
            query = "site:{domain} -www.{domain}".format(domain=self.domain)
        return query

class YahooEnum(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = "https://search.yahoo.com/search?p={query}&b={page_no}"
        self.engine_name = "Yahoo"
        self.MAX_DOMAINS = 10
        self.MAX_PAGES = 0
        super(YahooEnum, self).__init__(base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        self.q = q
        return

    def extract_domains(self, resp):
        link_regx2 = re.compile('<span class=" fz-.*? fw-m fc-12th wr-bw.*?">(.*?)</span>')
        link_regx = re.compile('<span class="txt"><span class=" cite fw-xl fz-15px">(.*?)</span>')
        links_list = []
        try:
            links = link_regx.findall(resp)
            links2 = link_regx2.findall(resp)
            links_list = links + links2
            for link in links_list:
                link = re.sub("<(\/)?b>", "", link)
                if not link.startswith('http'):
                    link = "http://" + link
                subdomain = urlparse.urlparse(link).netloc
                if not subdomain.endswith(self.domain):
                    continue
                if subdomain and subdomain not in self.subdomains and subdomain != self.domain:
                    if self.verbose:
                        self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                    self.subdomains.append(subdomain.strip())
        except Exception:
            pass

        return links_list

    def should_sleep(self):
        return

    def get_page(self, num):
        return num + 10

    def generate_query(self):
        if self.subdomains:
            fmt = 'site:{domain} -domain:www.{domain} -domain:{found}'
            found = ' -domain:'.join(self.subdomains[:77])
            query = fmt.format(domain=self.domain, found=found)
        else:
            query = "site:{domain}".format(domain=self.domain)
        return query

class AskEnum(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'http://www.ask.com/web?q={query}&page={page_no}&qid=8D6EE6BF52E0C04527E51F64F22C4534&o=0&l=dir&qsrc=998&qo=pagination'
        self.engine_name = "Ask"
        self.MAX_DOMAINS = 11
        self.MAX_PAGES = 0
        enumratorBaseThreaded.__init__(self, base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        self.q = q
        return

    def extract_domains(self, resp):
        links_list = list()
        link_regx = re.compile('<p class="web-result-url">(.*?)</p>')
        try:
            links_list = link_regx.findall(resp)
            for link in links_list:
                if not link.startswith('http'):
                    link = "http://" + link
                subdomain = urlparse.urlparse(link).netloc
                if subdomain not in self.subdomains and subdomain != self.domain:
                    if self.verbose:
                        self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                    self.subdomains.append(subdomain.strip())
        except Exception:
            pass

        return links_list

    def get_page(self, num):
        return num + 1

    def generate_query(self):
        if self.subdomains:
            fmt = 'site:{domain} -www.{domain} -{found}'
            found = ' -'.join(self.subdomains[:self.MAX_DOMAINS])
            query = fmt.format(domain=self.domain, found=found)
        else:
            query = "site:{domain} -www.{domain}".format(domain=self.domain)

        return query

class BingEnum(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'https://www.bing.com/search?q={query}&go=Submit&first={page_no}'
        self.engine_name = "Bing"
        self.MAX_DOMAINS = 30
        self.MAX_PAGES = 0
        enumratorBaseThreaded.__init__(self, base_url, self.engine_name, domain, subdomains, q=q, silent=silent)
        self.q = q
        self.verbose = verbose
        return

    def extract_domains(self, resp):
        links_list = list()
        link_regx = re.compile('<li class="b_algo"><h2><a href="(.*?)"')
        link_regx2 = re.compile('<div class="b_title"><h2><a href="(.*?)"')
        try:
            links = link_regx.findall(resp)
            links2 = link_regx2.findall(resp)
            links_list = links + links2

            for link in links_list:
                link = re.sub('<(\/)?strong>|<span.*?>|<|>', '', link)
                if not link.startswith('http'):
                    link = "http://" + link
                subdomain = urlparse.urlparse(link).netloc
                if subdomain not in self.subdomains and subdomain != self.domain:
                    if self.verbose:
                        self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                    self.subdomains.append(subdomain.strip())
        except Exception:
            pass

        return links_list

    def generate_query(self):
        if self.subdomains:
            fmt = 'domain:{domain} -www.{domain} -{found}'
            found = ' -'.join(self.subdomains[:self.MAX_DOMAINS])
            query = fmt.format(domain=self.domain, found=found)
        else:
            query = "domain:{domain} -www.{domain}".format(domain=self.domain)
        return query

class BaiduEnum(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'https://www.baidu.com/s?pn={page_no}&wd={query}&oq={query}'
        self.engine_name = "Baidu"
        self.MAX_DOMAINS = 2
        self.MAX_PAGES = 760
        enumratorBaseThreaded.__init__(self, base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        self.querydomain = self.domain
        self.q = q
        return

    def extract_domains(self, resp):
        links = list()
        found_newdomain = False
        subdomain_list = []
        link_regx = re.compile('<a.*?class="c-showurl".*?>(.*?)</a>')
        try:
            links = link_regx.findall(resp)
            for link in links:
                link = re.sub('<.*?>|>|<|&nbsp;', '', link)
                if not link.startswith('http'):
                    link = "http://" + link
                subdomain = urlparse.urlparse(link).netloc
                if subdomain.endswith(self.domain):
                    subdomain_list.append(subdomain)
                    if subdomain not in self.subdomains and subdomain != self.domain:
                        found_newdomain = True
                        if self.verbose:
                            self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                        self.subdomains.append(subdomain.strip())
        except Exception:
            pass
        if not found_newdomain and subdomain_list:
            self.querydomain = self.findsubs(subdomain_list)
        return links

    def findsubs(self, subdomains):
        count = Counter(subdomains)
        subdomain1 = max(count, key=count.get)
        count.pop(subdomain1, "None")
        subdomain2 = max(count, key=count.get) if count else ''
        return (subdomain1, subdomain2)

    def check_response_errors(self, resp):
        return True

    def should_sleep(self):
        time.sleep(random.randint(2, 5))
        return

    def generate_query(self):
        if self.subdomains and self.querydomain != self.domain:
            found = ' -site:'.join(self.querydomain)
            query = "site:{domain} -site:www.{domain} -site:{found} ".format(domain=self.domain, found=found)
        else:
            query = "site:{domain} -site:www.{domain}".format(domain=self.domain)
        return query

class NetcraftEnum(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        self.base_url = 'https://searchdns.netcraft.com/?restriction=site+ends+with&host={domain}'
        self.engine_name = "Netcraft"
        super(NetcraftEnum, self).__init__(self.base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        self.q = q
        return

    def req(self, url, cookies=None):
        cookies = cookies or {}
        try:
            resp = self.session.get(url, headers=self.headers, timeout=self.timeout, cookies=cookies)
        except Exception as e:
            self.print_(e)
            resp = None
        return resp

    def should_sleep(self):
        time.sleep(random.randint(1, 2))
        return

    def get_next(self, resp):
        link_regx = re.compile('<a.*?href="(.*?)">Next Page')
        link = link_regx.findall(resp)
        url = 'http://searchdns.netcraft.com' + link[0]
        return url

    def create_cookies(self, cookie):
        cookies = dict()
        cookies_list = cookie[0:cookie.find(';')].split("=")
        cookies[cookies_list[0]] = cookies_list[1]
        # hashlib.sha1 requires utf-8 encoded str
        cookies['netcraft_js_verification_response'] = hashlib.sha1(urllib.unquote(cookies_list[1]).encode('utf-8')).hexdigest()
        return cookies

    def get_cookies(self, headers):
        if 'set-cookie' in headers:
            cookies = self.create_cookies(headers['set-cookie'])
        else:
            cookies = {}
        return cookies

    def enumerate(self):
        start_url = self.base_url.format(domain='example.com')
        resp = self.req(start_url)
        cookies = self.get_cookies(resp.headers)
        url = self.base_url.format(domain=self.domain)
        while True:
            resp = self.get_response(self.req(url, cookies))
            self.extract_domains(resp)
            if 'Next Page' not in resp:
                return self.subdomains
                break
            url = self.get_next(resp)
            self.should_sleep()

    def extract_domains(self, resp):
        links_list = list()
        link_regx = re.compile('<a class="results-table__host" href="(.*?)"')
        try:
            links_list = link_regx.findall(resp)
            for link in links_list:
                subdomain = urlparse.urlparse(link).netloc
                if not subdomain.endswith(self.domain):
                    continue
                if subdomain and subdomain not in self.subdomains and subdomain != self.domain:
                    if self.verbose:
                        self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                    self.subdomains.append(subdomain.strip())
        except Exception:
            pass
        return links_list

class DNSdumpster(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'https://dnsdumpster.com/'
        self.live_subdomains = []
        self.engine_name = "DNSdumpster"
        self.q = q
        self.lock = None
        super(DNSdumpster, self).__init__(base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        return

    def check_host(self, host):
        is_valid = False
        Resolver = dns.resolver.Resolver()
        Resolver.nameservers = ['8.8.8.8', '8.8.4.4']
        self.lock.acquire()
        try:
            ip = Resolver.query(host, 'A')[0].to_text()
            if ip:
                if self.verbose:
                    self.print_("%s%s: %s%s" % (R, self.engine_name, W, host))
                is_valid = True
                self.live_subdomains.append(host)
        except:
            pass
        self.lock.release()
        return is_valid

    def req(self, req_method, url, params=None):
        params = params or {}
        headers = dict(self.headers)
        headers['Referer'] = 'https://dnsdumpster.com'
        try:
            if req_method == 'GET':
                resp = self.session.get(url, headers=headers, timeout=self.timeout)
            else:
                resp = self.session.post(url, data=params, headers=headers, timeout=self.timeout)
        except Exception as e:
            self.print_(e)
            resp = None
        return self.get_response(resp)

    def get_csrftoken(self, resp):
        csrf_regex = re.compile('<input type="hidden" name="csrfmiddlewaretoken" value="(.*?)">', re.S)
        token = csrf_regex.findall(resp)[0]
        return token.strip()

    def enumerate(self):
        self.lock = threading.BoundedSemaphore(value=70)
        resp = self.req('GET', self.base_url)
        token = self.get_csrftoken(resp)
        params = {'csrfmiddlewaretoken': token, 'targetip': self.domain}
        post_resp = self.req('POST', self.base_url, params)
        self.extract_domains(post_resp)
        for subdomain in self.subdomains:
            t = threading.Thread(target=self.check_host, args=(subdomain,))
            t.start()
            t.join()
        return self.live_subdomains

    def extract_domains(self, resp):
        tbl_regex = re.compile('<a name="hostanchor"><\/a>Host Records.*?<table.*?>(.*?)</table>', re.S)
        link_regex = re.compile('<td class="col-md-4">(.*?)<br>', re.S)
        links = []
        try:
            results_tbl = tbl_regex.findall(resp)[0]
        except IndexError:
            results_tbl = ''
        links_list = link_regex.findall(results_tbl)
        links = list(set(links_list))
        for link in links:
            subdomain = link.strip()
            if not subdomain.endswith(self.domain):
                continue
            if subdomain and subdomain not in self.subdomains and subdomain != self.domain:
                self.subdomains.append(subdomain.strip())
        return links

class Virustotal(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'https://www.virustotal.com/ui/domains/{domain}/subdomains?relationships=resolutions'
        self.engine_name = "Virustotal"
        self.q = q
        super(Virustotal, self).__init__(base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        self.url = self.base_url.format(domain=self.domain)

        # Virustotal requires specific headers to bypass the bot detection:
        self.headers["X-Tool"] = "vt-ui-main"
        self.headers["X-VT-Anti-Abuse-Header"] = "hm"  # as of 1/20/2022, the content of this header doesn't matter, just its presence
        self.headers["Accept-Ianguage"] = self.headers["Accept-Language"]  # this header being present is required to prevent a captcha

        return

    # the main send_req need to be rewritten
    def send_req(self, url):
        try:
            resp = self.session.get(url, headers=self.headers, timeout=self.timeout)
        except Exception as e:
            self.print_(e)
            resp = None

        return self.get_response(resp)

    # once the send_req is rewritten we don't need to call this function, the stock one should be ok
    def enumerate(self):
        while self.url != '':
            resp = self.send_req(self.url)
            resp = json.loads(resp)
            if 'error' in resp:
                self.print_(R + "[!] Error: Virustotal probably now is blocking our requests" + W)
                break
            if 'links' in resp and 'next' in resp['links']:
                self.url = resp['links']['next']
            else:
                self.url = ''
            self.extract_domains(resp)
        return self.subdomains

    def extract_domains(self, resp):
        #resp is already parsed as json
        try:
            for i in resp['data']:
                if i['type'] == 'domain':
                    subdomain = i['id']
                    if not subdomain.endswith(self.domain):
                        continue
                    if subdomain not in self.subdomains and subdomain != self.domain:
                        if self.verbose:
                            self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                        self.subdomains.append(subdomain.strip())
        except Exception:
            pass

class ThreatCrowd(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'https://www.threatcrowd.org/searchApi/v2/domain/report/?domain={domain}'
        self.engine_name = "ThreatCrowd"
        self.q = q
        super(ThreatCrowd, self).__init__(base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        return

    def req(self, url):
        try:
            resp = self.session.get(url, headers=self.headers, timeout=self.timeout)
        except Exception:
            resp = None

        return self.get_response(resp)

    def enumerate(self):
        url = self.base_url.format(domain=self.domain)
        resp = self.req(url)
        self.extract_domains(resp)
        return self.subdomains

    def extract_domains(self, resp):
        try:
            links = json.loads(resp)['subdomains']
            for link in links:
                subdomain = link.strip()
                if not subdomain.endswith(self.domain):
                    continue
                if subdomain not in self.subdomains and subdomain != self.domain:
                    if self.verbose:
                        self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                    self.subdomains.append(subdomain.strip())
        except Exception as e:
            pass

class CrtSearch(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'https://crt.sh/?q=%25.{domain}'
        self.engine_name = "SSL Certificates"
        self.q = q
        super(CrtSearch, self).__init__(base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        return

    def req(self, url):
        try:
            resp = self.session.get(url, headers=self.headers, timeout=self.timeout)
        except Exception:
            resp = None

        return self.get_response(resp)

    def enumerate(self):
        url = self.base_url.format(domain=self.domain)
        resp = self.req(url)
        if resp:
            self.extract_domains(resp)
        return self.subdomains

    def extract_domains(self, resp):
        link_regx = re.compile('<TD>(.*?)</TD>')
        try:
            links = link_regx.findall(resp)
            for link in links:
                link = link.strip()
                subdomains = []
                if '<BR>' in link:
                    subdomains = link.split('<BR>')
                else:
                    subdomains.append(link)

                for subdomain in subdomains:
                    if not subdomain.endswith(self.domain) or '*' in subdomain:
                        continue

                    if '@' in subdomain:
                        subdomain = subdomain[subdomain.find('@')+1:]

                    if subdomain not in self.subdomains and subdomain != self.domain:
                        if self.verbose:
                            self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                        self.subdomains.append(subdomain.strip())
        except Exception as e:
            print(e)
            pass

class PassiveDNS(enumratorBaseThreaded):
    def __init__(self, domain, subdomains=None, q=None, silent=False, verbose=True):
        subdomains = subdomains or []
        base_url = 'https://api.sublist3r.com/search.php?domain={domain}'
        self.engine_name = "PassiveDNS"
        self.q = q
        super(PassiveDNS, self).__init__(base_url, self.engine_name, domain, subdomains, q=q, silent=silent, verbose=verbose)
        return

    def req(self, url):
        try:
            resp = self.session.get(url, headers=self.headers, timeout=self.timeout)
        except Exception as e:
            resp = None

        return self.get_response(resp)

    def enumerate(self):
        url = self.base_url.format(domain=self.domain)
        resp = self.req(url)
        if not resp:
            return self.subdomains

        self.extract_domains(resp)
        return self.subdomains

    def extract_domains(self, resp):
        try:
            subdomains = json.loads(resp)
            for subdomain in subdomains:
                if subdomain not in self.subdomains and subdomain != self.domain:
                    if self.verbose:
                        self.print_("%s%s: %s%s" % (R, self.engine_name, W, subdomain))
                    self.subdomains.append(subdomain.strip())
        except Exception as e:
            pass

class portscan():
    def __init__(self, subdomains, ports):
        self.subdomains = subdomains
        self.ports = ports
        self.lock = None

    def port_scan(self, host, ports):
        openports = []
        self.lock.acquire()
        for port in ports:
            try:
                s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
                s.settimeout(2)
                result = s.connect_ex((host, int(port)))
                if result == 0:
                    openports.append(port)
                s.close()
            except Exception:
                pass
        self.lock.release()
        if len(openports) > 0:
            print("%s%s%s - %sFound open ports:%s %s%s%s" % (G, host, W, R, W, Y, ', '.join(openports), W))

    def run(self):
        self.lock = threading.BoundedSemaphore(value=20)
        for subdomain in self.subdomains:
            t = threading.Thread(target=self.port_scan, args=(subdomain, self.ports))
            t.start()

def main(domain, threads, savefile, ports, silent, verbose, enable_bruteforce, engines):
    bruteforce_list = set()
    search_list = set()

    if is_windows:
        subdomains_queue = list()
    else:
        subdomains_queue = multiprocessing.Manager().list()

    # Check Bruteforce Status
    if enable_bruteforce or enable_bruteforce is None:
        enable_bruteforce = True

    # Validate domain
    domain_check = re.compile("^(http|https)?[a-zA-Z0-9]+([\-\.]{1}[a-zA-Z0-9]+)*\.[a-zA-Z]{2,}$")
    if not domain_check.match(domain):
        if not silent:
            print(R + "Error: Please enter a valid domain" + W)
        return []

    if not domain.startswith('http://') or not domain.startswith('https://'):
        domain = 'http://' + domain

    parsed_domain = urlparse.urlparse(domain)

    if not silent:
        print(B + "[-] Enumerating subdomains now for %s" % parsed_domain.netloc + W)

    if verbose and not silent:
        print(Y + "[-] verbosity is enabled, will show the subdomains results in realtime" + W)

    supported_engines = {'baidu': BaiduEnum,
                         'yahoo': YahooEnum,
                         'google': GoogleEnum,
                         'bing': BingEnum,
                         'ask': AskEnum,
                         'netcraft': NetcraftEnum,
                         'dnsdumpster': DNSdumpster,
                         'virustotal': Virustotal,
                         'threatcrowd': ThreatCrowd,
                         'ssl': CrtSearch,
                         'passivedns': PassiveDNS
                         }

    chosenEnums = []

    if engines is None:
        chosenEnums = [
            BaiduEnum, YahooEnum, GoogleEnum, BingEnum, AskEnum,
            NetcraftEnum, DNSdumpster, Virustotal, ThreatCrowd,
            CrtSearch, PassiveDNS
        ]
    else:
        engines = engines.split(',')
        for engine in engines:
            if engine.lower() in supported_engines:
                chosenEnums.append(supported_engines[engine.lower()])

    # Start the engines enumeration
    enums = [enum(domain, [], q=subdomains_queue, silent=silent, verbose=verbose) for enum in chosenEnums]
    for enum in enums:
        enum.start()
    for enum in enums:
        enum.join()

    subdomains = set(subdomains_queue)
    for subdomain in subdomains:
        search_list.add(subdomain)

    if enable_bruteforce:
        if not silent:
            print(G + "[-] Starting bruteforce module now using subbrute.." + W)
        record_type = False
        path_to_file = os.path.dirname(os.path.realpath(__file__))
        subs = os.path.join(path_to_file, 'subbrute', 'names.txt')
        resolvers = os.path.join(path_to_file, 'subbrute', 'resolvers.txt')
        process_count = threads
        output = False
        json_output = False
        bruteforce_list = subbrute.print_target(parsed_domain.netloc, record_type, subs, resolvers, process_count, output, json_output, search_list, verbose)

    subdomains = search_list.union(bruteforce_list)

    if subdomains:
        subdomains = sorted(subdomains, key=subdomain_sorting_key)

        if savefile:
            write_file(savefile, subdomains)

        if not silent:
            print(Y + "[-] Total Unique Subdomains Found: %s" % len(subdomains) + W)

        if ports:
            if not silent:
                print(G + "[-] Start port scan now for the following ports: %s%s" % (Y, ports) + W)
            ports = ports.split(',')
            pscan = portscan(subdomains, ports)
            pscan.run()

        elif not silent:
            for subdomain in subdomains:
                print(G + subdomain + W)
    return subdomains

def interactive():
    args = parse_args()
    domain = args.domain
    threads = args.threads
    savefile = args.output
    ports = args.ports
    enable_bruteforce = args.bruteforce
    verbose = args.verbose
    engines = args.engines
    if verbose or verbose is None:
        verbose = True
    if args.no_color:
        no_color()
    banner()
    res = main(domain, threads, savefile, ports, silent=False, verbose=verbose, enable_bruteforce=enable_bruteforce, engines=engines)

if __name__ == "__main__":
    interactive()

https://github.com/aboul3la/Sublist3r

pip install -r requirements.txt

git clone https://github.com/aboul3la/Sublist3r.git  #下载
cd Sublist3r 
pip install -r requirements.txt   #安装依赖项完成安装

提示：以上方法为爆破子域名，由于字典比较强大，所以效率较高。

帮助文档

D:\Desktop\tools\信息收集篇工具\Sublist3r-master>python sublist3r.py -h
usage: sublist3r.py [-h] -d DOMAIN [-b [BRUTEFORCE]] [-p PORTS] [-v [VERBOSE]] [-t THREADS] [-e ENGINES] [-o OUTPUT]
                    [-n]

OPTIONS:
  -h, --help            show this help message and exit
  -d DOMAIN, --domain DOMAIN
                        Domain name to enumerate it's subdomains
  -b [BRUTEFORCE], --bruteforce [BRUTEFORCE]
                        Enable the subbrute bruteforce module
  -p PORTS, --ports PORTS
                        Scan the found subdomains against specified tcp ports
  -v [VERBOSE], --verbose [VERBOSE]
                        Enable Verbosity and display results in realtime
  -t THREADS, --threads THREADS
                        Number of threads to use for subbrute bruteforce
  -e ENGINES, --engines ENGINES
                        Specify a comma-separated list of search engines
  -o OUTPUT, --output OUTPUT
                        Save the results to text file
  -n, --no-color        Output without color

Example: python sublist3r.py -d google.com

中文翻译

-h ：帮助

-d ：指定主域名枚举子域名

-b ：调用subbrute暴力枚举子域名

-p ：指定tpc端口扫描子域名

-v ：显示实时详细信息结果

-t ：指定线程

-e ：指定搜索引擎

-o ：将结果保存到文本

-n ：输出不带颜色

默认参数扫描子域名

python sublist3r.py -d baidu.com

使用暴力枚举子域名

python sublist3r -b -d baidu.com

python2.7.14 环境

;C:\Python27;C:\Python27\Scripts

OneForALL

pip3 install --user -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/

python3 oneforall.py --target baidu.com run /*收集*/

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-SuGhvJ4B-1669458115189)(https://xiaox1ao.github.io/images/image13-1667221746306.png)]

爆破子域名

Example：

brute.py --target domain.com --word True run

brute.py --targets ./domains.txt --word True run

brute.py --target domain.com --word True --concurrent 2000 run

brute.py --target domain.com --word True --wordlist subnames.txt run

brute.py --target domain.com --word True --recursive True --depth 2 run

brute.py --target d.com --fuzz True --place m.*.d.com --rule ‘[a-z]’ run

brute.py --target d.com --fuzz True --place m.*.d.com --fuzzlist subnames.txt run

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-R33qRISE-1669458115189)(https://xiaox1ao.github.io/images/image14-1667221746306.png)]

Wydomain

dnsburte.py -d aliyun.com -f dnspod.csv -o aliyun.log

wydomain.py -d aliyun.com

FuzzDomain

隐藏域名hosts碰撞

隐藏资产探测-hosts碰撞

https://mp.weixin.qq.com/s/fuASZODw1rLvgT7GySMC8Q

很多时候访问目标资产IP响应多为：401、403、404、500，但是用域名请求却能返回正常的业务系统（禁止IP直接访问），因

为这大多数都是需要绑定host才能正常请求访问的（目前互联网公司基本的做法）（域名删除了A记录，但是反代的配置未更新），那么我们就可以通过收集到的目标的域名和目标资产的IP段组合起来，以 IP段+域名的形式进行捆绑碰撞，就能发现很多有意思的东西。

在发送http请求的时候，对域名和IP列表进行配对，然后遍历发送请求（就相当于修改了本地的hosts文件一样），并把相应的title和响应包大小拿回来做对比，即可快速发现一些隐蔽的资产。

进行hosts碰撞需要目标的域名和目标的相关IP作为字典

域名就不说了

使用Hosts_scan

多线程

端口扫描

当确定了目标大概的ip段后，可以先对ip的开放端口进行探测，一些特定服务可能开起在默认端口上，探测开放端口有利于快速收集目标资产，找到目标网站的其他功能站点。

msscan端口扫描

sudo msscan -p 1-65535 192.168.19.151 --rate=1000

https://gitee.com/youshusoft/GoScanner/

Scanner.exe 192.168.19.151 1-65535 512

御剑端口扫描工具

nmap扫描端口和探测端口信息

常用参数，如：

nmap -sV 192.168.0.2

nmap -sT 92.168.0.2

nmap -Pn -A -sC 192.168.0.2

nmap -sU -sT -p0-65535 192.168.122.1

用于扫描目标主机服务版本号与开放的端口

如果需要扫描多个ip或ip段，可以将他们保存到一个txt文件中

nmap -iL ip.txt

来扫描列表中所有的ip。

Nmap为端口探测最常用的方法，操作方便，输出结果非常直观。

在线端口检测

http://coolaf.com/tool/port

端口扫描器

御剑，msscan，zmap等

御剑高速端口扫描器：

填入想要扫描的ip段（如果只扫描一个ip，则开始IP和结束IP填一个即可），可以选择不改默认端口列表，也可以选择自己指定端口。

渗透端口

21,22,23,1433,152,3306,3389,5432,5900,50070,50030,50000,27017,27018,11211,9200,9300,7001,7002,6379,5984,873,443,8000-9090,80-89,80,10000,8888,8649,8083,8080,8089,9090,7778,7001,7002,6082,5984,4440,3312,3311,3128,2601,2604,2222,2082,2083,389,88,512,513,514,1025,111,1521,445,135,139,53

渗透常见端口及对应服务

1.web类(web漏洞/敏感目录)

第三方通用组件漏洞struts thinkphp jboss ganglia zabbix

80 web

80-89 web

8000-9090 web

2.数据库类(扫描弱口令)

1433 MSSQL

1521 Oracle

3306 MySQL

5432 PostgreSQL

3.特殊服务类(未授权/命令执行类/漏洞)

443 SSL心脏滴血

873 Rsync未授权

5984 CouchDB http://xxx:5984/_utils/

6379 redis未授权

7001,7002 WebLogic默认弱口令，反序列

9200,9300 elasticsearch 参考WooYun: 多玩某服务器ElasticSearch命令执行漏洞

11211 memcache未授权访问

27017,27018 Mongodb未授权访问

50000 SAP命令执行

50070,50030 hadoop默认端口未授权访问

4.常用端口类(扫描弱口令/端口爆破)

21 ftp

22 SSH

23 Telnet

2601,2604 zebra路由，默认密码zebra

3389 远程桌面

5.端口合计详情

21 ftp

22 SSH

23 Telnet

80 web

80-89 web

161 SNMP

389 LDAP

443 SSL心脏滴血以及一些web漏洞测试

445 SMB

512,513,514 Rexec

873 Rsync未授权

1025,111 NFS

1433 MSSQL

1521 Oracle:(iSqlPlus Port:5560,7778)

2082/2083 cpanel主机管理系统登陆（国外用较多）

2222 DA虚拟主机管理系统登陆（国外用较多）

2601,2604 zebra路由，默认密码zebra

3128 squid代理默认端口，如果没设置口令很可能就直接漫游内网了

3306 MySQL

3312/3311 kangle主机管理系统登陆

3389 远程桌面

4440 rundeck 参考WooYun: 借用新浪某服务成功漫游新浪内网

5432 PostgreSQL

5900 vnc

5984 CouchDB http://xxx:5984/_utils/

6082 varnish 参考WooYun: Varnish HTTP accelerator CLI 未授权访问易导致网站被直接篡改或者作为代理进入内网

6379 redis未授权

7001,7002 WebLogic默认弱口令，反序列

7778 Kloxo主机控制面板登录

8000-9090 都是一些常见的web端口，有些运维喜欢把管理后台开在这些非80的端口上

8080 tomcat/WDCP主机管理系统，默认弱口令

8080,8089,9090 JBOSS

8083 Vestacp主机管理系统（国外用较多）

8649 ganglia

8888 amh/LuManager 主机管理系统默认端口

9200,9300 elasticsearch 参考WooYun: 多玩某服务器ElasticSearch命令执行漏洞

10000 Virtualmin/Webmin 服务器虚拟主机管理系统

11211 memcache未授权访问

27017,27018 Mongodb未授权访问

28017 mongodb统计页面

50000 SAP命令执行

50070,50030 hadoop默认端口未授权访问

常见的端口和攻击方法

Nmap,Msscan扫描等

例如：nmap -p 80,443,8000,8080 -Pn 192.168.0.0/24

常见端口表

21,22,23,80-90,161,389,443,445,873,1099,1433,1521,1900,2082,2083,2222,2601,2604,3128,3306,3311,3312,3389,4440,4848,5432,5560,5900,5901,5902,6082,6379,7001-7010,7778,8080-8090,8649,8888,9000,9200,10000,11211,27017,28017,50000,50030,50060,135,139,445,53,88

查找真实ip

[[绕过CDN找真实IP]]

如果目标网站使用了CDN，使用了cdn真实的ip会被隐藏，如果要查找真实的服务器就必须获取真实的ip，根据这个ip继续查询

旁站。

注意：很多时候，主站虽然是用了CDN，但子域名可能没有使用CDN，如果主站和子域名在一个ip段中，那么找到子域名的真

实ip也是一种途径。

多地ping确认是否使用CDN

http://ping.chinaz.com/

http://ping.aizhan.com/

{width=“5.767361111111111in” height=“3.925in”}

查询历史DNS解析记录

在查询到的历史解析记录中，最早的历史解析ip很有可能记录的就是真实ip，快速查找真实IP推荐此方法，但并不是所有网站都能查到。

DNSDB

https://dnsdb.io/zh-cn/

微步在线

https://x.threatbook.cn/

Ipip.net

https://tools.ipip.net/cdn.php

viewdns

https://viewdns.info/

phpinfo

如果目标网站存在phpinfo泄露等，可以在phpinfo中的SERVER_ADDR或_SERVER[“SERVER_ADDR”]找到真实ip

绕过CDN

绕过CDN的多种方法具体可以参考
https://www.cnblogs.com/qiudabai/p/9763739.html

旁站和C段

旁站往往存在业务功能站点，建议先收集已有IP的旁站，再探测C段，确认C段目标后，再在C段的基础上再收集一次旁站。

旁站是和已知目标站点在同一服务器但不同端口的站点，通过以下方法搜索到旁站后，先访问一下确定是不是自己需要的站点信息。

站长之家

同ip网站查询http://stool.chinaz.com/same

https://chapangzhan.com/

google hacking

https://blog.csdn.net/qq_36119192/article/details/84029809

网络空间搜索引擎

如FOFA搜索旁站和C段

该方法效率较高，并能够直观地看到站点标题，但也有不常见端口未收录的情况，虽然这种情况很少，但之后补充资产的时候可以用下面的方法nmap扫描再收集一遍。

在线c段 webscan.cc

webscan.cc

https://c.webscan.cc/

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-BHuurLQY-1669458115202)(https://xiaox1ao.github.io/images/image28-1667221746308.png)]

c段利用脚本

pip install requests

#coding:utf-8
import requests
import json

def get_c(ip):
    print("正在收集{}".format(ip))
    url="http://api.webscan.cc/?action=query&ip={}".format(ip)
    req=requests.get(url=url)
    html=req.text
    data=req.json()
    if 'null' not in html:
        with open("resulit.txt", 'a', encoding='utf-8') as f:
            f.write(ip + '\n')
            f.close()
        for i in data:
            with open("resulit.txt", 'a',encoding='utf-8') as f:
                f.write("\t{} {}\n".format(i['domain'],i['title']))
                print("     [+] {} {}[+]".format(i['domain'],i['title']))
                f.close()

def get_ips(ip):
    iplist=[]
    ips_str = ip[:ip.rfind('.')]
    for ips in range(1, 256):
        ipadd=ips_str + '.' + str(ips)
        iplist.append(ipadd)
    return iplist


ip=input("请你输入要查询的ip:")

ips=get_ips(ip)
for p in ips:
    get_c(p)

注意：探测C段时一定要确认ip是否归属于目标，因为一个C段中的所有ip不一定全部属于目标。

网络空间搜索引擎

如果想要在短时间内快速收集资产，那么利用网络空间搜索引擎是不错的选择，可以直观地看到旁站，端口，站点标题，IP等信息，点击列举出站点可以直接访问，以此来判断是否为自己需要的站点信息。FOFA的常用语法

1、同IP旁站：ip=“192.168.0.1”

2、C段：ip=“192.168.0.0/24”

3、子域名：domain=“baidu.com”

4、标题/关键字：title=“百度”

5、如果需要将结果缩小到某个城市的范围，那么可以拼接语句

title=“百度”&& region=“Beijing”

特征：body="百度"或header=“baidu”

扫描敏感目录/文件

扫描敏感目录需要强大的字典，需要平时积累，拥有强大的字典能够更高效地找出网站的管理后台，敏感文件常见的如.git文件

泄露，.svn文件泄露，phpinfo泄露等，这一步一半交给各类扫描器就可以了，将目标站点输入到域名中，选择对应字典类型，

就可以开始扫描了，十分方便。

御剑

https://www.fujieace.com/hacker/tools/yujian.html

7kbstorm

https://github.com/7kbstorm/7kbscan-WebPathBrute

bbscan

https://github.com/lijiejie/BBScan

在pip已经安装的前提下，可以直接：

pip install -r requirements.txt

使用示例：

1. 扫描单个web服务 www.target.com

python BBScan.py --host www.target.com

2. 扫描www.target.com和www.target.com/28下的其他主机

python BBScan.py --host www.target.com --network 28

3. 扫描txt文件中的所有主机

python BBScan.py -f wandoujia.com.txt

4. 从文件夹中导入所有的主机并扫描

python BBScan.py -d targets/

–network
参数用于设置子网掩码，小公司设为28_{30，中等规模公司设置26}28，大公司设为24~26

当然，尽量避免设为24，扫描过于耗时，除非是想在各SRC多刷几个漏洞。

该插件是从内部扫描器中抽离出来的，感谢 Jekkay Hu<34538980[at]qq.com>

如果你有非常有用的规则，请找几个网站验证测试后，再 pull request

脚本还会优化，接下来的事:

增加有用规则，将规则更好地分类，细化

后续可以直接从 rules\request 文件夹中导入HTTP_request

优化扫描逻辑

dirmap

pip install -r requirement.txt

https://github.com/H4ckForJob/dirmap

单个目标

python3 dirmap.py -i https://target.com -lcf

多个目标

python3 dirmap.py -iF urls.txt -lcf

dirsearch

https://gitee.com/Abaomianguan/dirsearch.git

unzip dirsearch.zip

python3 dirsearch.py -u http://m.scabjd.com/ -e *

gobuster

sudo apt-get install gobuster

gobuster dir -u https://www.servyou.com.cn/ -w usr/share/wordlists/dirbuster/directory-list-2.3-medium.txt -x php -t 50

dir -u 网址 w字典 -x 指定后缀 -t 线程数量

dir -u https://www.servyou.com.cn/ -w /usr/share/wordlists/dirbuster/directory-list-2.3-medium.txt -x “php,html,rar,zip” -d --wildcard -o servyou.log | grep ^“3402”

网站文件

[[常见敏感文件泄露]]

1. robots.txt

2. crossdomin.xml

3. sitemap.xml

4. 后台目录

5. 网站安装包

6. 网站上传目录

7. mysql管理页面

8. phpinfo

9. 网站文本编辑器

10. 测试文件

11. 网站备份文件(.rar、zip、.7z、.tar.gz、.bak)

12. DS_Store 文件

13. vim编辑器备份文件(.swp)

14. WEB—INF/web.xml文件

15 .git

16 .svn

https://www.secpulse.com/archives/55286.html

扫描网页备份

例如

config.php

config.php~

config.php.bak

config.php.swp

config.php.rar

conig.php.tar.gz

网站头信息收集

1、中间件：web服务【Web Servers】 apache iis7 iis7.5 iis8 nginx WebLogic tomcat

2.、网站组件： js组件jquery、vue 页面的布局bootstrap 通过浏览器获取

http://whatweb.bugscaner.com/look/

火狐的插件Wappalyzer

curl命令查询头信息

curl https://www.moonsec.com -i

敏感文件搜索

GitHub搜索

in:name test #仓库标题搜索含有关键字test

in:descripton test #仓库描述搜索含有关键字

in:readme test #Readme文件搜素含有关键字

搜索某些系统的密码

https://github.com/search?q=smtp+58.com+password+3306&type=Code

github 关键词监控

https://www.codercto.com/a/46640.html

谷歌搜索

site:Github.com sa password

site:Github.com root password

site:Github.com User ID=‘sa’;Password

site:Github.com inurl:sql

SVN 信息收集

site:Github.com svn

site:Github.com svn username

site:Github.com svn password

site:Github.com svn username password

综合信息收集

site:Github.com password

site:Github.com ftp ftppassword

site:Github.com 密码

site:Github.com 内部

https://blog.csdn.net/qq_36119192/article/details/99690742

http://www.361way.com/github-hack/6284.html

https://docs.github.com/cn/github/searching-for-information-on-github/searching-code

https://github.com/search?q=smtp+bilibili.com&type=code

Google-hacking

site:域名

inurl: url中存在的关键字网页

intext：网页正文中的关键词

filetype:指定文件类型

https://mp.weixin.qq.com/s/2UJ-wjq44lCF9F9urJzZAA

https://zhuanlan.zhihu.com/p/491708780

wooyun漏洞库

https://wooyun.website/

网盘搜索

凌云搜索 https://www.lingfengyun.com/

盘多多：http://www.panduoduo.net/

盘搜搜：http://www.pansoso.com/

盘搜：http://www.pansou.com/

社工库

名字/常用id/邮箱/密码/电话登录网盘网站邮箱找敏感信息

tg机器人

网站注册信息

www.reg007.com 查询网站注册信息

一般是配合社工库一起来使用。

js敏感信息

网站的url连接写到js里面
js的api接口里面包含用户信息比如账号和密码

jsfinder

https://gitee.com/kn1fes/JSFinder

python3 JSFinder.py -u https://www.mi.com

python3 JSFinder.py -u https://www.mi.com -d

python3 JSFinder.py -u https://www.mi.com -d -ou mi_url.txt -os mi_subdomain.txt

当你想获取更多信息的时候，可以使用-d进行深度爬取来获得更多内容，并使用命令-ou, -os来指定URL和子域名所保存的文件名

批量指定URL和JS链接来获取里面的URL。

指定URL：

python JSFinder.py -f text.txt

指定JS：

python JSFinder.py -f text.txt -j

Packer-Fuzzer

寻找网站交互接口授权key

随着WEB前端打包工具的流行，您在日常渗透测试、安全服务中是否遇到越来越多以Webpack打包器为代表的网站？这类打包

器会将整站的API和API参数打包在一起供Web集中调用，这也便于我们快速发现网站的功能和API清单，但往往这些打包器所生

成的JS文件数量异常之多并且总JS代码量异常庞大（多达上万行），这给我们的手工测试带来了极大的不便，Packer Fuzzer软

件应运而生。

本工具支持自动模糊提取对应目标站点的API以及API对应的参数内容，并支持对：未授权访问、敏感信息泄露、CORS、SQL注

入、水平越权、弱口令、任意文件上传七大漏洞进行模糊高效的快速检测。在扫描结束之后，本工具还支持自动生成扫描报告，

您可以选择便于分析的HTML版本以及较为正规的doc、pdf、txt版本。

sudo apt-get install nodejs && sudo apt-get install npm

git clone https://gitee.com/keyboxdzd/Packer-Fuzzer.git

pip3 install -r requirements.txt

python3 PackerFuzzer.py -u https://www.liaoxuefeng.com

SecretFinder

一款基于Python脚本的JavaScript敏感信息搜索工具

https://gitee.com/mucn/SecretFinder

python SecretFinder.py -i https://www.moonsec.com/ -e

cms识别

收集好网站信息之后，应该对网站进行指纹识别，通过识别指纹，确定目标的cms及版本，方便制定下一步的测试计划，可以用公开的poc或自己累积的对应手法等进行正式的渗透测试。

云悉

http://www.yunsee.cn/info.html

潮汐指纹

http://finger.tidesec.net/

CMS指纹识别

http://whatweb.bugscaner.com/look/

https://github.com/search?q=cms识别

whatcms

御剑cms识别

https://github.com/ldbfpiaoran/cmscan

https://github.com/theLSA/cmsIdentification/

非常规操作

1、如果找到了目标的一处资产，但是对目标其他资产的收集无处下手时，可以查看一下该站点的body里是否有目标的特征，然后利用网络空间搜索引擎（如fofa等）对该特征进行搜索，如：body="XX公司"或body="baidu"等。

该方式一般适用于特征明显，资产数量较多的目标，并且很多时候效果拔群。

2、当通过上述方式的找到test.com的特征后，再进行body的搜索，然后再搜索到test.com的时候，此时fofa上显示的ip大概率为test.com的真实IP。

3、如果需要对政府网站作为目标，那么在批量获取网站首页的时候，可以用上

http://114.55.181.28/databaseInfo/index

之后可以结合上一步的方法进行进一步的信息收集。

SSL/TLS证书查询

SSL/TLS证书通常包含域名、子域名和邮件地址等信息，结合证书中的信息，可以更快速地定位到目标资产，获取到更多目标资产的相关信息。

https://myssl.com/

https://crt.sh

https://censys.io

https://developers.facebook.com/tools/ct/

https://google.com/transparencyreport/https/ct/

SSL证书搜索引擎：

https://certdb.com/domain/github.com

https://crt.sh/?Identity=%.moonsec.com

https://censys.io/

GetDomainsBySSL.py

查找厂商ip段

http://ipwhois.cnnic.net.cn/index.jsp

移动资产收集

微信小程序支付宝小程序

现在很多企业都有小程序，可以关注企业的微信公众号或者支付宝小程序，或关注运营相关人员，查看朋友圈，获取小程序。

https://weixin.sogou.com/weixin?type=1&ie=utf8&query=%E6%8B%BC%E5%A4%9A%E5%A4%9A

app软件搜索

https://www.qimai.cn/

社交信息搜索

QQ群 QQ手机号

微信群

领英

https://www.linkedin.com/

脉脉招聘

boss招聘

js敏感文件

https://github.com/m4ll0k/SecretFinder

https://github.com/Threezh1/JSFinder

https://github.com/rtcatc/Packer-Fuzzer

github信息泄露监控

https://github.com/0xbug/Hawkeye

https://github.com/MiSecurity/x-patrol

https://github.com/VKSRC/Github-Monitor

防护软件收集

安全防护云waf、硬件waf、主机防护软件、软waf

社工相关

微信或者QQ
混入内部群，蹲点观测。加客服小姐姐发一些连接。进一步获取敏感信息。测试产品，购买服务器，拿去测试账号和密码。

物理接触

到企业办公层连接wifi，连同内网。丢一些带有后门的usb
开放免费的wifi截取账号和密码。

社工库

在tg找社工机器人查找密码信息
或本地的社工库查找邮箱或者用户的密码或密文。组合密码在进行猜解登录。

资产收集神器

ARL(Asset Reconnaissance Lighthouse)资产侦察灯塔系统

https://github.com/TophantTechnology/ARL

git clone https://github.com/TophantTechnology/ARL
cd ARL/docker/
docker volume create arl_db
docker-compose pull
docker-compose up -d

AssetsHunter

https://github.com/rabbitmask/AssetsHunter

一款用于src资产信息收集的工具

https://github.com/sp4rkw/Reaper

domain_hunter_pro

https://github.com/bit4woo/domain_hunter_pro

LangSrcCurise

https://github.com/shellsec/LangSrcCurise

网段资产

https://github.com/colodoo/midscan

工具

Fuzz字典推荐：https://github.com/TheKingOfDuck/fuzzDicts

BurpCollector(BurpSuite参数收集插件)：https://github.com/TEag1e/BurpCollector

Wfuzz：https://github.com/xmendez/wfuzz

LinkFinder：https://github.com/GerbenJavado/LinkFinder

PoCBox：https://github.com/Acmesec/PoCBox

最后

从时代发展的角度看，网络安全的知识是学不完的，而且以后要学的会更多，同学们要摆正心态，既然选择入门网络安全，就不能仅仅只是入门程度而已，能力越强机会才越多。

因为入门学习阶段知识点比较杂，所以我讲得比较笼统，大家如果有不懂的地方可以找我咨询，我保证知无不言言无不尽，需要相关资料也可以找我要，我的网盘里一大堆资料都在吃灰呢。

干货主要有：

①1000+CTF历届题库（主流和经典的应该都有了）

②CTF技术文档（最全中文版）

③项目源码（四五十个有趣且经典的练手项目及源码）

④ CTF大赛、web安全、渗透测试方面的视频（适合小白学习）

⑤ 网络安全学习路线图（告别不入流的学习）

⑥ CTF/渗透测试工具镜像文件大全

⑦ 2023密码学/隐身术/PWN技术手册大全

如果你对网络安全入门感兴趣，那么你需要的话可以点击这里👉网络安全重磅福利：入门&进阶全套282G学习资源包免费分享！

扫码领取

程序员菠萝

关注

1
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
史上最全的信息收集总结！！

渗透的本质是信息收集信息收集也叫做资产收集信息收集是渗透测试的前期主要工作，是非常重要的环节，收集足够多的信息才能方便接下来的测试，信息收集主要是收集网站的域名信息、子域名信息、目标网站信息、目标网站真实IP、敏感/目录文件、开放端口和中间件信息等等。通过各种渠道和手段尽可能收集到多的关于这个站点的信息，有助于我们更多的去找到渗透点，突破口。
复制链接

扫一扫