文章目录
主要功能
检查JavaScript文件中的敏感数据,例如apikeys、accesstoken、authorizations和jwt等。
1.安装方式
git clone https://github.com/m4ll0k/SecretFinder.git
2.常用参数
参数 | 含义 | 用法 |
---|---|---|
-i | input 输入url | 网址(https://www.google.com/ )这种情况下参数-e必需 js链接(https://www.google.com/1.js) 文件夹(C:/Users/cong.chen/Desktop/aiswei2025) 本地文件(C:/Users\cong.chen/Desktop/aiswei2025/file.js) |
-e | extract 提取 | 提取页面中的所有 javascript 链接并处理 |
-o | output 输出 | 输出的文件,默认是output.html 如果参数值为cli则打印在控制台 |
-r | regex 正则 | 自定义正则表达式 |
-c | cookie | 添加cookie |
-g | ignore 忽略 | 忽略指定外部库提供的js |
3.示例
3.1分析整个域及其 JS 文件 结果打印在控制台
python SecretFinder.py -i https://www.baidu.com -e -o cli
3.2分析js链接 结果保存在文件a.html中
python SecretFinder.py -i https://pss.bdstatic.com/static/superman/js/lib/jquery-1-edb203c114.10.2.js -o a.html
3.3分析本地的js文件
python SecretFinder.py -i file:///C:/Users/cong.chen/Desktop/aiswei2025/test.js -o cli
或
-i C:\Users\cong.chen\Desktop\aiswei2025\test.js -o cli #修改后的代码(见下文)支持
-i C:/Users/cong.chen/Desktop/aiswei2025/test.js -o cli #修改后的代码支持
3.4分析本地的文件夹 修改后的代码(见下文)支持
python SecretFinder.py -i C:\Users\cong.chen\Desktop\aiswei2025\* -o cli
限定后缀
python SecretFinder.py -i C:\Users\cong.chen\Desktop\aiswei2025\*.js -o cli
3.5使用自己的正则表达式
python SecretFinder.py -i https://pss.bdstatic.com/static/superman/js/lib/jquery-1-edb203c114.10.2.js -o cli -r password\s*[`=:\"]+\s*[^\s]
4.代码修改说明
记录修改内容,方便后续修改
4.1 修改正则
.+?表示至少匹配一个,原代码会出现如果匹配内容单独一行时无法匹配到,改为.*?
直接拼接正则表达式,如果正则表达式中类似()这种特殊字符会报错,添加re.escape()处理
4.2 修复传入本地文件夹路径时的问题
传入的参数是本地文件夹路径时,原代码会因为windows路径分隔符问题报错,并且遍历文件夹时只遍历当前目录下的文件,未进行递归查找,修复了这两个问题,并且添加了按后缀查找功能。示例见3.4
4.3 修复传入本地文件时的问题
传入的参数是本地文件时,原代码会因为windows路径分隔符问题报错,修复问题
5.代码
附修改后的代码文件
#!/usr/bin/env python
# SecretFinder - Tool for discover apikeys/accesstokens and sensitive data in js file
# based on LinkFinder - github.com/GerbenJavado
# By m4ll0k (@m4ll0k2) github.com/m4ll0k
import os,sys
if not sys.version_info.major >= 3:
print("[ + ] Run this tool with python version 3.+")
sys.exit(0)
os.environ["BROWSER"] = "open"
import re
import glob
import argparse
import jsbeautifier
import webbrowser
import subprocess
import base64
import requests
import string
import random
from html import escape
import urllib3
import xml.etree.ElementTree
# disable warning
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
# for read local file with file:// protocol
from requests_file import FileAdapter
from lxml import html
from urllib.parse import urlparse
# regex
_regex = {
'google_api' : r'AIza[0-9A-Za-z-_]{35}',
'firebase' : r'AAAA[A-Za-z0-9_-]{7}:[A-Za-z0-9_-]{140}',
'google_captcha' : r'6L[0-9A-Za-z-_]{38}|^6[0-9a-zA-Z_-]{39}$',
'google_oauth' : r'ya29\.[0-9A-Za-z\-_]+',
'amazon_aws_access_key_id' : r'A[SK]IA[0-9A-Z]{16}',
'amazon_mws_auth_toke' : r'amzn\\.mws\\.[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}',
'amazon_aws_url' : r's3\.amazonaws.com[/]+|[a-zA-Z0-9_-]*\.s3\.amazonaws.com',
'amazon_aws_url2' : r"(" \
r"[a-zA-Z0-9-\.\_]+\.s3\.amazonaws\.com" \
r"|s3://[a-zA-Z0-9-\.\_]+" \
r"|s3-[a-zA-Z0-9-\.\_\/]+" \
r"|s3.amazonaws.com/[a-zA-Z0-9-\.\_]+" \
r"|s3.console.aws.amazon.com/s3/buckets/[a-zA-Z0-9-\.\_]+)",
'facebook_access_token' : r'EAACEdEose0cBA[0-9A-Za-z]+',
'authorization_basic' : r'basic [a-zA-Z0-9=:_\+\/-]{5,100}',
'authorization_bearer' : r'bearer [a-zA-Z0-9_\-\.=:_\+\/]{5,100}',
'authorization_api' : r'api[key|_key|\s+]+[a-zA-Z0-9_\-]{5,100}',
'mailgun_api_key' : r'key-[0-9a-zA-Z]{32}',
'twilio_api_key' : r'SK[0-9a-fA-F]{32}',
'twilio_account_sid' : r'AC[a-zA-Z0-9_\-]{32}',
'twilio_app_sid' : r'AP[a-zA-Z0-9_\-]{32}',
'paypal_braintree_access_token' : r'access_token\$production\$[0-9a-z]{16}\$[0-9a-f]{32}',
'square_oauth_secret' : r'sq0csp-[ 0-9A-Za-z\-_]{43}|sq0[a-z]{3}-[0-9A-Za-z\-_]{22,43}',
'square_access_token' : r'sqOatp-[0-9A-Za-z\-_]{22}|EAAA[a-zA-Z0-9]{60}',
'stripe_standard_api' : r'sk_live_[0-9a-zA-Z]{24}',
'stripe_restricted_api' : r'rk_live_[0-9a-zA-Z]{24}',
'github_access_token' : r'[a-zA-Z0-9_-]*:[a-zA-Z0-9_\-]+@github\.com*',
'rsa_private_key' : r'-----BEGIN RSA PRIVATE KEY-----',
'ssh_dsa_private_key' : r'-----BEGIN DSA PRIVATE KEY-----',
'ssh_dc_private_key' : r'-----BEGIN EC PRIVATE KEY-----',
'pgp_private_block' : r'-----BEGIN PGP PRIVATE KEY BLOCK-----',
'json_web_token' : r'ey[A-Za-z0-9-_=]+\.[A-Za-z0-9-_=]+\.?[A-Za-z0-9-_.+/=]*$',
'slack_token' : r"\"api_token\":\"(xox[a-zA-Z]-[a-zA-Z0-9-]+)\"",
'SSH_privKey' : r"([-]+BEGIN [^\s]+ PRIVATE KEY[-]+[\s]*[^-]*[-]+END [^\s]+ PRIVATE KEY[-]+)",
'Heroku API KEY' : r'[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}',
'possible_Creds' : r"(?i)(" \
r"password\s*[`=:\"]+\s*[^\s]+|" \
r"password is\s*[`=:\"]*\s*[^\s]+|" \
r"pwd\s*[`=:\"]*\s*[^\s]+|" \
r"passwd\s*[`=:\"]+\s*[^\s]+)",
}
_template = '''
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<style>
h1 {
font-family: sans-serif;
}
a {
color: #000;
}
.text {
font-size: 16px;
font-family: Helvetica, sans-serif;
color: #323232;
background-color: white;
}
.container {
background-color: #e9e9e9;
padding: 10px;
margin: 10px 0;
font-family: helvetica;
font-size: 13px;
border-width: 1px;
border-style: solid;
border-color: #8a8a8a;
color: #323232;
margin-bottom: 15px;
}
.button {
padding: 17px 60px;
margin: 10px 10px 10px 0;
display: inline-block;
background-color: #f4f4f4;
border-radius: .25rem;
text-decoration: none;
-webkit-transition: .15s ease-in-out;
transition: .15s ease-in-out;
color: #333;
position: relative;
}
.button:hover {
background-color: #eee;
text-decoration: none;
}
.github-icon {
line-height: 0;
position: absolute;
top: 14px;
left: 24px;
opacity: 0.7;
}
</style>
<title>LinkFinder Output</title>
</head>
<body contenteditable="true">
$$content$$
<a class='button' contenteditable='false' href='https://github.com/m4ll0k/SecretFinder/issues/new' rel='nofollow noopener noreferrer' target='_blank'><span class='github-icon'><svg height="24" viewbox="0 0 24 24" width="24" xmlns="http://www.w3.org/2000/svg">
<path d="M9 19c-5 1.5-5-2.5-7-3m14 6v-3.87a3.37 3.37 0 0 0-.94-2.61c3.14-.35 6.44-1.54 6.44-7A5.44 5.44 0 0 0 20 4.77 5.07 5.07 0 0 0 19.91 1S18.73.65 16 2.48a13.38 13.38 0 0 0-7 0C6.27.65 5.09 1 5.09 1A5.07 5.07 0 0 0 5 4.77a5.44 5.44 0 0 0-1.5 3.78c0 5.42 3.3 6.61 6.44 7A3.37 3.37 0 0 0 9 18.13V22" fill="none" stroke="#000" stroke-linecap="round" stroke-linejoin="round" stroke-width="2"></path></svg></span> Report an issue.</a>
</body>
</html>
'''
def parser_error(msg):
print('Usage: python %s [OPTIONS] use -h for help'%sys.argv[0])
print('Error: %s'%msg)
sys.exit(0)
def getContext(matches,content,name,rex='.*?'):
''' get context '''
items = []
matches2 = []
for i in [x[0] for x in matches]:
if i not in matches2:
matches2.append(i)
for m in matches2:
context = re.findall('%s%s%s'%(rex,re.escape(m),rex),content,re.IGNORECASE)
item = {
'matched' : m,
'name' : name,
'context' : context,
'multi_context' : True if len(context) > 1 else False
}
items.append(item)
return items
def parser_file(content,mode=1,more_regex=None,no_dup=1):
''' parser file '''
if mode == 1:
if len(content) > 1000000:
content = content.replace(";",";\r\n").replace(",",",\r\n")
else:
content = jsbeautifier.beautify(content)
all_items = []
for regex in _regex.items():
r = re.compile(regex[1],re.VERBOSE|re.I)
if mode == 1:
all_matches = [(m.group(0),m.start(0),m.end(0)) for m in re.finditer(r,content)]
items = getContext(all_matches,content,regex[0])
if items != []:
all_items.append(items)
else:
items = [{
'matched' : m.group(0),
'context' : [],
'name' : regex[0],
'multi_context' : False
} for m in re.finditer(r,content)]
if items != []:
all_items.append(items)
if all_items != []:
k = []
for i in range(len(all_items)):
for ii in all_items[i]:
if ii not in k:
k.append(ii)
if k != []:
all_items = k
if no_dup:
all_matched = set()
no_dup_items = []
for item in all_items:
if item != [] and type(item) is dict:
if item['matched'] not in all_matched:
all_matched.add(item['matched'])
no_dup_items.append(item)
all_items = no_dup_items
filtered_items = []
if all_items != []:
for item in all_items:
if more_regex:
if re.search(more_regex,item['matched']):
filtered_items.append(item)
else:
filtered_items.append(item)
return filtered_items
def parser_input(input):
''' Parser Input '''
# method 1 - url
schemes = ('http://','https://','ftp://','file://','ftps://')
if input.startswith(schemes):
return [input]
# method 2 - url inpector firefox/chrome
if input.startswith('view-source:'):
return [input[12:]]
# method 3 - Burp file
if args.burp:
jsfiles = []
items = []
try:
items = xml.etree.ElementTree.fromstring(open(args.input,'r').read())
except Exception as err:
print(err)
sys.exit()
for item in items:
jsfiles.append(
{
'js': base64.b64decode(item.find('response').text).decode('utf-8','replace'),
'url': item.find('url').text
}
)
return jsfiles
# method 4 - folder with a wildcard
if '*' in input:
# 获取输入中的文件扩展名,例如 'js' 或 'txt'
extension = os.path.splitext(input)[1] # 获取文件的扩展名,包含点(.)
# 构建递归搜索路径
pattern = os.path.join(os.path.abspath(os.path.dirname(input)), '**', f'*{extension}')
# 递归查找文件
paths = glob.glob(pattern, recursive=True)
paths = [path.replace('\\', '/') for path in paths]
for index, path in enumerate(paths):
paths[index] = "file:///%s" % path
return (paths if len(paths)> 0 else parser_error('Input with wildcard does not match any files.'))
# method 5 - local file
path = "file:///%s"% os.path.abspath(input).replace('\\', '/')
return [path if os.path.exists(input.replace('\\', '/')) else parser_error('file could not be found (maybe you forgot to add http/https).')]
def html_save(output):
''' html output '''
hide = os.dup(1)
os.close(1)
os.open(os.devnull,os.O_RDWR)
try:
text_file = open(args.output,"wb")
text_file.write(_template.replace('$$content$$',output).encode('utf-8'))
text_file.close()
print('URL to access output: file://%s'%os.path.abspath(args.output))
file = 'file:///%s'%(os.path.abspath(args.output))
if sys.platform == 'linux' or sys.platform == 'linux2':
subprocess.call(['xdg-open',file])
else:
webbrowser.open(file)
except Exception as err:
print('Output can\'t be saved in %s due to exception: %s'%(args.output,err))
finally:
os.dup2(hide,1)
def cli_output(matched):
''' cli output '''
for match in matched:
print(match.get('name')+'\t->\t'+match.get('matched').encode('ascii','ignore').decode('utf-8'))
def urlParser(url):
''' urlParser '''
parse = urlparse(url)
urlParser.this_root = parse.scheme + '://' + parse.netloc
urlParser.this_path = parse.scheme + '://' + parse.netloc + '/' + parse.path
def extractjsurl(content,base_url):
''' JS url extract from html page '''
soup = html.fromstring(content)
all_src = []
urlParser(base_url)
for src in soup.xpath('//script'):
src = src.xpath('@src')[0] if src.xpath('@src') != [] else []
if src != []:
if src.startswith(('http://','https://','ftp://','ftps://')):
if src not in all_src:
all_src.append(src)
elif src.startswith('//'):
src = 'http://'+src[2:]
if src not in all_src:
all_src.append(src)
elif src.startswith('/'):
src = urlParser.this_root + src
if src not in all_src:
all_src.append(src)
else:
src = urlParser.this_path + src
if src not in all_src:
all_src.append(src)
if args.ignore and all_src != []:
temp = all_src
ignore = []
for i in args.ignore.split(';'):
for src in all_src:
if i in src:
ignore.append(src)
if ignore:
for i in ignore:
temp.pop(int(temp.index(i)))
return temp
if args.only:
temp = all_src
only = []
for i in args.only.split(';'):
for src in all_src:
if i in src:
only.append(src)
return only
return all_src
def send_request(url):
''' Send Request '''
# read local file
# https://github.com/dashea/requests-file
if 'file://' in url:
s = requests.Session()
s.mount('file://',FileAdapter())
return s.get(url).content.decode('utf-8','replace')
# set headers and cookies
headers = {}
default_headers = {
'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36',
'Accept' : 'text/html, application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Language' : 'en-US,en;q=0.8',
'Accept-Encoding' : 'gzip'
}
if args.headers:
for i in args.header.split('\\n'):
# replace space and split
name,value = i.replace(' ','').split(':')
headers[name] = value
# add cookies
if args.cookie:
headers['Cookie'] = args.cookie
headers.update(default_headers)
# proxy
proxies = {}
if args.proxy:
proxies.update({
'http' : args.proxy,
'https' : args.proxy,
# ftp
})
try:
resp = requests.get(
url = url,
verify = False,
headers = headers,
proxies = proxies
)
return resp.content.decode('utf-8','replace')
except Exception as err:
print(err)
sys.exit(0)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-e","--extract",help="Extract all javascript links located in a page and process it",action="store_true",default=False)
parser.add_argument("-i","--input",help="Input a: URL, file or folder",required="True",action="store")
parser.add_argument("-o","--output",help="Where to save the file, including file name. Default: output.html",action="store", default="output.html")
parser.add_argument("-r","--regex",help="RegEx for filtering purposes against found endpoint (e.g: ^/api/)",action="store")
parser.add_argument("-b","--burp",help="Support burp exported file",action="store_true")
parser.add_argument("-c","--cookie",help="Add cookies for authenticated JS files",action="store",default="")
parser.add_argument("-g","--ignore",help="Ignore js url, if it contain the provided string (string;string2..)",action="store",default="")
parser.add_argument("-n","--only",help="Process js url, if it contain the provided string (string;string2..)",action="store",default="")
parser.add_argument("-H","--headers",help="Set headers (\"Name:Value\\nName:Value\")",action="store",default="")
parser.add_argument("-p","--proxy",help="Set proxy (host:port)",action="store",default="")
args = parser.parse_args()
if args.input[-1:] == "/":
# /aa/ -> /aa
args.input = args.input[:-1]
mode = 1
if args.output == "cli":
mode = 0
# add args
if args.regex:
# validate regular exp
try:
r = re.search(args.regex,''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(random.randint(10,50))))
except Exception as e:
print('your python regex isn\'t valid')
sys.exit()
_regex.update({
'custom_regex' : args.regex
})
if args.extract:
content = send_request(args.input)
urls = extractjsurl(content,args.input)
else:
# convert input to URLs or JS files
urls = parser_input(args.input)
# conver URLs to js file
output = ''
for url in urls:
print('[ + ] URL: '+url)
if not args.burp:
file = send_request(url)
else:
file = url.get('js')
url = url.get('url')
matched = parser_file(file,mode)
if args.output == 'cli':
cli_output(matched)
else:
output += '<h1>File: <a href="%s" target="_blank" rel="nofollow noopener noreferrer">%s</a></h1>'%(escape(url),escape(url))
for match in matched:
_matched = match.get('matched')
_named = match.get('name')
header = '<div class="text">%s'%(_named.replace('_',' '))
body = ''
# find same thing in multiple context
if match.get('multi_context'):
# remove duplicate
no_dup = []
for context in match.get('context'):
if context not in no_dup:
body += '</a><div class="container">%s</div></div>'%(context)
body = body.replace(
context,'<span style="background-color:yellow">%s</span>'%context)
no_dup.append(context)
# --
else:
body += '</a><div class="container">%s</div></div>'%(match.get('context')[0] if len(match.get('context'))>1 else match.get('context'))
body = body.replace(
match.get('context')[0] if len(match.get('context')) > 0 else ''.join(match.get('context')),
'<span style="background-color:yellow">%s</span>'%(match.get('context') if len(match.get('context'))>1 else match.get('context'))
)
output += header + body
if args.output != 'cli':
html_save(output)