flask(werkzeug.util)的secure_filename的中文上传问题
今天用werkzeug.util里面的secure_filename的时候发现中文名字的上传问题有一些问题
代码结构为
-template
-upload.html <-简单的上传表单html
-upload <-上传文件存于此文件夹
-XX.XX <-这里就是上传文件的位置
-root.py <-处理文件
upload.html很简单,就是一个html5的上传文件表单,其中name为filename
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Title</title>
</head>
<body>
<form action=""
method="post"
enctype="multipart/form-data">
<input type="file" name="filename">
<input type="submit" value="Upload">
</form>
</body>
</html>
upload文件夹在未上传的时候为空文件夹
root.py
# -*- coding:utf-8 -*-
from werkzeug.utils import secure_filename
from flask import Flask, request, render_template, url_for, redirect
from os import path
app = Flask(__name__)
@app.route('/upload/', methods=["GET", "POST"])
def upload():
if request.method == 'POST':
f = request.files['filename']
from unicodedata import normalize
f.save(path.join(path.abspath(path.dirname(__file__)), 'uploads') + '/'
+ secure_filename(normalize('NFKD', f.filename).encode('utf-8', 'strict').decode('utf-8')))
return redirect(url_for('upload'))
return render_template('upload.html')
if __name__ == '__main__':
app.run(debug=True)
从网上copy的正常上传到upload文件夹的代码,上传以后会出现
正常的中文名不见了欸
于是处理了一下就正常了
处理方法如下
将root.py中的
f.save(upload_path + '/' +secure_filename(normalize('NFKD', f.filename).encode('ascii', 'ignore').decode('ascii')))
中的ascii改成utf-8编码,也即
f.save(upload_path + '/' +secure_filename(normalize('NFKD', f.filename).encode('utf-8', 'ignore').decode('utf-8')))
然后找到werkzeug.util
打开util.py
在第30行
_filename_ascii_strip_re = re.compile(r'[^A-Za-z0-9_.-]')
后面加上一行
_filename_gbk_strip_re = re.compile(u"[^\u4e00-\u9fa5A-Za-z0-9_.-]")
这句就是加上了汉字的unicode的修改版正则
然后我们找到secure_filename函数
在其280行和282行有两个ascii,将之改成utf-8
也即,从
if isinstance(filename, text_type):
from unicodedata import normalize
filename = normalize('NFKD', filename).encode('ascii', 'ignore')
if not PY2:
filename = filename.decode('ascii')
改成
if isinstance(filename, text_type):
from unicodedata import normalize
filename = normalize('NFKD', filename).encode('utf-8', 'ignore')
if not PY2:
filename = filename.decode('utf-8')
最后,把286行的_filename_ascii_strip_re改成_filename_gbk_strip_re,也即
filename = str(_filename_ascii_strip_re.sub('', '_'.join(
filename.split()))).strip('._')
改成
filename = str(_filename_gbk_strip_re.sub('', '_'.join(
filename.split()))).strip('._')
即可正常运行