项目路径:
https://gitee.com/hocker/unzip_gbk.git
运行:
git clone https://gitee.com/hocker/unzip_gbk.git
cd unzip_gbk
python unzip_gkb.py 目录/文件
如果跟目录的话,会遍历解压目录下的所有zip文件.
说明:
之前上传了一个python2的,但是python2确实使用比较少了,于是我做了点修改,可以python3下使用。
原版修改
#!usr/bin/env python3
import os
import sys
import zipfile
print("processing File " + sys.argv[1] )
file = zipfile.ZipFile(sys.argv[1], "r");
for name in file.namelist():
print("Extracting " + name )
pathname = os.path.dirname(name)
if not os.path.exists(pathname) and pathname !="":
os.makedirs(pathname)
data = file.read(name);
if not os.path.exists(name):
fo = open(name, "wb")
fo.write(data)
fo.close()
file.close()
做了一次优化
#!/usr/bin/env python3
import os
import sys
import zipfile
def unzip_file(zip_filename):
print("processing File " + zip_filename )
file = zipfile.ZipFile(zip_filename, "r");
for name in file.namelist():
save_name = os.path.join(os.path.basename(os.path.splitext(zip_filename)[0]), name)
pathname = os.path.dirname(save_name)
if not os.path.exists(pathname) and pathname !="":
os.makedirs(pathname)
data = file.read(name);
if not os.path.exists(save_name):
fo = open(save_name, "wb")
fo.write(data)
fo.close()
print("[Extracting]:%s"%(name))
else:
print('[Exist]: %s'%(name))
file.close()
if __name__ == '__main__':
if len(sys.argv) != 2:
print("usage: %s <zip file|dir>"%sys.argv[0])
sys.exit(-1)
zip_filename = sys.argv[1]
zip_filename = os.path.abspath(sys.argv[1])
if os.path.isfile(zip_filename):
unzip_file(zip_filename)
elif os.path.isdir(zip_filename):
files = os.listdir(zip_filename)
for f in files:
ft = os.path.join(zip_filename,f)
if os.path.splitext(f)[1].lower() != '.zip' or not os.path.isfile(ft):
continue
unzip_file(os.path.join(zip_filename,f))
有优化空间,大家可以做一些输出优化和体验优化,然后放到/usr/local/bin下即可全局使用。
后来发现还是有乱码,找到原因说unzip的解码如果不是 utf-8 就是 cp437。我们需要的是gtk啥的。看博主直接改了unzipfile的库,具体如下:
if flags & 0x800:
# UTF-8 file names extension
filename = filename.decode('utf-8')
else:
# Historical ZIP filename encoding
filename = filename.decode('gbk')
然后另外一处:
if zinfo.flag_bits & 0x800:
# UTF-8 filename
fname_str = fname.decode("utf-8")
else:
fname_str = fname.decode('gbk')
后面发现怎么破,然后再来更新,暂时改库有效就行!!!