我试图使用pythonzipfile库来解压一个拆分的ZIP文件,方法是将所有的文件拆分连接起来,然后再解压缩最终的产品,但是我一直在使用这个库时遇到“文件头的坏幻数”错误。在
我正在编写一个Python脚本,它通常只接收一个ZIP文件,但是很少会收到一个分成多个部分的ZIP文件(例如,邮政编码001号,邮政编码.002等)。据我所知,如果需要将脚本与Docker容器的依赖项捆绑在一起,就没有简单的方法来处理这个问题。但是,我偶然发现了this SO answer,它解释了可以将这些文件连接到一个单独的ZIP文件中并将其视为一个ZIP文件。所以我的作战计划是将所有文件拆分成一个大的ZIP文件,然后解压缩这个文件。我用一个视频文件用以下命令创建了一个测试用例(使用Mac终端):$ zip -s 5m test ch4_3.mp4
将我的所有文件连接在一起:
^{pr2}$
如果我转到我的终端并运行unzip test_video.zip,则输出如下:$ unzip test_video.zip
Archive: test_video.zip
warning [test_video.zip]: zipfile claims to be last disk of a multi-part archive;
attempting to process anyway, assuming all parts have been concatenated
together in order. Expect "errors" and warnings...true multi-part support
doesn't exist yet (coming soon).
warning [test_video.zip]: 15728640 extra bytes at beginning or within zipfile
(attempting to process anyway)
file #1: bad zipfile offset (local header sig): 15728644
(attempting to re-compensate)
inflating: ch4_3.mp4
它似乎碰到了一点路障,但它成功地工作了。但是,当我尝试运行以下代码时:if not os.path.exists('output'):
os.mkdir('output')
with zipfile.ZipFile('tester/test_video.zip', 'r') as z:
z.extractall('output')
我得到以下错误:---------------------------------------------------------------------------
BadZipFile Traceback (most recent call last)
in ()
2 os.mkdir('output')
3 with zipfile.ZipFile('tester/test_video.zip', 'r') as z:
----> 4 z.extractall('output')
~/anaconda3/lib/python3.6/zipfile.py in extractall(self, path, members, pwd)
1499
1500 for zipinfo in members:
-> 1501 self._extract_member(zipinfo, path, pwd)
1502
1503 @classmethod
~/anaconda3/lib/python3.6/zipfile.py in _extract_member(self, member, targetpath, pwd)
1552 return targetpath
1553
-> 1554 with self.open(member, pwd=pwd) as source, 1555 open(targetpath, "wb") as target:
1556 shutil.copyfileobj(source, target)
~/anaconda3/lib/python3.6/zipfile.py in open(self, name, mode, pwd, force_zip64)
1371 fheader = struct.unpack(structFileHeader, fheader)
1372 if fheader[_FH_SIGNATURE] != stringFileHeader:
-> 1373 raise BadZipFile("Bad magic number for file header")
1374
1375 fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])
BadZipFile: Bad magic number for file header
如果我尝试在其他文件之前使用.zip文件运行它,我得到的结果是:split_files = ['test.zip', 'test.z01', 'test.z02', 'test.z03']
with open('test_video.zip', 'wb') as f:
for file in split_files:
with open(file, 'rb') as zf:
f.write(zf.read())
with zipfile.ZipFile('test_video.zip', 'r') as z:
z.extractall('output')
输出如下:---------------------------------------------------------------------------
BadZipFile Traceback (most recent call last)
in ()
1 if not os.path.exists('output'):
2 os.mkdir('output')
----> 3 with zipfile.ZipFile('test_video.zip', 'r') as z:
4 z.extractall('output')
~/anaconda3/lib/python3.6/zipfile.py in __init__(self, file, mode, compression, allowZip64)
1106 try:
1107 if mode == 'r':
-> 1108 self._RealGetContents()
1109 elif mode in ('w', 'x'):
1110 # set the modified flag so central directory gets written
~/anaconda3/lib/python3.6/zipfile.py in _RealGetContents(self)
1173 raise BadZipFile("File is not a zip file")
1174 if not endrec:
-> 1175 raise BadZipFile("File is not a zip file")
1176 if self.debug > 1:
1177 print(endrec)
BadZipFile: File is not a zip file
使用来自this SO question的答案,我已经计算出头部是b'PK\x07\x08',但我不知道为什么。我还使用了testzip()函数,它直接指向了罪魁祸首:ch4_3.mp4。在
您可以在this link here找到有问题的ZIP文件。有什么办法吗?在