前几天跑深度学习下载了一个数据集,结果是seq文件。百度如何转图片,尝试各种代码结果一直生成空文件夹,最后受一篇博客评论受启发,原来是分割字符错了。
python代码如下:
# Deal with .seq format for video sequence
# The .seq file is combined with images,
# so I split the file into several images with the image prefix
# "\xFF\xD8\xFF\xE0\x00\x10\x4A\x46\x49\x46".
import os.path
import fnmatch
import shutil
def open_save(file, savepath):
"""
read .seq file, and save the images into the savepath
:param file: .seq文件路径
:param savepath: 保存的图像路径
:return:
"""
# 读入一个seq文件,然后拆分成image存入savepath当中
f = open(file, 'rb+')
# 将seq文件的内容转化成str类型
string = f.read().decode('latin-1')
# splitstring是图片的前缀,可以理解成seq是以splitstring为分隔的多个jpg合成的文件
# splitstring就是seq文件的图片分割字符串
splitstring = "\x89\x50\x4E\x47\x0D\x0A\x1A\x0A"
# split函数做一个测试,因此返回结果的第一个是在seq文件中是空,因此后面省略掉第一个
"""
>>> a = ".12121.3223.4343"
>>> a.split('.')
['', '12121', '3223', '4343']
"""
# split .seq file into segment with the image prefix
strlist = string.split(splitstring)
f.close()
count = 0
# delete the image folder path if it exists
if os.path.exists(savepath):
shutil.rmtree(savepath)
# create the image folder path
if not os.path.exists(savepath):
os.makedirs(savepath)
# deal with file segment, every segment is an image except the first one
for img in strlist:
filename = str(count) + '.png'
filenamewithpath = os.path.join(savepath, filename)
# abandon the first one, which is filled with .seq header
if count > 0:
i = open(filenamewithpath, 'wb+')
i.write(splitstring.encode('latin-1'))
i.write(img.encode('latin-1'))
i.close()
count += 1
if __name__ == "__main__":
rootdir = "你的seq文件夹"
saveroot = "你的解压文件夹"
# walk in the rootdir, take down the .seq filename and filepath
for parent, dirnames, filenames in os.walk(rootdir):
for filename in filenames:
# check .seq file with suffix
# fnmatch 全称是 filename match,主要是用来匹配文件名是否符合规则的
if fnmatch.fnmatch(filename, '*.seq'):
# take down the filename with path of .seq file
thefilename = os.path.join(parent, filename)
# create the image folder by combining .seq file path with .seq filename
parent_path = parent
parent_path = parent_path.replace('\\', '/')
thesavepath = saveroot + '/' + parent_path.split('/')[-1] + '/' + filename.split('.')[0]
print("Filename=" + thefilename)
print("Savepath=" + thesavepath)
open_save(thefilename, thesavepath)
1、关于如何找分割字符串
分割字符代码段如下:
splitstring = "\x89\x50\x4E\x47\x0D\x0A\x1A\x0A"
这个分隔符需要下一个hex编辑器查看,我用的是Imhex。
这里就可以看见,这个seq文件是png格式的,分割代码 \x89\x50\x4E\x47\x0D\x0A\x1A\x0A。
修改后成功生成图片。
参考博客链接:
python3 处理seq文件_HockerF的博客-CSDN博客_python seq
在此也谢谢该作者。