数据预处理的时候遇到文件格式的问题,有两种方法:
1.命令行
安装parallel和convert:
$ sudo yum install parallel
$ sudo yum install ImageMagick
(1)png转jpg
$ parallel convert '{}' '{.}.jpg' ::: *.png
(2)jpg转png
$ parallel convert '{}' '{.}.png' ::: *.jpg
2. python
文件convert.py内容如下:
import cv2
import os
import argparse
parser = argparse.ArgumentParser(description='convert')
parser.add_argument('--srcpath', default='./source_dir/',
help='Source data path ')
parser.add_argument('--tgtpath', default='./target_dir/',
help='Target data path ')
parser.add_argument('--from', dest='srctype', default='png',
help='Source image type: jpg/png')
parser.add_argument('--to', dest='tgttype', default='jpg',
help='Target image type: jpg/png')
args = parser.parse_args()
path = args.srcpath
picture_type = args.srctype
newpath = args.tgtpath
if not os.path.exists(newpath):
os.mkdir(newpath)
path_list=os.listdir(path)
number=0
for filename in path_list:
number+=1
portion = os.path.splitext(filename)
# print('convert ' + filename +' to '+portion[0]+'.'+picture_type)
img = cv2.imread(path+"/"+filename)
cv2.imwrite("./"+newpath+"/"+portion[0]+'.'+args.tgttype,img)
print('converted ' + str(number) +' images to '+newpath)
cv2.waitKey(0)
cv2.destroyAllWindows()
运行命令:
python convert.py --srcpath path/to/source/dir --tgtpath path/to/target/dir