前言
1.前面的博文大概讲了官方的VOC2007的内容结构与各个目录的作用,接下来要讲的是如何制作自己的VOC2007数据,并用于训练。
2.制作VOC2007数据集的前准备是必须有包含要训练的样本的图像,和LabelImg,LabelImg是用来标注数据用的。
一、创建文件目录
1.创建VOC2007目录,在VOC2007目录下再创建三个空目录,分别是Annotations、ImageSets、JPEGImages,此时VOC2007目录下只有三个空的目录。
2.在ImageSets目录创建一个Main的目录。
3.把所有要标注的图像全部放进行JPEGImages目录下。
4.对JPEGImages下的图像进行重命名。
用python对整个文件下的图像以递增的方式进行命名,以下是python代码,路径改成自己的路径,在终端运行就可以了。
在home目录新建一个python脚本:
vim rename.py
输入以下代码,把路径改成自己的路径
import os
def rename():
path="/home/matt/data/VOC2007/JPEGImages/"
ex = 0
filelist = os.listdir(path)
count = 1
for file in filelist:
Olddir = os.path.join(path,file)
if os.path.isdir(Olddir):
continue
filename = os.path.splitext(file)[0]
filetype = ".jpg"
p = str(count).zfill(5)
Newdir = os.path.join(path,str(ex)+p+filetype)
os.rename(Olddir,Newdir)
count += 1
rename()
保存,退出,在终端运行:
sudo python ./rename.py
完成之后,文件名字如下图:
二、使用LabelImag标注数据
1.打开LabelImag标注工具,导入JPEGImages下的所有图像。
在LabelImg目录下,用终端运行
./labelImg.py
打开LabelImg工具,选择打开目录,选择VOC2007/JPEGImages/。
LabelImg把所有图像数据都读入进来
2.选择保存xml文件的路径,这里要选择VOC2007目录下的Annotations文件夹,选择要标注成的数据数据格式,这里选VOC。
3.开始标注数据。
打开一张图像,创始区块,然后用鼠标把要训练的物体框选进去,框选完成之后会跳出一个标示框,输入物体的名字,如果在整个图像场景下比较难识别到该物体,则选择有难度的,点OK。
然后保存
在VOC2007/Annotations目录下会有一个与该文件名字相同的xml文件
打开文件可以看到里面的内容
<annotation>
<folder>JPEGImages</folder>
<filename>000000.jpg</filename>
<path>/home/linux/caffe/caffe_ssd/data/VOCdevkit/VOC2007/JPEGImages/000000.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>700</width>
<height>504</height>
<depth>1</depth>
</size>
<segmented>0</segmented>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>156</xmin>
<ymin>109</ymin>
<xmax>168</xmax>
<ymax>130</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>177</xmin>
<ymin>150</ymin>
<xmax>191</xmax>
<ymax>170</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>220</xmin>
<ymin>134</ymin>
<xmax>243</xmax>
<ymax>144</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>270</xmin>
<ymin>101</ymin>
<xmax>291</xmax>
<ymax>113</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>317</xmin>
<ymin>100</ymin>
<xmax>336</xmax>
<ymax>112</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>486</xmin>
<ymin>127</ymin>
<xmax>499</xmax>
<ymax>153</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>551</xmin>
<ymin>143</ymin>
<xmax>573</xmax>
<ymax>157</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>591</xmin>
<ymin>162</ymin>
<xmax>603</xmax>
<ymax>182</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>521</xmin>
<ymin>163</ymin>
<xmax>535</xmax>
<ymax>181</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>82</xmin>
<ymin>213</ymin>
<xmax>96</xmax>
<ymax>234</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>128</xmin>
<ymin>228</ymin>
<xmax>148</xmax>
<ymax>240</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>188</xmin>
<ymin>247</ymin>
<xmax>205</xmax>
<ymax>266</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>243</xmin>
<ymin>281</ymin>
<xmax>267</xmax>
<ymax>292</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>391</xmin>
<ymin>241</ymin>
<xmax>407</xmax>
<ymax>270</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>416</xmin>
<ymin>214</ymin>
<xmax>427</xmax>
<ymax>233</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>457</xmin>
<ymin>236</ymin>
<xmax>482</xmax>
<ymax>250</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>577</xmin>
<ymin>292</ymin>
<xmax>598</xmax>
<ymax>304</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>616</xmin>
<ymin>306</ymin>
<xmax>632</xmax>
<ymax>327</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>543</xmin>
<ymin>306</ymin>
<xmax>559</xmax>
<ymax>331</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>211</xmin>
<ymin>296</ymin>
<xmax>227</xmax>
<ymax>322</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>282</xmin>
<ymin>297</ymin>
<xmax>298</xmax>
<ymax>319</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>115</xmin>
<ymin>313</ymin>
<xmax>136</xmax>
<ymax>343</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>523</xmin>
<ymin>254</ymin>
<xmax>535</xmax>
<ymax>277</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>416</xmin>
<ymin>372</ymin>
<xmax>429</xmax>
<ymax>393</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>414</xmin>
<ymin>414</ymin>
<xmax>428</xmax>
<ymax>435</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>349</xmin>
<ymin>438</ymin>
<xmax>373</xmax>
<ymax>452</ymax>
</bndbox>
</object>
<object>
<name>R</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>352</xmin>
<ymin>358</ymin>
<xmax>372</xmax>
<ymax>366</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>118</xmin>
<ymin>115</ymin>
<xmax>136</xmax>
<ymax>130</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>137</xmin>
<ymin>88</ymin>
<xmax>152</xmax>
<ymax>103</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>323</xmin>
<ymin>78</ymin>
<xmax>341</xmax>
<ymax>93</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>345</xmin>
<ymin>118</ymin>
<xmax>359</xmax>
<ymax>129</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>466</xmin>
<ymin>258</ymin>
<xmax>485</xmax>
<ymax>274</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>466</xmin>
<ymin>111</ymin>
<xmax>479</xmax>
<ymax>126</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>589</xmin>
<ymin>188</ymin>
<xmax>605</xmax>
<ymax>202</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>459</xmin>
<ymin>422</ymin>
<xmax>479</xmax>
<ymax>436</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>462</xmin>
<ymin>367</ymin>
<xmax>478</xmax>
<ymax>384</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>330</xmin>
<ymin>411</ymin>
<xmax>346</xmax>
<ymax>426</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>135</xmin>
<ymin>247</ymin>
<xmax>153</xmax>
<ymax>262</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>171</xmin>
<ymin>225</ymin>
<xmax>187</xmax>
<ymax>240</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>371</xmin>
<ymin>224</ymin>
<xmax>387</xmax>
<ymax>240</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>100</xmin>
<ymin>297</ymin>
<xmax>117</xmax>
<ymax>314</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>280</xmin>
<ymin>327</ymin>
<xmax>299</xmax>
<ymax>340</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>211</xmin>
<ymin>327</ymin>
<xmax>229</xmax>
<ymax>340</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>504</xmin>
<ymin>236</ymin>
<xmax>518</xmax>
<ymax>252</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>330</xmin>
<ymin>368</ymin>
<xmax>348</xmax>
<ymax>383</ymax>
</bndbox>
</object>
<object>
<name>J</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>187</xmin>
<ymin>82</ymin>
<xmax>254</xmax>
<ymax>131</ymax>
</bndbox>
</object>
<object>
<name>J</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>390</xmin>
<ymin>92</ymin>
<xmax>447</xmax>
<ymax>141</ymax>
</bndbox>
</object>
<object>
<name>J</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>532</xmin>
<ymin>84</ymin>
<xmax>587</xmax>
<ymax>133</ymax>
</bndbox>
</object>
<object>
<name>J</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>225</xmin>
<ymin>220</ymin>
<xmax>284</xmax>
<ymax>266</ymax>
</bndbox>
</object>
<object>
<name>J</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>557</xmin>
<ymin>226</ymin>
<xmax>614</xmax>
<ymax>280</ymax>
</bndbox>
</object>
<object>
<name>D</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>259</xmin>
<ymin>386</ymin>
<xmax>278</xmax>
<ymax>404</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>378</xmin>
<ymin>413</ymin>
<xmax>394</xmax>
<ymax>427</ymax>
</bndbox>
</object>
<object>
<name>C</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>377</xmin>
<ymin>369</ymin>
<xmax>395</xmax>
<ymax>383</ymax>
</bndbox>
</object>
</annotation>
关于这个文件的内容说明,请看之前关于官方数据说明的那部分,这里就不重新再说明了。
然后点下一个图像,继续以上的操作直到所有的图像都标示完成。
三、生成相关的txt文件
1.把所有的图像都标注完成之后,在main目录下使用python脚本生成存放训练与测试信息的相关txt文件,路径改成自己的路径。
import os
import random
xmlfilepath=r'/home/matt/data/VOC2007/Annotations/'
saveBasePath=r"/home/matt/data/"
trainval_percent=0.8
train_percent=0.8
total_xml = os.listdir(xmlfilepath)
num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)
print("train and val size",tv)
print("traub suze",tr)
ftrainval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/train.txt'), 'w')
fval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/val.txt'), 'w')
for i in list:
name=total_xml[i][:-4]+'\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest .close()
2.运行上面的python脚本,在main目录下生成四个txt文件
结语
1.以上的操作完成之后,就得到了VOC2007格式的数据集,接下来要做的事是把数据集转换成lmdb数据格式,步骤就跟处理之前处理VOC2007数据一样了。
2.我使用的图像数据是我同学收集和整理的,所以我就不上传上来了,如果有需要的话,可以加这个群(487350510)。