项目位置:Python
代码位置:DataAnalysis.py
项目Python的data目录带有需要分析的数值
file = './data/neutral.txt'
#这些是已知的分类
names = ['drawings', 'hentai', 'neutral', 'porn', 'sexy']
itemList = []
with open(file) as f:
item = {}
for line in f.readlines():
line = line.strip('\n')
#每个以'.jpg'结尾的行,作为一个item的开始
if line.endswith('.jpg'):
if item != {}:
t = item.copy()
itemList.append(t)
item.clear()
item['pic'] = line
else:
splits = line.split(' ')
if len(splits) > 1:
key = splits[0].strip()
value = splits[3].strip(')')
item[key] = value
#将最后一item放入队列
if item != {}:
itemList.append(item)
maxList = []
itemNumber = len(itemList)
for i in itemList:
pic = i.get('pic')
maxKey = max(i, key=i.get)
# if(maxKey == 'porn'):
# print(pic + " : " + str(maxKey))
maxList.append(maxKey)
for n in names:
print('%8s : %d' % (n, maxList.count(n)))
#print(n + ' : ' + str(maxList.count(n)))
需要分析的内容是这样的:
./test/hentai/4b8023e0-c673-4a6f-8589-603ee9d93855.jpg
hentai (score = 0.82976)
drawings (score = 0.07327)
porn (score = 0.05042)
sexy (score = 0.04333)
neutral (score = 0.00321)
./test/hentai/2b383b71-b72b-4d71-8fce-a64cbc604eea.jpg
hentai (score = 0.47422)
drawings (score = 0.34337)
neutral (score = 0.12827)
porn (score = 0.03460)
sexy (score = 0.01954)
./test/hentai/331da44a-5cbf-40dc-b91e-8ba91e652875.jpg
hentai (score = 0.70981)
drawings (score = 0.22027)
porn (score = 0.03091)
sexy (score = 0.02160)
neutral (score = 0.01741)
打印结果:
drawings : 77
hentai : 10
neutral : 1783
porn : 77
sexy : 53
这样就对分类有了一个大概的了解。