好久没写了,感觉最近懈怠了QAQ。。。
教程代码使用的是这个链接里的 https://github.com/timesler/facenet-pytorch
这个项目包含了pytorch版本的人脸检测模块MTCNN和由tensorflow版本facenet移植来的InceptionResnetV1权重。下面介绍一下怎么样用自己的数据集训练一个简单的人脸识别模型,可能不会对网络结构和原理做太多说明。
1.实现这个项目用的是开源的facenet-pytorch这套代码,有三种安装方式(由于我是在colab里实现的,所以代码前加了感叹号,不是这种环境的不用加)
(1)直接pip安装
!pip install facenet-pytorch
(2) github clone
!git clone https://github.com/timesler/facenet-pytorch.git facenet_pytorch
(3)将github源码下载下来,手动安装,这种安装方式便于修改其中的代码和内容,我是用这种方式安装的。
先将下载下来的代码放到确定的路径,然后改变当前工作目录到这个路径,之后再对整个包进行安装。
import os
os.chdir("drive/My Drive/Colab Notebooks/facenet-pytorch-master")
!pip install .
2.然后运行一下代码包种examples/infer.ipynb这个程序,进行对图片中的人脸检测。
(1)载入相应的包
from facenet_pytorch import MTCNN, InceptionResnetV1
import torch
from torch.utils.data import DataLoader
from torchvision import datasets
import numpy as np
import pandas as pd
import os
workers = 0 if os.name == 'nt' else 4
(2)查看一下目前使用的是GPU还是CPU
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print('Running on device: {}'.format(device))
(3)加载并定义要用于检测的MTCNN检测网络,这里我参数用的都是原始的没有进行修改。
mtcnn = MTCNN(
image_size=160, margin=0, min_face_size=20,
thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True,
device=device
)
(4)加载在vggface2人脸数据集上预训练好的InceptionResnetV1网络
resnet = InceptionResnetV1(pretrained='vggface2').eval().to(device)
(5)定义读取数据集,类别名称与索引转换的函数
def collate_fn(x):
return x[0]
dataset = datasets.ImageFolder('./data/test_images')
dataset.idx_to_class = {i:c for c, i in dataset.class_to_idx.items()}
loader = DataLoader(dataset, collate_fn=collate_fn, num_workers=workers)
(6)检查每张图片中存在人脸的概率
aligned = []
names = []
for x, y in loader:
x_aligned, prob = mtcnn(x, return_prob=True)
if x_aligned is not None:
print('Face detected with probability: {:8f}'.format(prob))
aligned.append(x_aligned)
names.append(dataset.idx_to_class[y])
然后呈现每张图片中存在人脸的概率,这里有5张测试样本
(7)然后使用之前定义的resnet网络进行预测,推断出的向量e1,e2差值越小也就是说是这个人的可能性越大,得到的结果如下图所示,其实这块我也不是很懂,如果大家把这块搞得明白了麻烦评论留言讲解一下。
aligned = torch.stack(aligned).to(device)
embeddings = resnet(aligned).detach().cpu()
dists = [[(e1 - e2).norm().item() for e2 in embeddings] for e1 in embeddings]
print(pd.DataFrame(dists, columns=names, index=names))
呈现的结果是这个样子,然后人脸检测这块的代码就先跑完了。
3.下面运行一下face_tracking.ipynb这套人脸人脸跟踪的代码,但是我跑的时候出了问题只呈现了图片结果,没能呈现跟踪效果~-~
(1)运行这块需要先安装mmcv
!pip install mmcv
(2)将mtcnn引入进来
from facenet_pytorch import MTCNN
import torch
import numpy as np
import mmcv, cv2
from PIL import Image, ImageDraw
from IPython import display
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print('Running on device: {}'.format(device))
mtcnn = MTCNN(keep_all=True, device=device)
(3)加载要用到的人脸视频,看一下共多少帧
video = mmcv.VideoReader('./examples/video.mp4')
frames = [Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) for frame in video]
display.Video('./examples/video.mp4', width=640)
frames_tracked = []
for i, frame in enumerate(frames):
print('\rTracking frame: {}'.format(i + 1), end='')
# Detect faces
boxes, _ = mtcnn.detect(frame)
# Draw faces
frame_draw = frame.copy()
draw = ImageDraw.Draw(frame_draw)
for box in boxes:
draw.rectangle(box.tolist(), outline=(255, 0, 0), width=6)
# Add to frame list
frames_tracked.append(frame_draw.resize((640, 360), Image.BILINEAR))
print('\nDone')
(4)加载这个视频进行检测跟踪,由于我的Ipython.display缺少“Video"所以只能呈现静态检测的效果没能实现跟踪,大家如果尝试了这套代码,可以看一下是什么效果。
d = display.display(frames_tracked[0], display_id=True)
i = 1
try:
while True:
d.update(frames_tracked[i % len(frames_tracked)])
i += 1
except KeyboardInterrupt:
pass
4.下面这个程序算是最主要的程序吧,就是自己简单的做一套人脸数据集,训练人脸识别模型,就按照源码里/data/test_images里的这种数据格式做就可以。不想用自己的数据直接用源码里的这个跑一下也可以。参考用的代码是examples/finetune.ipynb
(1)首先就与上边程序类似的模块加载以及明确要用的数据集
from facenet_pytorch import MTCNN, InceptionResnetV1, fixed_image_standardization, training
import torch
from torch.utils.data import DataLoader, SubsetRandomSampler
from torch import optim
from torch.optim.lr_scheduler import MultiStepLR
from torch.utils.tensorboard import SummaryWriter
from torchvision import datasets, transforms
import numpy as np
import os
#data_dir = './data/person'
data_dir = './data/test_images'
batch_size = 32
epochs = 8
workers = 0 if os.name == 'nt' else 8
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print('Running on device: {}'.format(device))
mtcnn = MTCNN(
image_size=160, margin=0, min_face_size=20,
thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True,
device=device
)
dataset = datasets.ImageFolder(data_dir, transform=transforms.Resize((512, 512)))
dataset.samples = [
(p, p.replace(data_dir, data_dir + '_cropped'))
for p, _ in dataset.samples
]
loader = DataLoader(
dataset,
num_workers=workers,
batch_size=batch_size,
collate_fn=training.collate_pil
)
for i, (x, y) in enumerate(loader):
mtcnn(x, save_path=y)
print('\rBatch {} of {}'.format(i + 1, len(loader)), end='')
# Remove mtcnn to reduce GPU memory usage
del mtcnn
resnet = InceptionResnetV1(
classify=True,
pretrained='vggface2',
num_classes=len(dataset.class_to_idx)
).to(device)
(2)之后定义一下要用的优化器,训练集验证集分配什么的
optimizer = optim.Adam(resnet.parameters(), lr=0.001)
scheduler = MultiStepLR(optimizer, [5, 10])
trans = transforms.Compose([
np.float32,
transforms.ToTensor(),
fixed_image_standardization
])
dataset = datasets.ImageFolder(data_dir + '_cropped', transform=trans)
img_inds = np.arange(len(dataset))
#print(img_inds)
np.random.shuffle(img_inds)
train_inds = img_inds[:int(0.8 * len(img_inds))]
print(train_inds)
val_inds = img_inds[int(0.8 * len(img_inds)):]
print(val_inds)
train_loader = DataLoader(
dataset,
num_workers=workers,
batch_size=batch_size,
sampler=SubsetRandomSampler(train_inds)
)
val_loader = DataLoader(
dataset,
num_workers=workers,
batch_size=batch_size,
sampler=SubsetRandomSampler(val_inds)
)
(3)明确损失函数和精确度计算,开始训练验证模型
loss_fn = torch.nn.CrossEntropyLoss()
metrics = {
'fps': training.BatchTimer(),
'acc': training.accuracy
}
writer = SummaryWriter()
writer.iteration, writer.interval = 0, 100
#writer.iteration, writer.interval = 0, 100
print('\n\nInitial')
print('-' * 10)
resnet.eval()
training.pass_epoch(
resnet, loss_fn, val_loader,
batch_metrics=metrics, show_running=True, device=device,
writer=writer
)
for epoch in range(epochs):
print('\nEpoch {}/{}'.format(epoch + 1, epochs))
print('-' * 10)
resnet.train()
training.pass_epoch(
resnet, loss_fn, train_loader, optimizer, scheduler,
batch_metrics=metrics, show_running=True, device=device,
writer=writer
)
resnet.eval()
training.pass_epoch(
resnet, loss_fn, val_loader,
batch_metrics=metrics, show_running=True, device=device,
writer=writer
)
writer.close()
训练完成的结果是这样的
(4)将训练完成的模型保存并做一些图片预测
torch.save(resnet,"face3.pth")
model=torch.load("face3.pth")
import pylab
import matplotlib.pyplot as plt
image_transforms = {
'test': transforms.Compose([
transforms.Resize(size=256),
transforms.CenterCrop(size=224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
}
dataset = 'data'
test_directory = os.path.join(dataset, 'test_images')
#test_directory = os.path.join(dataset, 'person')
data = {
'test_images': datasets.ImageFolder(root=test_directory, transform=image_transforms['test'])
}
batch_size = 32
num_classes = 3
test_data_size = len(data['test_images'])
test_data = DataLoader(data['test_images'], batch_size=batch_size, shuffle=True)
idx_to_class = {v: k for k, v in data['test_images'].class_to_idx.items()}
def predict(model, test_image_name):
transform = image_transforms['test']
test_image = Image.open(test_image_name)
draw = ImageDraw.Draw(test_image)
test_image_tensor = transform(test_image)
if torch.cuda.is_available():
test_image_tensor = test_image_tensor.view(1, 3, 224, 224).cuda()
else:
test_image_tensor = test_image_tensor.view(1, 3, 224, 224)
with torch.no_grad():
model.eval()
out = model(test_image_tensor)
ps = torch.exp(out)
topk, topclass = ps.topk(1, dim=1)
print("Prediction : ", idx_to_class[topclass.cpu().numpy()[0][0]], ", Score: ", topk.cpu().numpy()[0][0])
text = idx_to_class[topclass.cpu().numpy()[0][0]] + " " + str(topk.cpu().numpy()[0][0])
font = ImageFont.truetype('arial.ttf', 16)
draw.text((0, 0), text, (255, 0, 0), font=font)
#test_image.show()
#plt.show()
plt.imshow(test_image)
#pylab.show()
#print(test_image)
print(topclass)
#predict(model, './data/four_cropped/he_jiong/he_jiong_0002.jpg')
predict(model, './data/test_images_cropped/angelina_jolie/1.jpg')
呈现的测试结果可以显示预测图片人物的姓名,然后分类预测结果展示这块的代码用的是这位博主的 https://blog.csdn.net/heiheiya/article/details/103031300
可以得到这样的图片展示效果
也可以打印看一下索引与名称的对应
print(idx_to_class)
(5)相要在图片上显示人物名称需要安装能够在位图hang字体,具体安装步骤如下,安装包的链接地址可以更换大致流程是这个样子,先下载压缩包然后解压。举个例子这里安装的是一个韩文包,可以换名字,英文今天网好差,不重新跑中文包了,所以用之前做的这个举个例子。
首先先下载,这个链接里也有很多其他字体包
!wget "https://noto-website-2.storage.googleapis.com/pkgs/NotoSansCJKkr-hinted.zip"
然后解压自己下载的包
!unzip "NotoSansCJKkr-hinted.zip"
#!unzip "simhei.zip"
将解压后要用的字体文件放到这个文件夹下
!mv NotoSansCJKkr-Medium.otf /usr/share/fonts/truetype/
跑一个可视化的代码检查是否安装成功,上边人脸检测用的是arial.ttf这个字体,用的时候换成这个就可以
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm
path = '/usr/share/fonts/truetype/NotoSansCJKkr-Medium.otf'
fontprop = fm.FontProperties(fname=path, size= 15)
plt.plot(range(50), range(50), 'r')
plt.title('차트 제목', fontproperties=fontprop)
plt.ylabel('y축', fontproperties=fontprop)
plt.xlabel('x축', fontproperties=fontprop)
plt.show()
(6)如果想要预测的图片是.png格式的,需要进行一个简单的转化
import torch
import torch.nn as nn
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import os
from PIL import Image, ImageDraw, ImageFont
test_image = Image.open('./data1/me/1111.png').convert('RGB')
test_image.save('./data1/me/1111.jpg')
(7)如果想要加入查看模型准确率,就添加下述代码,这块我的结果没保存,不能展示了。代码是可以用的。
def computeTestSetAccuracy(model, loss_function):
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
test_acc = 0.0
test_loss = 0.0
with torch.no_grad():
model.eval()
for j, (inputs, labels) in enumerate(test_data):
inputs = inputs.to(device)
labels = labels.to(device)
outputs = model(inputs)
loss = loss_function(outputs, labels)
test_loss += loss.item() * inputs.size(0)
ret, predictions = torch.max(outputs.data, 1)
correct_counts = predictions.eq(labels.data.view_as(predictions))
acc = torch.mean(correct_counts.type(torch.FloatTensor))
test_acc += acc.item() * inputs.size(0)
print("Test Batch Number: {:03d}, Test: Loss: {:.4f}, Accuracy: {:.4f}".format(
j, loss.item(), acc.item()
))
avg_test_loss = test_loss/test_data_size
avg_test_acc = test_acc/test_data_size
print("Test accuracy : " + str(avg_test_acc))
#选用前面训练的最好的模型
model = torch.load('faceme.pth')
loss_func = nn.NLLLoss()
computeTestSetAccuracy(model, loss_func)
#predict(model, './data1/me')
哈哈到这里差不多就写完了,最近新增了好多关注开心^-^,不知道大家是由于哪篇博文关注我的呢?如果大家对我的文章有什么意见建议欢迎在评论或私信中指出哦!