在这个notebook中,你将迈出第一步,来开发可以作为移动端或 Web应用程序一部分的算法。在这个项目的最后,你的程序将能够把用户提供的任何一个图像作为输入。如果可以从图像中检测到一只狗,它会输出对狗品种的预测。如果图像中是一个人脸,它会预测一个与其最相似的狗的种类。下面这张图展示了完成项目后可能的输出结果。
- Step 0: 导入数据集
- Step 1: 检测人脸
- Step 2: 检测狗狗
- Step 3: 从头创建一个CNN来分类狗品种
- Step 4: 使用一个CNN来区分狗的品种(使用迁移学习)
- Step 5: 完成你的算法
- Step 6: 测试你的算法
步骤 0: 导入数据集
下载狗数据集:dog dataset,放置在data文件夹内并解压。
下载人数据集:human dataset(一个人脸识别数据集),放置在data文件夹内并解压。
步骤1:检测人脸
我们将使用 OpenCV 中的 Haar feature-based cascade classifiers 来检测图像中的人脸。OpenCV 提供了很多预训练的人脸检测模型,它们以XML文件保存在 github。我们已经下载了其中一个检测模型,并且把它存储在 haarcascades
的目录中。
在如下代码单元中,我们将演示如何使用这个检测模型在样本图像中找到人脸。
写一个人脸探测器
我们可以使用这个程序编写一个函数,如果在图像中检测到人脸,则返回True,否则返回False。这个函数名为face_detector,将图像的文件路径作为输入,出现在下面的代码块中。
评估人脸检测器
问题1:使用下面的代码单元格来测试face_detector函数的性能。
- 在human_files的前100张图片中有多大概率检测到人脸?
- 在dog_files的前100张图片中中有多大概率检测到人脸?
步骤 2: 检测狗狗
在这个部分中,我们使用预训练的 vgg16 模型去检测图像中的狗。
给定一个图像,这个预先训练过的vgg16模型将为图像中包含的对象返回一个预测(来自ImageNet中的1000个可能类别)。
用预先训练好的模型进行预测
在下一个代码单元格中,您将编写一个函数,它接受图像的路径作为输入,并返回由预先训练的vgg -16模型预测的ImageNet类对应的索引。输出应该始终是0到999之间的整数(含999)。
from PIL import Image
import torchvision.transforms as transforms
# Set PIL to be tolerant of image files that are truncated.
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
def VGG16_predict(img_path):
'''
Use pre-trained VGG-16 model to obtain index corresponding to
predicted ImageNet class for image at specified path
Args:
img_path: path to an image
Returns:
Index corresponding to VGG-16 model's prediction
'''
## TODO: Complete the function.
## Load and pre-process an image from the given img_path
## Return the *index* of the predicted class for that image
normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
transform = transforms.Compose([transforms.Resize((224,224)),
transforms.CenterCrop(224),
transforms.ToTensor(),
normalize
])
image = Image.open(img_path)
#print(image.size)
image = transform(image)
image.unsqueeze_(0)
#print(image.size)
if use_cuda:
image = image.cuda()
output = VGG16(image)
if use_cuda:
output = output.cuda()
class_index = output.data.cpu().numpy().argmax()
return class_index # predicted class index
#print(dog_files[0])
VGG16_predict(dog_files[0])
结果:
252
写一个狗狗探测器
在研究该 清单 的时候,你会注意到,狗类别对应的序号为151-268。因此,在检查预训练模型判断图像是否包含狗的时候,我们只需要检查如上的 VGG16_predict 函数是否返回一个介于151和268之间(包含区间端点)的值。
我们通过这些想法来完成下方的 dog_detector
函数,如果从图像中检测到狗就返回 True
,否则返回 False
。
问题2:使用下面的代码单元格来测试dog_detector函数的性能。
- 在human_files_short中有多少百分比检测到狗?
- 在dog_files_short中有多少百分比检测到狗?
你可以自由地探索其他预先训练网络(如Inception-v3, ResNet-50等)。请使用下面的代码单元来测试其他预先训练好的PyTorch模型。如果您决定执行这个可选任务,请报告human_files_short和dog_files_short的性能。
步骤 3: 从头开始创建一个CNN来分类狗品种
现在我们已经实现了一个函数,能够在图像中识别人类及狗狗。但我们需要更进一步的方法,来对狗的类别进行识别。在这一步中,你需要实现一个卷积神经网络来对狗的品种进行分类。你需要从头实现你的卷积神经网络(在这一阶段,你还不能使用迁移学习),并且你需要达到超过1%的测试集准确率。在本项目的步骤五种,你还有机会使用迁移学习来实现一个准确率大大提高的模型。
值得注意的是,对狗的图像进行分类是一项极具挑战性的任务。因为即便是一个正常人,也很难区分布列塔尼犬和威尔士史宾格犬。
不难发现其他的狗品种会有很小的类间差别(比如金毛寻回犬和美国水猎犬)。
同样,拉布拉多犬(labradors)有黄色、棕色和黑色这三种。那么你设计的基于视觉的算法将不得不克服这种较高的类间差别,以达到能够将这些不同颜色的同类狗分到同一个品种中。
我们也提到了随机分类将得到一个非常低的结果:不考虑品种略有失衡的影响,随机猜测到正确品种的概率是1/133,相对应的准确率是低于1%的。
请记住,在深度学习领域,实践远远高于理论。大量尝试不同的框架吧,相信你的直觉!当然,玩得开心!
为Dog数据集指定数据加载器
问题3:描述您选择的数据预处理过程。
- 你的代码如何调整图像的大小(裁剪,拉伸等)?输入张量的大小是多少,为什么?
- 你决定扩大数据集了吗?如果有,如何(通过平移,翻转,旋转等)?如果没有,为什么?
指定损失函数和优化器
训练和验证模型
# the following import is required for training to be robust to truncated images
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
"""returns trained model"""
# initialize tracker for minimum validation loss
valid_loss_min = np.Inf
for epoch in range(1, n_epochs+1):
# initialize variables to monitor training and validation loss
train_loss = 0.0
valid_loss = 0.0
###################
# train the model #
###################
model.train()
for batch_idx, (data, target) in enumerate(loaders['train']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
## find the loss and update the model parameters accordingly
## record the average training loss, using something like
## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
optimizer.zero_grad()
## find the loss and update the model parameters accordingly
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
## record the average training loss, using something like
## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
train_loss += ((1 / (batch_idx + 1)) * (loss.data - train_loss))
# print training/validation statistics
if (batch_idx+1) % 40 == 0:
print('Epoch: {} \tBatch: {} \tTraining Loss: {:.6f}'.format(epoch, batch_idx + 1, train_loss))
######################
# validate the model #
######################
model.eval()
for batch_idx, (data, target) in enumerate(loaders['valid']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
## update the average validation loss
output = model(data)
loss = criterion(output, target)
valid_loss += ((1 / (batch_idx + 1)) * (loss.data - valid_loss))
# print training/validation statistics
if (batch_idx+1) % 40 == 0:
print('Epoch: {} \tBatch: {} \tValidation Loss: {:.6f}'.format(epoch, batch_idx + 1, valid_loss))
# print training/validation statistics
print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
epoch,
train_loss,
valid_loss
))
## TODO: save the model if validation loss has decreased
if valid_loss <= valid_loss_min:
print('Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...'.format(
valid_loss_min,
valid_loss))
torch.save(model.state_dict(), save_path)
valid_loss_min = valid_loss
# return trained model
return model
epoch = 10
# train the model
model_scratch = train(epoch, loaders_scratch, model_scratch, optimizer_scratch,
criterion_scratch, use_cuda, 'model_scratch.pt')
# load the model that got the best validation accuracy
model_scratch.load_state_dict(torch.load('model_scratch.pt'))
结果:
Epoch: 1 Batch: 40 Training Loss: 4.889728
Epoch: 1 Batch: 80 Training Loss: 4.887787
Epoch: 1 Batch: 120 Training Loss: 4.887685
Epoch: 1 Batch: 160 Training Loss: 4.885648
Epoch: 1 Batch: 200 Training Loss: 4.881847
Epoch: 1 Batch: 240 Training Loss: 4.873278
Epoch: 1 Batch: 280 Training Loss: 4.862270
Epoch: 1 Batch: 320 Training Loss: 4.854340
Epoch: 1 Batch: 40 Validation Loss: 4.712207
Epoch: 1 Training Loss: 4.851301 Validation Loss: 4.704982
Validation loss decreased (inf --> 4.704982). Saving model ...
Epoch: 2 Batch: 40 Training Loss: 4.730308
Epoch: 2 Batch: 80 Training Loss: 4.719476
Epoch: 2 Batch: 120 Training Loss: 4.701708
Epoch: 2 Batch: 160 Training Loss: 4.695746
Epoch: 2 Batch: 200 Training Loss: 4.692133
Epoch: 2 Batch: 240 Training Loss: 4.675904
Epoch: 2 Batch: 280 Training Loss: 4.663143
Epoch: 2 Batch: 320 Training Loss: 4.650386
Epoch: 2 Batch: 40 Validation Loss: 4.488307
Epoch: 2 Training Loss: 4.643542 Validation Loss: 4.494160
Validation loss decreased (4.704982 --> 4.494160). Saving model ...
Epoch: 3 Batch: 40 Training Loss: 4.474283
Epoch: 3 Batch: 80 Training Loss: 4.501595
Epoch: 3 Batch: 120 Training Loss: 4.477735
Epoch: 3 Batch: 160 Training Loss: 4.488136
Epoch: 3 Batch: 200 Training Loss: 4.490754
Epoch: 3 Batch: 240 Training Loss: 4.487989
Epoch: 3 Batch: 280 Training Loss: 4.490090
Epoch: 3 Batch: 320 Training Loss: 4.481546
Epoch: 3 Batch: 40 Validation Loss: 4.262285
Epoch: 3 Training Loss: 4.479444 Validation Loss: 4.275317
Validation loss decreased (4.494160 --> 4.275317). Saving model ...
Epoch: 4 Batch: 40 Training Loss: 4.402790
Epoch: 4 Batch: 80 Training Loss: 4.372338
Epoch: 4 Batch: 120 Training Loss: 4.365306
Epoch: 4 Batch: 160 Training Loss: 4.367325
Epoch: 4 Batch: 200 Training Loss: 4.374326
Epoch: 4 Batch: 240 Training Loss: 4.369847
Epoch: 4 Batch: 280 Training Loss: 4.365964
Epoch: 4 Batch: 320 Training Loss: 4.363493
Epoch: 4 Batch: 40 Validation Loss: 4.249445
Epoch: 4 Training Loss: 4.364571 Validation Loss: 4.248449
Validation loss decreased (4.275317 --> 4.248449). Saving model ...
Epoch: 5 Batch: 40 Training Loss: 4.229365
Epoch: 5 Batch: 80 Training Loss: 4.267400
Epoch: 5 Batch: 120 Training Loss: 4.269664
Epoch: 5 Batch: 160 Training Loss: 4.257591
Epoch: 5 Batch: 200 Training Loss: 4.261866
Epoch: 5 Batch: 240 Training Loss: 4.247512
Epoch: 5 Batch: 280 Training Loss: 4.239336
Epoch: 5 Batch: 320 Training Loss: 4.230827
Epoch: 5 Batch: 40 Validation Loss: 4.043582
Epoch: 5 Training Loss: 4.231559 Validation Loss: 4.039588
Validation loss decreased (4.248449 --> 4.039588). Saving model ...
Epoch: 6 Batch: 40 Training Loss: 4.180193
Epoch: 6 Batch: 80 Training Loss: 4.140314
Epoch: 6 Batch: 120 Training Loss: 4.153989
Epoch: 6 Batch: 160 Training Loss: 4.140887
Epoch: 6 Batch: 200 Training Loss: 4.151268
Epoch: 6 Batch: 240 Training Loss: 4.153749
Epoch: 6 Batch: 280 Training Loss: 4.153314
Epoch: 6 Batch: 320 Training Loss: 4.156451
Epoch: 6 Batch: 40 Validation Loss: 3.940857
Epoch: 6 Training Loss: 4.149529 Validation Loss: 3.945810
Validation loss decreased (4.039588 --> 3.945810). Saving model ...
Epoch: 7 Batch: 40 Training Loss: 4.060485
Epoch: 7 Batch: 80 Training Loss: 4.065772
Epoch: 7 Batch: 120 Training Loss: 4.056967
Epoch: 7 Batch: 160 Training Loss: 4.068470
Epoch: 7 Batch: 200 Training Loss: 4.076772
Epoch: 7 Batch: 240 Training Loss: 4.087616
Epoch: 7 Batch: 280 Training Loss: 4.074337
Epoch: 7 Batch: 320 Training Loss: 4.080192
Epoch: 7 Batch: 40 Validation Loss: 3.860693
Epoch: 7 Training Loss: 4.078263 Validation Loss: 3.884382
Validation loss decreased (3.945810 --> 3.884382). Saving model ...
Epoch: 8 Batch: 40 Training Loss: 3.960585
Epoch: 8 Batch: 80 Training Loss: 3.983979
Epoch: 8 Batch: 120 Training Loss: 3.965129
Epoch: 8 Batch: 160 Training Loss: 3.965021
Epoch: 8 Batch: 200 Training Loss: 3.965830
Epoch: 8 Batch: 240 Training Loss: 3.976013
Epoch: 8 Batch: 280 Training Loss: 3.975547
Epoch: 8 Batch: 320 Training Loss: 3.978744
Epoch: 8 Batch: 40 Validation Loss: 3.784086
Epoch: 8 Training Loss: 3.980776 Validation Loss: 3.779312
Validation loss decreased (3.884382 --> 3.779312). Saving model ...
Epoch: 9 Batch: 40 Training Loss: 3.917738
Epoch: 9 Batch: 80 Training Loss: 3.967938
Epoch: 9 Batch: 120 Training Loss: 3.934165
Epoch: 9 Batch: 160 Training Loss: 3.917138
Epoch: 9 Batch: 200 Training Loss: 3.910391
Epoch: 9 Batch: 240 Training Loss: 3.909857
Epoch: 9 Batch: 280 Training Loss: 3.907439
Epoch: 9 Batch: 320 Training Loss: 3.901893
Epoch: 9 Batch: 40 Validation Loss: 3.826410
Epoch: 9 Training Loss: 3.903304 Validation Loss: 3.824970
Epoch: 10 Batch: 40 Training Loss: 3.910845
Epoch: 10 Batch: 80 Training Loss: 3.910709
Epoch: 10 Batch: 120 Training Loss: 3.915924
Epoch: 10 Batch: 160 Training Loss: 3.900949
Epoch: 10 Batch: 200 Training Loss: 3.883792
Epoch: 10 Batch: 240 Training Loss: 3.882242
Epoch: 10 Batch: 280 Training Loss: 3.875963
Epoch: 10 Batch: 320 Training Loss: 3.858594
Epoch: 10 Batch: 40 Validation Loss: 3.638186
Epoch: 10 Training Loss: 3.861175 Validation Loss: 3.647465
Validation loss decreased (3.779312 --> 3.647465). Saving model ...
<All keys matched successfully>
测试模型
在狗图片的测试数据集中试用你的模型。使用下面的代码单元计算和打印测试loss和准确性。确保您的测试精度大于10%。
步骤 4: 使用CNN迁移学习来区分狗的品种
使用 迁移学习(Transfer Learning)的方法,能帮助我们在不损失准确率的情况下大大减少训练时间。在以下步骤中,你可以尝试使用迁移学习来训练你自己的CNN。
模型架构
指定损失函数和优化器
训练和验证模型
# train the model
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
"""returns trained model"""
# initialize tracker for minimum validation loss
valid_loss_min = np.Inf
for epoch in range(1, n_epochs+1):
# initialize variables to monitor training and validation loss
train_loss = 0.0
valid_loss = 0.0
###################
# train the model #
###################
model.train()
for batch_idx, (data, target) in enumerate(loaders['train']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
optimizer.zero_grad()
## find the loss and update the model parameters accordingly
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
## record the average training loss, using something like
## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
train_loss += ((1 / (batch_idx + 1)) * (loss.data - train_loss))
# print training/validation statistics
if (batch_idx+1) % 40 == 0:
print('Epoch: {} \tBatch: {} \tTraining Loss: {:.6f}'.format(epoch, batch_idx + 1, train_loss))
######################
# validate the model #
######################
model.eval()
for batch_idx, (data, target) in enumerate(loaders['valid']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
## update the average validation loss
output = model(data)
loss = criterion(output, target)
valid_loss += ((1 / (batch_idx + 1)) * (loss.data - valid_loss))
# print training/validation statistics
if (batch_idx+1) % 40 == 0:
print('Epoch: {} \tBatch: {} \tValidation Loss: {:.6f}'.format(epoch, batch_idx + 1, valid_loss))
# print training/validation statistics
print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
epoch,
train_loss,
valid_loss
))
## TODO: save the model if validation loss has decreased
if valid_loss <= valid_loss_min:
print('Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...'.format(
valid_loss_min,
valid_loss))
torch.save(model.state_dict(), save_path)
valid_loss_min = valid_loss
# return trained model
return model
n_epochs = 2
model_transfer = train(n_epochs, loaders_transfer, model_transfer, optimizer_transfer, criterion_transfer, use_cuda, 'model_transfer.pt')
# load the model that got the best validation accuracy (uncomment the line below)
model_transfer.load_state_dict(torch.load('model_transfer.pt'))
结果:
测试模型
用模型预测狗的品种
编写一个函数,以图像路径作为输入,并返回您的模型预测的狗的品种
步骤 5: 完成你的算法
实现一个算法,它的输入为图像的路径,它能够区分图像是否包含一个人、狗或两者都不包含,然后:
- 如果从图像中检测到一只狗,返回被预测的品种。
- 如果从图像中检测到人,返回最相像的狗品种。
- 如果两者都不能在图像中检测到,输出错误提示。
我们非常欢迎你来自己编写检测图像中人类与狗的函数,你可以随意地使用上方完成的 face_detector
和 dog_detector
函数。你需要在步骤5使用你的CNN来预测狗品种。
下面提供了算法的示例输出,但你可以自由地设计自己的模型!
步骤 6: 测试你的算法
在这个部分中,你将尝试一下你的新算法!算法认为你看起来像什么类型的狗?如果你有一只狗,它可以准确地预测你的狗的品种吗?如果你有一只猫,它会将你的猫误判为一只狗吗?
结果:
This is a human picture who looks like Norwich terrier
This is a human picture who looks like Curly-coated retriever
This is a human picture who looks like Curly-coated retriever
This is a dog picture and it's English cocker spaniel
This is a dog picture and it's Bloodhound
This is a dog picture and it's Lowchen