pytorch使用模型预测
Object detection is a very popular task in Computer Vision, where, given an image, you predict (usually rectangular) boxes around objects present in the image and also recognize the types of objects. There could be multiple objects in your image and there are various state-of-the-art techniques and architectures to tackle this problem like Faster-RCNN and YOLO v3.
对象检测是Computer Vision中非常流行的任务,在给定图像的情况下,您可以预测图像中存在的对象周围的框(通常为矩形),并识别对象的类型。 您的图像中可能有多个对象,并且有各种各样的最新技术和体系结构可以解决此问题,例如Faster-RCNN和YOLO v3 。
This article talks about the case when there is only one object of interest present in an image. The focus here is more on how to read an image and its bounding box, resize and perform augmentations correctly, rather than on the model itself. The goal is to have a good grasp of the fundamental ideas behind object detection, which you can extend to get a better understanding of the more complex techniques.
本文讨论了图像中仅存在一个感兴趣的对象的情况。 这里的重点更多地放在如何读取图像及其边界框,正确调整大小和执行增强上,而不是模型本身上。 目的是要很好地掌握对象检测背后的基本思想,您可以对其进行扩展以更好地理解更复杂的技术。
Here’s a link to the notebook consisting of all the code I’ve used for this article: https://jovian.ml/aakanksha-ns/road-signs-bounding-box-prediction
这是笔记本的链接,包含我在本文中使用的所有代码: https : //jovian.ml/aakanksha-ns/road-signs-bounding-box-prediction
If you’re new to Deep Learning or PyTorch, or just need a refresher, this might interest you:
如果您是深度学习或PyTorch的新手,或者只是需要复习,那么您可能会感兴趣:
问题陈述 (Problem Statement)
Given an image consisting of a road sign, predict a bounding box around the road sign and identify the type of road sign.
给定包含路标的图像,请预测路标周围的边界框并确定路标的类型。
There are four distinct classes these signs could belong to:
这些迹象可能属于四个不同的类别:
- Traffic Light 红绿灯
- Stop 停止
- Speed Limit 速度极限
- Crosswalk