openmv图像识别（数字篇）

最新推荐文章于 2024-08-14 14:11:08 发布

即安莉

最新推荐文章于 2024-08-14 14:11:08 发布

阅读量2.8k

点赞数 14

分类专栏： openmv 机器视觉深度学习文章标签：算法计算机视觉前端 python 图像处理深度学习

本文链接：https://blog.csdn.net/2301_79913420/article/details/139774004

版权

openmv 同时被 3 个专栏收录

6 篇文章 3 订阅

订阅专栏

机器视觉

5 篇文章 0 订阅

订阅专栏

深度学习

1 篇文章 0 订阅

订阅专栏

温馨提示，本文的代码是采用多种方法集成的最优方法

Hello，亲爱的读者们！在这个充满挑战与创新的时代，我们总是不断探索新的技术边界。今天，我非常激动地与大家分享我最近在OpenMV4领域的一次有趣尝试。

作为一名热衷于机器视觉和智能识别的开发者，我一直在寻找能够提升项目性能和准确性的新工具和方法。OpenMV4以其强大的图像处理能力和灵活的编程接口，成为了我探索之旅中的得力助手。

在这篇博客中，我将带领大家一起走进基于色块识别的图形、颜色以及坐标识别的世界。通过一段精心编写的代码，我们可以实现对特定颜色和形状的物体进行快速而准确的识别，而且据我的实际测试，其准确率能够达到90%以上！

不过，请注意，为了获得最佳识别效果，我们需要确保环境光线适宜，并且物体与摄像头保持适当的距离。如果遇到识别效果不佳的情况，您还可以根据实际情况调整代码中的颜色阈值和物体距离参数。

不多说了，让我们直接进入正题，看看这段神奇的代码是如何工作的吧！（友情提示：为了运行这段代码，您需要使用OpenMV IDE。

一、代码初始化：

# Edge Impulse - OpenMV Object Detection Example

import sensor, image, time, os, tf, math, uos, gc

sensor.reset()                         # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)    # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
sensor.set_windowing((240, 240))       # Set 240x240 window.
sensor.skip_frames(time=2000)          # Let the camera adjust.

二、数字部分：

colors = [ # Add more colors if you are detecting more than 7 types of classes at once.
    (255,   0,   0),
    (  0, 255,   0),
    (255, 255,   0),
    (  0,   0, 255),
    (255,   0, 255),
    (  0, 255, 255),
    (255, 255, 255),
]

clock = time.clock()
while(True):
    clock.tick()

    img = sensor.snapshot()

    # detect() returns all objects found in the image (splitted out per class already)
    # we skip class index 0, as that is the background, and then draw circles of the center
    # of our objects

    for i, detection_list in enumerate(net.detect(img, thresholds=[(math.ceil(min_confidence * 255), 255)])):
        if (i == 0): continue # background class
        if (len(detection_list) == 0): continue # no detections for this class?

        print("********** %s **********" % labels[i])
        for d in detection_list:
            [x, y, w, h] = d.rect()
            center_x = math.floor(x + (w / 2))
            center_y = math.floor(y + (h / 2))
            print('x %d\ty %d' % (center_x, center_y))
            img.draw_circle((center_x, center_y, 12), color=colors[i], thickness=2)

    print(clock.fps(), "fps", end="\n\n")

代码说明：

注意：文件部分没有，我的代码是通过大模型训练出来的，如果因为训练花了好一周时间改了很多遍才可以让数字能够在倾斜45度也能识别，所以挂咸鱼上卖的

colors: 这是一个RGB颜色列表，用于为不同类别的对象分配不同的颜色。列表中的每个元组代表一种颜色，用于在图像上绘制检测到的对象。
clock = time.clock(): 创建一个时钟对象，用于测量和显示脚本的帧率（FPS）。
while(True):: 一个无限循环，表示脚本将持续运行，直到被外部中断。
clock.tick(): 每次循环时调用，用于更新时钟对象，以便计算帧率。
img = sensor.snapshot(): 从相机获取一张快照图像。
net.detect(img, thresholds=[(math.ceil(min_confidence * 255), 255)]): 调用神经网络进行对象检测。thresholds参数是一个列表，包含一个元组，元组中的第一个值是检测的最小置信度，第二个值是最大置信度。置信度用于确定检测结果的可信性。
for i, detection_list in enumerate(...): 遍历神经网络检测到的对象列表。enumerate用于同时获取类别索引i和对应的检测列表detection_list。
if (i == 0): continue: 跳过索引为0的类别，因为在某些模型中，0通常表示背景。
if (len(detection_list) == 0): continue: 如果当前类别没有检测到任何对象，则跳过。
print("********** %s **********" % labels[i]): 打印当前类别的名称。注意，labels变量在代码中未定义，可能在代码的其他部分定义。
for d in detection_list:: 遍历当前类别的所有检测结果。
[x, y, w, h] = d.rect(): 从检测结果中提取对象的边界框坐标和尺寸。
center_x和center_y: 计算对象边界框的中心坐标。
print('x %d\ty %d' % (center_x, center_y)): 打印对象中心的坐标。
img.draw_circle(...): 在图像上绘制一个圆圈，表示检测到的对象的中心。使用colors列表中的颜色，并设置圆圈的厚度。
print(clock.fps(), "fps", end="\n\n"): 打印当前的帧率，并换行。

三、总体代码

# Edge Impulse - OpenMV Object Detection Example

import sensor, image, time, os, tf, math, uos, gc

sensor.reset()                         # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)    # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
sensor.set_windowing((240, 240))       # Set 240x240 window.
sensor.skip_frames(time=2000)          # Let the camera adjust.

net = None
labels = None
min_confidence = 0.5

try:
    # load the model, alloc the model file on the heap if we have at least 64K free after loading
    net = tf.load("trained.tflite", load_to_fb=uos.stat('trained.tflite')[6] > (gc.mem_free() - (64*1024)))
except Exception as e:
    raise Exception('Failed to load "trained.tflite", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

try:
    labels = [line.rstrip('\n') for line in open("labels.txt")]
except Exception as e:
    raise Exception('Failed to load "labels.txt", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

colors = [ # Add more colors if you are detecting more than 7 types of classes at once.
    (255,   0,   0),
    (  0, 255,   0),
    (255, 255,   0),
    (  0,   0, 255),
    (255,   0, 255),
    (  0, 255, 255),
    (255, 255, 255),
]

clock = time.clock()
while(True):
    clock.tick()

    img = sensor.snapshot()

    # detect() returns all objects found in the image (splitted out per class already)
    # we skip class index 0, as that is the background, and then draw circles of the center
    # of our objects

    for i, detection_list in enumerate(net.detect(img, thresholds=[(math.ceil(min_confidence * 255), 255)])):
        if (i == 0): continue # background class
        if (len(detection_list) == 0): continue # no detections for this class?

        print("********** %s **********" % labels[i])
        for d in detection_list:
            [x, y, w, h] = d.rect()
            center_x = math.floor(x + (w / 2))
            center_y = math.floor(y + (h / 2))
            print('x %d\ty %d' % (center_x, center_y))
            img.draw_circle((center_x, center_y, 12), color=colors[i], thickness=2)

    print(clock.fps(), "fps", end="\n\n")