简单的图像中箭头方向识别

卖小麦←_←

已于 2022-07-13 15:17:32 修改

阅读量3.9k

点赞数 9

分类专栏：计算机视觉 python 文章标签：计算机视觉 python cnn 深度学习

于 2022-07-13 13:04:20 首次发布

本文链接：https://blog.csdn.net/weixin_42567071/article/details/125761332

版权

python 同时被 2 个专栏收录

30 篇文章

订阅专栏

计算机视觉

11 篇文章

订阅专栏

前言：
说起图像识别，很多人第一次反应就是机器学习，深度学习，卷积神经网络搞起来
这还没完，要有筛选各种模型，调参以及等待模型训练完成等等
不仅烦锁，而且一旦结果不理想，那又得是苦逼的调参和漫长的等待（训练完成）
虽说这样想没错，实际上也是这样，但这些对于想入门图像识别的新手而言，劝退力度可不少。
所以，刚入门的时候，与其在一头扎进烦锁的大项目里，还不如找个简单的项目练个手，既能增加自己的信心又能对卷积的计算加深了解。

任务：
识别图像中箭头所指的方向

待识别的的图像放在images文件夹下
计算方法：
手动输入4个方向的卷积核（9 x 9）
做卷积之前图像要经过压缩，之所以要压缩是因为压缩可以让图像的主要特征更明显，而且图像太多的话，处理的时间会变得很长
对图像做卷积，之后从矩阵里找最大值作为这个卷积核的得分，然后四个卷积核里得分最高的那个作为这个图像箭头所指的方向
效果图：
在这里插入图片描述

代码如下：

import cv2
import numpy as np
import matplotlib.pyplot as plt
import os

def conv(image, kernel, mode='same'):
    
 #进行卷积运算
    res = _convolve(image[:, :], kernel)
    return res

def normal(image, kernel):
#np.multiply()函数：数组和矩阵对应位置相乘，输出与相乘数组/矩阵的大小一致（点对点相乘）
    res = np.multiply(image, kernel).sum()
    if res > 255:
        return 255
    elif res<0:
        return 0
    else:
        return res

def _convolve(image, kernel):
    h_kernel, w_kernel = kernel.shape#获取卷积核的长宽，也就是行数和列数
    h_image, w_image = image.shape#获取欲处理图片的长宽
 #计算卷积核中心点开始运动的点，因为图片边缘不能为空
    res_h = h_image - h_kernel + 1
    res_w = w_image - w_kernel + 1
#生成一个0矩阵，用于保存处理后的图片
    res = np.zeros((res_h, res_w), np.uint8)
    for i in range(res_h):
        for j in range(res_w):
#image处传入的是一个与卷积核一样大小矩阵，这个矩阵取自于欲处理图片的一部分
            #这个矩阵与卷核进行运算，用i与j来进行卷积核滑动
            res[i, j] = normal(image[i:i + h_kernel, j:j + w_kernel], kernel)

    return res

kernel_r = np.array([[0,0,1,1,1,0,0,0,0],
                     [0,0,0,1,3,1,0,0,0],
                     [0,0,0,0,1,3,1,0,0],
                     [1,1,1,1,1,1,3,1,0],
                     [1,3,3,3,3,3,3,3,1],
                     [1,1,1,1,1,1,3,1,0],
                     [0,0,0,0,1,3,1,0,0],
                     [0,0,0,1,3,1,0,0,0],
                     [0,0,1,1,1,0,0,0,0],])

kernel_u = np.array([[0,0,0,0,1,0,0,0,0],
                     [0,0,0,1,3,1,0,0,0],
                     [0,0,1,3,3,3,1,0,0],
                     [0,1,3,1,3,1,3,1,0],
                     [1,3,1,1,3,1,1,3,1],
                     [1,1,0,1,3,1,0,1,1],
                     [1,0,0,1,3,1,0,0,1],
                     [0,0,0,1,3,1,0,0,0],
                     [0,0,0,1,1,1,0,0,0],])

kernel_l = np.array([[0,0,0,0,1,1,1,0,0],
                     [0,0,0,1,3,1,0,0,0],
                     [0,0,1,3,1,0,0,0,0],
                     [0,1,3,1,1,1,1,1,1],
                     [1,3,3,3,3,3,3,3,1],
                     [0,1,3,1,1,1,1,1,1],
                     [0,0,1,3,1,0,0,0,0],
                     [0,0,0,1,3,1,0,0,0],
                     [0,0,0,0,1,1,1,0,0],])

kernel_d = np.array([[0,0,0,1,1,1,0,0,0],
                     [0,0,0,1,3,1,0,0,0],
                     [1,0,0,1,3,1,0,0,1],
                     [1,1,0,1,3,1,0,1,1],
                     [1,3,1,1,3,1,1,3,1],
                     [0,1,3,1,3,1,3,1,0],
                     [0,0,1,3,3,3,1,0,0],
                     [0,0,0,1,3,1,0,0,0],
                     [0,0,0,0,1,0,0,0,0],])

if __name__ == '__main__':
    imgdir = 'images'
    imgpaths = [imgdir+'/'+filename for filename in os.listdir(imgdir) if '.jpg' in filename ]
    
    plt.figure()
    n=0
    for imgpath in imgpaths:
        n+=1
        img = cv2.imread(imgpath)
        img_w,img_h = img.shape[:2]
        if img_w<img_h:
            diff = int((img_h-img_w)/2)
            img2 = img[:,diff:img_h-diff,:]
        else:
            diff = img_w-img_h
            img2 = img[diff:img_w-diff,:,:]
        img4 = cv2.resize(img2,(224,224))
        img4 = cv2.cvtColor(img4, cv2.COLOR_BGR2RGB)
        img3 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
        img5=cv2.resize(img3,(27,27))
        ret,img6 = cv2.threshold(img5, 127, 255, cv2.THRESH_BINARY)
        #print(img6)
        #cv2.imshow('',img6)
        img7 = img6/255
        result_list = []
        kernel_dict = {'right':kernel_r,'up':kernel_u,'left':kernel_l,'down':kernel_d}
        for kernel in kernel_dict.keys():
            #print(conv(img7, kernel_dict[kernel]))
            result_list.append({kernel:np.max(conv(img7, kernel_dict[kernel]))})
        result_list.sort(key=lambda x:max(x.values()),reverse=True)
        #print(imgpath,''.join(result_list[0].keys()))
        plt.subplot(4,4,n)
        plt.axis('off')
        plt.imshow(img4)
        plt.title(imgpath.replace(imgdir+'/','')+'('+''.join(result_list[0].keys())+')')
    plt.show()