运用垂直投影的传统算法去除字符识别图片中的干扰背景

最新推荐文章于 2023-08-01 15:34:27 发布

雪回

最新推荐文章于 2023-08-01 15:34:27 发布

阅读量267

点赞数 1

文章标签：算法

本文链接：https://blog.csdn.net/Mintary/article/details/130827337

版权

在这里插入图片描述

是针对该图片的场景专门写的算法，图片非开源，遮挡了部分，左边红色框的部分是干扰背景，右边的字符是需要用ocr识别的区域。

由于有左边白色土块的干扰，detection效果变差，会识别出不是字符的东西。
请添加图片描述

所以需要切除干扰部分。
由于每张图片出现的干扰部分并不是在同一位置，所以并不适用于直接标定位置然后剪切。

所以想到了运用垂直投影计算白色像素的方法来进行智能切分

cv::Mat getVerProjImage(const cv::Mat &image)
{
	cv::Mat matTmp = image.clone();
	int maxCol = 0, maxNum = 0;//重置255最大数目和最大行
	int minCol = 0, minNum = matTmp.rows;//重置255最小数目和最小行
	int height = matTmp.rows, width = matTmp.cols;//图像的高和宽
	int tmp = 0;//保存当前行的255数目
	int *projArray = new int[width];//保存每一行255数目的数组

	//循环访问图像数据，查找每一行的255点的数目
  int cut1 = 0;
	for (int col = 0; col < width; ++col)
	{
		tmp = 0;
		for (int row = 0; row < height; ++row)
		{
			if (matTmp.at<uchar>(row, col) == 255)/*白色像素*/
			{
				++tmp;
			}
    //       int a=matTmp.at<uchar>(row, col);
    // std::cout<<"a"<<a<<std::endl;
		}
    //std::cout<<"col: "<<col<<"------"<<"tmp1: "<<tmp<<std::endl;
            if (tmp>10)
            {
            cut1=col;
            // std::cout<<"cut1"<<cut1<<std::endl;
            break;
            }
	}
  // std::cout<<"cut1"<<cut1<<std::endl;
  cv::Mat   imageROI0 = matTmp(cv::Rect(cut1, 0, width-cut1, height));
  // cv::imwrite("/home/ds/Desktop/PaddleOcr_Ascend/ocr_Ascend_release/result/cut0.jpg", imageROI0);
      int cut2=0;
    	for (int col = cut1+1; col < width; ++col)
            {
                tmp = 0;
                for (int row = 0; row < height; ++row)
                {
                    if (matTmp.at<uchar>(row, col) == 255)/*白色像素*/
                    {
                        ++tmp;
                    }
    //                           int a=matTmp.at<uchar>(row, col);
    // std::cout<<"a"<<a<<std::endl;
                }
                std::cout<<"col: "<<col<<"--------"<<"tmp2: "<<tmp<<std::endl;
                // std::cout<<"col"<<col<<std::endl;
                    if (tmp<6)
                    {
                    cut2=col;
                    break;
                    }
            }
    std::cout<<"height"<<height<<std::endl; 
    std::cout<<"width"<<width<<std::endl;
    std::cout<<"cut1: "<<cut1<<std::endl;    
    std::cout<<"cut2: "<<cut2<<std::endl;   
    cv::Mat   imageROI = matTmp(cv::Rect(cut2, 0, width-cut2, height));
	return  imageROI;
}

请添加图片描述
判断白色像素大于10处，标记为cut1,然后再判断白色像素小于6时，记为cut2,然后在cut2处将图片进行裁剪，就可去除干扰背景，判断标准是我根据实际情况进行调整修改的。
特别注意使用cv::Rect(cut2, 0, width-cut2, height)函数时，cv::Rect(x原点, y原点, 矩形宽度, 矩形长度)后面两个参数一定要随着改变，我就是没注意到这个，找bug找了非常久。
调用函数

  cv::Mat rgbaImage = cv::imread(img_path, cv::IMREAD_COLOR);
  cv::cvtColor(rgbaImage, rgbaImage, cv::COLOR_BGR2GRAY);
  cv::threshold(rgbaImage, rgbaImage, 40, 255, cv::THRESH_BINARY);
  rgbaImage = getVerProjImage(rgbaImage);

在最开始使用的时候我没有用第二行第三行进行二值化，然后使用matTmp.at(row, col)函数时，其实只能获得三通道的第一个通道的像素值，所以cut出的结果与我预想中不同，cut出的位置很奇怪，找了半天不知道为什么，后来才知道是因为通道的原因。后来添加了二值化的步骤，将图片从三通道转化为单通道,cut出的结果就是正确的。
添加了这个步骤，ocr识别准确率大大增加。
请添加图片描述