C++ opencv table Cell extract

55 篇文章 0 订阅
50 篇文章 1 订阅

对于表单元的提取,通过对前面博客中对求得的MASK图像使用与操作,得到joints图像,通过对joints图像进行处理即可得到对应表交点的坐标,通过对坐标处理实现表单元的分割处理。我使用的是Vector来实现坐标的存储,通过erase方法实现相同值的提取,通过阈值处理实现表点坐标的提取。进行实现表单元的分割提取。

  • For the extraction of form elements, I used and operated MASK images obtained in the previous blog to obtain joints images, and through the processing of JOINTS images, I obtained joints images for corresponding table intersection points, and realized the segmentation of form elements through the processing of coordinates.I used Vector to realize the storage of coordinates, erase method to achieve the extraction of the same value, and threshold processing to achieve the extraction of table point coordinates.Carry out the segmentation and extraction of form cells.

核心代码如下:

  • the core of code as the following:
vector<int> Variable_Pixel_White_Y_OK;
vector<int> Variable_Pixel_White_X_OK;
vector<int> Variable_Pixel_White_Y;
vector<int> Variable_Pixel_White_X;
int pixel_white = 0;
int white_pixel_Y_last = 0;
for (int y = 0; y < joints.rows; y++)//rows
{
	for (int x = 0; x < joints.cols; x++)//cols
	{
		pixel_white = joints.at<uchar>(y, x);
		if (pixel_white == 255)//白色像素的位置
		{
			Variable_Pixel_White_Y.push_back(y);//row
			Variable_Pixel_White_X.push_back(x);//col
		}
	}
}
if (Variable_Pixel_White_X.size() > 2 && Variable_Pixel_White_Y.size() > 2)
{
	//========================================================================================================
	sort(Variable_Pixel_White_X.begin(), Variable_Pixel_White_X.end());
	Variable_Pixel_White_X.erase(unique(Variable_Pixel_White_X.begin(), Variable_Pixel_White_X.end()), Variable_Pixel_White_X.end());
	for (unsigned int i = 0; i < Variable_Pixel_White_X.size() - 2; i++)
	{
		if ((Variable_Pixel_White_X[i + 2] - Variable_Pixel_White_X[i + 1]) - (Variable_Pixel_White_X[i + 1] - Variable_Pixel_White_X[i]) > 10)//
		{

			Variable_Pixel_White_X_OK.push_back(Variable_Pixel_White_X[i + 1]);
		}
	}
	Variable_Pixel_White_X_OK.push_back(Variable_Pixel_White_X[Variable_Pixel_White_X.size() - 1]);
	//========================================================================================================
	//========================================================================================================
	sort(Variable_Pixel_White_Y.begin(), Variable_Pixel_White_Y.end());
	Variable_Pixel_White_Y.erase(unique(Variable_Pixel_White_Y.begin(), Variable_Pixel_White_Y.end()), Variable_Pixel_White_Y.end());
	for (unsigned int i = 0; i < Variable_Pixel_White_Y.size() - 2; i++)
	{
		if ((Variable_Pixel_White_Y[i + 2] - Variable_Pixel_White_Y[i + 1]) - (Variable_Pixel_White_Y[i + 1] - Variable_Pixel_White_Y[i]) > 10)//
		{

			Variable_Pixel_White_Y_OK.push_back(Variable_Pixel_White_Y[i + 1]);
		}
	}
	Variable_Pixel_White_Y_OK.push_back(Variable_Pixel_White_Y[Variable_Pixel_White_Y.size() - 1]);
	//========================================================================================================
	//========================================================================================================
	cout << "cols:" << Variable_Pixel_White_X_OK.size() - 1 << endl;
	cout << "rows:" << Variable_Pixel_White_Y_OK.size() - 1 << endl;
	//========================================================================================================
	//========================================================================================================
	//------------------------------------------------>分割<--------------------------------------------------
	int rect_x = 0, rect_y = 0;
	int d_y = 0, d_num = 0, Abs = 0;
	int d = 0, h = 0;
	for (int i = 0; i < Variable_Pixel_White_Y_OK.size() - 1; i++)
	{
		for (int j = 0; j < Variable_Pixel_White_X_OK.size() - 1; j++)
		{
			//
			d = Variable_Pixel_White_X_OK[j + 1] - Variable_Pixel_White_X_OK[j];
			h = Variable_Pixel_White_Y_OK[i + 1] - Variable_Pixel_White_Y_OK[i];
			rect_x = Variable_Pixel_White_X_OK[j];
			rect_y = Variable_Pixel_White_Y_OK[i];
			//(0 <= roi.x && 0 <= roi.width &&
			//roi.x + roi.width <= m.cols &&
			//0 <= roi.y && 0 <= roi.height &&
			//roi.y + roi.height <= m.rows)
			if (rect_x + 5 >= 0 &&
				rect_y + 5 >= 0 &&
				d - 10 >= 0 &&
				h - 10 >= 0 &&
				rect_x + 5 + d - 10 <= gray.cols &&
				rect_y + 5 + h - 10 <= gray.rows)
			{
				Rect rect(rect_x + 5, rect_y + 5, d - 10, h - 10);
				Mat ROI = gray(rect);
				string Img_Name = "./title_time/roi/" + to_string(save_Img) + ".jpg";
				save_Img++;
				imwrite(Img_Name, ROI);
			}
		}
	}
}
else
{

}

但是这种方法适用于常规的表格分割,对于含有合并单元格的不能这么处理,这部分有待继续研究。正在研究新的方法进行对合并单元的表提取分割。后续的工作基于本次代码基础之上实现,对于本次代码如何使用,可以参考我前几天的博客中表提取,将这部分代码加入其中就可以。在这里我就不全部列出了。

  • However, this method is suitable for regular table, and it cannot be done for those containing merged cells. This part needs further study.A new method for table extraction and segmentation of merged cells is being studied.The subsequent work is based on this code. For how to use this code, please refer to the table extraction in my blog a few days ago and add this part of code into it.I won't list them all here.

 I hope I can help you,If you have any questions, please  comment on this blog or send me a private message. I will reply in my free time.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值