利用Office2003自带组件进行OCR(附源码)

本文摘自:http://www.codeproject.com/KB/office/modi.aspx

源码下载地址: Download source files - 34.1 KB (不知道怎么上传文件,只能用原文的下载地址了

Introduction

Optical Character Recognition (OCR) extracts text and layout information from document images. With the help of Microsoft Office Document Imaging Library (MODI), which is contained in the Office 2003 package, you can easily integrate OCR functionality into your own applications. In combination with the MODI Document Viewer control, you will have complete OCR support with only a few lines of code.

Important note: MS Office XP does not contain MODI, MS Office 2003 is required!

Getting Started

Adding the Library

First of all, you need to add the library's reference to your project: Microsoft Office Document Imaging 11.0 Type Library (located in MDIVWCTL.DLL).

Create a Document Instance and Assign an Image File

Supported image formats are TIFF, multi-page TIFF, and BMP.

_MODIDocument = new MODI.Document();
_MODIDocument.Create(filename);

Call the OCR Method

The OCR process is started by the MODIDocument.OCR method.

// The MODI call for OCR
_MODIDocument.OCR(_MODIParameters.Language,
_MODIParameters.WithAutoRotation,
_MODIParameters.WithStraightenImage);

With the Document.OCR call, all the contained pages of the document are processed. You can also call the OCR method for each page separately, by calling the MODIImage.OCR method in the very same way. As you can see, the OCR method has three parameters:

  • Language
  • AutoRotation
  • StraightenImages

The use of these parameters depend on your specific imaging scenario.

Screenshot - modiSettings.JPG

Tracking the OCR Progress

Since the whole recognition process can take a few seconds, you may want to keep an eye on the progress. Therefore, the OnOCRProgress event can be used.

// add event handler for progress visualisation
_MODIDocument.OnOCRProgress +=
new MODI._IDocumentEvents_OnOCRProgressEventHandler(this.ShowProgress);
public void ShowProgress(int progress, ref bool cancel)
{
statusBar1.Text = progress.ToString() + "% processed.";
}

The Document Viewer

Together with the MODI document model comes the MODI viewer component AxMODI.AxMiDocView. The viewer is contained in the same library as the document model (MDIVWCTL.DLL). With a single statement, you can assign the document to the viewer. The viewer offers you many operations like selection, pan etc..

axMiDocView1.Document = _MODIDocument;

To make the component available in Visual Studio, just go to the Toolbox Explorer, open the context menu, select Add/Delete Elements.., and choose the COM Controls tab. Then, search for Microsoft Office Document Imaging Viewer 11.0, and enable it.

Processing the Recognition Result

Working on the result structure is pretty straightforward. If you just want to use the full text, you simply need the image's Layout.Text property. As an example for further processing, here is a little statistic method:

Collapse
private void Statistic()
{
// iterating through the document's structure doing some statistics.
    string statistic = "";
for (int i = 0 ; i < _MODIDocument.Images.Count; i++)
{
int numOfCharacters = 0;
int charactersHeights = 0;
MODI.Image image = (MODI.Image)_MODIDocument.Images[i];
MODI.Layout layout = image.Layout;
// getting the page's words
        for (int j= 0; j< layout.Words.Count; j++)
{
MODI.Word word = (MODI.Word) layout.Words[j];
// getting the word's characters
            for (int k = 0; k < word.Rects.Count; k++)
{
MODI.MiRect rect = (MODI.MiRect) word.Rects[k];
charactersHeights  += rect.Bottom-rect.Top;
numOfCharacters++;
}
}
float avHeight = (float )charactersHeights/numOfCharacters;
statistic += "Page "+i+ ": Avarage character height is: "+
"avHeight.ToString("0.00") +" pixel!"+ "\r\n";
}
MessageBox.Show("Document Statistic:\r\n"+statistic);
}

Searching

MODI also offers a full featured built-in search. Since a document may contain several pages, you can use the search method to browse through the pages.

Screenshot - modiSearch.JPG

MODI offers several arguments to customize your search.

// convert our search dialog properties to corresponding MODI arguments
object PageNum = _DialogSearch.Properties.PageNum;
object WordIndex = _DialogSearch.Properties.WordIndex;
object StartAfterIndex = _DialogSearch.Properties.StartAfterIndex;
object Backward = _DialogSearch.Properties.Backward;
bool MatchMinus = _DialogSearch.Properties.MatchMinus;
bool MatchFullHalfWidthForm = _DialogSearch.Properties.MatchFullHalfWidthForm;
bool MatchHiraganaKatakana = _DialogSearch.Properties.MatchHiraganaKatakana;
bool IgnoreSpace =_DialogSearch.Properties.IgnoreSpace;

To use the search function, you need to create an instance of the type MiDocSearchClass, where all search arguments take place:

// initialize MODI search
MODI.MiDocSearchClass search = new MODI.MiDocSearchClass();
search.Initialize(
_MODIDocument,
_DialogSearch.Properties.Pattern,
ref PageNum,
ref WordIndex,
ref StartAfterIndex,
ref Backward,
MatchMinus,
MatchFullHalfWidthForm,
MatchHiraganaKatakana,
IgnoreSpace);

After the initialization call of the search instance, the process call itself is simple:

MODI.IMiSelectableItem SelectableItem = null;
// the one and only search call
search.Search(null,ref SelectableItem);

You will find the search results in the referenced SelectableItem argument. The MODI search has impressive features, and works very well. Sure, it is restricted to search for plain text. In most real world applications, you will need some kind of fuzzy searching since your text results may be corrupted by single OCR errors. But for a few lines of integration code, it is an impressive functionality.

MODI, Office 2007 and Vista

Good news: Office 2007 and Vista, both support MODI! It's not installed by default, but you can easily add the package via installing options of your Office 2007. You just need to rerun the setup.exe (of your Office installation) again and choose the package as in the screenshot below.

Screenshot - modi_vista.jpg

About Document Processing

OCR is only one step in document processing. To get a more qualified access to your paper based document information, usually a couple steps and techniques are required:

Scanning

Before documents are available as images, they have to be digitalized. This process is called 'scanning.' There are two important standards used for interacting with the scanning hardware: TWAIN and WIA. There are (at least) two good articles in CodeProject on how to use these APIs.

Image Processing

Although the scanning devices are getting better, a couple of methods can be used to increase the image quality. These pre-processing functions include noise reduction and angle correction, for instance.

OCR Itself

As a next step, OCR itself interprets pixel-based images to layout and text elements. OCR can be called the 'highest' bottom up technology, where the system has no or only little knowledge about the business context. Recognizing hand written documents is often called ICR (intelligent Character Recognition).

Document Classification

In most business cases, you have certain target structures you want to fill with the document information. That is called 'Document Classification and Detail Extraction.' For instance, you might want to process invoices, or you have certain table structures to fill. In Document Processing Part II, you can see how this kind of content knowledge can be used.

Beyond

After that, you might have an address database you want to match the document addresses with. Due to 'noisy' environments or disordered information, you need more sophisticated techniques than simple SQL. In the last step, the extracted information is given to the client application (like an ERP backbone) where customized workflow activities are triggered. The sector creates new names for that every couple of months: ECM (Enterprise Content Management), DMS (Document Management System), IDP (Intelligent Document Processing), (DLC) Document Life Cycle.

References

Versions

转载于:https://www.cnblogs.com/xuneng/archive/2008/08/01/1258391.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: VB OCR识别码是指使用VB语言编写的OCR(Optical Character Recognition,光学字符识别)识别程序的代码实现。OCR是一种通过计算机技术识别并转换印刷体或手写体字符成为可编辑文本的技术。VB OCR识别码可以是一个基于图像处理算法的识别程序,可以识别图像中的字符并将其转换为文本。OCR识别码可以通过学习和模拟人类视觉和语言处理方式进行设计和实现。 VB OCR识别码的实现步骤一般包括图像预处理、特征提取和字符识别三个主要环节。首先,必须将原始图像进行预处理以获取更好的图像质量,比如调整亮度、对比度和色彩平衡等。然后利用特征提取算法提取出图像中字符的特征,如边缘、轮廓、纹理等信息,然后使用字符识别算法根据这些特征对字符进行识别。字符识别算法包括模板匹配、神经网络、支持向量机等。 然而,识别率往往受到诸多因素的影响,如图像的质量、字体、字号、倾斜角度、噪声等。因此,要提高VB OCR识别码的准确性,需要对算法进行优化,通过模型迭代、增加样本量、引入自适应学习等方法来改进算法,同时结合实际场景进行调整和优化。 总之,VB OCR识别码是一种非常有用的技术,可以在各种场景下进行自动化文本识别和报表处理等工作,受到越来越多应用的关注。 ### 回答2: 首先,OCR技术是一种可以将印刷体或手写体转换成文本或者数字的自动识别技术。而VB OCR识别码是指使用Visual Basic编写的OCR识别程序的代码。这种码可以用来实现光学字符识别功能,即对图像中的字符进行识别和转换,从而可以拓宽应用范围,提高实用价值。 VB OCR识别码需要包含以下主要部分:图像预处理、字符分割、特征提取和识别分类。首先,通过图像预处理,对图像进行滤波、二值化等操作,使字符更加清晰。然后,通过字符分割,将图像分割成单个字符,以供后续的识别处理。接着,通过特征提取,对字符的几何形状特征、灰度特征等进行提取,建立特征向量,为后续的分类处理做准备。最后,通过分类器将字符识别出来,并将结果输出。 VB OCR识别码可以应用于各种领域,比如照片文字识别、电子文档转换等。不过,在使用该码时,需要注意数据集和识别模型的训练和优化,以及对应的算法和实现细节的完善。 总之,VB OCR识别码是实现OCR技术的重要基础,它可以帮助我们有效地提高图像识别的精度和效率。 ### 回答3: VB OCR识别码是一种基于VB语言开发的图像识别程序,可用于识别图片中的文字内容。这种码程序能够对一些复杂的、包含多种字体的图像进行识别,并将其转化成文字的形式。这对于小型企业、个人开发者或学习者来说,是一个非常实用且好用的工具。 VB OCR识别码主要包含两个部分:图像处理模块和文字识别模块。图像处理模块主要用于对输入的图像进行预处理,如图像灰度化、二值化、去噪等,以提高后续的图像识别准确率。而文字识别模块则是核心部分,负责将预处理后的图像中的文字信息进行识别,并将其转化成计算机可读的文字形式。 VB OCR识别码的优点是开发难度较低,图像识别准确率较高,还能够进行定制化的开发,适应不同的应用场景。另外,VB语言本身就是一种容易上手的编程语言,对于初学者来说也比较友好。 总之,VB OCR识别码是一款实用而且易用的软件工具,能够为很多人的学习和工作提供很大的帮助和便利。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值