Tesseract Call Graph

Tesseract OCR的页面布局分析主要包括三个主要步骤:读取图像并设置内部结构(PIX)、页面布局分析和获取文本块的文本方向。在读取图像时,不同类型的图像会转换为RGBA格式。页面布局分析涉及确定文本块的文本方向和垂直对齐情况,并通过ColumnFinder进行列查找以找到文本、图像、规则线和表格等块。
摘要由CSDN通过智能技术生成

Route 0 - Overall path

main[api/tesseractmain.cpp] -> 
TessBaseAPI::ProcessPages[api/baseapi.cpp] -> 
TessBaseAPI::ProcessPage[api/baseapi.cpp] -> 
{
    TessBaseAPI::Recognize [api/baseapi.cpp] -> 
    {
        TessBaseAPI::FindLines [api/baseapi.cpp] ->
        Tesseract::SegmentPage [ccmain/pagesegmain.cpp] ->
        Tesseract::AutoPageSeg [ccmain/ pagesegmain.cpp]
        ...
    }

    TessResultRenderer::AddImage [api/renderer.cpp] --->
    TessTextRenderer::AddImageHandler [api/renderer.cpp] --->
    TessBaseAPI::GetUTF8Text [api/baseapi.cpp] ->
    TessBaseAPI::Recognize [api/baseapi.cpp] -> 
    {
        TessBaseAPI::FindLines [api/baseapi.cpp] ->
        Tesseract::SegmentPage [ccmain/pagesegmain.cpp] ->
        Tesseract::AutoPageSeg [ccmain/ pagesegmain.cpp]
        ..
    }

}

Recognize(NULL) can be called by:

  • TessBoxTextRenderer::AddImageHandler –>TessBaseAPI::GetBoxText()
  • TessHOcrRenderer::AddImageHandler –> TessBaseAPI::GetHOCRText()
  • TessTextRenderer::AddImageHandler —> TessBaseAPI::GetUTF8Text()
  • TessUnlvRenderer::AddImageHandler —> TessBaseAPI::GetUNLVText()
  • TessBaseAPI::MeanTextConf() —> TessBaseAPI::AllWordConfidences()

Route 1 - Read image and setup internal structure (PIX)

main[api/tesseractmain.cpp] -> 
TessBaseAPI::ProcessPages[api/baseapi.cpp] -> 
TessBaseAPI::ProcessPagesInternal[api/baseapi.cpp] -> 
(i) **For TIFF**
{
    TessBaseAPI::ProcessPagesMultipageTiff[api/baseapi.cpp] ->  
}
(ii) **For non TIFF** 
    pixReadMem[liblept/src/readfile.c]
    (This is a variation of pixReadStream(), where the data is read from a memory buffer rather than a file.)
    (IFF_BMP, IFF_JFIF_JPEG, IFF_PNG, IFF_TIFF[#####], IFF_PNM, IFF_GIF, IFF_JP2, IFF_WEBP, IFF_SPIX, IFF_UNKNOWN)
    -> 
    pixReadMemPng [liblept/src/pngio.c]->
    pixReadStreamPng [liblept/src/pngio.c]->

TessBaseAPI::ProcessPage[api/baseapi.cpp] -> 
SetInputName[api/baseapi.cpp]
SetImage[api/baseapi.cpp]->
TessBaseAPI::InternalSetImage[api/baseapi.cpp]->
ImageThresholder::SetImage[ccmain/thresholder.cpp]
TessBaseAPI::Recognize [api/baseapi.cpp] -> 
TessBaseAPI::FindLines [api/baseapi.cpp] -> 
....

Comments from liblept/src/pngio.c

spp == 2 (gray + alpha), spp == 3 (rgb), spp == 4 (rgba)
grayscale + alpha; convert to RGBA
do not support 2 spp PIX
Any image with alpha is converted to RGBA (spp = 4, with equal red, green and blue channels) on reading.
There are three important cases with alpha:

  • (a) grayscale-with-alpha (spp = 2), where bpp = 8, and each pixel has an associated alpha (transparency) value in the second component of the
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值