The PageSegMode in Tesseract is REALLY, REALLY, REALLY confusing!!
Well, it is kind of all make sense in the end, once one understand the exact meaning of the terminologies used in Tesseract Documentation, such as column
, block
, line
…
- The so-called
Automatic page segmentation
is supposedly verified by PSM_COL_FIND_ENABLED macro, for1,2,3
mode
inline bool PSM_COL_FIND_ENABLED(int pageseg_mode)
{
return pageseg_mode >= PSM_AUTO_OSD && pageseg_mode <= PSM_AUTO;
}
- which will incur
Tesseract::AutoPageSeg
thenColumnFinder::FindBlocks
and most importantlyTabFind::FindTabVectors
, which in theory should find at least one column.- Otherwise
ColumnFinder::MakeColumns
later