表格OCR相关资源整理【ICDAR】【表格识别】【持续更新...】

11 篇文章 3 订阅
11 篇文章 1 订阅
  • 定义:
    • 表格检测(Table Detection)任务是从一个页面中检测出表格所在的区域
    • 表格结构识别(Table Structure Recognition)任务则是在检测到的表格区域的基础上,进一步将表格的内容与逻辑结构识别出来
  • 代码:
  • 数据集:

名称

说明

内容

量级

地址

ICDAR2013

PDF

美国政府文件和欧盟文件

http://www.tamirhassan.com/html/dataset.html

icdar2017页面对象识别

页面截图

ctdar2019

分为两类数据,历史文档和现在文档

GitHub - cndplab-founder/ICDAR2019_cTDaR: The ICDAR 2019 cTDaR is to evaluate the performance of methods for table detection (TRACK A) and table recognition (TRACK B). For the first track, document images containing one or several tables are provided. For TRACK B two subtracks exist: the first subtrack (B.1) provides the table region. Thus, only the table structure recognition must be performed. The second subtrack (B.2) provides no a-priori information. This means, the table region and table structure detection has to be done.

TABLE2LATEX-450K

latex

46.6万

https://github.com/bloomberg/TABLE2LATEX

DECO

电子表格

1165

DECO: A Dataset of Annotated Spreadsheets for Layout and Table Recognition | Database Systems Group

第三方个人数据

扫描英文表格检测

403

https://github.com/sgrpanchal31/table-detection-dataset

  • 论文:
    • ICDAR2019会议中,共有16篇与表格识别相关的论文
    • 其中5篇针对表格检测任务
    • 8篇针对表格结构识别任务
    • 1篇在同时进行了表格检测与结构识别的任务
    • 2篇则是发布了新的表格识别相关的数据集

任务

论文名称

说明

作者

代码

数据

识别

A Genetic-based Search for Adaptive Table Recognition in Spreadsheets

传统图像,应用于excel截图

识别

Deep Splitting and Merging for Table Structure Decomposition

ICDAR2013表格竞赛表格结构识别子任务的数据集State-of-the-art

adobe研究院

识别

DeepTabStr:Deep Learning based Table Structure Recognition

识别

ReS2TIM: Reconstruct SyntacticStructures from Table Images

icdar2013 f1 0.74

识别

Rethinking Semantic Segmentationfor Table Structure Recognition in Documents

不可处理跨行跨列

识别

Rethinking Table Recognitionusing Graph Neural Networks

有框线无框线表格均可处理

没有提供预训练模型

GitHub - shahrukhqasim/TIES-2.0: Code for: S.R. Qasim, H. Mahmood, and F. Shafait, Rethinking Table Recognition using Graph Neural Networks (2019)

合成,提供数据生产工具

识别

TableStructure Extraction with Bi-directional Gated Recurrent Unit Networks

端到端检测识别

TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images

icdar2013检测和识别F1分别为96.62%和91.51%

检测

A GAN-based Feature Generator forTable Detection

ICDAR13/17 state-of-the-art

北京大学王选计算机研究所

检测

A YOLO-based Table Detection Method

检测

Faster R-CNN BasedTable Detection Combining Corner Locating

ICDAR2017 POD数据集

检测

Table Detection in Invoice Documents by Graph Neural Networks

取自 RVL-CDIP invoice data

端到端CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documentsGitHub - DevashishPrasad/CascadeTabNet: This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"

  • 6
    点赞
  • 31
    收藏
    觉得还不错? 一键收藏
  • 5
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值