使用PP-Structure识别表格,使用save_structure_res保存结果,只生成res_0.txt,没有生成excel。

问题描述:

最近使用yolo识别表格,之后使用PP-structure进行表格的文本识别,并且通过save_structure_res来保存表格识别的excel信息,但是发现有些表格没有保存成excel,只生成了res_0.txt,打开res_0.txt发现type='figure'。

原因:

因为PP-Structure会进行版面分析,PP-Structure默认的版面分析模型会识别多种类型,人类所认为的表格机器不一定会识别成表格。

解决办法:

1、替换PP-Structure的版面分析模型为表格识别模型。

table_name = file.split(".")[0]
save_folder = table_folder_path
img = cv2.imread(table_path)
result = table_engine(img)
# 保存 OCR 结果
save_structure_res(result, save_folder, table_name)
table_engine = PPStructure(
    use_gpu=False,
    show_log=True,
    lang='ch',
    layout_model_dir='new_model/layout/picodet_lcnet_x1_0_fgd_layout_table_infer',
    layout_url='new_model/layout/layout_table_dict.txt',
)

替换layout_model_dir为picodet_lcnet_x1_0_fgd_layout_table_infer模型,下载路径为:

https://gitee.com/paddlepaddle/PaddleOCR/blob/release/2.7/ppstructure/docs/models_list.md

layout_url为layout_table_dict.txt,下载路径为:

https://gitee.com/paddlepaddle/PaddleOCR/blob/release/2.7/ppocr/utils/dict/layout_dict/layout_table_dict.txt
2、进行完上述步骤后发现所有表格都没有保存excel,还需要修改predict_system.py,我的是在以下路径:

修改第127行table为text

修改第201行table为text

这样就会发现所有的表格都保存了excel。我猜测可能会有些表格识别不出来,暂时未遇到(这是因为我提前使用了yolo进行了表格识别)

def structure_loss(pred, mask): weit = 1 + 5*torch.abs(F.avg_pool2d(mask, kernel_size=31, stride=1, padding=15) - mask) wbce = F.binary_cross_entropy_with_logits(pred, mask, reduction='none') wbce = (weit*wbce).sum(dim=(2, 3)) / weit.sum(dim=(2, 3)) pred = torch.sigmoid(pred) inter = ((pred * mask)*weit).sum(dim=(2, 3)) union = ((pred + mask)*weit).sum(dim=(2, 3)) wiou = 1 - (inter + 1)/(union - inter+1) return (wbce + wiou).mean() def train(train_loader, model, optimizer, epoch, best_loss): model.train() loss_record2, loss_record3, loss_record4 = AvgMeter(), AvgMeter(), AvgMeter() accum = 0 for i, pack in enumerate(train_loader, start=1): # ---- data prepare ---- images, gts = pack images = Variable(images).cuda() gts = Variable(gts).cuda() # ---- forward ---- lateral_map_4, lateral_map_3, lateral_map_2 = model(images) # ---- loss function ---- loss4 = structure_loss(lateral_map_4, gts) loss3 = structure_loss(lateral_map_3, gts) loss2 = structure_loss(lateral_map_2, gts) loss = 0.5 * loss2 + 0.3 * loss3 + 0.2 * loss4 # ---- backward ---- loss.backward() torch.nn.utils.clip_grad_norm_(model.parameters(), opt.grad_norm) optimizer.step() optimizer.zero_grad() # ---- recording loss ---- loss_record2.update(loss2.data, opt.batchsize) loss_record3.update(loss3.data, opt.batchsize) loss_record4.update(loss4.data, opt.batchsize) # ---- train visualization ---- if i % 20 == 0 or i == total_step: print('{} Epoch [{:03d}/{:03d}], Step [{:04d}/{:04d}], ' '[lateral-2: {:.4f}, lateral-3: {:0.4f}, lateral-4: {:0.4f}]'. format(datetime.now(), epoch, opt.epoch, i, total_step, loss_record2.show(), loss_record3.show(), loss_record4.show())) save_path = 'snapshots/{}/'.format(opt.train_save) os.makedirs(save_path, exist_ok=True) if (epoch+1) % 1 == 0: meanloss = test(model, opt.test_path) if meanloss < best_loss: print('new best loss: ', meanloss) best_loss = meanloss torch.save(model.state_dict(), save_path + 'TransFuse-%d.pth' % epoch) print('[Saving Snapshot:]', save_path + 'TransFuse-%d.pth'% epoch) return best_loss def test(model, path): model.eval() mean_loss = [] for s in ['val', 'test']: image_root = '{}/data_{}.npy'.format(path, s) gt_root = '{}/mask_{}.npy'.format(path, s) test_loader = test_dataset(image_root, gt_root) dice_bank = [] iou_bank = [] loss_bank = [] acc_bank = [] for i in range(test_loader.size): image, gt = test_loader.load_data() image = image.cuda() with torch.no_grad(): _, _, res = model(image) loss = structure_loss(res, torch.tensor(gt).unsqueeze(0).unsqueeze(0).cuda()) res = res.sigmoid().data.cpu().numpy().squeeze() gt = 1*(gt>0.5) res = 1*(res > 0.5) dice = mean_dice_np(gt, res) iou = mean_iou_np(gt, res) acc = np.sum(res == gt) / (res.shape[0]*res.shape[1]) loss_bank.append(loss.item()) dice_bank.append(dice) iou_bank.append(iou) acc_bank.append(acc) print('{} Loss: {:.4f}, Dice: {:.4f}, IoU: {:.4f}, Acc: {:.4f}'. format(s, np.mean(loss_bank), np.mean(dice_bank), np.mean(iou_bank), np.mean(acc_bank))) mean_loss.append(np.mean(loss_bank)) return mean_loss[0]
最新发布
03-16
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值