EMNLP 2023 - Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation

EMNLP 2023 - Oral Long Paper - Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation

前言

进入课题组以后,一直在Medical Report Generation领域寻找新的突破,并取得了一些成果。下面,分享一篇发表在EMNLP 2023的工作。

[Oral Long Paper] “Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation”
文章下载:https://aclanthology.org/2023.emnlp-main.408/
更多的信息,可见我的Video in Bilibili
链接: My_Oral_Video
在这里插入图片描述

Introduction

Brain CT examination is widely applied in cranial diseases diagnosis. However, writing reports could be time-consuming and error-prone for radiologists.

The automatic Brain CT reports generation can improve the efficiency and accuracy of diagnosing brain diseases.

The current methods have employed various Cross-modal alignment mechanisms to refine the dedicated consistency of salient pathological features between visual and textual modalities.
For instance, the Cross-modal Attention Mechanism tends to concentrate on specific visual regions to mimic clinical observations during report generation.
Besides, the Cross-modal Memory Mechanism employs a memory matrix to patternize visual-textual relations.
Moreover, the Cross-modal Contrastive Learning facilitates unsupervised feature alignment and has been proved to be effective on our small-scale medical dataset.

Challenges:

  1. The first one is Coarse-grained Supervision: the training data in image-text format lacks detailed supervision for recognizing subtle abnormalities.
  2. And the next is Coupled Cross-modal Alignment: visual-textual alignment may be inevitably coupled in a coarse-grained manner, and this may cause the tangled feature representation for report generation.

Contribution:
在这里插入图片描述
在这里插入图片描述

Method

在这里插入图片描述
Now, let’s delve into our framework.
At its core, we feed a series of Brain CT scans as input with the goal of generating a medical report.
Our model contains two parallel branches.
First off, there’s the Brain CT report generation branch.
Alongside that, we have the Pathological Graph-driven Cross-modal Alignment branch, designed to learn the consistency across different modalities of pathologies and improve the overall report generation process.
These two branches collaborate through shared visual and textual embedding layers.
Now, let’s dive deeper into the details of each of these branches.
在这里插入图片描述
在这里插入图片描述
Here is our PGCA branch.
First, we organize a Pathological Graph to encompass clinically significant attributes, such as tissues represented in green and lesions in purple.
The inituation behind deviding tissues and lesions lies in their ability to reflect the backbone of a medical report. Examining the figure, you’ll notice that the fundamental structure of diagnostic sentences revolves around the relationship between tissues and lesions.
To capture this, we select key tissue and lesion entities as graph nodes, connecting them through intra-attribute edges fixed by expert knowledge to convey common medical understanding. Additionally, we establish inter-attribute edges that dynamically adapt based on actual tissue-lesion relations in reports, reflecting specific clinical observations. These edges effectively partition the graph into three distinct sub-graphs: the tissue graph, the lesion graph, and the tissue-lesion graph.

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

Experiment

在这里插入图片描述
在这里插入图片描述

一些开会照片

会议注册
在这里插入图片描述
会议演讲
在这里插入图片描述

Picture with Le Bras, Ronan, 艾伦AI研究所
在这里插入图片描述
社交晚宴
在这里插入图片描述
Poster现场
在这里插入图片描述
会议环球影城活动
在这里插入图片描述
最佳论文评选现场
在这里插入图片描述

总结

总体来讲,会议体验非常好。一方面结识了许多圈内的朋友,另一方面也通过交流拓宽了自己的知识面,收获满满。感谢EMNLP2023,给予这次宝贵的交流学习机会,未来将持续关注该会议,争取产出更优秀的成果!

  • 24
    点赞
  • 22
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值