EMNLP 2023 - Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation

置顶二仙桥下钊半仙

已于 2024-07-04 10:12:42 修改

阅读量1k

点赞数 24

分类专栏：论文讲解深度学习笔记文章标签：自然语言处理 nlp 论文笔记计算机视觉深度学习健康医疗机器学习

于 2023-12-12 00:08:40 首次发布

本文链接：https://blog.csdn.net/qq_44837861/article/details/134937836

版权

论文讲解同时被 3 个专栏收录

7 篇文章 2 订阅

订阅专栏

深度学习

7 篇文章 2 订阅

订阅专栏

笔记

6 篇文章 0 订阅

订阅专栏

本文介绍了一种在EMNLP2023上发表的方法，针对脑部CT报告生成中的粗粒度监督和耦合跨模态对齐问题，提出了一种包含病理图驱动的跨模态对齐分支，通过共享视觉和文本嵌入层改善报告生成过程。作者在会议上分享了框架细节和实验经验，表示会议交流丰富了研究视野。

摘要由CSDN通过智能技术生成

EMNLP 2023 - Oral Long Paper - Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation

前言

进入课题组以后，一直在Medical Report Generation领域寻找新的突破，并取得了一些成果。下面，分享一篇发表在EMNLP 2023的工作。

[Oral Long Paper] “Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation”
文章下载：https://aclanthology.org/2023.emnlp-main.408/
更多的信息，可见我的Video in Bilibili
链接: My_Oral_Video
在这里插入图片描述

Introduction

Brain CT examination is widely applied in cranial diseases diagnosis. However, writing reports could be time-consuming and error-prone for radiologists.

The automatic Brain CT reports generation can improve the efficiency and accuracy of diagnosing brain diseases.

The current methods have employed various Cross-modal alignment mechanisms to refine the dedicated consistency of salient pathological features between visual and textual modalities.
For instance, the Cross-modal Attention Mechanism tends to concentrate on specific visual regions to mimic clinical observations during report generation.
Besides, the Cross-modal Memory Mechanism employs a memory matrix to patternize visual-textual relations.
Moreover, the Cross-modal Contrastive Learning facilitates unsupervised feature alignment and has been proved to be effective on our small-scale medical dataset.

Challenges:

The first one is Coarse-grained Supervision: the training data in image-text format lacks detailed supervision for recognizing subtle abnormalities.
And the next is Coupled Cross-modal Alignment: visual-textual alignment may be inevitably coupled in a coarse-grained manner, and this may cause the tangled feature representation for report generation.

Contribution:
在这里插入图片描述

Method

在这里插入图片描述
Now, let’s delve into our framework.
At its core, we feed a series of Brain CT scans as input with the goal of generating a medical report.
Our model contains two parallel branches.
First off, there’s the Brain CT report generation branch.
Alongside that, we have the Pathological Graph-driven Cross-modal Alignment branch, designed to learn the consistency across different modalities of pathologies and improve the overall report generation process.
These two branches collaborate through shared visual and textual embedding layers.
Now, let’s dive deeper into the details of each of these branches.
在这里插入图片描述

Here is our PGCA branch.
First, we organize a Pathological Graph to encompass clinically significant attributes, such as tissues represented in green and lesions in purple.
The inituation behind deviding tissues and lesions lies in their ability to reflect the backbone of a medical report. Examining the figure, you’ll notice that the fundamental structure of diagnostic sentences revolves around the relationship between tissues and lesions.
To capture this, we select key tissue and lesion entities as graph nodes, connecting them through intra-attribute edges fixed by expert knowledge to convey common medical understanding. Additionally, we establish inter-attribute edges that dynamically adapt based on actual tissue-lesion relations in reports, reflecting specific clinical observations. These edges effectively partition the graph into three distinct sub-graphs: the tissue graph, the lesion graph, and the tissue-lesion graph.

在这里插入图片描述

Experiment

在这里插入图片描述

一些开会照片

会议注册
在这里插入图片描述
会议演讲

Picture with Le Bras, Ronan, 艾伦AI研究所
在这里插入图片描述
社交晚宴

Poster现场

会议环球影城活动

最佳论文评选现场

总结

总体来讲，会议体验非常好。一方面结识了许多圈内的朋友，另一方面也通过交流拓宽了自己的知识面，收获满满。感谢EMNLP2023，给予这次宝贵的交流学习机会，未来将持续关注该会议，争取产出更优秀的成果！

二仙桥下钊半仙

关注

24
点赞
踩
22

收藏

觉得还不错? 一键收藏
2
评论
EMNLP 2023 - Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation

EMNLP2023会议报告，题目是Granularity Matters: Pathological Graph-driven Cross-modal Alignment for Brain CT Report Generation。此外，通过学习交流，学习到了宝贵知识。以此博客记录这段经历。
复制链接

扫一扫

专栏目录