论文网址:BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLM
论文代码:https://github.com/1994cxy/BP-GPT
英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用
目录
2.4.3. Baseline and Evaluation Metrics
2.4.4. Evaluation the Text Prompt
2.4.5. Evaluation of fMRI to Text Decoding
1. 心得
(1)不好意思哈xd这么早给你扒来读了,只是刚好看到了,就当宣传了,github多来点Star也不是不行
(2)还只有四页,轻松愉悦看一看
(3)一天一论文,头发远离我
2. 论文逐段精读
2.1. Abstract
①现存问题:现有的LLM在从fMRI中提取语义的时候没有端到端?????有点以偏概全了,我觉得不是一个很好的limitation
②They proposed Brain Prompt GPT (BP-GPT) to decoding fMRI by aligning fMRI and text
2.2. Introduction
①我很欣赏你,用一句名言开头。只有小登的世界是这样的,一本真正的故事会,而不是八股。
“The limits of my language mean the limits of my world” - Ludwig Wittgenstein.
如果作者认为语言带来了理解,这总有一种不能进步的意味。实际上造词这种东西时有发生,我们的词袋也一直更新,但ai似乎不能自动更新捏。
②The frequency of pronouncing is different from BOLD reaction
③Chanllenge: decoding multi words in one repetition time (TR)(这个现存问题不比上面那啥端到端正常???)
④Framework of BP-GPT:
(这图片还可以再优化一下吧....)
2.3. Method
2.3.1. fMRI to Text Decoding
①Encode fMRI by:
where denotes encoder,
denotes fMRI signal.
②BCELoss of fMRI encoder:
③The similarity between positive pair fMRI prompt and text prompt:
where is temperature hyperparameter
④Negative pairs from different samples, the similarity is calculated by:
⑤The contrastive loss:
2.3.2. Training
①BCEloss is for training text prompt, and the decoder is trained by:
2.3.3. Inference
①The length of sentence is different from fMRI windows. "当前解决方案在最近的工作中利用字率模型来预测参与者感知的单词数。当生成的文本长度满足字率模型预测的字数时,文本生成过程将停止。虽然这种方法可以解决问题,但它并没有充分利用 LLM 的特性。"
②So they add $ in the real text:
based on TR
2.4. Experiment
2.4.1. Dataset
①Dataset:
A. LeBel, L. Wagner, S. Jain, A. Adhikari-Desai, B. Gupta, A. Morgenthal, J. Tang, L. Xu, and A. G. Huth, “A natural language fmri dataset for voxelwise encoding models,” Scientific Data, vol. 10, no. 1, p. 555, 2023.
②Subjects: they choose 3 from 8
③Situation: passively listened to naturally spoken English stories such as The Month and New York Times Modern Love podcasts
2.4.2. Implementing Details
①
②
③Time series windows for fMRI sequence and corresponding text: 20s with no gap
④Length of prompt:
⑤Input dimesion of BERT: 512
⑥Layer of Transformer: 8 with 8 head
⑦Optimizer: AdamW
⑧Batch size: 32
2.4.3. Baseline and Evaluation Metrics
①Test set: story “Where There's Smoke”
2.4.4. Evaluation the Text Prompt
①Performance:
2.4.5. Evaluation of fMRI to Text Decoding
①Performance table:
2.4.6. Ablation Study
①Contrastive module ablation:
②Fine tune ablation:
2.5. Conclusion
~
3. Reference
@article{chen2025bp, title={BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLM}, author={Chen, Xiaoyu and Du, Changde and Liu, Che and Wang, Yizhe and He, Huiguang}, journal={arXiv preprint arXiv:2502.15172}, year={2025} }