论文都出自 Stanford University的 ,作者与[论文阅读笔记50]弱监督在电子病历的医学研究(之一)相同;
论文2-(Nature):Ontology-driven weak supervision for clinical entity classification in electronic health records
题目
参考:Fries, J.A., Steinberg, E., Khattar, S. et al. Ontology-driven weak supervision for clinical entity classification in electronic health records. Nat Commun 12, 2017 (2021). https://doi-org.stanford.idm.oclc.org/10.1038/s41467-021-22328-4
论文:https://arxiv.org/pdf/2008.01972.pdf
https://github.com/som-shahlab/trove
解决问题与贡献
实体分类(NER)任务,由于隐私,共享的数据问题;
贡献:
提出了Trove框架,一个使用医学本体和专家生成的规则的弱监督实体分类框架。
内容
自动标注的例子:
实验
实验1:label来源不同(Ontologies—Rules—Hand-labeled)
== 指南,UMLS, OTHER,RULES,Hand-labeled ==
实验2:Trove与其它弱监督比较
实验3:检验学习源正确性
实验4:案例研究
监测急诊科接受covid-19检测的患者,分析出现症状/患病和危险因素的临床记录。
代码
Code availability Trove is written in Python v3.6, spaCy 2.3.4 was used for NLP preprocessing, and Snorkel v0.9.5 was used for training the label model. BioBERT-Base v1.1, Transformers v2.8 [70], and PyTorch v1.1.0 were used to train all discriminative models.
Trove is open source software and publicly available at https://github.com/som-shahlab/trove; https://doi.org/10.5281/zenodo.4497214 [71]
总结:相关的医学的内容,看得不是太明白,下次再一次阅读。这个方法的使用,对于实践有很好的实用意义。