Source
Title
FiNER: Financial Named Entity Recognition Dataset and Weak-Supervision Model
URL: https://arxiv.org/pdf/2302.11157.pdf
Authors & Affiliation
Agam Shah等,Georgia Institute of Technology
一作为Computing PhD, 二作及以后来自school of business
Publication
ACM SIGIR ’23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
Abstract
略
Contribution
数据集
Extending Snorkel’s existing weak-supervision framework for span-level labeling. 好像是用于生成标注数据
FiNER-ORD
金融新闻文章,人力标注(可能存在标注错误,需要处理)
数量:47,851条
来源:webz.io (https://webz.io/free-datasets/financial-news-articles/)
标注工具:https://github.com/doccano/doccano
entity label: LOC,PER和ORG
问题:
有效样本数目太小
训练集中entity数仅83,测试集中79,验证集中仅4
人工标注错误
entity类别有限
仅 LOC,PER和ORG三种