知识图谱开源数据:开源图谱评测数据
- MSRA-NER实体数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/MSRA
- 人民日报实体数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/people_daily
- 新浪微博实体数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/weibo
- CLUENER细粒度实体数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/cluener_public
- Yidu-S4K医疗命名实体识别数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/yidu-s4k
- 面向试验鉴定的实体数据集 https://www.biendata.xyz/competition/ccks_2020_8/
- BosonNLP实体数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/boson
- 影视音乐书籍实体数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/video_music_book_datasets
- 中文电子病历实体数据集 https://www.biendata.xyz/competition/CCKS2017_2
- 中文电子简历实体数据集 https://github.com/GuocaiL/nlp_corpus/tree/main/open_ner_data/ResumeNER
- CoNLL 2003数据集 https://www.clips.uantwerpen.be/conll2003/ner/
- OntoNotes5.0 数据集 https://catalog.ldc.upenn.edu/ldc2013t19
- ACE实体关系数据集 https://catalog.ldc.upenn.edu/byproject
- SemEval实体关系数据集 https://github.com/thunlp/OpenNRE/blob/master/benchmark/download_semeval.sh
- FewRel实体关系数据集 https://github.com/thunlp/OpenNRE/blob/master/benchmark/download_fewrel.sh
- Wiki80实体关系数据集 https://github.com/thunlp/OpenNRE/blob/master/benchmark/download_wiki80.sh
- NYT10实体关系数据集 https://github.com/thunlp/OpenNRE/blob/master/benchmark/download_nyt10.sh
- DulE2.0实体关系数据集 https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/information_extraction/DulE
- COAE2016实体关系数据集 https://ccir2016.ccnl.scut.edu.cn/caoe_test.php
- 人物实体关系数据集 https://github.com/SUDA-HLT/IPRE
- Chinese Literature Text文档级实体关 https://github.com/lancopku/Chinese-Literature-NER-RE-Dataset
- DocRED文档级实体关系数据集 https://github.com/thunlp/DocRED
- ACE事件袖取数据集 https://github.com/n/pcl-lab/ace2005-preprocessing
- 医疗事件抽取数据集 https://www.biendata.xyz/competition/ccks_2020_2_1/
- CCKS2020金融领城小样本迁移事件抽取数据集 https://www.biendata.xvz/competitior/ccks_2020_3/
- CCKS202D金融领域事件主体拍取数据集 https://www.biendata.xyz/competition/ccks_2020_4_1/data/
- CCK52020金融领域的篇章级事件拍取数据集 https://www.biendata.xyz/competition/ccks_2020_4_2/data/
- CCKS2021金融领域篇章级事件抽取数据集 https://www.biendata.xyz/competition/ccks_2021_task6_1/data/
- DuEE-Fin算章级事件抽取数据集 https://astudio.baidu.com/astudo/compatition/detai/65/0/introduction
- B、DuEE百度中文句子级事件抽取数据集 https://aistudio.baidu.com/aistud.o/projectdetail/1639964
- 科大讯飞开放城事件抽取数据集 http://challenge.xtyun.cn/topic/info?type=hotspot
- CCKS2021通用组粒度事件检测数据集 https://biendata.xyz/competition/ocks_2021_maver/data/
- CEC事件抽凰数据集 https://codechins.csdn.net/mirrcr/shiiiebei2009/CEC-Corpus
- 面向金融领域的篇章级事件因果关系抽取数据集 https://endata.xyz/competition/ocks_2021_task6_2/data/
- SemEva/SCIF句子级因果事件关系数据集 https://ait.ocn.0semey1:201/ndex.phooid=tass
- FB15k知识表示数据集 https://web.informatik.uni-mannheim.de/pi1/kge-datasets/fb15k.tar.gz
- FB15k-237知识表示数据集 https://mannheim.de/pi1/kge-datasets/fb15k-237.tar.gz
- WN18知识表示数据集 https://veinformatik.uni-mannheim de/pi1/kge-datasets/wn18.tar.gz
- WN18rr知识表示数据集 https://web.infomatik.un-mannheim.de/pi1/kge-datasets/wnm.tar.gz
- YAGO3-10知识表示数据集 https://web.informatik.uni-mannheim.de/pi1/kge-datasets/yago3-10.tar.qz
- ogbl-biokg知识表示数据集 https://github.com/snap-stanford/ogb
- ogbl-wikikg2知识表示数据集 https://github.com/snap-stanford/ogb
- NLPCC2013中文微博实体链接数据集 http://www.softcont.com/e/nlpcc2013/
- 2014年NLPCC实体链接数据集 http://toci.ccf.crg.cr/conference/2014/pages/pape04_tdata.html
- 2015年NLPCC实体链接数据集 https://www.biendata.xyz/ccf.toc_tcci2018/datasets/toci_tag/2
- KBP 2017 实体链接数据集 http://nlp.cs.rci.edu/kbo/2017/
- KBP 2019实体链接数据集 http://nlc.cs.rci.edu/kbo./2019/
- CCKS 2019 中文短文本实体链指数据集 https://biendata.xyz/competition/ccks_2019_elv
- CCXS2020中文短文本的实体链接数据集 https://www.biendata.xvz/competition/ocks_2020_el/
- 知识工厂实体链接数据集 https://github.com/lhiclh/chinese_entity_linking
知识图谱开源工具:知识本体构建工具
- protégé https://protege.stanford.edu
- NeOn Toolkit http://neon-toolkit.org/wiki/Main_Page.html
- Altova SemanticWorkshttps://www.lesliesikos.com/
- TopBraid Composer http://www.topquadrant.com/
- 思维导图 https://www.xmind.cn/xmind8-pro/
知识图谱开源工具:知识标注开源工具
- YEDDA/SUTDAnnotator https://github.com/jiesutd/YEDDA 适合做个人实验的轻量级实体标注
- Chinese-Annotator https://github.com/crownpku/Chinese-Annotator 适合做文本分类标注
- Brat https://github.com/nlplab/brat功能最全,学术界用的较多
- doccano https://github.com/doccano 除实体关系、事件要素、事件关系不能标之外均可
- Marktool https://github.com/chosendai/MarkTool 持续维护,功能最全
知识图谱开源工具:知识抽取工具
- DeepKE https://github.com/zjunlp/deepke
- OpenNRE https://github.com/thunlp/OpenNRE.git
- DeepDive https://www.openkg.cn/dataset/cn-deepdive
知识图谱开源工具:大规模图谱存储工具
-
原则:图数据库不都是最佳选择,mongodb有时很受欢迎,RDF工业界基本不用
- 根据具体的数据规模及应用场景合理选择
- 不涉及多跳查询的场景可选择合适的关系型数据库
- 涉及多跳查询、最短路径、推理分析等可考虑RDF数据库
-
工业界常用图数据库
- Neo4j图数据库 https://neo4j.com
- HugeGraph https://hugegraph.github.io/hugegraph-doc/
- NebulaGraph https://github.com/microsoft/SPTAG
知识图谱开源工具:图算法计算工具
- PyTorch Geometric (PyG) https://github.com/rusty1s/
- tf_geometric https://github.com/Craw/Script/tf_geometric
- Deep Graph Library (DGL) https://github.com/dmlc/dgl
- CogDL https://github.com/THUDM/cogdl
- GraphEmbedding https://github.com/shenweichen/GraphEmbedding
- Spark GraphX http://spark.apache.org/graphx/
- networkx https://networkx.org
- Plato https://github.com/tencent/plato
知识图谱开源工具:知识融合工具
- Dedupe https://github.com/dedupeio/dedupe
- Falcon-Ao http://ws.nju.edu.cn/falcon-ao/
- LIMES https://github.com/dice-group/LIMES
- OpenEAhttps://github.com/nju-websoft/OpenEA
- PRASEMap https://github.com/qizhyuan/PRASEMap
知识图谱开源工具:知识表示工具
- DGL-KE https://github.com/awslabs/dgl-ke
- OpenKE https://github.com/thunlp/OpenKE
- pykg2vec https://github.com/Sujit-O/pykg2vec
- GraphVite https://github.com/DeepGraphLearning/graphvite
- Pytorch-BigGraph https://github.com/facebookresearch/PyTorch-BigGraph
知识图谱开源工具:图谱可视化工具
- D3.js https://observablehq.com/@d3/gallery
- Vis.js https://visjs.github.io/vis-network/examples/
- Echarts https://echarts.apache.org
- AntvG6 https://www.yuque.com/antv/g6/intro
知识图谱开源工具:大规模图谱搜索工具
- Elasticsearch https://www.elastic.co/cn/
- FAISS https://github.com/facebookresearch/faiss
- SPTAG https://github.com/microsoft/SPTAG
- Vearch https://github.com/vearch/vearch
- Milvus https://milvus.io/