自然语言处理
zdcs
这个作者很懒,什么都没留下…
展开
-
康奈尔大学的电影对白语料库介绍 --Cornell Movie-Dialogs Corpus
这个公开的资源被很多和自然语言处理NLP相关的开源代码和论文提到,所以仔细阅读了readme,并记录相关要点所有文件以" +++$+++ "分隔符- movie_titles_metadata.txt - 包含每部电影标题信息 - fields: - movieID, - movie title,原创 2016-12-05 15:33:08 · 6946 阅读 · 2 评论 -
TrecQA 数据集简介
TrecQA------ TrecQA 数据集一般用来评估QA的答案选择 它由一下论文发表和组织:+ Wang et al. [What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA.](http://www.aclweb.org/anthology/D07-1003) *EMNLP-CoNLL 2007*.+ He...原创 2018-02-27 14:12:24 · 5999 阅读 · 0 评论 -
关于保险的问答数据集
地址:https://github.com/shuzi/insuranceQA仅用于研究目的使用请引用一下论文: Applying Deep Learning to Answer Selection: A Study and An Open Task Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, Bowen Zhou ASRU 2...原创 2018-02-27 14:32:00 · 1544 阅读 · 1 评论 -
SICK数据集简介
官方网址:http://clic.cimec.unitn.it/composes/sick.htmlSICK是Sentences Involving Compositional Knowledge 的首字母缩写SICK数据集包含一万个英语句子对, 来自于两个已经存在的paraphrase数据集:一个是8k imageFlickrbuilt, (http://nlp.cs.illinois.e...原创 2018-02-18 21:51:18 · 4686 阅读 · 2 评论 -
MSLR数据集简介
微软发布的两个规模较大的learning to rank数据集MSLR-WEB30k 30,000个查询query从其中随机采样10,000个形成mslr-web10k 描述:queries 和 urls 由ID来表示. 数据集包含了从q-u对中抽取的特征向量以及相关性评价标签(1) 相关性评价来自于 Microsoft Bing,5分制, 从0 (不相关) 到 4 (最相关).(2) 特征由...原创 2018-02-18 22:41:25 · 2516 阅读 · 1 评论 -
微软WikiQA corpus 简介
太简单了,没什么好翻译的The WikiQA corpus is a new publicly available set of question and sentence pairs, collected and annotated for research on open-domain question answering. Last published: August 28, 2015....转载 2018-02-18 23:06:21 · 3956 阅读 · 0 评论 -
微软的MSR paraphrase数据集
5800对句子,人工标注关于语料的来源和标注方式参考readme样本如下, 非常简单明了:下载地址:https://www.microsoft.com/en-us/download/details.aspx?id=52398Quality #1 ID #2 ID #1 String #2 String1 702876 702977 Amrozi accused his brother, whom ...原创 2018-02-19 00:25:59 · 4074 阅读 · 0 评论 -
SST数据集
参考:http://blog.csdn.net/ltochange/article/details/61194650http://blog.csdn.net/yeyang911/article/details/54378716转载 2018-02-19 00:29:24 · 11994 阅读 · 1 评论 -
Stanford Natural Language Inference (SNLI)和Multi-Genre NLI Corpus(MultiNLI) 数据集
Stanford Natural Language Inference (SNLI)和Multi-Genre NLI Corpus(MultiNLI) 数据集https://nlp.stanford.edu/projects/snli/https://www.nyu.edu/projects/bowman/multinli/MultiNLI是SNLI的升级版,格式一样,规模相当,但是前者变化更...原创 2018-02-19 10:45:01 · 8038 阅读 · 0 评论 -
记忆网络论文相关笔记(不全)
Hybrid computing using a neuralnetwork with dynamic external memory DNC 架构不同于最近提出的Memory networks和Pointer networks的神经记忆框架,其区别在于DNC内存有选择性地可以写入和读取, 允许迭代修改内存内容。 如果内存可以被认为是 DNC 的 RAM,然后网络,被称为控制器,是可微 CP...原创 2018-02-27 10:37:29 · 445 阅读 · 1 评论 -
AG及新闻主题分类数据集
AG是由ComeToMyHead超过一年的努力,从2000多不同的新闻来源搜集的超过1百万的新闻文章ComeToMyHead是一个学术新闻搜索引擎,开始于2004年7月 http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html该数据集由学术社区提供,用于研究分类,聚类,信息获取(rank,搜索)...等非商业活动两个格式版...原创 2018-02-27 10:18:38 · 8811 阅读 · 2 评论 -
华为诺亚实验室中文对话语料库介绍
少有的中文对话语料库,记录一下格式相关信息,贴出样本以备快速参考,从样本看显然已经分词。以下内容主要来自 Readme for conversation_data_v1.1数据集有5个文件 1. post.index contains post_id with its contents 包含post_id及相关内容首先是p原创 2016-12-05 16:19:23 · 9675 阅读 · 7 评论 -
论文笔记: Compact Bilinear Pooling
Compact Bilinear PoolingYang Gao1, Oscar Beijbom1, Ning Zhang2∗, Trevor Darrell1 † 1EECS, UC Berkeley 2Snapchat Inc. {yg, obeijbom, trevor}@eecs.berkeley.edu {ning.zhang}@snapchat.comarXiv:1511.06翻译 2017-02-06 11:43:07 · 5374 阅读 · 5 评论 -
论文笔记:Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answeri
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question AnsweringHuijuan Xu and Kate SaenkoDepartment of Computer Science, UMass Lowell, USA hxu1@cs.uml.edu, saenko原创 2017-02-05 18:55:26 · 1053 阅读 · 0 评论 -
论文笔记:Aligning where to see and what to tell: image caption with region-based attention ...
Aligning where to see and what to tell: image caption with region-based attention and scene factorizationrXiv:1506.06272v1 [cs.CV] 20 Jun 2015摘要部分:本文提出一种图像文字标注系统利用了图像与句子之间的平行结构在该模型中,原创 2017-02-05 18:13:56 · 1326 阅读 · 2 评论 -
论文笔记: HADAMARD PRODUCT FOR LOW-RANK BILINEAR POOLING
HADAMARD PRODUCT FOR LOW-RANK BILINEAR POOLINGJin-HwaKim Interdisciplinary Program in Cognitive Science Seoul National University Seoul, 08826, Republic of Korea jhkim@bi.snu.ac.krKyoung-WoonOn Sc原创 2017-02-06 12:15:48 · 3034 阅读 · 0 评论 -
论文笔记 :Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual GroundingAkiraFukui*1,2 DongHukPark*1 DaylenYang*1 AnnaRohrbach*1,3 TrevorDarrell1 MarcusRohrbach1 1UC Berkeley EECS, CA,原创 2017-02-06 12:25:18 · 2185 阅读 · 0 评论 -
论文笔记 : Review Networks for Caption Generation
Review Networks for Caption GenerationZhilinYang, YeYuan, YuexinWu, RuslanSalakhutdinov, WilliamW.Cohen School of Computer Science Carnegie Mellon University {zhiliny,yey1,yuexinw,rsalakhu,wcohen}@c原创 2017-02-06 14:13:10 · 1265 阅读 · 0 评论 -
论文笔记: Hierarchical Question-Image Co-Attention for Visual Question Answering
Hierarchical Question-Image Co-Attention for Visual Question AnsweringJiasenLu∗,JianweiYang∗,DhruvBatra∗† ,DeviParikh∗† ∗Virginia Tech,†Georgia Institute of Technology {jiasenlu, jw2yang, dbatra, pa原创 2017-02-09 10:40:53 · 2399 阅读 · 0 评论 -
First Quora Dataset Release: Question Pairs
我就喜欢这种格式简单明了的数据集:id qid1 qid2 question1 question2 is_duplicate0 1 2 What is the step by step guide to invest in share market in india? What is the step by step guide to invest in share market? 01 3 4 ...原创 2018-02-22 00:51:06 · 1339 阅读 · 0 评论