- 博客(414)
- 资源 (14)
- 收藏
- 关注
原创 论文发表-关于深度学习在问答系统和对话系统方面的应用研究-2018到2019年
作者导言:以下是我2018-2019年发表的关于深度学习在问答系统和对话系统的应用的paper,感兴趣的读者可以查阅。大家如果有关于这些paper的问题,欢迎发邮件到我的email (yangliuyx@gmail.com)咨询讨论,我会尽量回复。本来想对每一篇论文做细致的介绍,但是发现CSDN最近的博客编辑器不好用,所以先把近两年来发表的论文贴在这儿,以后有时间我会再更新,添加更多细节文字介绍。...
2019-06-08 01:32:19 2940 2
原创 概率语言模型及其变形系列(5)-LDA Gibbs Sampling 的JAVA实现
本系列博文介绍常见概率语言模型及其变形模型,主要总结PLSA、LDA及LDA的变形模型及参数Inference方法。初步计划内容如下第一篇:PLSA及EM算法第二篇:LDA及Gibbs Samping第三篇:LDA变形模型-Twitter LDA,TimeUserLDA,ATM,Labeled-LDA,MaxEnt-LDA等第四篇:基于变形LDA的paper分类总结(bibliography)第五
2013-01-28 17:41:09 44339 104
原创 概率语言模型及其变形系列(1)-PLSA及EM算法
本系列博文介绍常见概率语言模型及其变形模型,主要总结PLSA、LDA及LDA的变形模型及参数Inference方法。初步计划内容如下第一篇:PLSA及EM算法第二篇:LDA及Gibbs Samping第三篇:LDA变形模型-Twitter LDA,TimeUserLDA,ATM,Labeled-LDA,MaxEnt-LDA等第四篇:基于变形LDA的paper分类总结第五篇:LDA Gibbs Sa
2012-12-20 23:31:33 76442 40
原创 概率语言模型及其变形系列(2)-LDA及Gibbs Sampling
本系列博文介绍常见概率语言模型及其变形模型,主要总结PLSA、LDA及LDA的变形模型及参数Inference方法。初步计划内容如下第一篇:PLSA及EM算法第二篇:LDA及Gibbs Samping第三篇:LDA变形模型-Twitter LDA,TimeUserLDA,ATM,Labeled-LDA,MaxEnt-LDA等第四篇:基于变形LDA的paper分类总结第五篇:LDA Gibbs Sa
2012-12-17 13:08:30 69582 61
原创 文本语言模型的参数估计-最大似然估计、MAP及贝叶斯估计
以PLSA和LDA为代表的文本语言模型是当今统计自然语言处理研究的热点问题。这类语言模型一般都是对文本的生成过程提出自己的概率图模型,然后利用观察到的语料数据对模型参数做估计。有了语言模型和相应的模型参数,我们可以有很多重要的应用,比如文本特征降维、文本主题分析等等。本文主要介绍文本分析的三类参数估计方法-最大似然估计MLE、最大后验概率估计MAP及贝叶斯估计。1、最大似然估计MLE首先回顾一下贝
2012-12-15 11:15:36 42024 19
转载 深度文本匹配开源工具(MatchZoo)
博主导言:苦于Deep Learning的baseline太多实现困难?苦于没有好的基于深度学习处理NLP, IR, QA任务的开源工具?苦于没有发布自己研究的深度文本匹配模型的交流平台?强烈推荐MatchZoo,用深度学习做自然语言处理,信息检索,智能问答等任务的小伙伴看过来。MatchZoo提供了基准数据集(TREC MQ系列数据、WiKiQA数据等)进行开发与测试,整合了当前最流行的深度文本
2017-12-10 23:32:04 16411 5
原创 SIGIR 2017 Paper Characterizing and Predicting Enterprise Email Reply Behavior
中文简介:本文对企业邮件系统中的用户行为进行了建模分析,首先分析了影响用户邮件回复行为的几类因素,然后基于分析结果建立了预测用户邮件回复行为和邮件回复时间的机器学习模型。基于Avocado邮件数据的实验结果表明,本文提出的特征和模型对于用户邮件回复行为的预测准确度大幅度超过了以往的基准方法。论文出处:SIGIR'17英文摘要:Email is still among the most popula
2017-08-28 23:17:46 2284
原创 Gradient Tree Boosting (GBM, GBRT, GBDT, MART)算法解析和基于XGBoost/Scikit-learn的实现
1. 概要Gradient Tree Boosting (别名 GBM, GBRT, GBDT, MART)是一类很常用的集成学习算法,在KDD Cup, Kaggle组织的很多数据挖掘竞赛中多次表现出在分类和回归任务上面最好的performance。同时在2010年Yahoo Learning to Rank Challenge中, 夺得冠军的LambdaMART算法也属于这一类算法。因此Tree Boosting算法和深度学习算法DNN/CNN/RNN等等一样在工业界和学术界中得到了非常广泛的应用。
2017-03-16 12:57:49 17395 3
原创 CIKM 2016 aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model
中文简介:本文针对当前深度学习模型包括基于CNN或者LSTM的模型适用于Answer Sentence Selection这个task时必须额外combine 传统的text matching feature的问题,提出了一个attention based neural matching model。该模型提出使用value-shared weighting scheme以及基于attention
2016-10-30 05:12:17 4032
原创 ICTIR 2016 Analysis of the Paragraph Vector Model for Information Retrieval
中文简介:本文是对前面的SIGIR‘16工作的拓展, 主要是对PV model适用于IR的task时的三方面的问题进行了更加深入的分析,并且提出了针对这三个问题的相应改进。论文出处:ICTIR' 16英文摘要:Previous studies have shown that semantically meaningful representations of words and text can
2016-10-30 04:59:41 2437
原创 SIGIR 2016 Improving Language Estimation with the Paragraph Vector Model for Ad-hoc Retrieval
中文简介:本文对如何基于Paragraph Vector model改进Ad-hoc Retrieval task进行了分析,主要针对IR的场景提出了对PV model的三方面的改进。实验表明,改进后的模型进行检索的效果超过了基于topic model增强的LM的效果。论文出处:SIGIR'16英文摘要:Incorporating topic level estimation into
2016-10-30 04:44:32 1595
原创 ICDM 2014 Paper ShellMiner Mining Organizational Phrases in Argumentative Texts in Social Media
中文简介: 本文提出了概率生成模型 Shell Topic Model (STM)对社交论坛文本中的组织性短语(Organizational Phrases)和主题词(topical contents)进行建模分析,主要的应用有组织性短语的挖掘和文档建模。论文出处:ICDM‘14.英文摘要:Threaded debate forums have become one of the major so
2016-07-03 07:18:49 5678
原创 ECIR 2016 Paper Modelling User Interest for Zero-query Ranking
中文简介:本文对智能个人助理(如Google Now,Microsoft Cortana)中的信息卡片排序进行了研究,从user modeling的角度提出了三组排序特征:implicit feedback features, entity based user interests features以及user demographic features. 其中entity features的提取用
2016-07-03 07:05:48 5164
原创 ECIR 2016 Paper Beyond Factoid QA: Effective Methods for Non-factoid Answer Sentence Retrieval
中文简介:本文对non-factoid 问题的答案句子检索进行了研究,基于learning to ranking的框架,在传统文本匹配特征的基础上提出了给予语义匹配和上下文信息的特征,并且证明了这些特征对于答案句子检索的有效性。本文使用TREC GOV2数据集,并且开源了code和标注数据集,下载链接参见论文脚注。论文出处:ECIR'16.英文摘要: Retrieving finer graine
2016-07-03 06:56:46 5165
原创 CIKM 2013 Paper Modeling interaction features for debate side clustering
中文简介:本文对如何对网上论坛讨论中用户交互关系进行统计建模分析进行了研究。论文出处:CIKM‘13.英文摘要: Online discussion forums are popular social media platforms for users to express their opinions and discuss controversial issues with each othe
2015-12-23 23:19:03 4059
原创 NAACL 2013 Paper Mining User Relations from Online Discussions using Sentiment Analysis and PMF
中文简介:本文对如何基于情感分析和概率矩阵分解从网络论坛讨论中挖掘用户关系进行了深入研究。论文出处:NAACL'13.英文摘要: Advances in sentiment analysis have enabled extraction of user relations implied in online textual exchanges such as forum posts. Howev
2015-12-23 23:12:47 3102
原创 COLING 2014 Paper Generating Supplementary Travel Guides from Social Media
中文简介:想知道如何基于雅虎问答社区帖生成旅行指南吗?本文介绍了相关统计模型和技术。论文出处:COLING’14英文摘要: In this paper we study how to summarize travel-related information in forum threads to generate supplementary travel guides. Such summarie
2015-12-23 22:55:13 3133
原创 CIKM 2013 Paper CQARank: Jointly Model Topics and Expertise in Community Question Answering
中文简介: 本文对如何在问答社区对用户主题兴趣及专业度建模分析进行了研究,并且提出了针对此问题的统计图模型Topics Expertise Model.论文出处:CIKM‘13.英文摘要: Community Question Answering (CQA) websites, where people share expertise on open platforms, have become
2015-12-23 22:48:04 4113
原创 LeetCode Unique Binary Search Trees
Given n, how many structurally unique BST's (binary search trees) that store values 1...n?For example,Given n = 3, there are a total of 5 unique BST's. 1 3 3 2 1 \ /
2015-08-31 12:42:57 2849
原创 LeetCode Implement Stack using Queues
Implement the following operations of a stack using queues.push(x) -- Push element x onto stack.pop() -- Removes the element on top of the stack.top() -- Get the top element.empty() -- Return whether
2015-07-27 16:38:42 3052
原创 LeetCode Find Minimum in Rotated Sorted Array II
Follow up for "Find Minimum in Rotated Sorted Array":What if duplicates are allowed?Would this affect the run-time complexity? How and why?Suppose a sorted array is rotated at some pivot unknown to yo
2015-07-27 16:25:57 2794
原创 LeetCode Find Minimum in Rotated Sorted Array
Suppose a sorted array is rotated at some pivot unknown to you beforehand.(i.e., 0 1 2 4 5 6 7 might become 4 5 6 7 0 1 2).Find the minimum element.You may assume no duplicate exists in the array.思路分析
2015-07-27 16:21:30 2620
原创 LeetCode Maximal Square
Given a 2D binary matrix filled with 0's and 1's, find the largest square containing all 1's and return its area.For example, given the following matrix:1 0 1 0 01 0 1 1 11 1 1 1 11 0 0 1 0Return
2015-07-27 16:09:17 3158
原创 LeetCode Implement Queue using Stacks
Implement the following operations of a queue using stacks.push(x) -- Push element x to the back of queue.pop() -- Removes the element from in front of queue.peek() -- Get the front element.empty() --
2015-07-21 14:48:29 3297
原创 LeetCode Majority Element II
Given an integer array of size n, find all elements that appear more than ⌊ n/3 ⌋ times. The algorithm should run in linear time and in O(1) space.Hint:How many majority elements could it possibly hav
2015-07-20 15:10:04 3788 1
原创 LeetCode Majority Element
Given an array of size n, find the majority element. The majority element is the element that appears more than ⌊ n/2 ⌋ times.You may assume that the array is non-empty and the majority element always
2015-07-20 14:58:41 3089
原创 LeetCode Kth Smallest Element in a BST
Given a binary search tree, write a function kthSmallest to find the kth smallest element in it.Note: You may assume k is always valid, 1 ≤ k ≤ BST's total elements.Follow up:What if the BST is modifi
2015-07-19 15:11:29 3503
原创 LeetCode Product of Array Except Self
Given an array of n integers where n > 1, nums, return an array output such that output[i] is equal to the product of all the elements of nums except nums[i].Solve it without division and in O(n).For
2015-07-19 14:47:06 4004
原创 LeetCode Reverse Bits
Reverse bits of a given 32 bits unsigned integer.For example, given input 43261596 (represented in binary as 00000010100101000001111010011100), return 964176192 (represented in binary as00111001011110
2015-06-22 14:51:15 3582 1
原创 LeetCode Contains Duplicate III
Given an array of integers, find out whether there are two distinct indices i and j in the array such that the difference between nums[i] and nums[j] is at most t and the difference between i and j is
2015-06-15 14:02:42 3801
原创 LeetCode Missing Ranges [LeetCode Book Problem]
Given a sorted integer array where the range of elements are [lower, upper] inclusive, return its missing ranges.For example, given [0, 1, 3, 50, 75], lower = 0 and upper = 99, return ["2", "4->49", "
2015-06-15 12:59:49 3215
原创 LeetCode Number of 1 Bits
Write a function that takes an unsigned integer and returns the number of ’1' bits it has (also known as the Hamming weight).For example, the 32-bit integer ’11' has binary representation 000000000000
2015-06-15 12:45:05 2283
原创 LeetCode Invert Binary Tree
这是最近比较火的一个题目,因为一条推特的转播“Google HR:我们90%的工程师都用你写的软件,但是你竟然不会在白板上面反转一颗二叉树,所以滚吧”,各方看法不一,但看这个题目,的确是一个很简单的题目。考察最基本的递归和树操作。下面给出了递归实现和借助栈的迭代实现(非递归实现)。后一个版本的可扩展性更好,可以处理更大的树。实在是容易题,就是DFS遍历一遍树节点,把每个树节点的左右孩子互换就可以了。或许是那个ios开发牛人不屑于准备就去Google面试,结果被爆,与其说是能力问题,不如说是态度问题。
2015-06-15 10:46:20 3744 1
原创 LeetCode Implement Trie (Prefix Tree)
Implement a trie with insert, search, and startsWith methods.Note:You may assume that all inputs are consist of lowercase letters a-z.思路分析:这题主要考察Trie 即前缀树的实现,Trie可以用于字典的压缩存储,可以节省空间,但是不节省时间(和HashSet相比)
2015-05-31 12:30:46 6022 1
原创 LeetCode Contains Duplicate
Given an array of integers, find if the array contains any duplicates. Your function should return true if any value appears at least twice in the array, and it should return false if every element is
2015-05-31 12:00:03 5857
原创 LeetCode Contains Duplicate II
Given an array of integers and an integer k, find out whether there there are two distinct indices i and j in the array such that nums[i] = nums[j] and the difference between iand j is at most k.思路分析:
2015-05-31 11:54:39 7830
原创 LeetCode Happy Number
Write an algorithm to determine if a number is "happy".A happy number is a number defined by the following process: Starting with any positive integer, replace the number by the sum of the squares of
2015-05-31 11:36:55 2214
原创 LeetCode Count Primes
Description:Count the number of prime numbers less than a non-negative number, n.思路分析:这题是一道数学题,求小于n的所有质数。容易想到的思路是定义个isPrime的判定函数,对小于n的数一个一个判定,但是时间复杂度O(N^2)。有没有更快的算法呢?有,这是一个经典的找质数的算法,Sieve of Eratosthe
2015-05-25 14:57:09 2672
原创 LeetCode Binary Tree Right Side View
Given a binary tree, imagine yourself standing on the right side of it, return the values of the nodes you can see ordered from top to bottom.For example:Given the following binary tree, 1
2015-05-25 14:37:25 2245
原创 LeetCode Copy List with Random Pointer
A linked list is given such that each node contains an additional random pointer which could point to any node in the list or null.Return a deep copy of the list.思路分析:这题要求拷贝链表,包括内容,next指针和random指针。容易想
2015-04-20 14:08:09 1398
LibSVM Java API调用示例程序
2012-12-16
基于机器学习SNS隐私向导分类器的C++及WEKA实现源码
2012-06-03
基于机器学习的SNS隐私保护策略推荐向导的设计与实现
2012-06-03
基于Apriori、FP-Growth及Eclat算法的频繁模式挖掘源程序
2012-04-24
基于Apriori、FP-Growth及Eclat算法的频繁模式挖掘源程序共享版
2012-04-24
基于Kmeans算法、MBSAS算法及DBSCAN算法的newsgroup18828文本聚类器
2012-04-17
基于贝叶斯及KNN算法的newsgroup文本分类器免积分下载版
2012-03-31
基于贝叶斯及KNN算法的newsgroup文本分类器
2012-03-27
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人