RBM-An approach for text summarization using deep learning algorithm

Padmapriya G, Duraiswamy K. AN APPROACH FOR TEXT SUMMARIZATION USING DEEP LEARNING ALGORITHM[J]. Journal of Computer Science, 2014, 10(1):1-9.

Abstract

RBM被广泛应用,限制玻尔兹曼机
对三种不同知识领域的文档进行了实验
基于RBM

Introduction

  • Developed a multi-document summarization system using deep learning algorithm Restricted Boltzmann Machine (RBM).
  • Solving the ranking problem by finding out the intersection between
    the user query and a particular sentence
  • Sentences are selected on the basis of compression rate entered by the user.

Motivation

信息爆炸,从大量信息中找到我们需要的信息很有必要,做摘要是快速获取信息的一个重要途径

Model

-Restricted Boltzman Machine
Restricted Boltzmann Machine is a stochastic neural network (that is a network of neurons where each neuron has some random behavior when activated).
这是一个随机的网络,二分图——这意味着信息在训练期间和网络使用期间都在两个方向流动,并且这两个方向的权重是相同的
这里写图片描述

Term weight

见A survey of document summarition

Concept feature

这里写图片描述
where, P(wi, wj)-joint probability that both keyword
appeared together in a text window.
P(wi)-probability that a keyword wi appears in a text
window and can be computed by:
这里写图片描述
Where:
swi = The number of windows containing the keyword
wi
|sw| = Total number of windows constructed from a text document
The sentence matrix generate by above steps is:
这里写图片描述
Here sentence matrix S = (s1, s2,……..sn) where si = (f1, f2,……..f4), i<= n is the feature vector.

Deep Learning Algorithm

  • Restricted Boltzmann machine contains two hidden layers and for them two set of bias value is selected namely H0H1:
  • 这里写图片描述
    These set of bias values are values which are randomly selected
    这里写图片描述
    这里写图片描述
    这里写图片描述
    这里写图片描述

Optimal Feature Vector Set Generation

  • Fine tune the obtained feature vector set by adjusting the weight of the units of the RBM
  • To fine tune the feature vector set optimally we use back propagation algorithm
  • Uses cross-entropy error 交叉熵
    For example term weight feature of the sentence will be reconstruct by using following formula
    这里写图片描述

Sentence Score

这里写图片描述
Where:
Sc = Sentence score of a sentence
S = Sentence
Q = User query
Wc = Total word count of a text

Ranking of Sentence

To find out number of top sentences to select from the matrix we use following formula based on the compression rate.
这里写图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值