双向长短期记忆网络模型_基于改进的双向长短期记忆网络的视频摘要生成模型...

Abstract: In order to solve the problem that traditional video summarization methods often fail to take time sequence information into account and simultaneously extract video features that are too complex and prone to overfitting, an improved Bi-directional Long Short-Term Memory (BiLSTM) network video summarization generation model was proposed. Firstly, the depth characteristics of video frames were extracted through the Convolutional Neural Network (CNN). In order to make the generated video abstract more diverse, the BiLSTM was adopted to convert the depth feature recognition task into the timing feature annotation task of video frames, so that the model can obtain more context information. Secondly, considering that the generated video abstract should be representative, the fusion method of max pooling was adopted to reduce the feature dimension and highlight the key information to dilute the redundant information, so that the model can learn the representative features. The parameters required by the full connection layer was reduced through the reduction of feature dimension, besides, it avoided the problem of overfitting. Finally, the importance score of the video frame was predicted and converted into the shot score, and the key shot was selected to generate the video summary. The experimental results show that the improved video summary generation model can improve the accuracy of video summary generation on the two standard data sets TvSum and SumMe. The F1 value, a measure of the model's performance, showed an improvement of 1.4 and 0.3 percentage points in the mentioned model compared with the current Long Short-Term Memory (LSTM) network video summary model DPPLSTM (Determinantal Point Process Long Short-Term Memory).

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值