双向长短期记忆网络模型_基于改进的双向长短期记忆网络的视频摘要生成模型...

最新推荐文章于 2024-06-06 15:14:12 发布

樱桃小公举

最新推荐文章于 2024-06-06 15:14:12 发布

阅读量359

点赞数

文章标签：双向长短期记忆网络模型

本文链接：https://blog.csdn.net/weixin_29043895/article/details/113021417

版权

Abstract： In order to solve the problem that traditional video summarization methods often fail to take time sequence information into account and simultaneously extract video features that are too complex and prone to overfitting, an improved Bi-directional Long Short-Term Memory (BiLSTM) network video summarization generation model was proposed. Firstly, the depth characteristics of video frames were extracted through the Convolutional Neural Network (CNN). In order to make the generated video abstract more diverse, the BiLSTM was adopted to convert the depth feature recognition task into the timing feature annotation task of video frames, so that the model can obtain more context information. Secondly, considering that the generated video abstract should be representative, the fusion method of max pooling was adopted to reduce the feature dimension and highlight the key information to dilute the redundant information, so that the model can learn the representative features. The parameters required by the full connection layer was reduced through the reduction of feature dimension, besides, it avoided the problem of overfitting. Finally, the importance score of the video frame was predicted and converted into the shot score, and the key shot was selected to generate the video summary. The experimental results show that the improved video summary generation model can improve the accuracy of video summary generation on the two standard data sets TvSum and SumMe. The F1 value, a measure of the model's performance, showed an improvement of 1.4 and 0.3 percentage points in the mentioned model compared with the current Long Short-Term Memory (LSTM) network video summary model DPPLSTM (Determinantal Point Process Long Short-Term Memory).