论文阅读——HEVC中用于动态视频序列基于强化学习的速率控制方法

最新推荐文章于 2023-02-23 10:50:21 发布

liaojq2020

最新推荐文章于 2023-02-23 10:50:21 发布

阅读量318

点赞数

分类专栏：强化学习 HEVC 文章标签：视频编码强化学习人工智能深度学习算法

本文链接：https://blog.csdn.net/qq_43616471/article/details/112132817

版权

强化学习同时被 2 个专栏收录

4 篇文章 1 订阅

订阅专栏

HEVC

3 篇文章 2 订阅

订阅专栏

一、文章出处

本文题为《Rate Control Method Based on Deep Reinforcement Learning for Dynamic Video Sequences in HEVC》，文章链接：原文链接，加载过程较慢容易出现问题，提供资源分享下载链接：分享链接

二、主要内容

文章提出一种基于强化学习的 HEVC 速率控制算法，通过对 encoder 端帧内预测的决策过程分析与建模最终通过强化学习解决问题。

1.一些概念

① frame-level and CTU-level

In addition, our method includes the frame and CTU-level rate control strategy, whose two tasks can be formulated and solved independently. The frame level rate control strategy is determined first; and then the CTU-level rate control strategy is determined.

文中的方法分为 frame-level 和 CTU-level 两部分，两部分独立建模并分别解决。先决定 frame-level rate control strategy 再确定 CTU-level rate control strategy。

② episode

At the frame level, each GOP is regarded as one episode of a task.
At the CTU level, each frame is treated as one episode of a task.

③ state

在这里插入图片描述

Because I frames and P/B frames have many different characteristics in the encoding process, we use different sets of features for I frames and P/B frames.
For an inter-frame, we choose features 3-4 and 6-8 in Table I.
For an intra-frame, we select features 1-7 in Table I to describe the environment.

对于 inter-frame ，选择上表中的特征 3-4 和 6-8 作为 state。
对于 intra-frame ，选择上表中的特征 1-7 作为state。

在这里插入图片描述

For an inter-frame, we choose features 9-12 in Table II.
For an intra-frame, we select features 1-11 in Table II to describe the environment.

对于 inter-frame ，选择上表中的特征 9-12 作为 state。
对于 intra-frame ，选择上表中的特征 1-11 作为state。

④ action

In our proposed framework, the RL agent determines the QP for each frame and CTU. The possible actions are the QP values, which range from 0 to 51.

At the frame level, the actions include all possible QPs and bit budgets for a frame.
At the CTU level, the available actions include all possible QPs that can be used to encode a CTU.

action 是在 0-51 的范围中选择合适的 QP 值。

Note that to ensure stable quality, at the frame level, the QP is must satisfy $QP_{PicAvg}−2 ≤ QP_{currframe} ≤ QP_{PicAvg}+2$ , where $QP_{PicAvg}$ is average QP of previous frames.
At the CTU level, the QP must satisfy $QP_{currframe} −2 ≤ QP_{currCTU} ≤ QP_{currframe} + 2$ .

为了保证稳定的质量，QP 必须满足 $QP_{PicAvg}−2 ≤ QP_{currframe} ≤ QP_{PicAvg}+2$ 或 $QP_{currframe} −2 ≤ QP_{currCTU} ≤ QP_{currframe} + 2$ 的约束。

⑤ reward

frame-level:

$D_{cur\_frame}$ and $D_{prev\_frame}$ are the distortions of the current and previous frames, respectively, for which we use mean-square error (MSE) as the quality evaluation standard;

$V=T_{avg\_frame} * N_{coded\_frame}-\sum^{N_{coded\_frame}}_{i=1}R_i$ is the current buffer status, with $T_{avg\_frame}$ being the target average number of bits per frame, $N_{coded\_frame}$ is the number of the encoded frames, $R_i$ is the actual bits of i-th frame, and $\epsilon$ is a small value to avoid division by 0.

CTU-level:

$D_{cur\_CTU}$ and $D_{prev\_CTU}$ are the distortions of the current and previous CTUs, respectively.

Here, the distortion is measured as the sum of the absolute transformed difference (SATD) of the CTU for an intra-frame and as the MAD for an inter-frame.

2.算法过程

在这里插入图片描述

三、算法效果

在这里插入图片描述

liaojq2020

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
论文阅读——HEVC中用于动态视频序列基于强化学习的速率控制方法

一、文章出处本文题为《Rate Control Method Based on Deep Reinforcement Learning for Dynamic Video Sequences in HEVC》，文章链接：原文链接，加载过程较慢容易出现问题，提供资源分享下载链接：分享链接二、主要内容文章提出一种基于强化学习的 HEVC 速率控制算法，通过对 encoder 端帧内预测的决策过程分析与建模最终通过强化学习解决问题。1.一些概念① frame-level and CTU-levelI
复制链接

扫一扫

专栏目录