mpeg标准8x8DCT变换以及定点化实现

言午许成长

已于 2022-01-19 16:09:20 修改

阅读量3.1k

点赞数 1

分类专栏：编解码文章标签：线性代数矩阵

于 2021-01-04 15:53:27 首次发布

本文链接：https://blog.csdn.net/qq_42139383/article/details/112177870

版权

编解码专栏收录该内容

32 篇文章

订阅专栏

Mpeg1/2标准的DCT变换与H264的整数DCT不太一样，会有小数运算。硬件实现时会做定点化处理，因此也会产生误差，误差主要体现在某些数值小数部分在0.5左右。比如小数运算时数值为4.4999，四舍五入最终像素值为4；如果定点化结果为4.50001，实际上与小数运算差异不大，但最终像素值却相差1。

1.DCT简介

DCT和DCT反变换可用如下公式表示：
dct变换公式
其中，Xij是图像块 X 中第 i 行第 j 列图像或残差值，Ymn 是变换结果矩阵 Y 相应频率点上的 DCT 系数。可以用矩阵表示
$Y=AXA^T$
$X=A^TYA$

Mpeg1/2标准的DCT是8x8大小的。
令 $a=\sqrt{1/8}$ ,
$b=\frac{1}{2}\cos{\left(\frac{\pi}{8}\right)}$ ,
$c=\frac{1}{2}\cos{\left(\frac{3\pi}{8}\right)}$ ,
$d=\frac{1}{2}\cos{\left(\frac{\pi}{16}\right)}$ ,
$e=\frac{1}{2}\cos{\left(\frac{3}{16}\pi\right)}$ ,
$f=\frac{1}{2}\cos{\frac{5}{16}\pi}$ ,
$g=\frac{1}{2}\cos{\frac{7}{16}\pi}$ ,
则矩阵A可以如下表示
$\begin{pmatrix} a & a & a & a& a& a& a& a \\ d & e&f &g&-g&-f&-e&-d \\ b&c&-c&-b&-b&-c&c&b \\ e&-g&-d&-f&f&d&g&-e \\ a& -a&-a&a&a&-a&-a&a \\ f&-d&g&e&-e&-g&d&-f \\ c&-b&b&-c&-c&b&-b&c\\ g&-f&e&-d&d&-e&f&-g \end{pmatrix}$
矩阵运算可以使用蝶形变换加速，以8x8反变换为例，分析 $A^TY$ 的第一列，
其中间结果用acc0-acc7表示，如下
$\begin{pmatrix} acc0 \\ acc1 \\ acc 2 \\ acc3 \\ \end{pmatrix} =\begin{pmatrix} a&b& a& c\\ a&c&-a&-b \\ a&-c&-a&b\\ a&-b&a&-c \end{pmatrix}\begin{pmatrix} x0\\x2 \\ x4\\ x6 \end{pmatrix}$

$\begin{pmatrix} acc4 \\ acc5 \\ acc 6 \\ acc7 \\ \end{pmatrix} =\begin{pmatrix} d&e& f& g\\ e&-g&-d&-f \\ f&-d&g&e\\ g&-f&e&-d \end{pmatrix}\begin{pmatrix} x1\\x3 \\ x5\\ x7 \end{pmatrix}$
最终结果为
$y 0 = a c c 0 + a c c 4$
$y 1 = a c c 1 + a c c 5$
$y 2 = a c c 2 + a c c 6$
$y 3 = a c c 3 + a c c 7$
$y 4 = a c c 3 - a c c 7$
$y 5 = a c c 2 - a c c 6$
$y 6 = a c c 1 - a c c 5$
$y 7 = a c c 0 - a c c 4$

DCT反变换的几种python实现

按照公式直接计算

def normal_dct(block):
    A = np.zeros((8, 8))  # 生成0矩阵
    shape = src.shape[1]  # 获取维数
    for i in range(8):
        for j in range(8):
            if (i == 0):
                x = sqrt(1 / shape)
            else:
                x = sqrt(2 / shape)
            A[i][j] = x * cos(pi * (j + 0.5) * i / shape)  
    return np.dot(np.dot(A.T, src), A)

使用scipy的dct函数

from scipy.fftpack import dct, idct
def idct2(block):
    return idct(idct(block.T, norm = 'ortho').T, norm = 'ortho')

2.定点化实现

定点主要是把小数转成整型数据范围，通过整形加减以及移位操作替换小数运算。DCT运算过程中矩阵 $A$ 中的参数都是小数，需要转为整数进行运算。
小数可以用2的负数次幂相加来表示，以 $a=\sqrt{1/8}$ 为例，可以表示为 $2^{-2} + 2^{-4} + 2^{-5} + 2^{-7} + 2^{-9}$ 。实现过程中以16个元素的数组来表示，每个数组元素所在数组下标对应2的幂次，则a的表项为a_table = [0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0]。

# a = sqrt(1/8) = 0.35355 = 2^-2 + 2^-4 + 2^-5 + 2^-7 + 2^-9
a_table = [0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0]
# b = (1/2*cos(PI/8)) = 0.46194 = 2^-1 - 2^-5 - 2^-7 + 2^-10
b_table = [0,1,0,0,0,-1,0,-1,0,0,1,0,0,0,0,0]
# c = (1/2*cos(PI*3/8)) = 0.19134 = 2^-3 + 2^-4 + 2^-8 - 2^-14
c_table = [0,0,0,1,1,0,0,0,1,0,0,0,0,0,-1,0]
# d = 1/2*cos(PI/16) = 0.49039 = 2^-1 - 2^-7 - 2^-9 + 2^-13
d_table = [0,1,0,0,0,0,0,-1,0,-1,0,0,0,1,0,0]
# e = 1/2*cos(PI*3/16) = 0.41573 = 2^-2 + 2^-3 + 2^-5 + 2^-7 + 2^-9 - 2^-12
e_table = [0,0,1,1,0,1,0,1,0,1,0,0,-1,0,0,0]
# f = 1/2*cos(PI*5/16) = 0.27779 = 2^-2 + 2^-5 - 2^-8 + 2^-11
f_table = [0,0,1,0,0,1,0,0,-1,0,0,1,0,0,0,0]
# g = 1/2*cos(PI*7/16) = 0.09755 = 2^-4 + 2^-5 + 2^-8 - 2^-13
g_table = [0,0,0,0,1,1,0,0,1,0,0,0,0,-1,0,0]
# 定点化IDCT
def fixpoint_idct(block):
    input = block.reshape(64, 1)
    idct_data1 = [0 for i in range(64)]
    idct_data2 = [0 for i in range(64)]
    tmp_data1 = idct_data1
    tmp_data2 = idct_data2

    for i in range(64):
        tmp_data1[i] = input[i] << 10
    #两次蝶形变换
    for k in range(2):
        for hor in range(8):
            a_x0 = b_x2 = a_x4 = c_x6 = 0
            c_x2 = b_x6 = 0
            d_x1 = e_x3 = f_x5 = g_x7 = 0
            e_x1 = g_x3 = d_x5 = f_x7 = 0
            f_x1 = d_x3 = g_x5 = e_x7 = 0
            g_x1 = f_x3 = e_x5 = d_x7 = 0
            i = hor << 3

            x0 = tmp_data1[i]
            x1 = tmp_data1[i+1]
            x2 = tmp_data1[i+2]
            x3 = tmp_data1[i+3]
            x4 = tmp_data1[i+4]
            x5 = tmp_data1[i+5]
            x6 = tmp_data1[i+6]
            x7 = tmp_data1[i+7]
			#遍历长度为16的查找表
            for j in range(16):
                a_x0 += (x0 >> j) * a_table[j]
                b_x2 += (x2 >> j) * b_table[j]
                a_x4 += (x4 >> j) * a_table[j]
                c_x6 += (x6 >> j) * c_table[j]
                c_x2 += (x2 >> j) * c_table[j]
                b_x6 += (x6 >> j) * b_table[j]

                d_x1 += (x1 >> j) * d_table[j]
                e_x1 += (x1 >> j) * e_table[j]
                f_x1 += (x1 >> j) * f_table[j]
                g_x1 += (x1 >> j) * g_table[j]

                d_x3 += (x3 >> j) * d_table[j]
                e_x3 += (x3 >> j) * e_table[j]
                f_x3 += (x3 >> j) * f_table[j]
                g_x3 += (x3 >> j) * g_table[j]
                d_x5 += (x5 >> j) * d_table[j]
                e_x5 += (x5 >> j) * e_table[j]
                f_x5 += (x5 >> j) * f_table[j]
                g_x5 += (x5 >> j) * g_table[j]
                d_x7 += (x7 >> j) * d_table[j]
                e_x7 += (x7 >> j) * e_table[j]
                f_x7 += (x7 >> j) * f_table[j]
                g_x7 += (x7 >> j) * g_table[j]

            acc0 = a_x0 + b_x2 + a_x4 + c_x6
            acc1 = a_x0 + c_x2 - a_x4 - b_x6
            acc2 = a_x0 - c_x2 - a_x4 + b_x6
            acc3 = a_x0 - b_x2 + a_x4 - c_x6
            acc4 = d_x1 + e_x3 + f_x5 + g_x7
            acc5 = e_x1 - g_x3 - d_x5 - f_x7
            acc6 = f_x1 - d_x3 + g_x5 + e_x7
            acc7 = g_x1 - f_x3 + e_x5 - d_x7

            tmp_data2[hor] = acc0 + acc4
            tmp_data2[(1<<3)+hor] = acc1 + acc5
            tmp_data2[(2<<3)+hor] = acc2 + acc6
            tmp_data2[(3<<3)+hor] = acc3 + acc7
            tmp_data2[(4<<3)+hor] = acc3 - acc7
            tmp_data2[(5<<3)+hor] = acc2 - acc6
            tmp_data2[(6<<3)+hor] = acc1 - acc5
            tmp_data2[(7<<3)+hor] = acc0 - acc4

        if k == 0:
            tmp_data1 = idct_data2
            tmp_data2 = idct_data1

    for i in range(64):
        tmp_data2[i] = (tmp_data2[i] + 512) >> 10
        if tmp_data2[i] > 255:
            tmp_data2[i] = 255
        if tmp_data2[i] < -256:
            tmp_data2[i] = -256

    return np.asarray(tmp_data2).reshape(8, 8)