本文结合github yinjinchao给出的jpeg编码源码demo进行讲解,另外编码原理部分有任何问题,大家可以参照下面的博客,这里主要讲代码实现
github链接为github源码
原理介绍Jpeg编码原理
当我们已经获取到一幅图像的所有像素rgb值时,第一步要做的就是rgb转yuv了,当然以下的过程我们也都仅考虑分解过后单个8 * 8 block的实现
1,颜色空间转换(以RGB到YUV为例)
void JpegEncoder::_convertColorSpace(int xPos, int yPos, char* yData, char* cbData, char* crData)
{
for (int y=0; y<8; y++)
{
unsigned char* p = m_rgbBuffer + (y+yPos)*m_width*3 + xPos*3;
for (int x=0; x<8; x++)
{
unsigned char B = *p++;
unsigned char G = *p++;
unsigned char R = *p++;
yData[y*8+x] = (char)(0.299f * R + 0.587f * G + 0.114f * B - 128);
cbData[y*8+x] = (char)(-0.1687f * R - 0.3313f * G + 0.5f * B );
crData[y*8+x] = (char)(0.5f * R - 0.4187f * G - 0.0813f * B);
}
}
}
这里的逻辑应该是非常直观的,m_rgbBuffer也就是保存rgb数据的数组起始地址了,经过颜色空间转换后,也就得到了三个表,分别为Y, U(Cr), V(Cb)
当然,从这里我们也能看出这里采用的YUV 4:4:4格式
2,DCT
我们将图像通过DCT转到频域范围,从而能够去除图像中的高频分量,获得较高的压缩比
对照DCT的公式,我们看下代码实现
void JpegEncoder::_foword_FDC(const char* channel_data, short* fdc_data)
{
const float PI = 3.1415926f;
for(int v=0; v<8; v++)
{
for(int u=0; u<8; u++)
{
float alpha_u = (u==0) ? 1/sqrt(8.0f) : 0.5f;
float alpha_v = (v==0) ? 1/sqrt(8.0f) : 0.5f;
float temp = 0.f;
for(int x=0; x<8; x++)
{
for(int y=0; y<8; y++)
{
float data = channel_data[y*8+x];
data *= cos((2*x+1)*u*PI/16.0f);
data *= cos((2*y+1)*v*PI/16.0f);
temp += data;
}
}
temp *= alpha_u*alpha_v/m_YTable[ZigZag[v*8+u]];
fdc_data[ZigZag[v*8+u]] = (short) ((short)(temp + 16384.5) - 16384);
}
}
}
这里也是按部就班的依照公式,计算出每个u, v 对应的 F(u, v),那么下面的m_YTable是干嘛的呢
首先是由两个量化表(DQT),一个是亮度的量化表,一个是色度的量化表
const unsigned char Luminance_Quantization_Table[64] =
{
16, 11, 10, 16, 24, 40, 51, 61,
12, 12, 14, 19, 26, 58, 60, 55,
14, 13, 16, 24, 40, 57, 69, 56,
14, 17, 22, 29, 51, 87, 80, 62,
18, 22, 37, 56, 68, 109, 103, 77,
24, 35, 55, 64, 81, 104, 113, 92,
49, 64, 78, 87, 103, 121, 120, 101,
72, 92, 95, 98, 112, 100, 103, 99
};
//-------------------------------------------------------------------------------
const unsigned char Chrominance_Quantization_Table[64] =
{
17, 18, 24, 47, 99, 99, 99, 99,
18, 21, 26, 66, 99, 99, 99, 99,
24, 26, 56, 99, 99, 99, 99, 99,
47, 66, 99, 99, 99, 99, 99, 99,
99, 99, 99, 99, 99, 99, 99, 99,
99, 99, 99, 99, 99, 99, 99, 99,
99, 99, 99, 99, 99, 99, 99, 99,
99, 99, 99, 99, 99, 99, 99, 99
};
const char ZigZag[64] =
{
0, 1, 5, 6,14,15,27,28,
2, 4, 7,13,16,26,29,42,
3, 8,12,17,25,30,41,43,
9,11,18,24,31,40,44,53,
10,19,23,32,39,45,52,54,
20,22,33,38,46,51,55,60,
21,34,37,47,50,56,59,61,
35,36,48,49,57,58,62,63
};
这两个量化表是根据人眼的视觉心理阈(貌似是这么叫的。。。)做出来的,当然为了能够手动调节量化的程度,源码作者搞了这么个东西,作者对quality_scale的解释是
该参数在1~199之间,数值越大,压缩比例越高,跟着代码看确实也是如此
void JpegEncoder::_initQualityTables(int quality_scale)
{
if(quality_scale<=0) quality_scale=1;
if(quality_scale>=100) quality_scale=99;
for(int i=0; i<64; i++)
{
int temp = ((int)(Luminance_Quantization_Table[i] * quality_scale + 50) / 100);
if (temp<=0) temp = 1;
if (temp>0xFF) temp = 0xFF;
m_YTable[ZigZag[i]] = (unsigned char)temp;
temp = ((int)(Chrominance_Quantization_Table[i] * quality_scale + 50) / 100);
if (temp<=0) temp = 1;
if (temp>0xFF) temp = 0xFF;
m_CbCrTable[ZigZag[i]] = (unsigned char)temp;
}
}
另外这个新生成的量化表并采用了之字形编排方式,也就是m_YTable和m_CbCrTable不是用i作为下表定位数组元素而是用了m_YTable[ZigZag[i]],这是由于u,v较大时,量化表