JPEG编解码原理及C++调试

JPEG编解码原理及C++调试

JPEG(Joint Photographic Experts Group,联合图像专家小组),是一种针对数字图像的有损压缩标准方法,问世于1986年,并于1992年获得了ISO 10918-1的认定。鉴于JPEG编码算法可以在提供较大的压缩比的同时,保持较好的显示质量,JPEG逐渐成为最为熟知和广泛使用的数字图像格式和通用标准。

目前JPEG文件最常用的扩展名为 .jpg.jpeg,同时也有部分使用 .jpe.jfif.jif

本文将简要说明JPEG编解码的原理,并对解码部分进行代码分析。

一. JPEG编解码原理

首先给出基本(baseline)JPEG编码器的系统框图1


图1-1 基本JPEG编码器系统框图

下面简要说明编解码原理。

JPEG编码原理

(1) 零偏置电平下移

先对8×8的像块进行零偏置电平下移(Level Offset),即对于灰度级为 2 n 2^n 2n的像素,通过减去 2 n − 1 2^{n-1} 2n1,将无符号整数变为有符号数,使其值域变为 [ − 2 n − 1 , 2 n − 1 − 1 ] [-2^{n-1},2^{n-1}-1] [2n1,2n11],以将绝对值大的数出现的概率大大减小,提高编码效率。

(2) 8×8 DCT

将图像分为8×8的像块;对于宽(高)不是8的整数倍的图像,使用图像边缘像素填充,以不改变频谱分布。然后对每一个子块进行DCT(Discrete Cosine Transform,离散余弦变换),以实现能量集中和去相关,便于去除空间冗余,提高编码效率。

DCT核矩阵
C = 1 N [ 1 1 ⋯ 1 2 cos ⁡ π 2 N 2 cos ⁡ 3 π 2 N ⋯ 2 cos ⁡ ( 2 N − 1 ) π 2 N 2 cos ⁡ 2 π 2 N 2 cos ⁡ 6 π 2 N ⋯ 2 cos ⁡ 2 ( 2 N − 1 ) π 2 N ⋮ ⋮ ⋱ ⋮ 2 cos ⁡ ( N − 1 ) π 2 N 2 cos ⁡ ( N − 1 ) 3 π 2 N ⋯ 2 cos ⁡ ( N − 1 ) ( 2 N − 1 ) π 2 N ] (1-1) \bold C=\frac{1}{\sqrt{N}}\left[\begin{array}{cccc} 1 & 1 & \cdots & 1 \\ \sqrt{2} \cos \frac{\pi}{2 N} & \sqrt{2} \cos \frac{3 \pi}{2 N} & \cdots & \sqrt{2} \cos \frac{(2 N-1) \pi}{2 N} \\ \sqrt{2} \cos \frac{2 \pi}{2 N} & \sqrt{2} \cos \frac{6 \pi}{2 N} & \cdots & \sqrt{2} \cos \frac{2(2 N-1) \pi}{2 N} \\ \vdots & \vdots & \ddots & \vdots \\ \sqrt{2} \cos \frac{(N-1) \pi}{2 N} & \sqrt{2} \cos \frac{(N-1) 3 \pi}{2 N} & \cdots & \sqrt{2} \cos \frac{(N-1)(2 N-1) \pi}{2 N} \end{array}\right] \tag{1-1} C=N 112 cos2Nπ2 cos2N2π2 cos2N(N1)π12 cos2N3π2 cos2N6π2 cos2N(N1)3π12 cos2N(2N1)π2 cos2N2(2N1)π2 cos2N(N1)(2N1)π(1-1)
JPEG中使用的是二维8×8 DCT( N = 8 N=8 N=8),则
F ( u , v ) = C f ( x , y ) C T = C f ( x , y ) C − 1 (1-2) \bold F(u,v) = \bold C \bold f(x,y)\bold C^{{\rm T}}=\bold C \bold f(x,y)\bold C^{-1} \tag{1-2} F(u,v)=Cf(x,y)CT=Cf(x,y)C1(1-2)
需要特别强调的是,DCT是一种无损变换,也无法对图像进行压缩,这样做的目的是在为下一步的量化做准备。

(3) 量化

实际上JPEG压缩编码算法中,真正可供调整的部分并不多:DCT、熵编码这两个主要步骤都是完全确定的了,实际上只有量化可以调整,量化也自然是JPEG压缩编码算法的核心。此外,量化是编码流程中唯一会引入误差也是唯一会带来压缩的步骤


图1-2 人眼的视觉特性

JPEG标准中采用中平型均匀量化,由于人眼对低频分量的敏感程度远高于高频分量,且对亮度的敏感程度远高于色度,因而标准中据此设计了2张量化表(亮度、色差各一张),使低频细量化,高频粗量化,亮度细量化,色差粗量化,以减少视觉冗余

JPEG标准中给出的50%质量的亮度量化矩阵2
Q = [ 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 ] (1-3) \bold Q=\left[\begin{array}{cccccccc} 16 & 11 & 10 & 16 & 24 & 40 & 51 & 61 \\ 12 & 12 & 14 & 19 & 26 & 58 & 60 & 55 \\ 14 & 13 & 16 & 24 & 40 & 57 & 69 & 56 \\ 14 & 17 & 22 & 29 & 51 & 87 & 80 & 62 \\ 18 & 22 & 37 & 56 & 68 & 109 & 103 & 77 \\ 24 & 35 & 55 & 64 & 81 & 104 & 113 & 92 \\ 49 & 64 & 78 & 87 & 103 & 121 & 120 & 101 \\ 72 & 92 & 95 & 98 & 112 & 100 & 103 & 99 \end{array}\right] \tag{1-3} Q=1612141418244972111213172235649210141622375578951619242956648798242640516881103112405857871091041211005160698010311312010361555662779210199(1-3)
量化后的DCT系数按下式计算:
B ( u , v ) = round ⁡ [ F ( u , v ) Q ( u , v ) ] u , v = 0 , 1 , 2 , … , 7 (1-4) \bold B(u,v)=\operatorname{round}\left[\frac{\bold F(u,v)}{\bold Q(u,v)}\right] \quad u,v=0,1,2, \ldots, 7 \tag{1-4} B(u,v)=round[Q(u,v)F(u,v)]u,v=0,1,2,,7(1-4)
计算后结果为
B = [ − 26 − 3 − 6 2 2 − 1 0 0 0 − 2 − 4 1 1 0 0 0 − 3 1 5 − 1 − 1 0 0 0 − 3 1 2 − 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] \bold B=\left[\begin{array}{rrrrrrrr} -26 & -3 & -6 & 2 & 2 & -1 & 0 & 0 \\ 0 & -2 & -4 & 1 & 1 & 0 & 0 & 0 \\ -3 & 1 & 5 & -1 & -1 & 0 & 0 & 0 \\ -3 & 1 & 2 & -1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}\right] B=26033100032110000645200002111000021100000100000000000000000000000
注意到,子块中大多数高频区域的系数(第4行/列后)量化结果都已经为0。

而量化矩阵并不是固定的,可以根据要求的质量的不同而进行调整。

(4) DC系数差分编码

我们注意到,8×8像块经过DCT后得到的DC系数有两个特点:一是系数的值较大;二是相邻像块的DC系数差值不大(即存在冗余)。根据这个特点,JPEG标准采用了DPCM(差分脉冲编码调制),以对相邻图像块之间量化DC系数的差值 DIFF \text {DIFF} DIFF进行编码:
DIFF k = DC k − DC k − 1 (1-5) \text {DIFF}_k = \text {DC}_k - \text {DC}_{k-1} \tag{1-5} DIFFk=DCkDCk1(1-5)
具体原理可以参见使用DPCM进行图像压缩的C++实现方法

(5) AC系数Zig-Zag扫描与RLE

由于DCT后,系数大多数集中在左上角,即低频分量区,因此采用Zig-Zag(之字形)扫描,将系数按频率的高低顺序读出,这样可以出现很多连零的机会,便于进行RLE(Run Length Encoding,游程编码),尤其在最后,如果都是零,给出EOB (End of Block)即可。


图1-3 Zig-Zag扫描

RLE的具体原理可参见TGA转换为YUV的C++实现方法

例如0, -2, -1, -1, -1, 0, 0, -1, EOB表示为:(1, -2), (0, -1), (0, -1), (0, -1), (2, -1), EOB。

(6) Huffman编码

对DC系数DPCM的结果和AC系数RLE的结果进行Huffman编码,类别ID采用一元码编码,类内索引采用定长码编码。以DC系数为例:


图1-4 DC系数Huffman编码表

例如差值 DIFF = 3 \text {DIFF} = 3 DIFF=3,对应的类别ID = 2,类内索引 = 3,则码字为100 11。

共有亮度DC、亮度AC、色差DC、色差AC四张Huffman编码表。

JPEG解码原理

解码完全是编码的逆过程,因此只给出解码系统框图:


图1-5 JPEG解码器系统框图

二. JPEG文件格式分析

JPEG文件以segment的形式组织,其中每个segment以一个marker开始,而每个marker均以0xFF和一个marker的标识符开始,随后为2字节的marker长度(不包含marker的起始两字节)和对应的payload(SOI和EOI marker只有2字节的标识符)。

注意,连续的0xFF字节并不是marker的起始标志,而是用来填充的特殊字符。

此外,部分中,0xFF后若为0x00,则跳过此字节不予处理。

常见的marker如下:

Short nameBytesPayloadNameComments
SOI0xFF, 0xD8noneStart Of Image
SOF00xFF, 0xC0variable sizeStart Of Frame (baseline DCT)Indicates that this is a baseline DCT-based JPEG, and specifies the width, height, number of components, and component subsampling (e.g., 4:2:0).
SOF20xFF, 0xC2variable sizeStart Of Frame (progressive DCT)Indicates that this is a progressive DCT-based JPEG, and specifies the width, height, number of components, and component subsampling (e.g., 4:2:0).
DHT0xFF, 0xC4variable sizeDefine Huffman Table(s)Specifies one or more Huffman tables.
DQT0xFF, 0xDBvariable sizeDefine Quantization Table(s)Specifies one or more quantization tables.
DRI0xFF, 0xDD4 bytesDefine Restart IntervalSpecifies the interval between RSTn markers, in Minimum Coded Units (MCUs). This marker is followed by two bytes indicating the fixed size so it can be treated like any other variable size segment.
SOS0xFF, 0xDAvariable sizeStart Of ScanBegins a top-to-bottom scan of the image. In baseline DCT JPEG images, there is generally a single scan. Progressive DCT JPEG images usually contain multiple scans. This marker specifies which slice of data it will contain, and is immediately followed by entropy-coded data.
RST n n n0xFF, 0xD n n n ( n = 0 , ⋯   , 7 ) (n = 0,\cdots, 7) (n=0,,7)noneRestartInserted every r r r macroblocks, where r r r is the restart interval set by a DRI marker. Not used if there was no DRI marker. The low three bits of the marker code cycle in value from 0 to 7.
APP n n n0xFF, 0xE n n nvariable sizeApplication-specificFor example, an Exif JPEG file uses an APP1 marker to store metadata, laid out in a structure based closely on TIFF.
COM0xFF, 0xFEvariable sizeCommentContains a text comment.
EOI0xFF, 0xD9noneEnd Of Image

下面使用Synalyze it! Pro App来对图2-1进行二进制分析,并对一些marker字段作一些简要说明。


图2-1 test.jpg(1024×1024)

SOI(Start Of Image)与EOI(End Of Image)

JPEG标准规定,SOI和EOI必须为文件的起始和结尾marker。它们均只包含标识符0xFFD8/0xFFD9


图2-2 SOI marker

APP0(Application-specific)


图2-3 APP0 marker
  • 字段4 & 5:版本号,表示JFIF的版本
  • 字段6:X、Y方向的密度单位(0:无单位,1:点数/英寸;2:点数/厘米)
  • 字段7 & 8:X、Y方向密度(取值范围未知)
  • 字段9 & 10:缩略图X、Y方向像素数

DQT(Define Quantization Table)


图2-4 其中一个DQT marker(未全部展示)
  • Pq(4位):量化精度(0:8位;1:16位)
  • Tq(4位):量化表ID,取值范围为0~3
  • Qk(每项1字节):共64项

SOF0(Start Of Frame)


图2-5 SOF0 marker
  • 字段3:精度,表示每个数据样本的位数(通常是8位,一般软件不支持12/16位)
  • 字段4 & 5:图像高、宽(单位:像素)
  • 字段6:颜色分量数(1:灰度图;3:YCrCb或YIQ;4:CMYK),JFIF中使用YCrCb,故该字段恒为3
  • 字段7:颜色分量信息,均包含以下4个字段:
    • 字段7.0:颜色分量ID
    • 字段7.1 & 7.2:水平与垂直采样因子
    • 字段7.3:该分量使用的量化表ID

DHT(Define Huffman Table)


图2-6 DHT marker(DC 0号表)
  • 字段3 & 4(各4位):表ID和表类型(0x00:DC 0号表,0x01:直流1号表,0x10:AC 0号表,0x11:交流1号表)
  • 字段5—20:不同位数的码字数量。以图2-6为例,该字段表示没有1位的Huffman码字,2位的Huffman码字有3个,3—9位的Huffman码字均有1个,没有10位及以上的Huffman码字
  • 字段21—结尾:编码内容。由上一字段知,此Huffman树有3 + 7 = 10个叶节点,故本字段有10项。这段数据表示10个叶子结点按从小到大排列,其权值(Huffval)依次为04、 05、 06、 03、 02、 01、 00、09、 07、 08 (16进制)。
    对于DC表,权值表示再读取的位数;而AC表权值的高4位表示当前数值前面有多少个连续的零,低4位表示该交流分量数值的二进制位数,也就是接下来需要读入的位数。

SOS(Start Of Scan)


图2-7 SOS marker
  • 字段3:颜色分量数,应与SOF marker中的字段6相同
  • 字段4—6:颜色分量信息,其中DC、AC字段分别表示两个分量使用的Huffman表号
  • 字段7:Ss,谱选择开始
  • 字段8:Se,谱选择结束
  • 字段9 & 10:Ah & Al,谱选择

三. JPEG解码流程及核心代码说明

下面我们以Tiny JPEG Decoder程序为基础,对JPEG解码流程进行简要说明。

1. 分层结构

JPEG压缩编码算法的一大特点就是采用了分层结构设计的思想,下面说明三个主要结构体的设计意图:

  • struct huffman_table:存储Huffman码表。

    /* tinyjpeg-internal.h */
    
    struct huffman_table
    {
      /* Fast look up table, using HUFFMAN_HASH_NBITS bits we can have directly the symbol,
       * if the symbol is <0, then we need to look into the tree table */
      short int lookup[HUFFMAN_HASH_SIZE];
    
      /* code size: give the number of bits of a symbol is encoded */
      unsigned char code_size[HUFFMAN_HASH_SIZE];
      /* some place to store value that is not encoded in the lookup table 
       * FIXME: Calculate if 256 value is enough to store all values
       */
      uint16_t slowtable[16-HUFFMAN_HASH_NBITS][256];
    };
    
  • struct component:储存当前8×8像块中有关解码的信息。

    /* tinyjpeg-internal.h */
    
    struct component 
    {
      unsigned int Hfactor; // 水平采样因子
      unsigned int Vfactor; // 垂直采样因子
      float* Q_table;   // 指向该8×8块使用的量化表
      struct huffman_table *AC_table;   // 指向该块使用的AC Huffman表
      struct huffman_table *DC_table;   // 指向该块使用的DC Huffman表
      short int previous_DC;    // 前一个块的直流DCT系数
      short int DCT[64];    // DCT系数数组
        
    #if SANITY_CHECK
      unsigned int cid;
    #endif
    };
    
  • struct jdec_private:JPEG数据流结构体,用于存储JPEG图像宽高、数据流指针、Huffman码表等内容,并包含struct huffman_tablestruct component

    /* tinyjpeg-internal.h */
    
    struct jdec_private
    {
      /* Public variables */
      uint8_t *components[COMPONENTS];  /* 分别指向YUV三个分量的三个指针 */
      unsigned int width, height;	/* 图像宽高 */
      unsigned int flags;
    
      /* Private variables */
      const unsigned char *stream_begin, *stream_end;
      unsigned int stream_length;
    
      const unsigned char *stream;	/* 指向当前数据流的指针 */
      unsigned int reservoir, nbits_in_reservoir;
    
      struct component component_infos[COMPONENTS];
      float Q_tables[COMPONENTS][64];		/* quantization tables */
      struct huffman_table HTDC[HUFFMAN_TABLES];	/* DC huffman tables */
      struct huffman_table HTAC[HUFFMAN_TABLES];	/* AC huffman tables */
      int default_huffman_table_initialized;
      int restart_interval;
      int restarts_to_go;				/* MCUs left in this restart interval */
      int last_rst_marker_seen;			/* Rst marker is incremented each time */
    
      /* Temp space used after the IDCT to store each components */
      uint8_t Y[64*4], Cr[64], Cb[64];
    
      jmp_buf jump_state;
      /* Internal Pointer use for colorspace conversion, do not modify it !!! */
      uint8_t *plane[COMPONENTS];
    };
    

2. 解码整体流程

/* 读取JPEG文件,进行解码,并存储结果 */
int convert_one_image(const char *infilename, const char *outfilename, int output_format)
{
  FILE *fp;
  unsigned int length_of_file;  // 文件大小
  unsigned int width, height;   // 图像宽、高
  unsigned char *buf;   // 缓冲区
  struct jdec_private *jdec;
  unsigned char *components[3];

  /* 将JPEG读入缓冲区 */
  fp = fopen(infilename, "rb");
  if (fp == NULL)
    exitmessage("Cannot open filename\n");
  length_of_file = filesize(fp);
  buf = (unsigned char *)malloc(length_of_file + 4);
  if (buf == NULL)
    exitmessage("Not enough memory for loading file\n");
  fread(buf, length_of_file, 1, fp);
  fclose(fp);

  /* Decompress it */
  jdec = tinyjpeg_init();   // 初始化
  if (jdec == NULL)
    exitmessage("Not enough memory to alloc the structure need for decompressing\n");

  /* 解析JPEG文件头 */
  if (tinyjpeg_parse_header(jdec, buf, length_of_file)<0)
    exitmessage(tinyjpeg_get_errorstring(jdec));

  /* 计算图像宽高 */
  tinyjpeg_get_size(jdec, &width, &height);

  snprintf(error_string, sizeof(error_string),"Decoding JPEG image...\n");
  if (tinyjpeg_decode(jdec, output_format) < 0) // 解码实际数据
    exitmessage(tinyjpeg_get_errorstring(jdec));

  /* 
   * Get address for each plane (not only max 3 planes is supported), and
   * depending of the output mode, only some components will be filled 
   * RGB: 1 plane, YUV420P: 3 planes, GREY: 1 plane
   */
  tinyjpeg_get_components(jdec, components);

  /* 按照指定的输出格式保存输出文件 */
  switch (output_format)
   {
    case TINYJPEG_FMT_RGB24:
    case TINYJPEG_FMT_BGR24:
      write_tga(outfilename, output_format, width, height, components);
      break;
    case TINYJPEG_FMT_YUV420P:
      write_yuv(outfilename, width, height, components);
      break;
    case TINYJPEG_FMT_GREY:
      write_pgm(outfilename, width, height, components);
      break;
   }

  /* Only called this if the buffers were allocated by tinyjpeg_decode() */
  tinyjpeg_free(jdec);
  /* else called just free(jdec); */

  free(buf);
  return 0;
}

下面再对解码过程中的几个核心模块进行解读。

3. 解析JPEG文件头

int tinyjpeg_parse_header(struct jdec_private *priv, const unsigned char *buf, unsigned int size)
{
  int ret;

  /* Identify the file */
  if ((buf[0] != 0xFF) || (buf[1] != SOI))  // JPEG文件必须以SOI marker为起始,否则不是合法的JPEG文件
    snprintf(error_string, sizeof(error_string),"Not a JPG file ?\n");

  priv->stream_begin = buf+2;   // 跳过标识符
  priv->stream_length = size-2;
  priv->stream_end = priv->stream_begin + priv->stream_length;

  ret = parse_JFIF(priv, priv->stream_begin);   // 开始解析JPEG

  return ret;
}

4. 解析marker标识符

/* 略去了trace部分 */

static int parse_JFIF(struct jdec_private *priv, const unsigned char *stream)
{
  int chuck_len;
  int marker;
  int sos_marker_found = 0;
  int dht_marker_found = 0;
  const unsigned char *next_chunck;

  /* Parse marker */
  while (sos_marker_found == 0)
   {
     if (*stream++ != 0xff)
       goto bogus_jpeg_format;
     /* Skip any padding ff byte (this is normal) */
     while (*stream == 0xff)
       stream++;

     marker = *stream++;    // 获取0xFF后的一个字节(即为marker标识符)
     chuck_len = be16_to_cpu(stream);   // length字段
     next_chunck = stream + chuck_len;
     switch (marker)    // 判断marker类型
      {
       case SOF:
	 if (parse_SOF(priv, stream) < 0)
	   return -1;
	 break;
       case DQT:
	 if (parse_DQT(priv, stream) < 0)
	   return -1;
	 break;
       case SOS:
	 if (parse_SOS(priv, stream) < 0)
	   return -1;
	 sos_marker_found = 1;
	 break;
       case DHT:
	 if (parse_DHT(priv, stream) < 0)
	   return -1;
	 dht_marker_found = 1;
	 break;
       case DRI:
	 if (parse_DRI(priv, stream) < 0)
	   return -1;
	 break;
       default:
	 break;
      }

     stream = next_chunck;  // 解析下一个marker
   }

  if (!dht_marker_found) {
    build_default_huffman_tables(priv);
  }
  return 0;
    
bogus_jpeg_format:
  return -1;
}

5. 解析DQT

static int parse_DQT(struct jdec_private *priv, const unsigned char *stream)
{
  int qi;   // 量化表ID
  float *table; // 指向量化表
  const unsigned char *dqt_block_end;   // 指向量化表结束位置
  dqt_block_end = stream + be16_to_cpu(stream);
  stream += 2;	// 跳过长度字段

  while (stream < dqt_block_end)	// 检查是否还有量化表
   {
     qi = *stream++;    // 将量化表中系数逐个赋给qi
     table = priv->Q_tables[qi];
     build_quantization_table(table, stream);
     stream += 64;
   }
  return 0;
}

6. 建立量化表

static void build_quantization_table(float *qtable, const unsigned char *ref_table)
{
  int i, j;
  static const double aanscalefactor[8] = {
     1.0, 1.387039845, 1.306562965, 1.175875602,
     1.0, 0.785694958, 0.541196100, 0.275899379
  };    // 比例因子
  const unsigned char *zz = zigzag;

  for (i=0; i<8; i++) {
     for (j=0; j<8; j++) {
       *qtable++ = ref_table[*zz++] * aanscalefactor[i] * aanscalefactor[j];
     }
   }
}

其中,zigzag数组实现了之字形扫描:

static const unsigned char zigzag[64] = 
{
   0,  1,  5,  6, 14, 15, 27, 28,
   2,  4,  7, 13, 16, 26, 29, 42,
   3,  8, 12, 17, 25, 30, 41, 43,
   9, 11, 18, 24, 31, 40, 44, 53,
  10, 19, 23, 32, 39, 45, 52, 54,
  20, 22, 33, 38, 46, 51, 55, 60,
  21, 34, 37, 47, 50, 56, 59, 61,
  35, 36, 48, 49, 57, 58, 62, 63
};

7. 解析DHT

static int parse_DHT(struct jdec_private *priv, const unsigned char *stream)
{
  unsigned int count, i;
  unsigned char huff_bits[17];  // 码长1~16
  int length, index;

  length = be16_to_cpu(stream) - 2;
  stream += 2;	// 跳过长度字段

  while (length>0) {    // 检查是否还有表
     index = *stream++;

     /* We need to calculate the number of bytes 'vals' will takes */
     huff_bits[0] = 0;
     count = 0;
     for (i=1; i<17; i++) {
	    huff_bits[i] = *stream++;
	    count += huff_bits[i];
     }

     if (index & 0xf0 )
       build_huffman_table(huff_bits, stream, &priv->HTAC[index&0xf]);  // 建立交流表
     else
       build_huffman_table(huff_bits, stream, &priv->HTDC[index&0xf]);  // 建立直流表

     length -= 1;
     length -= 16;
     length -= count;
     stream += count;
  }
  return 0;
}

8. 建立Huffman码表

static void build_huffman_table(const unsigned char *bits, const unsigned char *vals, struct huffman_table *table)  // bits为各个位数码字的数量,val为Huffval,table为要建立的Huffman表
{
  unsigned int i, j, code, code_size, val, nbits;
  unsigned char huffsize[HUFFMAN_BITS_SIZE + 1];    // 每个码字的长度
  unsigned char* hz;
  unsigned int huffcode[HUFFMAN_BITS_SIZE + 1]; // 每个码字
  unsigned char* hc;
  int next_free_entry;

  /* 初始化 */
  hz = huffsize;
  for (i=1; i<=16; i++)
   {
     for (j=1; j<=bits[i]; j++)
       *hz++ = i;
   }
  *hz = 0;

  memset(table->lookup, 0xff, sizeof(table->lookup));
  for (i=0; i<(16-HUFFMAN_HASH_NBITS); i++)
    table->slowtable[i][0] = 0;

  code = 0;
  hc = huffcode;
  hz = huffsize;
  nbits = *hz;
  while (*hz)
   {
     while (*hz == nbits)
      {
	*hc++ = code++;
	hz++;
      }
     code <<= 1;
     nbits++;
   }

  /*
   * Build the lookup table, and the slowtable if needed.
   */
  next_free_entry = -1;
  for (i=0; huffsize[i] != 0; i++)
   {
     /* 得到Huffval、每个码字、每个码字的长度*/
     val = vals[i];
     code = huffcode[i];
     code_size = huffsize[i];
     table->code_size[val] = code_size; // Huffval(权值)
     if (code_size <= HUFFMAN_HASH_NBITS)
      {
	/*
	 * Good: val can be put in the lookup table, so fill all value of this
	 * column with value val 
	 */
	int repeat = 1UL<<(HUFFMAN_HASH_NBITS - code_size);
	code <<= HUFFMAN_HASH_NBITS - code_size;
	while ( repeat-- )
	  table->lookup[code++] = val;  // 得到Huffval长度的查找表
      }
     else
      {
	/* Perhaps sorting the array will be an optimization */
	uint16_t *slowtable = table->slowtable[code_size-HUFFMAN_HASH_NBITS-1];
	while(slowtable[0])
	  slowtable+=2;
	slowtable[0] = code;
	slowtable[1] = val;
	slowtable[2] = 0;
	/* TODO: NEED TO CHECK FOR AN OVERFLOW OF THE TABLE */
      }
   }
}

9. 解析SOS

static int parse_SOS(struct jdec_private *priv, const unsigned char *stream)
{
  unsigned int i, cid, table;
  unsigned int nr_components = stream[2];   // 颜色分量数

  stream += 3;
  for (i=0;i<nr_components;i++) {
     /* 得到使用的Huffmann表号 */
     cid = *stream++;
     table = *stream++;
      
     priv->component_infos[i].AC_table = &priv->HTAC[table&0xf];
     priv->component_infos[i].DC_table = &priv->HTDC[table>>4];
  }
  priv->stream = stream+3;
  return 0;
}

10. 解析SOF

static int parse_SOF(struct jdec_private *priv, const unsigned char *stream)
{
  int i, width, height, nr_components, cid, sampling_factor;
  int Q_table;
  struct component *c;

  print_SOF(stream);

  height = be16_to_cpu(stream+3);   // 图像高度
  width  = be16_to_cpu(stream+5);   // 图像宽度
  nr_components = stream[7];    // 颜色分量数

  stream += 8;
  for (i=0; i<nr_components; i++) {
     /* 分别解析各分量 */
     cid = *stream++;   // 分量ID
     sampling_factor = *stream++;   // 采样因子
     Q_table = *stream++;
     c = &priv->component_infos[i];
     c->Vfactor = sampling_factor&0xf;  // 垂直采样因子
     c->Hfactor = sampling_factor>>4;   // 水平采样因子
     c->Q_table = priv->Q_tables[Q_table];  // 使用的量化表
  }
  priv->width = width;
  priv->height = height;

  return 0;
}

11. 解析JPEG实际数据

int tinyjpeg_decode(struct jdec_private *priv, int pixfmt)  // pixfmt为输出格式
{
  unsigned int x, y, xstride_by_mcu, ystride_by_mcu;
  unsigned int bytes_per_blocklines[3], bytes_per_mcu[3];
  decode_MCU_fct decode_MCU;
  const decode_MCU_fct *decode_mcu_table;
  const convert_colorspace_fct *colorspace_array_conv;
  convert_colorspace_fct convert_to_pixfmt;

  if (setjmp(priv->jump_state))
    return -1;

  /* To keep gcc happy initialize some array */
  bytes_per_mcu[1] = 0;
  bytes_per_mcu[2] = 0;
  bytes_per_blocklines[1] = 0;
  bytes_per_blocklines[2] = 0;

  decode_mcu_table = decode_mcu_3comp_table;
  switch (pixfmt) {
     /* 根据不同的输出格式确定MCU */
     case TINYJPEG_FMT_YUV420P:
       colorspace_array_conv = convert_colorspace_yuv420p;
       if (priv->components[0] == NULL)
	 priv->components[0] = (uint8_t *)malloc(priv->width * priv->height);
       if (priv->components[1] == NULL)
	 priv->components[1] = (uint8_t *)malloc(priv->width * priv->height/4);
       if (priv->components[2] == NULL)
	 priv->components[2] = (uint8_t *)malloc(priv->width * priv->height/4);
       bytes_per_blocklines[0] = priv->width;
       bytes_per_blocklines[1] = priv->width/4;
       bytes_per_blocklines[2] = priv->width/4;
       bytes_per_mcu[0] = 8;
       bytes_per_mcu[1] = 4;
       bytes_per_mcu[2] = 4;
       break;

     case TINYJPEG_FMT_RGB24:
       colorspace_array_conv = convert_colorspace_rgb24;
       if (priv->components[0] == NULL)
	 priv->components[0] = (uint8_t *)malloc(priv->width * priv->height * 3);
       bytes_per_blocklines[0] = priv->width * 3;
       bytes_per_mcu[0] = 3*8;
       break;

     case TINYJPEG_FMT_BGR24:
       colorspace_array_conv = convert_colorspace_bgr24;
       if (priv->components[0] == NULL)
	 priv->components[0] = (uint8_t *)malloc(priv->width * priv->height * 3);
       bytes_per_blocklines[0] = priv->width * 3;
       bytes_per_mcu[0] = 3*8;
       break;

     case TINYJPEG_FMT_GREY:
       decode_mcu_table = decode_mcu_1comp_table;
       colorspace_array_conv = convert_colorspace_grey;
       if (priv->components[0] == NULL)
	 priv->components[0] = (uint8_t *)malloc(priv->width * priv->height);
       bytes_per_blocklines[0] = priv->width;
       bytes_per_mcu[0] = 8;
       break;

     default:
       return -1;
  }

  xstride_by_mcu = ystride_by_mcu = 8;  // 初始化:MCU的宽高均为8px(4:4:4)
  if ((priv->component_infos[cY].Hfactor | priv->component_infos[cY].Vfactor) == 1) {
     /* 水平、垂直采样因子均为1 */
     decode_MCU = decode_mcu_table[0];  // MCU包含1个Y
     convert_to_pixfmt = colorspace_array_conv[0];
  } else if (priv->component_infos[cY].Hfactor == 1) {
     /* 水平采样因子为1,垂直采样因子为2 */
     decode_MCU = decode_mcu_table[1];  // MCU包含2个Y
     convert_to_pixfmt = colorspace_array_conv[1];
     ystride_by_mcu = 16;   // MCU高16px,宽8px
  } else if (priv->component_infos[cY].Vfactor == 2) {
     /* 水平、垂直采样因子均为2 */
     decode_MCU = decode_mcu_table[3];  // MCU包含4个Y
     convert_to_pixfmt = colorspace_array_conv[3];
     xstride_by_mcu = 16;   // MCU宽16px
     ystride_by_mcu = 16;   // MCU高16px
  } else {
     /* 水平采样因子为2,垂直采样因子为1 */
     decode_MCU = decode_mcu_table[2];  // MCU包含2个Y
     convert_to_pixfmt = colorspace_array_conv[2];
     xstride_by_mcu = 16;   // MCU宽16px,高8px
  }

  resync(priv);

  /* Don't forget to that block can be either 8 or 16 lines */
  bytes_per_blocklines[0] *= ystride_by_mcu;
  bytes_per_blocklines[1] *= ystride_by_mcu;
  bytes_per_blocklines[2] *= ystride_by_mcu;

  bytes_per_mcu[0] *= xstride_by_mcu/8;
  bytes_per_mcu[1] *= xstride_by_mcu/8;
  bytes_per_mcu[2] *= xstride_by_mcu/8;

  /* 对每个像块进行解码(8x8 / 8x16 / 16x16) */
  for (y=0; y < priv->height/ystride_by_mcu; y++)
   {
     //trace("Decoding row %d\n", y);
     priv->plane[0] = priv->components[0] + (y * bytes_per_blocklines[0]);
     priv->plane[1] = priv->components[1] + (y * bytes_per_blocklines[1]);
     priv->plane[2] = priv->components[2] + (y * bytes_per_blocklines[2]);
     for (x=0; x < priv->width; x+=xstride_by_mcu)
      {
	decode_MCU(priv);
	convert_to_pixfmt(priv);
	priv->plane[0] += bytes_per_mcu[0];
	priv->plane[1] += bytes_per_mcu[1];
	priv->plane[2] += bytes_per_mcu[2];
	if (priv->restarts_to_go>0)
	 {
	   priv->restarts_to_go--;
	   if (priv->restarts_to_go == 0)
	    {
	      priv->stream -= (priv->nbits_in_reservoir/8);
	      resync(priv);
	      if (find_next_rst_marker(priv) < 0)
		return -1;
	    }
	 }
      }
   }

  return 0;
}

12. 解析MCU

decode_MCU_2x2_3planes()为例:

/*
 * Decode a 2x2
 *  .-------.
 *  | 1 | 2 |
 *  |---+---|
 *  | 3 | 4 |
 *  `-------'
 */
static void decode_MCU_2x2_3planes(struct jdec_private *priv)
{
  // Y
  process_Huffman_data_unit(priv, cY);
  IDCT(&priv->component_infos[cY], priv->Y, 16);
  process_Huffman_data_unit(priv, cY);
  IDCT(&priv->component_infos[cY], priv->Y+8, 16);
  process_Huffman_data_unit(priv, cY);
  IDCT(&priv->component_infos[cY], priv->Y+64*2, 16);
  process_Huffman_data_unit(priv, cY);
  IDCT(&priv->component_infos[cY], priv->Y+64*2+8, 16);

  // Cb
  process_Huffman_data_unit(priv, cCb);
  IDCT(&priv->component_infos[cCb], priv->Cb, 8);

  // Cr
  process_Huffman_data_unit(priv, cCr);
  IDCT(&priv->component_infos[cCr], priv->Cr, 8);
}

四. 程序调试及结果

该程序的命令行参数设置方法如下:

--benchmark 输入文件名 输出格式 输出文件名

其中:

  • 第一个参数可以省略
  • 输入文件名带.jpeg/.jpg后缀,输出文件名无后缀
  • 输出格式:如yuv420p

1. 将输出文件保存为.yuv格式

write_yuv()中添加输出yuv文件的相关代码:

static void write_yuv(const char* filename, int width, int height, unsigned char** components) {
    FILE* F;
    char temp[1024];

    snprintf(temp, 1024, "%s.Y", filename);
    F = fopen(temp, "wb");
    fwrite(components[0], width, height, F);
    fclose(F);
    snprintf(temp, 1024, "%s.U", filename);
    F = fopen(temp, "wb");
    fwrite(components[1], width * height / 4, 1, F);
    fclose(F);
    snprintf(temp, 1024, "%s.V", filename);
    F = fopen(temp, "wb");
    fwrite(components[2], width * height / 4, 1, F);
    fclose(F);

    snprintf(temp, 1024, "%s.YUV", filename);
    F = fopen(temp, "wb");
    fwrite(components[0], width, height, F);
    fwrite(components[1], width * height / 4, 1, F);
    fwrite(components[2], width * height / 4, 1, F);
    fclose(F);
}

结果如下:


图4-1 输出的yuv文件

2. 调试TRACE

在程序中已经包含了TRACE相关的预处理器块(受制于篇幅,在前面代码分析中省略了相关内容):

#if TRACE
	/* TRACE */
#endif

在tinyjpeg.h文件中,我们可以看到TRACE已经是处于打开的状态(1):

#define TRACE 1	// 若设为0则可关闭TRACE
#define TRACEFILE "trace_jpeg.txt"	// TRACE文件的文件名

TRACE文件中包含了很多程序运行过程中的中间变量和解析(如Huffman码表)的情况,便于debug(这样远比打断点的效率要高)。执行程序后,得到的trace_jpeg.txt文件如下图所示:


图4-2 TRACE文件

3. 输出量化矩阵和Huffman码表

/* tinyjpeg.h中添加 */
/* 声明全局变量 by S.Z.Zheng */
    FILE* qtabFilePtr;    // 量化表文件指针
/* 声明结束 */


/* tinyjpeg.c中添加*/
static void build_quantization_table(float *qtable, const unsigned char *ref_table)
{
  ...
  for (i=0; i<8; i++) {
      for (j=0; j<8; j++) {
          /* Added by S.Z.Zheng */
          fprintf(qtabFilePtr, "%-6d", ref_table[*zz]);
          if (j == 7) {
              fprintf(qtabFilePtr, "\n");
          }
          /* Addition ended */
		 ...
      }
  }
  fprintf(qtabFilePtr, "\n\n"); // Added by S.Z.Zheng
}

static int parse_DQT(struct jdec_private *priv, const unsigned char *stream)
{
    ...
    while (stream < dqt_block_end)    // 检查是否还有量化表
    {
        ...
        fprintf(qtabFilePtr, "Quantisation table [%d]:\n", qi);   // 量化表ID(added by S.Z.Zheng)
        build_quantization_table(table, stream);
        ...
    }
	...
}


/* loadjpeg.c中添加 */
int main(int argc, char *argv[])
{
  ...
  /* Added by S.Z.Zheng */
  const char* qtabFileName = "q_table.txt"; // 量化表文件名
  fopen_s(&qtabFilePtr, qtabFileName, "wb");    // 打开文件
  /* Addition ended */
  ...
  fclose(qtabFilePtr);  // Added by S.Z.Zheng
  return 0;
}

输出的量化表如下(0号为DC表,1号为AC表):


图4-3 量化表

Huffman码表已包含在TRACE文件中。

4. 输出DC、AC图像

/* tinyjpeg.h中添加 */
/* 声明全局变量 by S.Z.Zheng */
		...
    FILE* dcImgFilePtr; // DC图像文件指针
    FILE* acImgFilePtr; // AC图像文件指针
/* 声明结束 */

/* tinyjpeg.c中添加 */
int tinyjpeg_decode(struct jdec_private* priv, int pixfmt)
{
		...
    /* Added by S.Z.Zheng */
    unsigned char* dcImgBuff;
    unsigned char* acImgBuff;
    unsigned char* uvBuff = 128;
    int count = 0;
    /* Addition ended*/

    /* 对每个像块进行解码(8x8 / 8x16 / 16x16) */
    for (y = 0; y < priv->height / ystride_by_mcu; y++) {
        ...
        for (x = 0; x < priv->width; x += xstride_by_mcu) {
            decode_MCU(priv);

            dcImgBuff = (unsigned char)((priv->component_infos->DCT[0] + 512.0) / 4 + 0.5);  // DCT[0]为DC系数;DC系数范围-512~512;变换到0~255
            acImgBuff = (unsigned char)(priv->component_infos->DCT[1] + 128);   // 选取DCT[1]作为AC的observation;+128便于观察
            fwrite(&dcImgBuff, 1, 1, dcImgFilePtr);
            fwrite(&acImgBuff, 1, 1, acImgFilePtr);
            count++;
						...
                }
            }
        }
    }
		...
    /* Added by S.Z.Zheng */
    for (int i = 0; i < count / 4 * 2; i++) {
        fwrite(&uvBuff, sizeof(unsigned char), 1, dcImgFilePtr);
        fwrite(&uvBuff, sizeof(unsigned char), 1, acImgFilePtr);
    }
    /* Addition ended */
    return 0;
}

/* loadjpeg.c中添加 */
int main(int argc, char *argv[]) {
	...
  /* Added by S.Z.Zheng */
  ...
  const char* dcImgFileName = "test_decoded_dc.yuv";    // DC图像文件名
  const char* acImgFileName = "test_decoded_ac.yuv";    // AC图像文件名
  ...
  fopen_s(&dcImgFilePtr, dcImgFileName, "wb");    // 打开DC图像文件
  fopen_s(&acImgFilePtr, acImgFileName, "wb");    // 打开AC图像文件
  /* Addition ended */
	...
  /* Added by S.Z.Zheng */
  ...
  fclose(dcImgFilePtr);
  fclose(acImgFilePtr);
  /* Addition Ended */

  return 0;
}

结果如下:


图4-4 DC图像与AC图像

这里AC系数观察的是DCT[1],保证图像较为清晰。

5. 计算DC、AC图像的pmf

使用求RGB图像各分量的概率分布和熵中的程序计算电平的概率分布,并绘制图像:


图4-5 DC、AC图像的pmf

参考文献


  1. 《现代电视原理》 ↩︎

  2. JPEG - Wikipedia ↩︎

  • 7
    点赞
  • 31
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值