图像处理算法_1d-dct快速算法流图-CSDN博客

在计算机信息处理及应用中，图像信息处理和处理结果的应用占有十分重要的地位。图像处理的发展依赖于处理器芯片(包括单片机、DSP、FPGA等)技术的应用和发展，以及大容量、价格低廉的存储器的出现。图像处理系统虽然由机箱式大体积结构发展为插卡式小型化结构，但是由于图像处理存在大量的数据信息，在实时性和容量上一般不能满足多数需要实时处理的场合。

图像处理系统有很多种实现方法，比如在通用计算机上用软件实现、用单片机实现、利用特殊用途的DSP芯片来实现等。但这些实现方法都有着缺点，例如软件实现速度太慢，不能用于实时系统;单片机采用的是冯·诺依曼总线结构，而且乘法运算速度太慢;如果用特殊用途的DSP芯片又缺乏灵活性，而且开发工具不是很完善。在本系统中，图像处理系统采用FPGA芯片设计来实现。利用FPGA芯片的高速处理特性完成大部分的图像处理工作，计算机只是作为辅助操作和存储系统。这种方法发挥了FPGA的高速性能又具有相当大的灵活性的特点，开发工具比较完善。

此外，视频数据一个很大的特点就是大容量性，这是与音频数据最大的区别之一。对于大容量数据的实时处理不但需要高速的CPU，还需要能扩展的大容量存储空间。在通用计算机上用软件实现时，其存储空间由计算机的存储空间决定，其扩展性能受到了限制;同时单片机的扩展空间有限。它们都无法满足视频数据的大容量要求。

现今图像处理应用越来越趋于小型化系统，趋向于把整个处理系统集成在一个小"黑盒子"里面，甚至于一块电路板上。这就要求图像处理系统具有高速度、高集成度的处理芯片来完成原本需要许多部件合作完成的任务。FPGA的中央处理器特性使得可以把众多的功能集于一身，并通过外部扩展来实现复杂的功能，实现系统的小型化［１］。

本设计以QuartusII软件为设计平台，QuartusII是Altera公司提供的FPGA／CPLD开发集成环境。借助于这款优秀的FPGA设计软件，可以轻松的完成设计输入、综合或编译、适配、仿真、下载等功能，为用户创造多功能、高效率的FPGA开发集成环境，因而成为 FPGA设计主流软件。

2.基本原理

2.1 数字图像压缩

随着数字技术的发展，对图像采用数字信号处理技术越加显示它的优点，但数字化后其频带大大加宽。一路普通电视图像信号数字化后，它的数据码率将高达167Mbps，对存储器容量要求很大，占有的带宽将达80MHz左右，这样将使数字信号技术失去其实用价值。因而需要采用数字图像压缩技术解决上述问题，可以说数字图像压缩编码技术是使数字图像信号走向实用化的关键技术之一，表2－1列出了各种应用的码率［4］。

表2－1 各种应用的码率

2.2 DCT／IDCT的基本原理

由于一幅图像中相邻像素有很高的相关性，因此可以采用数字信息处理技术将图像信号进行压缩。典型的图像变换算法有DFT、DCT、哈达马变换、Harr变换、Slant变换和K－L变换等，其中离散余弦变换（DCT）是一种常用的图像变换算法。DCT在去除信号相关性方面比离散傅里叶变换（DFT）更优越，对于协方差矩阵为Toeplitx矩阵形式的数据，它更接近于最佳变换。自Ahmed、Natarajan和Rao于1974年提出离散余弦变换概念以来，DCT已被广泛应用于数字信号处理各个领域，特别是图像压缩领域。目前的图像压缩标准，如JPEG、H.261协议、MPEG－1、MPEG－2及最新的H.263协议，其视频编码均采用基于DCT变换＋运动补偿预测＋VLC编码的算法［2］。

DCT属于图像变换编码，它运用了图像变换技术，将一幅原始图像进行变换之后，只要去接近于零的系数，并且对较小的系数进行粗量化，而保留包含了图像主要信息的系数，以进行压缩编码。变换编码所产生的量化误差和信道误差将如同随机噪声那样分散到重构图中的各个像素中去，而不会造成误差积累。在实际的图像压缩中，DCT变换的方块尺寸的选择与要求的图像质量、压缩比及实现的复杂度有关。若要求高的压缩比，则方块尺寸应选择大一些，若要减小运算的复杂程度，则要求方块小一些。实际使用时通常折衷取方块大小为8×8,此时DCT与IDCT公式可展开如下：

DCT：

（0≤u≤7,0≤v≤7）式（2-1）

IDCT：

（0≤u≤7,0≤v≤7）式 (2-2)

在DCT和IDCT的变换式中，f(i,j)为图像的像素值，F(u,v)为经DCT变换后的系数值，且有

严格说DCT本身并不能进行码率压缩，因为64个样值仍然得到64个系数，如图2-1所示。这里给出了一个8×8像块的具体例子，经DCT变换后，比特数增加了。在这个例子值是8比特，从0～255得到的即直流分量的最大值是原来256的16倍，即0～4095,交流分量的范围是-2048～2047.只是在经过量化后，特别是按人眼的生理特征对低频分量和高频分量设置不同的量化，会使大多数高频分量的系数变为零。一般说来，人眼对低频分量比较敏感，而对高频分量不太敏感。因此对低频分量采用较细的量化，而对高频分量采用较粗的量化［4］。

图 2-1 算法总流程

DCT的编码过程如图 2-2所示。输入的图像信号首先被划分为8×8的样本子块，每一个子块经过正向DCT变换（FDCT）转换为一组DCT系数。因为每个像素的系数由8位组成，所以共有64个DCT系数。其中第一个系数是该子块DCT系数序列的导引系数（记为DC），其余63个DCT系数为跟随系数（记为AC）。每一个系数用量化表（图 2-2中的“表说明1”）中对应的值进行量化。

量化之后，DC系数是差分编码，即把前一个图像子块的量化DC系数作为当前子块量化DC系数的预测值，并取二者之差编码，如图 2-3所示。AC系数被重新排列成一个一维所谓Zig-Zag序列（如图 2-4所示），并进行进一步的压缩编码，通常用无失真熵编码（哈夫曼编码或算术编码）。采用哪一种编码，用条件表（图 2-2中的“表说明2”）加以说明［2］。

图 2-2 DCT编码系统简图

其它如下所示：

图 2-3 DC差分编码图 2-4 Zig-Zag顺序

DCT解码过程的每一个步骤都是相应编码过程的逆过程。如图 2-5所示。熵解码器量化DCT系数的Zig-Zag序列，然后用离散余弦逆变换（IDCT）重构为8×8的图像子块，最后合成全部图像。虽然DCT是有损压缩，但是，只要选择足够多的系数（例如多于16系数），这种变换编码压缩的重构图像的视觉效果是可以接受的。这可能是静止图像压缩国际标准JPEG采用DCT压缩编码的理由［2］。

图 2-5 DCT解码系统简图

3.关于QuartusⅡ

Altera QuartusⅡ设计软件提供完整的多平台设计环境，能够直接满足特定设计需要，为可编程芯片系统（SOPC）提供全面的设计环境。QuartusⅡ软件含有FPGA和CPLD设计所有阶段的解决方案。

Quartus II图形用户界面如图3-1［３］.

图 3-1　QuartusⅡ界面

Quartus II设计流程如图3-2.

图3-2　Quartus II设计流程

此外，Quartus II软件为设计流程的每个阶段提供Quartus II图形用户界面、EDA工具界面以及命令行界面。可以在整个流程中只使用这些界面中的一个，也可以在设计流程的不同阶段使用不同界面。

4.二维DCT算法的FPGA设计

4.1 二维DCT快速算法的原理

4.1.1 二维DCT 的分解

二维DCT是由一维衍生而来的，它可以分解为两个独立的一维DCT。

令

　　（4-1）

其中

则可将式（2-1）重写为

　（4-2）

其中

以上二式分别表示两个独立的一维DCT变换，式（4-1）代表列变换，式（4-2）代表行变换。所以采用FPGA来实现DCT变换时，应该对图像样本矩阵先进行列变换，再进行行变换。

在进行列变换时，G矩阵中的第x行第y列的值等于f矩阵的第y列和

（j=0,…7）相乘加的结果。

同理，在进行行变换时，F矩阵中的第x行第y列的值等于G矩阵的第y行和

（i=0,…7）相乘加的结果。

所以可以看出，二维DCT变换可以分解为，先对f矩阵进行一次列的一维DCT变换得到G矩阵，然后对G矩阵进行转置得到

，最后再对

进行一次列的一维DCT变换即可。其算法结构如图4-1所示［５］。

图 4-1 2D-DCT算法结构

4.1.2 一维DCT 的快速算法

从4.1.1节看出， DCT变换是由输入矩阵跟一系列DCT系数相乘加的结果，为了计算方便，可以将这些系数组成一个矩阵C，如图4-2所示。其中c(k)代表的是

。根据余弦函数的周期特性和对称特性：cos x=cos(2n

x)=-cos[(2n+1)

x],因而可将图4-2的矩阵C中的系数进行化简，得到图4-3的化简系数。所以一维DCT变换Y＝CX，其展开如图4-4所示［５］。

图 4-2 系数矩阵　　图 4-3　变换后的系数矩阵

图 4-4　一维DCT运算

由图4-3可见，该系数矩阵具有良好的对称性，即第1、3、5、7行的左4列与右4列对称相等，第2、4、6、8行的左4列与右4列符号相反。因而图4-4中的矩阵可进一步化简。如图4-5所示。

图 4-5　变换后一维DCT运算

4.2 乘法器设计

4.2.1 乘法器快速算法

由图4-5可知

y0=c0×t0+c0×t2+c0×t4+c0×t6 式（4-3）

y2=c2×t0+c6×t2-c6×t4-c2×t6 式（4-4）

y4=c4×t0-c4×t2-c4×t4+c4×t6 式（4-5）

y6=c6×t0-c2×t2+c2×t4-c6×t6 式（4-6）

y1=c1×t1+c3×t3+c5×t5+c7×t7 式（4-7）

y3=c3×t0-c7×t2-c1×t4-c5×t6 式（4-8）

y5=c5×t0-c5×t2+c7×t4+c3×t6 式（4-9）

y7=c7×t0-c5×t2+c3×t4-c1×t6 式（4-10）

其中t0=x0+x7,t2=x1+x6,t4=x2+x5,t6=x3+x4,t1=x0-x7,t3=x1-x6,t5=x2-x3,t7=x3-x4

由此看出，一维DCT变换的核心算法是乘法器算法。

因乘积项中的DCT系数值是已知的，所以采用查询表和移位相加的方法进行乘法运算，此DA算法可以使硬件结构比较简单，消耗的硬件资源相对比较少，最重要的是运算周期比较短。

其具体的例子如下：

Y=x0×5+x1×3,比如(x0=1,x1=2),它们用二进制表示为x0(0001),x1(0010),5(0101),

3(0011).首先列出一张5和3相加所有可能的结果的表，其内容如下：

“00”:0000 0*3+0*5 “01”:0101 0*3+1*5 “10”:0011 1*3+0*5 “11”:1000 1*3+1*5

然后用x0和x1第1位组成的两位数（01）查得的数据为0101,第2位（10）查得的数据为0011,第3位（00）查得的数据为0000,第4位（00）查得的数据为0000,最后相加。

0101

0011

0000

0001011

其结果（0001011）化为十进制数为11,值和实际结果相符y=1*5+2*3=11.

本文中的乘法算法基本思想如上所述，本文乘法查询表的数据如下：

［y0］

“0000” :00000000 “0001” :00010110 c0

“0010” :00010110 c0 “0011” :00101100 c0+c0

“0100” :00010110 c0 “0101” :00101100 c0+c0

“0110” :00101100 c0+c0 “0111” :01000010 c0+c0+c0

“1000” :00010110 c0 “1001” :00101100 c0+c0

“1010” :00101100 c0+c0 “1011” :01000010 c0+c0+c0

“1100” :00101100 c0+c0 “1101” :01000010 c0+c0+c0

“1110” :01000010 c0+c0+c0 “1111” :01011000 c0+c0+c0+c0

［y2］

“0000” :00000000 “0001” :00011101 c2

“0010” :00001100 c6 “0011” :00101001 c2+c6

“0100” :11110100 -c6 “0101” :00010001 c2-c6

“0110” :00000000 c6-c6 “0111” :00011101 c2+c6-c6=c2

“1000” :11100011 -c2 “1001” :00000000 c2-c2

“1010” :11101111 c6-c2 “1011” :00001100 c2+c6-c2=c6

“1100” :11010111 -c6-c2 “1101” :11110100 c2-c6-c2=-c6

“1110” :11100011 c6-c6-c2=-c2 “1111” :00000000 c2+c6-c6-c2

［y4］同理

［y6］同理

［y1］同理

［y3］同理

［y5］同理

［y7］同理

4.2.2 乘法查询表的VHDL程序

Table_y0.vhd \\ y0=c0×t0+c0×t2+c0×t4+c0×t6

见附录

Table_y1.vhd \\ y1=c1×t1+c3×t3+c5×t5+c7×t7

见附录

Table_y2.vhd \\ y2=c2×t0+c6×t2-c6×t4-c2×t6

见附录

Table_y3.vhd \\ y3=c3×t0-c7×t2-c1×t4-c5×t6

见附录

Table_y4.vhd \\ y4=c4×t0-c4×t2-c4×t4+c4×t6

见附录

Table_y5.vhd \\y5=c5×t0-c5×t2+c7×t4+c3×t6

见附录

Table_y6.vhd \\ y6=c6×t0-c2×t2+c2×t4-c6×t6

见附录

Table_y7.vhd \\ y7=c7×t0-c5×t2+c3×t4-c1×t6

见附录

4.2.3 乘法查询表与硬件之间的映射问题

其中某个查询表的顶层文件引脚如图4-6所示.

图4-6　table_y0引脚图

查询表的RTL视图如图4-7所示。

图4-7　查询表的RTL视图

由RTL视图可知，查询表采用的硬件结构为多路选择器MUX。通过对多路选择器MUX的DATA端赋初值，可以快速的实现乘法查询。

4.2.4 乘法查询表仿真

其仿真波形如图4-8所示。

图4-8　乘法查询表仿真

4.3 一维DCT算法设计

4.3.1 一维DCT算法总流程

由4.1.2节可知，输入数据先经过预处理（convert），即（t0=x0+x7,t2=x1+x6,……），然后通过查询表查找数据，再次进行移位相加，如此循环，就可以得到8位×12位的乘法结果。其总的流程如图4-9所示。

图4-9　一维DCT算法总流程

4.3.2 一维DCT算法VHDL程序

Convert.vhd \\ t0=x0+x7,t2=x1+x6,t4=x2+x5,t6=x3+x4,

\\t1=x0-x7,t3=x1-x6,t5=x2-x3,t7=x3-x4

见附录

One_d_dct.vhd \\实现一维离散余弦变换

见附录

4.3.3 一维DCT算法与硬件之间的映射问题

一维DCT算法的顶层文件引脚如图4-10所示.

图4-10　一维DCT算法引脚图

引脚的功能分别为clk(时钟信号)，rst(复位信号)，start(开始信号),x0-x7(12位的数据输入信号),done(计算结束信号)，y0-y7(12位的数据输出信号)。

一维DCT算法中的循环移位模块、查找乘法表后相加模块分别如图4-11、图4-12所示。

图4-11 循环移位模块

图4-12 查找乘法表后相加模块

由图4-11可以看出，循环移位模块是通过控制线把多路选择器和寄存器相结合的方法来实现循环移位功能。

4.3.4 一维DCT算法仿真

其仿真波形如图4-13所示。

图4-13　一维DCT算法仿真

4.4 二维DCT算法设计

4.4.1二维DCT算法总流程

因为一维DCT算法模块中的数据是并形的，所以总的输入输出都要进行串行并行转换。

其总流程如图4-14所示。

图4-14　二维DCT算法总流程

4.4.2二维DCT算法VHDL程序

Two_d_dct.vhd \\二维DCT算法模块

见附录

Dct.vhd \\顶层模块文件

见附录

如果需要串口输入数据，就要增加串并转换模块，那么顶层文件如下：

Main.vhd \\增加串口模块的顶层文件

见附录

4.4.3二维DCT算法与硬件之间的映射问题

二维DCT算法的顶层文件引脚如图4-15所示.

图 4-15　二维DCT算法的顶层文件引脚图

引脚的功能分别为clk(时钟信号)，rst(复位信号)，start(开始信号),datain(串行数据输入),doutclk(数据输出时钟信号)，done(计算结束信号)，dataout(串行数据输出)。

4.4.4二维DCT算法仿真

其仿真波形如图4-16所示。

图4-16　二维DCT算法仿真

5.设计结果分析

5.1资源消耗

QuartusII编译后的资源分析如图5-1所示。

图5-1　资源分析

本设计选用的芯片为低端的CycloneII芯片，由图5-1可知，本设计消耗的资源如下：

总的逻辑单元 8100 占56%

总的寄存器 5487

总的端口数 30 占20%

由此看出，本设计没有超出资源上限。

5.2 时序分析

由图4-21看出，对8×8的12位数据方块所需的时间为26个时钟周期。在设计中，行变换和列变换采用流水线操作方式，每13个时钟周期就可馈送一组新的块数据，从而提高了数据处理的速度。其仿真结果如图2-1所示。

如果系统时钟频率为55.6MHz,相应的26个时钟周期延时为467.9ns，每秒可处理8×8数据块4.27×10e6 个，对应的数据速率达8×8×12×4.27×10e6＝3.28Gb/s,适用于各种实时图像传输的场合，可直接用于HDTV的图像压缩。

6. 结论

本设计研究了图像处理算法中的离散余弦变换（DCT）的FPGA实现，完成了分析和设计算法，并探讨了算法与硬件之间的映射问题。设计结果通过了软件仿真，基本正确的实现了预期的功能。

参考文献

[1] 徐志军，徐光辉.CPLD／FPGA的开发与应用［M］.北京：电子工业出版社，2002,247-264.

[2] 陈传波，金光级.数字图像处理［M］.北京：机械工业出版社，2004,142-147.

[3] 潘松，黄继业.EDA技术与VHDL［M］.北京：清华大学出版社，2005,116-124.

[4] 佚名.数字压缩编码技术[EB/OL].http://www.btc.sh.cn/wsxy/digi/d4z.htm,2003

[5] 钟文荣，陈建发.二维DCT算法的高速芯片设计[EB/OL].

http://210.51.180.206/sNewsSystem/ShowNews.aspx?newsid=2793,2007-1-15

[6] 邹明德.直接法数据压缩分析[J].华东化工学院学报,1989,15(2): 239-244.

A research of image processing algorithm based on FPGA

He De-qiu 105052003028 Advisor: Lin Yan-qing

Major in Electronic Information Science and Technology

College of Mathematics and Computer Science

【Abstract】In the 21st century ,with the arrival of the information age ,image processing of information in various fields gains more and more attentions. The image coding and decoding has been applied to an increasing number of occasions, it is core technology of digital television, STB, HDTV decoder, DVD player, video-conferencing and other application. But, most of the image compression and decoding, such as DCT (Discrete Cosine Transform) and IDCT (inverse discrete cosine transform),require very high rate of data processing. Therefore, high-performance FPGA (Field Programmable Gate Array) is widely used in image processing.

【Keywords】Image Compression ; FPGA;DCT;IDCT

附录

Table_y0.vhd \\ y0=c0×t0+c0×t2+c0×t4+c0×t6

library ieee;

use ieee.std_logic_1164.all;

entity table_y0 is

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y0;

architecture one of table_y0 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t6,t4,t2,t0)

begin t<=t6&t4&t2&t0;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00010110";

when "0010"=>databuff<="00010110";

when "0011"=>databuff<="00101100";

when "0100"=>databuff<="00010110";

when "0101"=>databuff<="00101100";

when "0110"=>databuff<="00101100";

when "0111"=>databuff<="01000010";

when "1000"=>databuff<="00010110";

when "1001"=>databuff<="00101100";

when "1010"=>databuff<="00101100";

when "1011"=>databuff<="01000010";

when "1100"=>databuff<="00101100";

when "1101"=>databuff<="01000010";

when "1110"=>databuff<="01000010";

when "1111"=>databuff<="01011000";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Table_y1.vhd \\ y1=c1×t1+c3×t3+c5×t5+c7×t7

library ieee;

use ieee.std_logic_1164.all;

entity table_y1 is

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y1;

architecture one of table_y1 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t7,t5,t3,t1)

begin t<=t7&t5&t3&t1;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00011111";

when "0010"=>databuff<="00011010";

when "0011"=>databuff<="00111001";

when "0100"=>databuff<="00010001";

when "0101"=>databuff<="00110000";

when "0110"=>databuff<="00101011";

when "0111"=>databuff<="01001010";

when "1000"=>databuff<="00000110";

when "1001"=>databuff<="00100101";

when "1010"=>databuff<="00100000";

when "1011"=>databuff<="00111111";

when "1100"=>databuff<="00010111";

when "1101"=>databuff<="00110110";

when "1110"=>databuff<="00110001";

when "1111"=>databuff<="01010000";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Table_y2.vhd \\ y2=c2×t0+c6×t2-c6×t4-c2×t6

library ieee;

use ieee.std_logic_1164.all;

entity table_y2 is

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y2;

architecture one of table_y2 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t6,t4,t2,t0)

begin t<=t6&t4&t2&t0;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00011101";

when "0010"=>databuff<="00001100";

when "0011"=>databuff<="00101001";

when "0100"=>databuff<="11110100";

when "0101"=>databuff<="00010001";

when "0110"=>databuff<="00000000";

when "0111"=>databuff<="00011101";

when "1000"=>databuff<="11100011";

when "1001"=>databuff<="00000000";

when "1010"=>databuff<="11101111";

when "1011"=>databuff<="00001100";

when "1100"=>databuff<="11010111";

when "1101"=>databuff<="11110100";

when "1110"=>databuff<="11100011";

when "1111"=>databuff<="00000000";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Table_y3.vhd \\ y3=c3×t0-c7×t2-c1×t4-c5×t6

library ieee;

use ieee.std_logic_1164.all;

entity table_y3 is

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y3;

architecture one of table_y3 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t7,t5,t3,t1)

begin t<=t7&t5&t3&t1;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00011010";

when "0010"=>databuff<="11111010";

when "0011"=>databuff<="00010100";

when "0100"=>databuff<="11100001";

when "0101"=>databuff<="11111011";

when "0110"=>databuff<="11011011";

when "0111"=>databuff<="11110101";

when "1000"=>databuff<="11101111";

when "1001"=>databuff<="00001001";

when "1010"=>databuff<="11101001";

when "1011"=>databuff<="00000011";

when "1100"=>databuff<="11010000";

when "1101"=>databuff<="11101010";

when "1110"=>databuff<="11001010";

when "1111"=>databuff<="11100100";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Table_y4.vhd \\ y4=c4×t0-c4×t2-c4×t4+c4×t6

library ieee;

use ieee.std_logic_1164.all;

entity table_y4 is

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y4;

architecture one of table_y4 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t6,t4,t2,t0)

begin t<=t6&t4&t2&t0;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00010110";

when "0010"=>databuff<="00010110";

when "0011"=>databuff<="00000000";

when "0100"=>databuff<="00010110";

when "0101"=>databuff<="00000000";

when "0110"=>databuff<="11010100";

when "0111"=>databuff<="11101010";

when "1000"=>databuff<="00010110";

when "1001"=>databuff<="00101100";

when "1010"=>databuff<="00000000";

when "1011"=>databuff<="00010110";

when "1100"=>databuff<="00000000";

when "1101"=>databuff<="00010110";

when "1110"=>databuff<="11101010";

when "1111"=>databuff<="00000000";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Table_y5.vhd \\y5=c5×t0-c5×t2+c7×t4+c3×t6

library ieee;

use ieee.std_logic_1164.all;

entity table_y5 is

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y5;

architecture one of table_y5 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t7,t5,t3,t1)

begin t<=t7&t5&t3&t1;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00010001";

when "0010"=>databuff<="11100001";

when "0011"=>databuff<="11110010";

when "0100"=>databuff<="00000110";

when "0101"=>databuff<="00010111";

when "0110"=>databuff<="11100111";

when "0111"=>databuff<="11111000";

when "1000"=>databuff<="00011010";

when "1001"=>databuff<="00101011";

when "1010"=>databuff<="11111011";

when "1011"=>databuff<="00001100";

when "1100"=>databuff<="00100000";

when "1101"=>databuff<="00110001";

when "1110"=>databuff<="00000001";

when "1111"=>databuff<="00010010";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Table_y6.vhd \\ y6=c6×t0-c2×t2+c2×t4-c6×t6

library ieee;

use ieee.std_logic_1164.all;

entity table_y6 is

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y6;

architecture one of table_y6 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t6,t4,t2,t0)

begin t<=t6&t4&t2&t0;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00001100";

when "0010"=>databuff<="11100011";

when "0011"=>databuff<="11101111";

when "0100"=>databuff<="00011101";

when "0101"=>databuff<="00101001";

when "0110"=>databuff<="00000000";

when "0111"=>databuff<="00001100";

when "1000"=>databuff<="11110100";

when "1001"=>databuff<="00000000";

when "1010"=>databuff<="11010111";

when "1011"=>databuff<="11100011";

when "1100"=>databuff<="00010001";

when "1101"=>databuff<="00011101";

when "1110"=>databuff<="11110100";

when "1111"=>databuff<="00000000";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Table_y7.vhd \\ y7=c7×t0-c5×t2+c3×t4-c1×t6

library ieee;

use ieee.std_logic_1164.all;

entity table_y7 is

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end table_y7;

architecture one of table_y7 is

signal databuff:std_logic_vector(7 downto 0);

signal t:std_logic_vector(3 downto 0);

begin

process(t7,t5,t3,t1)

begin t<=t7&t5&t3&t1;

case t is

when "0000"=>databuff<="00000000";

when "0001"=>databuff<="00000110";

when "0010"=>databuff<="11101111";

when "0011"=>databuff<="11110101";

when "0100"=>databuff<="00011010";

when "0101"=>databuff<="00100000";

when "0110"=>databuff<="00001001";

when "0111"=>databuff<="00001111";

when "1000"=>databuff<="11100001";

when "1001"=>databuff<="11100111";

when "1010"=>databuff<="11010000";

when "1011"=>databuff<="11010110";

when "1100"=>databuff<="11111011";

when "1101"=>databuff<="00000001";

when "1110"=>databuff<="11101010";

when "1111"=>databuff<="11110000";

when others=>databuff<="00000000";

end case;

end process;

data<=databuff;

end one;

Convert.vhd \\ t0=x0+x7,t2=x1+x6,t4=x2+x5,t6=x3+x4,

\\t1=x0-x7,t3=x1-x6,t5=x2-x3,t7=x3-x4

library ieee;

use ieee.std_logic_1164.all;

use ieee.std_logic_unsigned.all;

entity convert is

port(x7,x6,x5,x4,x3,x2,x1,x0:in std_logic_vector(11 downto 0);

t7,t6,t5,t4,t3,t2,t1,t0:out std_logic_vector(11 downto 0));

end entity convert;

architecture one of convert is

begin

t7<=x3-x4;t6<=x3+x4;t5<=x2-x5;t4<=x2+x5;t3<=x1-x6;t2<=x1+x6;t1<=x0-x7;t0<=x0+x7;

end architecture one;

One_d_dct.vhd \\实现一维离散余弦变换

library ieee;

use ieee.std_logic_1164.all;

use ieee.std_logic_unsigned.all;

entity one_d_dct is

port(x7,x6,x5,x4,x3,x2,x1,x0:in std_logic_vector(11 downto 0);

y7,y6,y5,y4,y3,y2,y1,y0:out std_logic_vector(11 downto 0);

start,rst,clk:in std_logic;

done:buffer std_logic);

end entity one_d_dct;

architecture one of one_d_dct is

component table_y0 \\引用table_y0 实体

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component table_y1 \\引用table_y1 实体

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component table_y2 \\引用table_y2 实体

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component table_y3 \\引用table_y3 实体

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component table_y4 \\引用table_y4 实体

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component table_y5 \\引用table_y5 实体

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component table_y6 \\引用table_y6 实体

port(t6,t4,t2,t0:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component table_y7 \\引用table_y7 实体

port(t7,t5,t3,t1:in std_logic;

data:out std_logic_vector(7 downto 0));

end component;

component convert \\引用convert 实体

port(x7,x6,x5,x4,x3,x2,x1,x0:in std_logic_vector(11 downto 0);

t7,t6,t5,t4,t3,t2,t1,t0:out std_logic_vector(11 downto 0));

end component;

function sgn_extend (data_8:std_logic_vector(7 downto 0))

return std_logic_vector is \\实现符号位数扩展

begin

return data_8(7)&data_8&"0000000";

end function sgn_extend;

function sgn_cut (data_16:std_logic_vector(15 downto 0))

return std_logic_vector is \\实现位数删减

begin

return data_16(13 downto 2);

end function sgn_cut;

signal count:integer range 0 to 11;

signal compute:std_logic;

signal t7,t6,t5,t4,t3,t2,t1,t0:std_logic_vector(11 downto 0);

signal d7,d6,d5,d4,d3,d2,d1,d0:std_logic_vector(7 downto 0);

signal tt7,tt6,tt5,tt4,tt3,tt2,tt1,tt0:std_logic_vector(11 downto 0);

signal dy7,dy6,dy5,dy4,dy3,dy2,dy1,dy0:std_logic_vector(15 downto 0);

signal outy7,outy6,outy5,outy4,outy3,outy2,outy1,outy0:std_logic_vector(11 downto 0);

begin

y7<=outy7;y6<=outy6;y5<=outy5;y4<=outy4;y3<=outy3;y2<=outy2;y1<=outy1;y0<=outy0;

u:convert port map(x7=>x7,x6=>x6,x5=>x5,x4=>x4,x3=>x3,x2=>x2,x1=>x1,x0=>x0,t7=>t7,t6=>t6,t5=>t5,t4=>t4,t3=>t3,t2=>t2,t1=>t1,t0=>t0);

u7:table_y7 port map(t7=>tt7(0),t5=>tt5(0),t3=>tt3(0),t1=>tt1(0),data=>d7);

u6:table_y6 port map(t6=>tt6(0),t4=>tt4(0),t2=>tt2(0),t0=>tt0(0),data=>d6);

u5:table_y5 port map(t7=>tt7(0),t5=>tt5(0),t3=>tt3(0),t1=>tt1(0),data=>d5);

u4:table_y4 port map(t6=>tt6(0),t4=>tt4(0),t2=>tt2(0),t0=>tt0(0),data=>d4);

u3:table_y3 port map(t7=>tt7(0),t5=>tt5(0),t3=>tt3(0),t1=>tt1(0),data=>d3);

u2:table_y2 port map(t6=>tt6(0),t4=>tt4(0),t2=>tt2(0),t0=>tt0(0),data=>d2);

u1:table_y1 port map(t7=>tt7(0),t5=>tt5(0),t3=>tt3(0),t1=>tt1(0),data=>d1);

u0:table_y0 port map(t6=>tt6(0),t4=>tt4(0),t2=>tt2(0),t0=>tt0(0),data=>d0);

outy7<=sgn_cut(dy7);outy6<=sgn_cut(dy6);outy5<=sgn_cut(dy5);outy4<=sgn_cut(dy4);outy3<=sgn_cut(dy3);outy2<=sgn_cut(dy2);outy1<=sgn_cut(dy1);outy0<=sgn_cut(dy0);

process(clk,rst)

begin

if clk'event and clk='1' then

if rst='1' then \\复位处理

count<=0;done<='0';compute<='0'; dy7<=(others=>'0');dy6<=(others=>'0');dy5<=(others=>'0');dy4<=(others=>'0');dy3<=(others=>'0');dy2<=(others=>'0');dy1<=(others=>'0');dy0<=(others=>'0'); tt7<=(others=>'0');tt6<=(others=>'0');tt5<=(others=>'0');tt4<=(others=>'0');tt3<=(others=>'0');tt2<=(others=>'0');tt1<=(others=>'0');tt0<=(others=>'0');

else

if done='1' then done<='0';end if;

if compute='1' then \\开始计算

if count=11 then \\计算符号位

dy7<=dy7(15)&dy7(15 downto 1)-sgn_extend(d7);

dy6<=dy6(15)&dy6(15 downto 1)-sgn_extend(d6);

dy5<=dy5(15)&dy5(15 downto 1)-sgn_extend(d5);

dy4<=dy4(15)&dy4(15 downto 1)-sgn_extend(d4);

dy3<=dy3(15)&dy3(15 downto 1)-sgn_extend(d3);

dy2<=dy2(15)&dy2(15 downto 1)-sgn_extend(d2);

dy1<=dy1(15)&dy1(15 downto 1)-sgn_extend(d1);

dy0<=dy0(15)&dy0(15 downto 1)-sgn_extend(d0);

done<='1';compute<='0';

else \\移位相加

dy7<=dy7(15)&dy7(15 downto 1)+sgn_extend(d7);

dy6<=dy6(15)&dy6(15 downto 1)+sgn_extend(d6);

dy5<=dy5(15)&dy5(15 downto 1)+sgn_extend(d5);

dy4<=dy4(15)&dy4(15 downto 1)+sgn_extend(d4);

dy3<=dy3(15)&dy3(15 downto 1)+sgn_extend(d3);

dy2<=dy2(15)&dy2(15 downto 1)+sgn_extend(d2);

dy1<=dy1(15)&dy1(15 downto 1)+sgn_extend(d1);

dy0<=dy0(15)&dy0(15 downto 1)+sgn_extend(d0);

end if;

count<=count+1;

end if;

if start='1' and done='0' then compute<='1';end if;

if start='1' then

count<=0; tt7<=t7;tt6<=t6;tt5<=t5;tt4<=t4;tt3<=t3;tt2<=t2;tt1<=t1;tt0<=t0; dy7<=(others=>'0');dy6<=(others=>'0');dy5<=(others=>'0');dy4<=(others=>'0');dy3<=(others=>'0');dy2<=(others=>'0');dy1<=(others=>'0');dy0<=(others=>'0');

else

tt7(10 downto 0)<=tt7(11 downto 1);tt6(10 downto 0)<=tt6(11 downto 1);tt5(10 downto 0)<=tt5(11 downto 1);tt4(10 downto 0)<=tt4(11 downto 1);tt3(10 downto 0)<=tt3(11 downto 1);tt2(10 downto 0)<=tt2(11 downto 1);tt1(10 downto 0)<=tt1(11 downto 1);tt0(10 downto 0)<=tt0(11 downto 1);

end if;

end process;

end architecture one;

Two_d_dct.vhd \\二维DCT算法模块

library ieee;

use ieee.std_logic_1164.all;

use ieee.std_logic_unsigned.all;

entity two_d_dct is

port(din:in std_logic_vector(767 downto 0);

dout:out std_logic_vector(767 downto 0);

clk,start,rst:in std_logic;

done:buffer std_logic);

end entity two_d_dct;

architecture one of two_d_dct is

component one_d_dct \\引用一维DCT算法的one_d_dct实体

port(x7,x6,x5,x4,x3,x2,x1,x0:in std_logic_vector(11 downto 0);

y7,y6,y5,y4,y3,y2,y1,y0:out std_logic_vector(11 downto 0);

start,rst,clk:in std_logic;

done:buffer std_logic);

end component;

signal data1D:std_logic_vector(767 downto 0);

signal data2D:std_logic_vector(767 downto 0);

signal donerow0,donerow1,donerow2,donerow3,donerow4,donerow5,donerow6,donerow7:std_logic;

signal donecol0,donecol1,donecol2,donecol3,donecol4,donecol5,donecol6,donecol7:std_logic;

begin \\引脚连接

c0:one_d_dct port map(x0=>din(11 downto 0),x1=>din(23 downto 12),x2=>din(35 downto 24),x3=>din(47 downto 36),x4=>din(59 downto 48),x5=>din(71 downto 60),x6=>din(83 downto 72),x7=>din(95 downto 84),

y0=>data1D(11 downto 0),y1=>data1D(23 downto 12),y2=>data1D(35 downto 24),y3=>data1D(47 downto 36),y4=>data1D(59 downto 48),y5=>data1D(71 downto 60),y6=>data1D(83 downto 72),y7=>data1D(95 downto 84),

start=>start,clk=>clk,rst=>rst,done=>donecol0);

c1:one_d_dct port map(x0=>din(107 downto 96),x1=>din(119 downto 108),x2=>din(131 downto 120),x3=>din(143 downto 132),x4=>din(155 downto 144),x5=>din(167 downto 156),x6=>din(179 downto 168),x7=>din(191 downto 180),

y0=>data1D(107 downto 96),y1=>data1D(119 downto 108),y2=>data1D(131 downto 120),y3=>data1D(143 downto 132),y4=>data1D(155 downto 144),y5=>data1D(167 downto 156),y6=>data1D(179 downto 168),y7=>data1D(191 downto 180),

start=>start,clk=>clk,rst=>rst,done=>donecol1);

c2:one_d_dct port map(x0=>din(203 downto 192),x1=>din(215 downto 204),x2=>din(227 downto 216),x3=>din(239 downto 228),x4=>din(251 downto 240),x5=>din(263 downto 252),x6=>din(275 downto 264),x7=>din(287 downto 276),

y0=>data1D(203 downto 192),y1=>data1D(215 downto 204),y2=>data1D(227 downto 216),y3=>data1D(239 downto 228),y4=>data1D(251 downto 240),y5=>data1D(263 downto 252),y6=>data1D(275 downto 264),y7=>data1D(287 downto 276),

start=>start,clk=>clk,rst=>rst,done=>donecol2);

c3:one_d_dct port map(x0=>din(299 downto 288),x1=>din(311 downto 300),x2=>din(323 downto 312),x3=>din(335 downto 324),x4=>din(347 downto 336),x5=>din(359 downto 348),x6=>din(371 downto 360),x7=>din(383 downto 372),

y0=>data1D(299 downto 288),y1=>data1D(311 downto 300),y2=>data1D(323 downto 312),y3=>data1D(335 downto 324),y4=>data1D(347 downto 336),y5=>data1D(359 downto 348),y6=>data1D(371 downto 360),y7=>data1D(383 downto 372),

start=>start,clk=>clk,rst=>rst,done=>donecol3);

c4:one_d_dct port map(x0=>din(395 downto 384),x1=>din(407 downto 396),x2=>din(419 downto 408),x3=>din(431 downto 420),x4=>din(443 downto 432),x5=>din(455 downto 444),x6=>din(467 downto 456),x7=>din(479 downto 468),

y0=>data1D(395 downto 384),y1=>data1D(407 downto 396),y2=>data1D(419 downto 408),y3=>data1D(431 downto 420),y4=>data1D(443 downto 432),y5=>data1D(455 downto 444),y6=>data1D(467 downto 456),y7=>data1D(479 downto 468),

start=>start,clk=>clk,rst=>rst,done=>donecol4);

c5:one_d_dct port map(x0=>din(491 downto 480),x1=>din(503 downto 492),x2=>din(515 downto 504),x3=>din(527 downto 516),x4=>din(539 downto 528),x5=>din(551 downto 540),x6=>din(563 downto 552),x7=>din(575 downto 564),

y0=>data1D(491 downto 480),y1=>data1D(503 downto 492),y2=>data1D(515 downto 504),y3=>data1D(527 downto 516),y4=>data1D(539 downto 528),y5=>data1D(551 downto 540),y6=>data1D(563 downto 552),y7=>data1D(575 downto 564),

start=>start,clk=>clk,rst=>rst,done=>donecol5);

c6:one_d_dct port map(x0=>din(587 downto 576),x1=>din(599 downto 588),x2=>din(611 downto 600),x3=>din(623 downto 612),x4=>din(635 downto 624),x5=>din(647 downto 636),x6=>din(659 downto 648),x7=>din(671 downto 660),

y0=>data1D(587 downto 576),y1=>data1D(599 downto 588),y2=>data1D(611 downto 600),y3=>data1D(623 downto 612),y4=>data1D(635 downto 624),y5=>data1D(647 downto 636),y6=>data1D(659 downto 648),y7=>data1D(671 downto 660),

start=>start,clk=>clk,rst=>rst,done=>donecol6);

c7:one_d_dct port map(x0=>din(683 downto 672),x1=>din(695 downto 684),x2=>din(707 downto 696),x3=>din(719 downto 708),x4=>din(731 downto 720),x5=>din(743 downto 732),x6=>din(755 downto 744),x7=>din(767 downto 756),

y0=>data1D(683 downto 672),y1=>data1D(695 downto 684),y2=>data1D(707 downto 696),y3=>data1D(719 downto 708),y4=>data1D(731 downto 720),y5=>data1D(743 downto 732),y6=>data1D(755 downto 744),y7=>data1D(767 downto 756),

start=>start,clk=>clk,rst=>rst,done=>donecol7);

r0:one_d_dct port map(x0=>data1D(11 downto 0),x1=>data1D(107 downto 96),x2=>data1D(203 downto 192),x3=>data1D(299 downto 288),x4=>data1D(395 downto 384),x5=>data1D(491 downto 480),x6=>data1D(587 downto 576),x7=>data1D(683 downto 672),

y0=>data2D(11 downto 0),y1=>data2D(23 downto 12),y2=>data2D(35 downto 24),y3=>data2D(47 downto 36),y4=>data2D(59 downto 48),y5=>data2D(71 downto 60),y6=>data2D(83 downto 72),y7=>data2D(95 downto 84),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow0);

r1:one_d_dct port map(x0=>data1D(23 downto 12),x1=>data1D(119 downto 108),x2=>data1D(215 downto 204),x3=>data1D(311 downto 300),x4=>data1D(407 downto 396),x5=>data1D(503 downto 492),x6=>data1D(599 downto 588),x7=>data1D(695 downto 684),

y0=>data2D(107 downto 96),y1=>data2D(119 downto 108),y2=>data2D(131 downto 120),y3=>data2D(143 downto 132),y4=>data2D(155 downto 144),y5=>data2D(167 downto 156),y6=>data2D(179 downto 168),y7=>data2D(191 downto 180),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow1);

r2:one_d_dct port map(x0=>data1D(35 downto 24),x1=>data1D(131 downto 120),x2=>data1D(227 downto 216),x3=>data1D(323 downto 312),x4=>data1D(419 downto 408),x5=>data1D(515 downto 504),x6=>data1D(611 downto 600),x7=>data1D(707 downto 696),

y0=>data2D(203 downto 192),y1=>data2D(215 downto 204),y2=>data2D(227 downto 216),y3=>data2D(239 downto 228),y4=>data2D(251 downto 240),y5=>data2D(263 downto 252),y6=>data2D(275 downto 264),y7=>data2D(287 downto 276),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow2);

r3:one_d_dct port map(x0=>data1D(47 downto 36),x1=>data1D(143 downto 132),x2=>data1D(239 downto 228),x3=>data1D(335 downto 324),x4=>data1D(431 downto 420),x5=>data1D(527 downto 516),x6=>data1D(623 downto 612),x7=>data1D(719 downto 708),

y0=>data2D(299 downto 288),y1=>data2D(311 downto 300),y2=>data2D(323 downto 312),y3=>data2D(335 downto 324),y4=>data2D(347 downto 336),y5=>data2D(359 downto 348),y6=>data2D(371 downto 360),y7=>data2D(383 downto 372),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow3);

r4:one_d_dct port map(x0=>data1D(59 downto 48),x1=>data1D(155 downto 144),x2=>data1D(251 downto 240),x3=>data1D(347 downto 336),x4=>data1D(443 downto 432),x5=>data1D(539 downto 528),x6=>data1D(635 downto 624),x7=>data1D(731 downto 720),

y0=>data2D(395 downto 384),y1=>data2D(407 downto 396),y2=>data2D(419 downto 408),y3=>data2D(431 downto 420),y4=>data2D(443 downto 432),y5=>data2D(455 downto 444),y6=>data2D(467 downto 456),y7=>data2D(479 downto 468),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow4);

r5:one_d_dct port map(x0=>data1D(71 downto 60),x1=>data1D(167 downto 156),x2=>data1D(263 downto 252),x3=>data1D(359 downto 348),x4=>data1D(455 downto 444),x5=>data1D(551 downto 540),x6=>data1D(647 downto 636),x7=>data1D(743 downto 732),

y0=>data2D(491 downto 480),y1=>data2D(503 downto 492),y2=>data2D(515 downto 504),y3=>data2D(527 downto 516),y4=>data2D(539 downto 528),y5=>data2D(551 downto 540),y6=>data2D(563 downto 552),y7=>data2D(575 downto 564),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow5);

r6:one_d_dct port map(x0=>data1D(83 downto 72),x1=>data1D(179 downto 168),x2=>data1D(275 downto 264),x3=>data1D(371 downto 360),x4=>data1D(467 downto 456),x5=>data1D(563 downto 552),x6=>data1D(659 downto 648),x7=>data1D(755 downto 744),

y0=>data2D(587 downto 576),y1=>data2D(599 downto 588),y2=>data2D(611 downto 600),y3=>data2D(623 downto 612),y4=>data2D(635 downto 624),y5=>data2D(647 downto 636),y6=>data2D(659 downto 648),y7=>data2D(671 downto 660),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow6);

r7:one_d_dct port map(x0=>data1D(95 downto 84),x1=>data1D(191 downto 180),x2=>data1D(287 downto 276),x3=>data1D(383 downto 372),x4=>data1D(479 downto 468),x5=>data1D(575 downto 564),x6=>data1D(671 downto 660),x7=>data1D(767 downto 756),

y0=>data2D(683 downto 672),y1=>data2D(695 downto 684),y2=>data2D(707 downto 696),y3=>data2D(719 downto 708),y4=>data2D(731 downto 720),y5=>data2D(743 downto 732),y6=>data2D(755 downto 744),y7=>data2D(767 downto 756),

start=>donecol0,clk=>clk,rst=>rst,done=>donerow7);

process(clk,rst)

begin

if clk'event and clk='1' then

if rst='1' then \\复位处理

done<='0';dout<=(others=>'0');

else

done<=donerow0;

if donerow0='1' then dout<=data2D; end if;

end if;

end process;

end architecture one;

Dct.vhd \\顶层模块文件

library ieee;

use ieee.std_logic_1164.all;

use ieee.std_logic_unsigned.all;

entity dct is

port(datain:in std_logic_vector(11 downto 0);

dataout:out std_logic_vector(11 downto 0);

clk,start,rst,dinclk,doutclk:in std_logic;

done:buffer std_logic);

end entity dct;

architecture one of dct is

component two_d_dct \\引用二维DCT算法two_d_dct实体

port(din:in std_logic_vector(767 downto 0);

dout:out std_logic_vector(767 downto 0);

clk,start,rst:in std_logic;

done:buffer std_logic);

end component;

signal inbuff:std_logic_vector(767 downto 0);

signal outbuff:std_logic_vector(767 downto 0);

signal doutbuff:std_logic_vector(767 downto 0);

signal donestate,startstate:std_logic;

begin

u:two_d_dct port map(din=>inbuff,dout=>outbuff,clk=>clk,start=>start,rst=>rst,done=>done);

process(dinclk)

begin

if dinclk'event and dinclk='1' then \\数据输入时钟信号

if startstate='0' then

inbuff(767 downto 756)<=datain;

inbuff(755 downto 0)<=inbuff(767 downto 12);

end if;

end process;

process(start) \\开始状态处理

begin

if start'event and start='1' then

startstate<=not startstate;

end if;

end process;

process(done) \\结束状态处理

begin

if done'event and done='1' then

donestate<=not donestate;

end if;

end process;

process(doutclk)

begin

if doutclk'event and doutclk='1' then \\输出时钟信号

if donestate='1' then

if done='1' then

doutbuff<=outbuff;

else

dataout<=doutbuff(11 downto 0);

doutbuff(755 downto 0)<=doutbuff(767 downto 12);

end if;

end process;

end architecture one;

Main.vhd \\增加串口模块的顶层文件

library ieee;

use ieee.std_logic_1164.all;

use ieee.std_logic_unsigned.all;

entity main is

port( datain:in std_logic;

clk,start,rst,doutclk:in std_logic;

dataout:out std_logic;

done:buffer std_logic );

end main;

architecture one of main is

component dct is

port(datain:in std_logic_vector(11 downto 0);

dataout:out std_logic_vector(11 downto 0);

clk,start,rst,dinclk,doutclk:in std_logic;

done:buffer std_logic);

end component;

signal ready,doutclk2:std_logic;

signal data:std_logic_vector(11 downto 0);

signal data_out:std_logic_vector(11 downto 0);

signal dataoutbuff:std_logic_vector(11 downto 0);

signal datainbuff:std_logic_vector(7 downto 0);

signal count:std_logic_vector(2 downto 0);

signal count2:std_logic_vector(3 downto 0);

begin

u: dct port map(datain=>data,dataout=>data_out,clk=>clk,start=>start,rst=>rst,

dinclk=>ready,doutclk=>doutclk2,done=>done);

process(rst,clk)

begin

if rst='0' then \\复位处理

count<=(others=>'0');

else \\串并转换

if (clk'event and clk='1') then

datainbuff(7 downto 1)<=datainbuff(6 downto 0);

datainbuff(0)<=datain;

if count="111" then

data<=”0000”&datainbuff; ready<='1'; count<="000";

else

count<=count+1;

end if;

end process;

process(rst,doutclk)

begin

if rst=’0’ then \\复位处理

count2<=(others=>’0’);

else \\并串转换

if (doutclk’event and doutclk=’1’ ) then

dataout<=datadataoutbuff(0);

dataoutbuff(10 downto 0)<=dataoutbuff(11 downto 1);

if count2=”1011” then

doutclk2<=’1’; dataoutbuff<=data_out; count2<=”0000”;

else

count2<=count+1; doutclk2<=’0’;

end if;

end if

end if;

end one;