matlab基于dct的图像压缩编码解码_基于AI或传统编码方法的图像压缩开源算法汇总...

最新推荐文章于 2023-11-18 00:08:59 发布

weixin_39819283

最新推荐文章于 2023-11-18 00:08:59 发布

阅读量385

点赞数

文章标签： matlab基于dct的图像压缩编码解码图像压缩算法代码

一、基于AI的图像压缩算法

1. Mu Li, Wangmeng Zuo, Shuhang Gu, Debin Zhao, David Zhang. Learning Convolutional Networks for Content-weighted Image Compression. CVPR2018, Pages：3214-3223.

论文链接：https://arxiv.org/abs/1703.10553

开源代码：https://github.com/limuhit/ImageCompression

工具：基于Caffe

简介：

Lossy image compression is generally formulated as a joint rate-distortion optimization to learn encoder, quantizer, and decoder. However, the quantizer is non-differentiable, and discrete entropy estimation usually is required for rate control. These make it very challenging to develop a convolutional network (CNN)-based image compression system. In this paper, motivated by that the local information content is spatially variant in an image, we suggest that the bit rate of the different parts of the image should be adapted to local content. And the content aware bit rate is allocated under the guidance of a content-weighted importance map. Thus, the sum of the importance map can serve as a continuous alternative of discrete entropy estimation to control compression rate. And binarizer is adopted to quantize the output of encoder due to the binarization scheme is also directly defined by the importance map. Furthermore, a proxy function is introduced for binary operation in backward propagation to make it differentiable. Therefore, the encoder, decoder, binarizer and importance map can be jointly optimized in an end-to-end manner by using a subset of the ImageNet database. In low bit rate image compression, experiments show that our system significantly outperforms JPEG and JPEG 2000 by structural similarity (SSIM) index, and can produce the much better visual result with sharp edges, rich textures, and fewer artifacts.

2. Mentzer F, Agustsson E, Tschannen M, et al. Conditional Probability Models for Deep Image Compression[J]. computer vision and pattern recognition, 2018: 4394-4402.

论文链接：https://arxiv.org/abs/1801.04260

开源代码：https://github.com/fab-jul/imgcomp-cvpr

工具：Tensorflow

简介：

Deep Neural Networks trained as image auto-encoders have recently emerged as a promising direction for advancing the state-of-the-art in image compression. The key challenge in learning such networks is twofold: To deal with quantization, and to control the trade-off between reconstruction error (distortion) and entropy (rate) of the latent image representation. In this paper, we focus on the latter challenge and propose a new technique to navigate the rate-distortion trade-off for an image compression auto-encoder. The main idea is to directly model the entropy of the latent representation by using a context model: A 3D-CNN which learns a conditional probability model of the latent distribution of the auto-encoder. During training, the auto-encoder makes use of the context model to estimate the entropy of its representation, and the context model is concurrently updated to learn the dependencies between the symbols in the latent representation. Our experiments show that this approach, when measured in MS-SSIM, yields a state-of-the-art image compression system based on a simple convolutional auto-encoder.

3. Toderici G, Vincent D, Johnston N, et al. Full Resolution Image Compression with Recurrent Neural Networks[J]. computer vision and pattern recognition, 2017: 5435-5443.

论文链接：http://arxiv.org/abs/1608.05148

开源代码：https://github.com/1zb/pytorch-image-comp-rnn

工具： PyTorch 0.2.0

简介：

This paper presents a set of full-resolution lossy image compression methods based on neural networks. Each of the architectures we describe can provide variable compression rates during deployment without requiring retraining of the network: each network need only be trained once. All of our architectures consist of a recurrent neural network (RNN)-based encoder and decoder, a binarizer, and a neural network for entropy coding. We compare RNN types (LSTM, associative LSTM) and introduce a new hybrid of GRU and ResNet. We also study "one-shot" versus additive reconstruction architectures and introduce a new scaled-additive framework. We compare to previous work, showing improvements of 4.3%-8.8% AUC (area under the rate-distortion curve), depending on the perceptual metric used. As far as we know, this is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.

4. Prakash A, Moran N, Garber S, et al. Semantic Perceptual Image Compression Using Deep Convolution Networks[J]. data compression conference, 2017: 250-259.

论文链接：https://arxiv.org/abs/1612.08712

开源代码：https://github.com/iamaaditya/image-compression-cnn

备注：Tensorflow，CNN

简介：

It has long been considered a significant problem to improve the visual quality of lossy image and video compression. Recent advances in computing power together with the availability of large training data sets has increased interest in the application of deep learning cnns to address image recognition and image processing tasks. Here, we present a powerful cnn tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy compression. A modest increase in complexity is incorporated to the encoder which allows a standard, off-the-shelf jpeg decoder to be used. While jpeg encoding may be optimized for generic images, the process is ultimately unaware of the specific content of the image to be compressed. Our technique makes jpeg content-aware by designing and training a model to identify multiple semantic regions in a given image. Unlike object detection techniques, our model does not require labeling of object positions and is able to identify objects in a single pass. We present a new cnn architecture directed specifically to image compression, which generates a map that highlights semantically-salient regions so that they can be encoded at higher quality as compared to background regions. By adding a complete set of features for every class, and then taking a threshold over the sum of all feature activations, we generate a map that highlights semantically-salient regions so that they can be encoded at a better quality compared to background regions. Experiments are presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset, in which our algorithm achieves higher visual quality for the same compressed size.

5. Theis L, Shi W, Cunningham A, et al. Lossy Image Compression with Compressive Autoencoders[J]. international conference on learning representations, 2017.

论文题目：LOSSY IMAGE COMPRESSION WITH COMPRESSIVE AUTOENCODERS论文链接：https://arxiv.org/abs/1703.00395

开源代码：https://github.com/alexandru-dinu/cae

工具：Pytorch

简介：

We propose a new approach to the problem of optimizing autoencoders for lossy image compression. New media formats, changing hardware technology, as well as diverse requirements and content types create a need for compression algorithms which are more flexible than existing codecs. Autoencoders have the potential to address this need, but are difficult to optimize directly due to the inherent non-differentiabilty of the compression loss. We here show that minimal changes to the loss are sufficient to train deep autoencoders competitive with JPEG 2000 and outperforming recently proposed approaches based on RNNs. Our network is furthermore computationally efficient thanks to a sub-pixel architecture, which makes it suitable for high-resolution images. This is in contrast to previous work on autoencoders for compression using coarser approximations, shallower architectures, computationally expensive methods, or focusing on small images.

6. Haimeng Zhao, Peiyuan Liao. CAE-ADMM: Implicit Bitrate Optimization via ADMM-based Pruning in Compressive Autoencoders.

论文链接：https://arxiv.org/abs/1901.07196

时间：2019

开源代码：https://github.com/JasonZHM/CAE-ADMM

工具：使用pytorch，msssim

简介：

We introduce ADMM-pruned Compressive AutoEncoder (CAE-ADMM) that uses Alternative Direction Method of Multipliers (ADMM) to optimize the trade-off between distortion and efficiency of lossy image compression. Specifically, ADMM in our method is to promote sparsity to implicitly optimize the bitrate, different from entropy estimators used in the previous research. The experiments on public datasets show that our method outperforms the original CAE and some traditional codecs in terms of SSIM/MS-SSIM metrics, at reasonable inference speed.

7. Balle J, Laparra V, Simoncelli E P, et al. End-to-end Optimized Image Compression[J]. international conference on learning representations, 2017.

论文链接：https://arxiv.org/abs/1611.01704

开源代码：https://github.com/tensorflow/compression

备注：Tensorflow改进版

简介：

We describe an image compression method, consisting of a nonlinear analysis transformation, a uniform quantizer, and a nonlinear synthesis transformation. The transforms are constructed in three successive stages of convolutional linear filters and nonlinear activation functions. Unlike most convolutional neural networks, the joint nonlinearity is chosen to implement a form of local gain control, inspired by those used to model biological neurons. Using a variant of stochastic gradient descent, we jointly optimize the entire model for rate-distortion performance over a database of training images, introducing a continuous proxy for the discontinuous loss function arising from the quantizer. Under certain conditions, the relaxed loss function may be interpreted as the log likelihood of a generative model, as implemented by a variational autoencoder. Unlike these models, however, the compression model must operate at any given point along the rate-distortion curve, as specified by a trade-off parameter. Across an independent set of test images, we find that the optimized method generally exhibits better rate-distortion performance than the standard JPEG and JPEG 2000 compression methods. More importantly, we observe a dramatic improvement in visual quality for all images at all bit rates, which is supported by objective quality estimates using MS-SSIM.

8. Eirikur Agustsson, Michael Tschannen, Fabian Mentzer, Radu Timofte, Luc Van Gool. Generative Adversarial Networks for Extreme Learned Image Compression.

论文链接：https://arxiv.org/abs/1804.02958

时间：2018

开源代码：https://github.com/Justin-Tan/generative-compression

备注：GAN（Generative Adversarial Networks），tensorflow1.8

简介：

We propose a framework for extreme learned image compression based on Generative Adversarial Networks (GANs), obtaining visually pleasing images at significantly lower bitrates than previous methods. This is made possible through our GAN formulation of learned compression combined with a generator/decoder which operates on the full-resolution image and is trained in combination with a multi-scale discriminator. Additionally, if a semantic label map of the original image is available, our method can fully synthesize unimportant regions in the decoded image such as streets and trees from the label map, therefore only requiring the storage of the preserved region and the semantic label map. A user study confirms that for low bitrates, our approach is preferred to state-of-the-art methods, even when they use more than double the bits.

9. Akbari M, Liang J, Han J, et al. DSSLIC: Deep Semantic Segmentation-based Layered Image Compression.[J]. arXiv: Computer Vision and Pattern Recognition, 2018.

论文链接：https://arxiv.org/abs/1806.03348

开源代码：https://github.com/makbari7/DSSLIC

备注：Ubuntu 16.04 Python 2.7 Cuda 8.0 Pyorch 0.3.0

简介：

Deep learning has revolutionized many computer vision fields in the last few years, including learning-based image compression. In this paper, we propose a deep semantic segmentation-based layered image compression (DSSLIC) framework in which the semantic segmentation map of the input image is obtained and encoded as the base layer of the bit-stream. A compact representation of the input image is also generated and encoded as the first enhancement layer. The segmentation map and the compact version of the image are then employed to obtain a coarse reconstruction of the image. The residual between the input and the coarse reconstruction is additionally encoded as another enhancement layer. Experimental results show that the proposed framework outperforms the H.265/HEVC-based BPG and other codecs in both PSNR and MS-SSIM metrics across a wide range of bit rates in RGB domain. Besides, since semantic segmentation map is included in the bit-stream, the proposed scheme can facilitate many other tasks such as image search and object-based adaptive image compression.

10. Implemented the K-means algorithm with Octave and Python for image compression.

开源代码：https://github.com/Wrinth/Image-Compression-with-K-Means-Clustering

备注：MATLAB

简介：

In this project, I implement the K-means algorithm and use it for image compression. I first started on an example 2D dataset (data.mat) to helped me gain an intuition of how the K-means algorithm works. After that, I used the K-means algorithm for image compression by reducing the number of colors that occur in an image to only those that are most common in that image.

二、传统编码方法

1. Dropbox Lepton

blog：https://blogs.dropbox.com/tech/2016/07/lepton-image-compression-saving-22-losslessly-from-images-at-15mbs/

时间：2016

开源代码：https://github.com/dropbox/lepton

备注：用于JPEG格式图片的二次压缩

简介：Lepton is a tool and file format for losslessly compressing JPEGs by an average of 22%. This can be used to archive large photo collections, or to serve images live and save 22% bandwidth.

2. Free lossless image format

源代码：https://github.com/FLIF-hub/FLIF

简介：

FLIF is a lossless image format based on MANIAC compression. MANIAC (Meta-Adaptive Near-zero Integer Arithmetic Coding) is a variant of CABAC (context-adaptive binary arithmetic coding), where the contexts are nodes of decision trees which are dynamically learned at encode time.

FLIF outperforms PNG, FFV1, lossless WebP, lossless BPG and lossless JPEG2000 in terms of compression ratio. Moreover, FLIF supports a form of progressive interlacing (essentially a generalization/improvement of PNG's Adam7) which means that any prefix (e.g. partial download) of a compressed file can be used as a reasonable lossy encoding of the entire image.

For more information on FLIF, visit https://flif.info

备注：FLIF一种基于MANIAC（CABAC编码的变体）无损图像格式，优于PNG/BPG/JPEG2000等。

3. Google guetzli

源代码： https://github.com/google/guetzli

简介：Guetzli is a JPEG encoder that aims for excellent compression density at high visual quality. Guetzli-generated images are typically 20-30% smaller than images of equivalent quality generated by libjpeg. Guetzli generates only sequential (nonprogressive) JPEGs due to faster decompression speeds they offer.

备注：JPEG格式二次压缩

weixin_39819283

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫