使用quicklz缩小程序体积

qinwsq

于 2024-02-06 14:08:09 发布

阅读量117

点赞数

文章标签：算法

原文链接：https://www.cnblogs.com/oloroso/p/9712473.html

版权

正在学习quickLZ ，发现作者写的很好，就搬运过来了（侵删）
https://www.cnblogs.com/oloroso/p/9712473.html

简述#

有一个需求是这样的，写的一个程序内置了一个很大的文件（实际就是抓取epsg.io的内容里面的epsg.io.json），这个文件筛选缩减后还有12MB，如果直接内置到程序中，编译后的程序就很大了。
因为这个程序是一个动态库，而使用upx压缩过的动态库有时候会有一些异常问题出现，所以不考虑使用upx进行压缩。
看到了quicklz后，感觉这是个好东西，于是就用这个来进行压缩，把压缩后的数据写入程序中，使用前进行解压即可。使用这个操作之后，程序大小从12MB缩小为不到1.5MB，效果很明显。

压缩和解压代码#

关于quicklz的使用，在http://www.quicklz.com/网站上有比较详细的说明，各个编程语言的接口也都有封装好。
更多的可以参考GitHub - robottwo/quicklz: A clone of QuickLZ (the FAST compression library)

压缩代码#

压缩的代码很简单，因为我这里只做字符串的，所以压缩率还比较高，可以达到12%左右。

压缩的代码如下：

Copy Highlighter-hljs

	`// 压缩字符串src，返回qlz编码格式的内容`
	`std::string quicklz_compress(const std::string& src)`
	`{`
	`qlz_state_compress state;`
	`memset(&state, 0, sizeof(qlz_state_compress));`
	`std::string dst;`
	`char buffer[4096 + 1024];`
	`for(size_t pos = 0;pos<src.size();pos+=4096) {`
	`size_t len = src.size() - pos;`
	`len = len > 4096 ? 4096 : len;`
	`len = qlz_compress(src.data() + pos, buffer, len, &state);`
	`dst.append(buffer,len);`
	`}`
	`return dst;`
	`}`

下面是quiz.c里面进行压缩的代码，可供参考

Copy Highlighter-hljs

	`#include "quicklz.h"`

	`#define MAX_BUF_SIZE 1024*1024`
	`#define BUF_BUFFER 400`
	`#define bool int`
	`#define true 1`
	`#define false 0`

	`int stream_compress(FILE ifile, FILE ofile)`
	`{`
	`char file_data, compressed;`
	`size_t d, c, fd_size, compressed_size;`
	`qlz_state_compress state_compress = (qlz_state_compress )malloc(sizeof(qlz_state_compress));`

	`fd_size = MAX_BUF_SIZE;`
	`file_data = (char*) malloc(fd_size);`

	`// allocate MAX_BUF_SIZE + BUF_BUFFER bytes for the destination buffer`
	`compressed_size = MAX_BUF_SIZE + BUF_BUFFER;`
	`compressed = (char*) malloc(compressed_size);`

	`// allocate and initially zero out the states. After this, make sure it is`
	`// preserved across calls and never modified manually`
	`memset(state_compress, 0, sizeof(qlz_state_compress));`

	`// compress the file using MAX_BUF_SIZE packets.`
	`while((d = fread(file_data, 1, MAX_BUF_SIZE, ifile)) != 0)`
	`{`
	`c = qlz_compress(file_data, compressed, d, state_compress);`

	`// the buffer "compressed" now contains c bytes which we could have sent directly to a`
	`// decompressing site for decompression`
	`fwrite(compressed, c, 1, ofile);`
	`}`

	`free(state_compress);`
	`free(compressed);`
	`free(file_data);`
	`return 0;`
	`}`

解压代码#

解压的速度很快，对程序运行几乎没有影响，比读取文件快多了。
解压代码如下：

Copy Highlighter-hljs

	`std::string quicklz_decompress(const std::string& qlzdata)`
	`{`
	`qlz_state_decompress state;`
	`memset(&state, 0, sizeof(qlz_state_decompress));`
	`std::string dst;`
	`for(size_t pos = 0;ops < qlzdata.size(); ){`
	`// 获取压缩数据段大小`
	`size_t co_size = qlz_size_compressed(qlzdata.data() + pos);`
	`// 获取该压缩段解压后的大小`
	`size_t de_size = qlz_size_decompressed(qlzdata.data() + pos);`
	`std::string buffer(de_size,0);`
	`qlz_decompress(qlzdata.data()+pos, (char*)buffer.data(),&state);`
	`pos += co_size;`
	`dst.append(buffer);`
	`}`
	`return dst;`
	`}`

下面是quiz.c里面进行解压的代码，可供参考

Copy Highlighter-hljs

	`int stream_decompress(FILE ifile, FILE ofile)`
	`{`
	`char file_data, decompressed;`
	`size_t d, c, dc, fd_size, d_size;`
	`qlz_state_decompress state_decompress = (qlz_state_decompress )malloc(sizeof(qlz_state_decompress));`

	`// a compressed packet can be at most MAX_BUF_SIZE + BUF_BUFFER bytes if it`
	`// was compressed with this program.`
	`fd_size = MAX_BUF_SIZE + BUF_BUFFER;`
	`file_data = (char*) malloc(fd_size);`

	`// allocate decompression buffer`
	`d_size = fd_size - BUF_BUFFER;`
	`decompressed = (char*) malloc(d_size);`

	`// allocate and initially zero out the scratch buffer. After this, make sure it is`
	`// preserved across calls and never modified manually`
	`memset(state_decompress, 0, sizeof(qlz_state_decompress));`

	`// read 9-byte header to find the size of the entire compressed packet, and`
	`// then read remaining packet`
	`while((c = fread(file_data, 1, 9, ifile)) != 0)`
	`{`
	`// Do we need a bigger decompressed buffer? If the file was compressed`
	`// with segments larger than the default in this program.`
	`dc = qlz_size_decompressed(file_data);`
	`if (dc > (fd_size - BUF_BUFFER)) {`
	`free(file_data);`
	`fd_size = dc + BUF_BUFFER;`
	`file_data = (char*)malloc(fd_size);`
	`}`

	`// Do we need a bigger compressed buffer?`
	`c = qlz_size_compressed(file_data);`
	`if (c > d_size) {`
	`free (decompressed);`
	`d_size = c;`
	`decompressed = (char*)malloc(d_size);`
	`}`

	`fread(file_data + 9, 1, c - 9, ifile);`
	`d = qlz_decompress(file_data, decompressed, state_decompress);`
	`fwrite(decompressed, d, 1, ofile);`
	`}`

	`free(decompressed);`
	`free(state_decompress);`
	`free(file_data);`
	`return 0;`
	`}`

将二进制文件生成C数组程序代码#

上面的代码用于压缩和解压qlz数据，但是这些数据还需要生成C风格的数组，于是就写了一个小程序来做转换，代码如下：

Copy Highlighter-hljs

	`#include <stdio.h>`
	`#include <stdlib.h>`
	`#include <string.h>`


	`int main(int c,char** v)`
	`{`
	`if(c != 3){`
	`printf("Usage:%s infile outfile\n",v[0]);`
	`return 0;`
	`}`
	`FILE* fin = fopen(v[1],"rb");`
	`if(!fin){`
	`printf("Error:%s Open Failed\n",v[1]);`
	`return 1;`
	`}`
	`FILE* fout = fopen(v[2],"wb");`
	`if(fout){`
	`size_t len = 0;`
	`unsigned char buffer[16];`
	`char strbuffer[1024] = "const unsigned char carr_xxx[] = {";`

	`fwrite(strbuffer,1,strlen(strbuffer),fout);`
	`while((len = fread(buffer,1 ,sizeof buffer,fin)) != 0){`
	`strbuffer[0] = '\n';`
	`strbuffer[1] = '\t';`
	`for(size_t i = 0, offset = 2; i < len; ++i) {`
	`offset += sprintf(&strbuffer[offset],"%hhu,",buffer[i]);`
	`}`
	`fwrite(strbuffer,1,strlen(strbuffer),fout);`
	`}`
	`if(strbuffer[0] != 'c'){`
	`fseek(fout,-1, SEEK_CUR);`
	`}`
	`strcpy(strbuffer,"\n};\n");`
	`fwrite(strbuffer,1,strlen(strbuffer),fout);`
	`fclose(fout);`
	`}`
	`fclose(fin);`
	`return 0;`
	`}`