使用python gzip库进行文件压缩与解压缩
import gzip
使用gzip压缩已存在的文本文件
""""""
f_in = open("hello.txt","rb")
f_out = gzip.open("hello1.gz","wb")
f_out.write(f_in.read())
f_in.close()
f_out.close()
""" "" "
with open ( "hello.txt" , "rb" ) as f_in:
with gzip. open ( "hello2.gz" , "wb" ) as f_out:
f_out. write( f_in. read( ) )
f_gzip = gzip. GzipFile( "hello4.gz" , "wb" )
f_in = open ( "hello.txt" , "rb" )
f_gzip. write( f_in. read( ) )
f_in. close( )
f_gzip. close( )
f_in = open ( "hello.txt" , "r" )
f_out = gzip. open ( "hello1.gz" , "wb" )
f_out. write( f_in. read( ) )
f_in. close( )
f_out. close( )
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-27-18e6a62b051d> in <module>()
1 f_in = open("hello.txt","r")
2 f_out = gzip.open("hello1.gz","wb")
----> 3 f_out.write(f_in.read())
4 f_in.close()
5 f_out.close()
C:\ProgramData\Anaconda3\lib\gzip.py in write(self, data)
258 else:
259 # accept any data that supports the buffer protocol
--> 260 data = memoryview(data)
261 length = data.nbytes
262
TypeError: memoryview: a bytes-like object is required, not 'str'
可见,就算是压文本文件,也要用rb模式进行打开,因为gzip中的write对象需要的是"byte-like object",而非str
使用gzip压缩已存在的二进制文件
with open ( "下载.jpg" , "rb" ) as f_in:
with gzip. open ( "pic.gz" , "wb" ) as f_out:
f_out. write( f_in. read( ) )
所以,可以发现,其实用gzip压缩文本文件和二进制图片的用到的技术毫无差别。都是用rb模式得到原文件对象,创建一个gzip的文件对象,然后在gzip文件对象中写入使用read方法读取后的原文件对象就可以了
压缩数据流
使用gzip解压文件
解压分为两种,解压普通文件或者解压数据流。解压普通文件是压缩的逆过程
with gzip. open ( "hello.gz" , "rb" ) as f_in:
with open ( "hello3.txt" , "wb" ) as f_out:
f_out. write( f_in. read( ) )
f_gzip = gzip. GzipFile( "hello4.gz" , "rb" )
f_in = open ( "hello5.txt" , "wb" )
f_in. write( f_gzip. read( ) )
f_in. close( )
f_gzip. close( )
解压数据流
使用gzip压缩数据
str_ins = "hello,boy"
with gzip. open ( "boy.gz" , "w" ) as f:
f. write( str_ins)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-28-5c9516f9b131> in <module>()
1 str_ins = "hello,boy"
2 with gzip.open("boy.gz","w") as f:
----> 3 f.write(str_ins)
C:\ProgramData\Anaconda3\lib\gzip.py in write(self, data)
258 else:
259 # accept any data that supports the buffer protocol
--> 260 data = memoryview(data)
261 length = data.nbytes
262
TypeError: memoryview: a bytes-like object is required, not 'str'
str_b = b"hello,girl"
with gzip. open ( "girl.gz" , "w" ) as f:
f. write( str_b)
可见,是不能够对文本字符串进行直接压缩的,而是将其变为字节字符串才可以
所以,一共有三种比较常见的方式进行解压和压缩,其中第三种是采用了GzipFile()方法
GzipFile方法是zip库中GzipFile类的构造方法,旨在提供一个gzip文件对象,当有数据写入这个对象时,就意味着对写入的数据进行压缩。而当有数据从这个对象中被读取时,就意味着压缩数据从这个对象中被解压了。