python字节流处理_Python解压缩字节流?

Here is the situation:

I get gzipped xml documents from Amazon S3

import boto

from boto.s3.connection import S3Connection

from boto.s3.key import Key

conn = S3Connection('access Id', 'secret access key')

b = conn.get_bucket('mydev.myorg')

k = Key(b)

k.key('documents/document.xml.gz')

I read them in file as

import gzip

f = open('/tmp/p', 'w')

k.get_file(f)

f.close()

r = gzip.open('/tmp/p', 'rb')

file_content = r.read()

r.close()

Question

How can I unzip the streams directly and read the contents?

I do not want to create temp files, they don't look good.

解决方案

Yes, you can use the zlib module to decompress byte streams:

import zlib

def stream_gzip_decompress(stream):

dec = zlib.decompressobj(32 + zlib.MAX_WBITS) # offset 32 to skip the header

for chunk in stream:

rv = dec.decompress(chunk)

if rv:

yield rv

The offset of 32 signals to the zlib header that the gzip header is expected but skipped.

The S3 key object is an iterator, so you can do:

for data in stream_gzip_decompress(k):

# do something with the decompressed data

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值