Python 标准库 —— 文件解压（zip、gz、pkl、tar）

最新推荐文章于 2024-05-27 16:07:47 发布

五道口纳什

最新推荐文章于 2024-05-27 16:07:47 发布

阅读量6.8k

点赞数

分类专栏： python 文章标签： zipfile tf

本文链接：https://blog.csdn.net/lanchunhui/article/details/62889760

版权

python 专栏收录该内容

210 篇文章 26 订阅

订阅专栏

1. zip ⇒ zipfile

with zipfile.ZipFile('../data/jaychou_lyrics.txt.zip', 'r') as zin:
    zin.extractall('../data/')

# 将 .txt.zip 解压为 .txt
with open('../data/jaychou_lyrics.txt') as f:
    f.read()
    ...

f = zipfile.ZipFile(filename)
with zipfile.ZipFile(filename) as f:

ZipFile 对象的基本成员函数：

f.namelist() ⇒ 由 names 构成的 list；
- zip 解压出来的文件，当然未必只有一个；
f.read(f.namelist()[0])

一个 zip 文件的解析


# 使用 tensorflow 下的相关接口


with parse_data(filename):
    with zipfile.ZipFile(filename) as f:
        data = tf.compat.as_str(f.read(f.namelist()[0])).split()
    return data

2. pkl ⇒ pickle

核心api
- pickle.dump
- pickle.load

将存储数据的变量 dump 到本地

pickle_file = 'data.pkl'
try:
    with open(pickle_file, 'wb') as f:
        save = {'X': X, 'y': y}             # 以字典的形式
        pickle.dump(f, save, protocol=pickle.HIGHEST_PROTOCOL)
except:
    raise

Using pickle.dump - TypeError: must be str, not bytes

Using pickle.dump - TypeError: must be str, not bytes

将存储数据的变量 dump 到本地 output 文件时，对 output 文件需以二进制模式（binary mode）打开，才可使用 pickle 的dump方法：
```
with open(filename, 'wb') as fp:
    pickle.dump(s, fp)
```

3. pkl ⇒ pickle, gz ⇒ gzip

两者结合使用的场景是读取解析 .pkl.gz 文件，比如大名鼎鼎的 mnist.pkl.gz （标准手写字符的数据）

先使用 fp = gzip.open(”)，
再使用 pickle.load(fp)

def load_data():
    with gzip.open('./mnist.pkl.gz') as fp:
        training_data, valid_data, test_data = pickle.load(fp)
    return training_data, valid_data, test_data

4. tgz ⇒ tarfile

核心api
- open
- extractall

with tarfile.open() as file:
    file.extractall(path=path)

五道口纳什

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
打赏
0
评论
Python 标准库 —— 文件解压（zip、gz、pkl、tar）

Python模块学习：zipfile zip文件操作0. 解压with zipfile.ZipFile('../data/jaychou_lyrics.txt.zip', 'r') as zin: zin.extractall('../data/')# 将 .txt.zip 解压为 .txtwith open('../data/jaychou_lyrics.txt'
复制链接

扫一扫