python读取二进制文件_python二进制文件的读取

最新推荐文章于 2023-02-23 10:46:22 发布

weixin_39847732

最新推荐文章于 2023-02-23 10:46:22 发布

阅读量176

点赞数

文章标签： python读取二进制文件

为什么要讨论二进制文件的读取，因为我们处理的数据不一定就是语音和图像。有时我们需要将普通数据保存读取。这是推荐一个非常好的库：pickle，它会直接把对象原封不动的保存（它会记录数组的相关信息），在原封不动的读取。

import pickle

a = np.array([[1,2,3],[4,5,6],[7,8,9],[11,12,13]]).astype('float32')

with open('a.bin', 'wb') as config_f:

pickle.dump(a, config_f)

这个就是保存一个数组【pickle.dump和pickle.dumps大家一定一定要注意】

import pickle

with open('a.bin', 'rb') as f_in:

C = pickle.load(f_in)

这是读取一个数组，惊不惊喜!意不意外！太方便了！

这是一个很重要的分割线，下面是用tf进行的二进制文件的操作，有点麻烦。老实说，我特别不喜欢tf的二进制读取，特别麻烦。

-----------------------------------------------------------------------------------------------

写一个二进文件

import numpy as np

a = np.array([1,2]).astype(np.float32)

with open('a.bin', 'wb') as fp:

fp.write(a.tostring())得到二进制打开是这样的把a改为[1,2,3]

这证明我们成功的将一个二进制文件写进去了a要是一个numpy数组

open当文件不存的时候会自动创建

tostring()：存入的二进制一般是字符串类型

其会一行的一行的依次保存数据

import tensorflow as tf

files = ['a.bin']

filename_queue = tf.train.string_input_producer(files, num_epochs=1)

reader = tf.WholeFileReader()

_, value = reader.read(filename_queue)

value = tf.decode_raw(value, tf.float32)

sv = tf.train.Supervisor()

with sv.managed_session() as sess:

while True:

try:

features = sess.run(value)

print(features)

finally:

pass

二进制文件读取tf.train.Supervisor()：必须要用这个创建sess。用tf.Session()要启动队列什么的，比较麻烦

tf.decode_raw(value, tf.float32)：要与你存储时候的是一致的

读取的时候，它不会保留你的形状，只会依次读取。比如形状是（3,2）存储，读出来是（6，）

tf.WholeFileReader()：是直接将你给定的文件全部读进来

tf还有一种读法：tf.FixedLengthRecordReader(record_bytes)①当读的文件的字节长度小于给定的长度时，直接跳过该文件

②当文件的长度大于给定的字节时，它只会从开始读读到指定的长度，其余的它是不读的。总得来说，它并没有我们想的那么智能

import numpy as np

import librosa

a = np.array([[1,2,3],[4,5,6],[7,8,9],[11,12,13]]).astype('float32')

c= np.array([[70,80,90]]).astype('float32')

with open('./etc_t/SF1/a.bin', 'wb') as fp:

fp.write(a.tostring())

with open('./etc_t/SF1/c.bin', 'wb') as fp:

fp.write(c.tostring())

b = np.array([[10,20,30],[40,50,60]]).astype('float32')

with open('./etc_t/TM3/b.bin', 'wb') as fp:

fp.write(b.tostring())

我创建了a,b,c三个文件。我读的固定长度是：2 * 3 * 4（4是一个字节的长度）

import tensorflow as tf

batch_size = 2

record_bytes = batch_size * 3 * 4

files_SF1 = tf.gfile.Glob('./etc_t/SF1/*.bin')

files_TM3 = tf.gfile.Glob('./etc_t/TM3/*.bin')

filename_queue = tf.train.string_input_producer(files_SF1)

reader = tf.FixedLengthRecordReader(record_bytes)

_, value = reader.read(filename_queue)

value = tf.decode_raw(value, tf.float32)

filename_queue2 = tf.train.string_input_producer(files_TM3)

reader = tf.FixedLengthRecordReader(record_bytes)

_, value2 = reader.read(filename_queue2)

value2 = tf.decode_raw(value2, tf.float32)

sv = tf.train.Supervisor()

while True:

with sv.managed_session() as sess:

v1R,v2R = sess.run([value,value2])

v1R = v1R.reshape(-1,3,1)

v2R = v2R.reshape(-1,3,1)

print(v1R)

print(v2R)

输出：

[[[ 1.]

[ 2.]

[ 3.]]

[[ 4.]

[ 5.]

[ 6.]]]

[[[ 10.]

[ 20.]

[ 30.]]

[[ 40.]

[ 50.]

[ 60.]]]

会发现，c文件直接是被忽略的，而a文件只会读取前6个数，即使它有12个数

看来看去还是tf读取二进制文件最方便

欢迎关注公众号：huangxiaobai880https://www.zhihu.com/video/935479478189232128

weixin_39847732

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫