TensorFlow系列——写tfrecord数据

最新推荐文章于 2024-04-28 22:50:14 发布

qq924178473

最新推荐文章于 2024-04-28 22:50:14 发布

阅读量949

点赞数 2

分类专栏：深度学习-实践编程语言文章标签： tensorflow tfrecord python spark dataframe

本文链接：https://blog.csdn.net/h_jlwg6688/article/details/116753698

版权

涉及的概念：

Example
Tensor
SequenceExample
Feature

涉及的写入方式

python
spark scala
spark dataframe

写入的数据类型

int64
float32
string

写入的特征类型

VarlenFeature
SparseFeature
FixedLenFeature

feature_schema = {
    # featureA: 一维字符串特征
    "featureA": tf.io.FixedLenFeature(shape=(1,), dtype=tf.string, default_value="null"),
    # featureB: 一维数值特征
    "featureB": tf.io.FixedLenFeature(shape=(1,), dtype=tf.float32, default_value=0.0),
    # featureC: 三维字符串特征
    "featureC": tf.io.FixedLenFeature(shape=(3,), dtype=tf.string, default_value=["null", "null", "null"]),
    # featureD: 二维数值特征
    "featureD": tf.io.FixedLenFeature(shape=(2,), dtype=tf.int64, default_value=[0, 0]),
    # featureE: 不固定维度字符串特征
    "featureE": tf.io.VarLenFeature(dtype=tf.string),
    # featureF: 不固定维度数值特征
    "featureF": tf.io.VarLenFeature(dtype=tf.float32),
    "featureEwhight":tf.io.VarLenFeature(dtype=tf.float32),
    # featureG: 二维字符串序列特征
    "featureG": tf.io.FixedLenSequenceFeature(shape=(2,), dtype=tf.string, allow_missing=True, default_value=None),
    # featureH: 三维数值序列特征
    "featureH": tf.io.FixedLenSequenceFeature(shape=(3,), dtype=tf.int64, allow_missing=True, default_value=None),
    # featureI: 21 * 4 * 10 维字符串稀疏特征
    "featureI": tf.io.SparseFeature(index_key=["featureI_Index0", "featureI_Index1", "featureI_Index2"],
                                    value_key="featureI_value", dtype=tf.string, size=[21, 4, 10], already_sorted=False)
}

一、python方式写tfrecord

    # TensorFlow2.x
    writer = tf.io.TFRecordWriter("./tfrecord")

    example_1 = tf.train.Example(features=tf.train.Features(feature={
        # 数据维度必须为 1
        "featureA": tf.train.Feature(bytes_list=tf.train.BytesList(v

最低0.47元/天解锁文章

qq924178473

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
TensorFlow系列——写tfrecord数据

涉及的概念：Example Tensor SequenceExample Feature涉及的写入方式python spark scala spark dataframe写入的数据类型int64 float32 string写入的特征类型VarlenFeature SparseFeature FixedLenFeature一、python方式写tfrecord二、spark scala方式写tfrecord三、spark dataframe方式写tfreco
复制链接

扫一扫

专栏目录