tensorflow_常用

NorthFish北海有鱼

已于 2022-07-19 16:02:53 修改

阅读量438

点赞数

分类专栏： tensorflow 文章标签： tensorflow 深度学习 python

于 2021-03-17 20:27:44 首次发布

本文链接：https://blog.csdn.net/MaYingColdPlay/article/details/114948084

版权

tensorflow 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

l1/l2正则化

tensorflow使用L2 regularization正则化修正overfitting过拟合_秦伟H的博客-CSDN博客

『TensorFlow』正则化添加方法整理 - 叠加态的猫 - 博客园

 net = tf.contrib.layers.fully_connected(inputs=net, num_outputs=num_outputs,
                                                activation_fn=activation_fn,
                                                weights_regularizer=tf.contrib.layers.l2_regularizer(l2_reg), \
                                                partitioner=tf.fixed_size_partitioner(partition_size),
                                                scope='mlp_%s' % scope)

在全连接层添加正则第十六节，使用函数封装库tf.contrib.layers - 大奥特曼打小怪兽 - 博客园

在损失函数上添加正则『TensorFlow』正则化添加方法整理 - 叠加态的猫 - 博客园

异常情况 tensorflow - regarding the ValueError: If `inputs` don't all have same shape and dtype or the shape - Stack Overflow

with tf.name_scope

tf.variable_scope和tf.name_scope的用法_UESTC_C2_403的博客-CSDN博客

当有操作的时候，就会变成这个名字。

tensorboard

使用tensorflow训练神经网络 | Calvin's Marbles

运行某个张量

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    print(sess.run([labels_scene_market]))

在batch内行转列

labels = features["mylabels"][:,0]
labels = tf.expand_dims(labels, -1)

train和Spec

tf.estimator.train_and_evaluate 详解_黑暗星球的博客-CSDN博客

  if 'max_steps' in self.params:
            train_spec = tf.estimator.TrainSpec(
                input_fn=lambda: data_load.input_fn(
                    self.batch_size),
                max_steps=self.params['max_steps'])
        else:
            train_spec = tf.estimator.TrainSpec(
                input_fn=lambda: data_load.input_fn(
                    self.batch_size))
        
#评估的时候，注意这个steps的设置，如果设置为none，在评估的时候要等好久。设置为1就好了。
        valid_spec = tf.estimator.EvalSpec(
            input_fn=lambda: data_load.input_fn(
                self.batch_size), steps=1,
            start_delay_secs=10,
            throttle_secs=60,
        )

        tf.estimator.train_and_evaluate(estimator, train_spec, valid_spec)
        tf.logging.info("Optimization Finished!")

data filter

# features_origin= tf.parse_example(line, self.feature_spec)
        features_origin = tf.parse_single_example(line, self.feature_spec)
        keeps = features_origin['dp']
        print('keeps')
        print(keeps)
        one_img = tf.expand_dims(keeps, -1)
        print('keeps 1')
        print(one_img)
        aa = tf.transpose(one_img, [1, 0])
        print(aa)

tf.unique_with_counts

TensorFlow函数：tf.unique_with_counts_w3cschool

tf.where 把本来为0的，embedding至为0

# tensor 'x' is [1, 1, 2, 4, 4, 4, 7, 8, 8]
y, idx, count = unique_with_counts(x)
y ==> [1, 2, 4, 7, 8]
idx ==> [0, 0, 1, 2, 2, 2, 3, 4, 4]
count ==> [2, 1, 3, 1, 2]


deep_embeds = tf.reshape(
                embedding_lookup_hashtable(wnd_v[key], deep_feat_unique_ids, is_training=is_training,
                                                serving_default_value=tf.zeros([key], tf.float32)),
                                                 shape=[-1, key])
deep_embeds = tf.where(tf.not_equal(deep_feat_unique_ids, tf.zeros_like(deep_feat_unique_ids)), deep_embeds,
                               tf.zeros_like(deep_embeds))
            # deep input, id feature
deep_inputs_single = tf.reshape(tf.nn.embedding_lookup(deep_embeds, deep_feat_unique_index),
                                 shape=[-1, len_cat * key])

tf.greater

tf.greater(a,b)_放下扳手&拿起键盘的博客-CSDN博客_tf.greater

         dense_input_1 = tf.multiply(deep_dense, w1)
            dense_input_2 = tf.where(tf.greater(deep_dense, 1e-6 * tf.ones_like(deep_dense)),
                                     tf.zeros_like(deep_dense), w2)
            dense_input = dense_input_1 + dense_input_2

tf.gather

滑动验证页面

embedding look up是继承了这个类

estimator中的传入config

Tensorflow Estimator 详解

config 参数是一个 tf.estimator.RunConfig 实例, 包含了配置设备, 训练流程控制, 分布式等参数.

params 参数就是一个Python字典, 可以传入任何用户定义的参数, 从上面的讨论我们知道该参数可以传递到 model_fn 和 input_fn

分布式tensorflow

分布式tensorflow（一） | Lynna's Blog

TensorFlow 分布式（Distributed TensorFlow）_跟着大数据和AI去旅行的博客-CSDN博客_tensorflow 分布式

分布式性能调优的几个方面：

1.使用哈希embedding。

KDD 2020 | Facebook提出组合embedding方法在大规模推荐系统中的应用 - 腾讯云开发者社区-腾讯云(embedding 存储方式）

embedding的生成方式：

第一种方式是作为普通的variable存储tensorflow中的Embedding操作详解 - 知乎，在ps场景下只能存储在一台ps服务器上，性能有限。

第二种方式是在第一种方式上增加分区策略tf.nn.embedding_lookup中关于partition_strategy参数详解_DFann的博客-CSDN博客，目前tensorflow支持div和mode两种方式，能够解决第一种存储方式的局限。只在len(params)>1时，partition_strategy才起作用，第一种的情况下len(params)=1，因为传入的是一个tensor,即使这个tensor是n行n列，也是一个tensor.

第三种方式(不太懂)在server端维护hash表，通过hash表映射到内存空间，这样做的好处有两点:

1.id无需有序2.可以动态的插入和删除embeddings 3.无需提前初始化所有的feature embedding，需要的时候再存储，特别是在spare feature时，极大的节省空间

第三种方式的意思是，初始化矩阵存储的时候用kv形式来存储，本来是用一个大矩阵来存储，用大矩阵存储的缺点是浪费内存。

embedding矩阵哈希冲突EmbeddingVariable - 机器学习PAI - 阿里云

哈希冲突的概率_悠悠吾心666的博客-CSDN博客_hash算法冲突概率。在样本里做哈希编码的时候就不存在这个博客里说的这种情况。

在样本里对类别特征先进行哈希编码MurmurHash一致性Hash算法JAVA版_潜水生活的博客-CSDN博客

2.数据分发方式

Tensorflow Dataset操作接口简介_AI算法-图哥的博客-CSDN博客

3.dnn层设置分片参数

根据ps数量做参数分割。

Tensorflow参数分割_shuai_wow的博客-CSDN博客_tensorflow 分割

4. 在 config中设置seastar

TensorFlow 大规模稀疏模型异步训练的分布式优化 - 案例分享 - tf.wiki 社区

serving

TensorFlow Serving + Docker + Tornado机器学习模型生产级快速部署 - 掘金

序列处理

文件IO接口

https://www.tensorflow.org/api_docs/python/tf/io/gfile

if job_name == "chief" and task_index == 0:
    if not tf.gfile.Exists(model_dir):
        tf.gfile.MakeDirs(model_dir)
        tf.gfile.MakeDirs(model_dir + "/profiler")

修改变量值

import tensorflow as tf

# https://icode.best/i/08630344346204


tensor_input = tf.constant([i for i in range(20)])
tensor_input = tf.reshape(tensor_input, [4, 5])

with tf.Session() as sess:
    print(sess.run([tensor_input]))
# tensor_input = tf.where(tf.greater(tensor_input,8),8,tensor_input)
tensor_input = tf.where_v2(tf.equal(tensor_input,3),3333,tensor_input)
with tf.Session() as sess:
    print(sess.run([tensor_input]))