【TensorFlow】(六)tf.feature_column.categorical_column_with_vocabulary_file()函数的作用及用法

最新推荐文章于 2022-02-09 18:25:45 发布

凝眸伏笔

最新推荐文章于 2022-02-09 18:25:45 发布

阅读量1.4k

点赞数 1

分类专栏： TensorFlow 文章标签： tensorflow 深度学习

本文链接：https://blog.csdn.net/pearl8899/article/details/107967905

版权

TensorFlow 专栏收录该内容

29 篇文章 8 订阅

订阅专栏

1.作用

如果单词有些时候比较多，这时候可以直接从文件中读取文字列表。

输入：

`key`	一个唯一的字符串识别输入功能。它是用来作为列名和功能解析CONFIGS字典键，配备了`Tensor`对象和功能栏。
`vocabulary_file`	词汇文件名。
`vocabulary_size`	在词汇中元素的个数。这必须是不大于长度`vocabulary_file` ，如果低于长，后来值将被忽略。如果没有，它被设置为长度`vocabulary_file` 。
`dtype`	该类型的特征。只有字符串和整数类型的支持。
`default_value`	整数ID值返回为外的词汇特征值，默认为`-1` 。这不能以积极的指定`num_oov_buckets` 。
`num_oov_buckets`	非负整数，词典外桶的数量。外的词汇所有输入将在范围内分配的ID `[vocabulary_size, vocabulary_size+num_oov_buckets)`基于所述输入值的散列。正`num_oov_buckets`不能指定`default_value` 。

输出：

每一个特征的hash值。

2.例子

import tensorflow as tf
sess=tf.Session()
#特征数据
features = {
    'department': ['sport', 'sport', 'drawing', 'gardening', 'travelling'],
}
#特征列
department = tf.feature_column.categorical_column_with_vocabulary_file('department', './pets_fc.txt', dtype=tf.string)
department = tf.feature_column.indicator_column(department)
#组合特征列
columns = [department]
#输入层（数据，特征列）
inputs = tf.feature_column.input_layer(features, columns)
#初始化并运行
init = tf.global_variables_initializer()
sess.run(tf.tables_initializer())
sess.run(init)

v=sess.run(inputs)
print(v)

输出：

[[1. 0. 0. 0.] #sport
 [1. 0. 0. 0.] #sport
 [0. 1. 0. 0.] #drawing
 [0. 0. 1. 0.] #gardening
 [0. 0. 0. 1.]] #travelling

thinking

同tf.feature_column.categorical_column_with_vocabulary_list()函数的区别：如果单词有些时候比较多，使用文件存储单词，使用该函数；若单词较少且已知，可以使用vocabulary_list函数，对单词做hash。

参考：

1.官方文档：https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/feature_column/categorical_column_with_vocabulary_file

2.例子：https://blog.csdn.net/anshuai_aw1/article/details/105075335

凝眸伏笔

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录