TensorFlow2.0教程-使用RNN实现文本分类
原文地址:https://blog.csdn.net/qq_31456593/article/details/89923645
Tensorflow 2.0 教程持续更新 :https://blog.csdn.net/qq_31456593/article/details/88606284
本教程主要由tensorflow2.0官方教程的个人学习复现笔记整理而来,并借鉴了一些keras构造神经网络的方法,中文讲解,方便喜欢阅读中文教程的朋友,tensorflow官方教程:https://www.tensorflow.org
完整tensorflow2.0教程代码请看https://github.com/czy36mengfei/tensorflow2_tutorials_chinese (欢迎star)
Tensorflow2.0部分教程内容:
TensorFlow 2.0 教程- Keras 快速入门
TensorFlow 2.0 教程-keras 函数api
TensorFlow 2.0 教程-使用keras训练模型
TensorFlow 2.0 教程-用keras构建自己的网络层
TensorFlow 2.0 教程-keras模型保存和序列化
1.使用tensorflow_datasets 构造输入数据
!pip install -q tensorflow_datasets
[31mspacy 2.0.18 has requirement numpy>=1.15.0, but you'll have numpy 1.14.3 which is incompatible.[0m
[31mplotnine 0.5.1 has requirement matplotlib>=3.0.0, but you'll have matplotlib 2.2.2 which is incompatible.[0m
[31mplotnine 0.5.1 has requirement pandas>=0.23.4, but you'll have pandas 0.23.0 which is incompatible.[0m
[31mneo4j-driver 1.6.2 has requirement neotime==1.0.0, but you'll have neotime 1.7.2 which is incompatible.[0m
[31mmizani 0.5.3 has requirement pandas>=0.23.4, but you'll have pandas 0.23.0 which is incompatible.[0m
[31mfastai 0.7.0 has requirement torch<0.4, but you'll have torch 0.4.1 which is incompatible.[0m
[33mYou are using pip version 10.0.1, however version 19.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
import tensorflow_datasets as tfds
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True,
as_supervised=True)
获取训练集、测试集
train_dataset, test_dataset = dataset['train'], dataset['test']
获取tokenizer对象,用进行字符处理级id转换(这里先转换成subword,再转换为id)等操作
tokenizer = info.features['text'].encoder
print('vocabulary size: ', tokenizer.vocab_size)
vocabulary size: 8185
token对象测试
sample_string = 'Hello word , Tensorflow'
tokenized_string = tokenizer.encode(sample_string)
print('tokened id: ', tokenized_string)
# 解码会原字符串
src_string = tokenizer.dec