TensorFlow2.0教程-使用RNN实现文本分类

该教程介绍了如何使用TensorFlow2.0和RNN进行文本分类,包括利用tensorflow_datasets构建输入数据,模型构建,训练,以及通过堆叠LSTM层来提升模型性能。
摘要由CSDN通过智能技术生成

TensorFlow2.0教程-使用RNN实现文本分类

原文地址:https://blog.csdn.net/qq_31456593/article/details/89923645

Tensorflow 2.0 教程持续更新 :https://blog.csdn.net/qq_31456593/article/details/88606284

本教程主要由tensorflow2.0官方教程的个人学习复现笔记整理而来,并借鉴了一些keras构造神经网络的方法,中文讲解,方便喜欢阅读中文教程的朋友,tensorflow官方教程:https://www.tensorflow.org

完整tensorflow2.0教程代码请看https://github.com/czy36mengfei/tensorflow2_tutorials_chinese (欢迎star)

Tensorflow2.0部分教程内容:

TensorFlow 2.0 教程- Keras 快速入门
TensorFlow 2.0 教程-keras 函数api
TensorFlow 2.0 教程-使用keras训练模型
TensorFlow 2.0 教程-用keras构建自己的网络层
TensorFlow 2.0 教程-keras模型保存和序列化

1.使用tensorflow_datasets 构造输入数据

!pip install -q tensorflow_datasets
[31mspacy 2.0.18 has requirement numpy>=1.15.0, but you'll have numpy 1.14.3 which is incompatible.[0m
[31mplotnine 0.5.1 has requirement matplotlib>=3.0.0, but you'll have matplotlib 2.2.2 which is incompatible.[0m
[31mplotnine 0.5.1 has requirement pandas>=0.23.4, but you'll have pandas 0.23.0 which is incompatible.[0m
[31mneo4j-driver 1.6.2 has requirement neotime==1.0.0, but you'll have neotime 1.7.2 which is incompatible.[0m
[31mmizani 0.5.3 has requirement pandas>=0.23.4, but you'll have pandas 0.23.0 which is incompatible.[0m
[31mfastai 0.7.0 has requirement torch<0.4, but you'll have torch 0.4.1 which is incompatible.[0m
[33mYou are using pip version 10.0.1, however version 19.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
import tensorflow_datasets as tfds
dataset, info = tfds.load('imdb_reviews/subwords8k', with_info=True,
                         as_supervised=True) 

获取训练集、测试集

train_dataset, test_dataset = dataset['train'], dataset['test']

获取tokenizer对象,用进行字符处理级id转换(这里先转换成subword,再转换为id)等操作

tokenizer = info.features['text'].encoder
print('vocabulary size: ', tokenizer.vocab_size)
vocabulary size:  8185

token对象测试

sample_string = 'Hello word , Tensorflow'
tokenized_string = tokenizer.encode(sample_string)
print('tokened id: ', tokenized_string)

# 解码会原字符串
src_string = tokenizer.dec
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值