大图机器学习：使用神经网络和TensorFlow对文本进行分类

最新推荐文章于 2021-11-11 10:07:16 发布

cumian9828

最新推荐文章于 2021-11-11 10:07:16 发布

阅读量334

点赞数

文章标签：神经网络算法 python tensorflow 机器学习

原文链接：https://www.freecodecamp.org/news/big-picture-machine-learning-classifying-text-with-neural-networks-and-tensorflow-d94036ac2274/

版权

by Déborah Mesquita

由DéborahMesquita

大图机器学习：使用神经网络和TensorFlow对文本进行分类 (Big Picture Machine Learning: Classifying Text with Neural Networks and TensorFlow)

Developers often say that if you want to get started with machine learning, you should first learn how the algorithms work. But my experience shows otherwise.

开发人员经常说，如果您想开始机器学习，则应该首先学习算法的工作原理。但是我的经验表明情况并非如此。

I say you should first be able to see the big picture: how the applications work. Once you understand this, it becomes much easier to dive in deep and explore the inner workings of the algorithms.

我说您首先应该能够看到全局： 应用程序如何工作 。一旦了解了这一点，深入研究和研究算法的内部原理将变得更加容易。

So how do you develop an intuition and achieve this big-picture understanding of machine learning? A good way to do this is by creating machine learning models.

那么，您如何发展直觉并实现对机器学习的全面了解？做到这一点的一个好方法是创建机器学习模型 。

Assuming you still don’t know how to create all these algorithms from scratch, you’ll want to use a library that has all these algorithms already implemented for you. And that library is TensorFlow.

假设您仍然不知道如何从头开始创建所有这些算法，那么您将想使用一个已经为您实现了所有这些算法的库。该库是TensorFlow 。

In this article, we’ll create a machine learning model to classify texts into categories. We’ll cover the following topics:

在本文中，我们将创建一个机器学习模型来将文本分类。我们将讨论以下主题：

How TensorFlow works
TensorFlow的工作原理
What is a machine learning model
什么是机器学习模型
What is a Neural Network
什么是神经网络
How the Neural Network learns
神经网络如何学习
How to manipulate data and pass it to the Neural Network inputs
如何处理数据并将其传递给神经网络输入
How to run the model and get the prediction results
如何运行模型并获得预测结果

You will probably learn a lot of new things, so let’s start! ?

您可能会学到很多新东西，所以让我们开始吧！？

TensorFlow (TensorFlow)

TensorFlow is an open-source library for machine learning, first created by Google. The name of the library help us understand how we work with it: tensors are multidimensional arrays that flow through the nodes of a graph.

TensorFlow是一个用于机器学习的开源库，最早由Google创建。库的名称有助于我们理解我们如何使用它：张量是流过图节点的多维数组。

图 (tf.Graph)

Every computation in TensorFlow is represented as a dataflow graph. This graph has two elements:

TensorFlow中的每个计算都表示为数据流图。该图包含两个元素：

a set of tf.Operation, that represents units of computation
一组tf.Operation ，代表计算单位
a set of tf.Tensor, that represents units of data
一组tf.Tensor ，代表数据单位

To see how all this works you will create this dataflow graph:

要查看所有这些工作原理，您将创建以下数据流图：

You’ll define x = [1,3,6] and y = [1,1,1]. As the graph works with tf.Tensor to represent units of data, you will create constant tensors:

您将定义x = [1,3,6]和y = [1,1,1] 。当图形与tf.Tensor一起表示数据单位时，您将创建常数张量：

import tensorflow as tf

x = tf.constant([1,3,6]) y = tf.constant([1,1,1])

Now you’ll define the operation unit:

现在，您将定义操作单位：

import tensorflow as tf

x = tf.constant([1,3,6]) y = tf.constant([1,1,1])

op = tf.add(x,y)

You have all the graph elements. Now you need to build the graph:

您具有所有图形元素。现在您需要构建图形：

import tensorflow as tf

my_graph = tf.Graph()

with my_graph.as_default():    x = tf.constant([1,3,6])     y = tf.constant([1,1,1])

op = tf.add(x,y)

This is how the TensorFlow workflow works: you first create a graph, and only then can you make the computations (really ‘running’ the graph nodes with operations). To run the graph you’ll need to create a tf.Session.

TensorFlow工作流程就是这样工作的：首先创建一个图，然后才可以进行计算(通过操作真正“运行”图节点)。要运行该图，您需要创建一个tf.Session 。

会话 (tf.Session)

A tf.Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated (from the docs). To do that, we need to define which graph will be used in the Session:

一个tf.Session对象封装了在其中执行Operation对象的环境，并评估了Tensor对象(来自docs )。为此，我们需要定义在会话中使用哪个图形：

import tensorflow as tf

my_graph = tf.Graph()

with tf.Session(graph=my_graph) as sess:    x = tf.constant([1,3,6])     y = tf.constant([1,1,1])

op = tf.add(x,y)

To execute the operations, you’ll use the method tf.Session.run(). This method executes one ‘step’ of the TensorFlow computation, by running the necessary graph fragment to execute every Operation objects and evaluate every Tensor passed in the argument fetches. In your case you will run a step of the sum operations:

要执行操作，您将使用方法tf.Session.run() 。该方法执行TensorFlow计算中的一个“步骤”，通过运行必要图形片段来执行每一个Operation对象和评价每Tensor在参数传递fetches 。在您的情况下，您将执行求和运算的步骤：

import tensorflow as tf

my_graph = tf.Graph()

with tf.Session(graph=my_graph) as sess:    x = tf.constant([1,3,6])     y = tf.constant([1,1,1])

op = tf.add(x,y)    result = sess.run(fetches=op)    print(result)

>>>; [2 4 7]

预测模型 (A Predictive Model)

Now that you know how TensorFlow works, you have to learn how to create a predictive model. In short,

既然您知道TensorFlow的工作原理，那么您就必须学习如何创建预测模型。简而言之，

Machine learning algorithm + data = predictive model

机器学习算法 + 数据 = 预测模型

The process to construct a model is like this:

构造模型的过程如下：

As you can see, the model consists of a machine learning algorithm ‘trained’ with data. When you have the model you will get results like this:

如您所见，该模型由数据训练的机器学习算法组成。建立模型后，您将获得如下结果：

The goal of the model you will create is to classify texts in categories, we define that:

您将创建的模型的目标是将文本分类，我们定义为：

input: text, result: category

输入：文本，结果：类别

We have a training dataset with all the texts labeled (every text has a label indicating to which category it belongs). In machine learning this type of task is denominated Supervised learning.

我们有一个训练数据集，其中所有文本都带有标签(每个文本都有一个标签，指示它属于哪个类别)。在机器学习中，这类任务称为监督学习 。

“We know the correct answers. The algorithm iteratively makes predictions on the training data and is corrected by the teacher.” — Jason Brownlee

“我们知道正确的答案。该算法反复对训练数据进行预测，并由教师进行纠正。” — 杰森·布朗利

You’ll classify data into categories, so it’s also a Classification task.

您将数据分类，因此它也是一个分类任务。

To create the model, we’re going to use Neural Networks.

为了创建模型，我们将使用神经网络。

神经网络 (Neural Networks)

A neural network is a computational model (a way to describe a system using mathematical language and mathematical concepts). These systems are self-learning and trained, rather than explicitly programmed.

神经网络是一种计算模型(一种使用数学语言和数学概念描述系统的方法)。这些系统是自学和培训的，而不是明确编程的。

Neural networks are inspired by our central nervous system. They have connected nodes that are similar to our neurons.

神经网络的灵感来自我们的中枢神经系统。它们连接的节点类似于我们的神经元。

The Perceptron was the first neural network algorithm. This article explains really well the inner working of a perceptron (the “Inside an artificial neuron” animation is fantastic).

Perceptron是第一个神经网络算法。本文非常好地解释了感知器的内部工作原理(“在人工神经元内部”动画非常棒)。

To understand how a neural network works we will actually build a neural network architecture with TensorFlow. This architecture was used by Aymeric Damien in this example.

为了了解神经网络的工作原理，我们将实际使用TensorFlow构建神经网络架构。在此示例中， Aymeric Damien使用了此体系结构。

神经网络架构 (Neural Network architecture)

The neural network will have 2 hidden layers (you have to choose how many hidden layers the network will have, is part of the architecture design). The job of each hidden layer is to transform the inputs into something that the output layer can use.

神经网络将具有2个隐藏层( 您必须选择网络将具有多少个隐藏层，这是体系结构设计的一部分)。每个隐藏层的工作是将输入转换为输出层可以使用的内容。

Hidden layer 1

隐藏层1

You also need to define how many nodes the 1st hidden layer will have. These nodes are also called features or neurons, and in the image above they are represented by each circle.

您还需要定义第一隐藏层有多少个节点。这些节点也称为特征或神经元，在上图中它们由每个圆圈表示。

In the input layer every node corresponds to a word of the dataset (we will see how this works later).

在输入层中，每个节点都对应于数据集的一个单词(稍后我们将了解其工作原理)。

As explained here, each node (neuron) is multiplied by a weight. Every node has a weight value, and during the training phase the neural network adjusts these values in order to produce a correct output (wait, we will learn more about this in a minute).

如所解释这里，每个节点(神经元)被乘以权重。每个节点都有一个权重值，并且在训练阶段，神经网络会调整这些值以产生正确的输出(等待，我们将在一分钟内了解更多信息)。

In addition to multiplying each input node by a weight, the network also adds a bias (role of bias in neural networks).

除了将每个输入节点乘以权重外，网络还添加了一个偏差( 神经网络中偏差的作用 )。

In your architecture after multiplying the inputs by the weights and sum the values to the bias, the data also pass by an activation function. This activation function defines the final output of each node. An analogy: imagine that each node is a lamp, the activation function tells if the lamp will light or not.

在您的体系结构中，将输入乘以权重并将值相加为偏差后，数据也会通过激活函数传递。此激活功能定义每个节点的最终输出。打个比方：假设每个节点都是一盏灯，激活功能将告诉该灯是否点亮。

There are many types of activation functions. You will use the rectified linear unit (ReLu). This function is defined this way:

激活功能的类型很多。您将使用整流线性单元(ReLu)。此函数的定义方式如下：

f(x) = max(0,x) [the output is x or 0 (zero), whichever is larger]

f(x) = max(0，x) [输出为x或0(零)，以较大者为准]

Examples: ifx = -1, then f(x) = 0(zero); if x = 0.7, then f(x) = 0.7.

示例：如果x = -1，则f(x)= 0 (零)； 如果x = 0.7 ，则f(x)= 0.7 。

Hidden layer 2

隐藏层2

The 2nd hidden layer does exactly what the 1st hidden layer does, but now the input of the 2nd hidden layer is the output of the 1st one.

第二隐藏层的功能与第一隐藏层的功能完全相同，但是现在第二隐藏层的输入就是第一隐藏层的输出。

Output layer

输出层

And we finally got to the last layer, the output layer. You will use the one-hot encoding to get the results of this layer. In this encoding only one bit has the value 1 and all the other ones got a zero value. For example, if we want to encode three categories (sports, space and computer graphics):

最后，我们到达了最后一层，即输出层。您将使用一键编码来获取该层的结果。在这种编码中，只有一位的值为1，而其他所有的值为0。例如，如果我们要编码三个类别(运动，空间和计算机图形)：

+-------------------+-----------+|    category       |   value   |+-------------------|-----------+|      sports       |    001    ||      space        |    010    || computer graphics |    100    ||-------------------|-----------|

So the number of output nodes is the number of classes of the input dataset.

因此，输出节点的数量就是输入数据集的类的数量。

The output layer values are also multiplied by the weights and we also add the bias, but now the activation function is different.

输出层的值也乘以权重，我们还加上了偏差，但是现在激活函数有所不同。

You want to label each text with a category, and these categories are mutually exclusive (a text doesn’t belong to two categories at the same time). To consider this, instead of using the ReLu activation function you will use the Softmax function. This function transforms the output of each unity to a value between 0 and 1 and also makes sure that the sum of all units equals 1. This way the output will tell us the probability of each text for each category.

您要为每个文本添加一个类别标签，而这些类别是互斥的(一个文本不能同时属于两个类别)。考虑到这一点，您将使用Softmax功能而不是使用ReLu激活功能。此函数将每个单位的输出转换为0到1之间的值，并确保所有单位的总和等于1。这样，输出将告诉我们每个类别的每个文本的概率。

| 1.2                    0.46|| 0.9   -> [softmax] ->  0.34|| 0.4                    0.20|

And now you have the data flow graph of your neural network. Translating everything we saw so far into code, the result is:

现在，您有了神经网络的数据流程图。将到目前为止所看到的一切翻译成代码，结果是：

# Network Parametersn_hidden_1 = 10        # 1st layer number of featuresn_hidden_2 = 5         # 2nd layer number of featuresn_input = total_words  # Words in vocabn_classes = 3          # Categories: graphics, space and baseball

def multilayer_perceptron(input_tensor, weights, biases):    layer_1_multiplication = tf.matmul(input_tensor, weights['h1'])    layer_1_addition = tf.add(layer_1_multiplication, biases['b1'])    layer_1_activation = tf.nn.relu(layer_1_addition)

# Hidden layer with RELU activation    layer_2_multiplication = tf.matmul(layer_1_activation, weights['h2'])    layer_2_addition = tf.add(layer_2_multiplication, biases['b2'])    layer_2_activation = tf.nn.relu(layer_2_addition)

# Output layer with linear activation    out_layer_multiplication = tf.matmul(layer_2_activation, weights['out'])    out_layer_addition = out_layer_multiplication + biases['out']

return out_layer_addition

(We’ll talk about the code for the output layer activation function later.)

(稍后我们将讨论输出层激活功能的代码。)

神经网络如何学习 (How the neural network learns)

As we saw earlier the weight values are updated while the network is trained. Now we will see how this happens in the TensorFlow environment.

如我们先前所见，在训练网络时会更新权重值。现在我们将看到在TensorFlow环境中如何发生这种情况。

tf。变量 (tf.Variable)

The weights and biases are stored in variables (tf.Variable). These variables maintain state in the graph across calls to run(). In machine learning we usually start the weight and bias values through a normal distribution.

权重和偏差存储在变量( tf.Variable )中。这些变量在对run()调用中保持图形中的状态。在机器学习中，我们通常通过正态分布开始权重和偏差值。

weights = {    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),    'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))}biases = {    'b1': tf.Variable(tf.random_normal([n_hidden_1])),    'b2': tf.Variable(tf.random_normal([n_hidden_2])),    'out': tf.Variable(tf.random_normal([n_classes]))}

When we run the network for the first time (that is, the weight values are the ones defined by the normal distribution):

当我们第一次运行网络时(即权重值是由正态分布定义的值)：

input values: xweights: wbias: boutput values: z

expected values: expected

To know if the network is learning or not, you need to compare the output values (z) with the expected values (expected). And how do we compute this difference (loss)? There are many methods to do that. Because we are working with a classification task, the best measure for the loss is the cross-entropy error.

要了解网络是否正在学习，您需要将输出值( z )与期望值( Expected )进行比较。以及我们如何计算这种差异(损失)？有很多方法可以做到这一点。因为我们正在处理分类任务，所以损失的最佳度量是交叉熵误差。

James D. McCaffrey wrote a brilliant explanation about why this is the best method for this kind of task.

詹姆斯·D·麦卡弗里 ( James D. McCaffrey ) 出色地解释了为什么这是执行此类任务的最佳方法。

With TensorFlow you will compute the cross-entropy error using the tf.nn.softmax_cross_entropy_with_logits() method (here is the softmax activation function) and calculate the mean error (tf.reduce_mean()).

使用TensorFlow，您将使用tf.nn.softmax_cross_entropy_with_logits()方法(此处是softmax激活函数)计算交叉熵误差，并计算平均误差( tf.reduce_mean() )。

# Construct modelprediction = multilayer_perceptron(input_tensor, weights, biases)

# Define lossentropy_loss = tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=output_tensor)loss = tf.reduce_mean(entropy_loss)

You want to find the best values for the weights and biases in order to minimize the output error (the difference between the value we got and the correct value). To do that you will use the gradient descent method. To be more specific, you will use the stochastic gradient descent.

您希望找到权重和偏差的最佳值，以最大程度地减少输出误差(我们得到的值与正确值之间的差)。为此，您将使用梯度下降法。更具体地说，您将使用随机梯度下降。

There are also many algorithms to compute the gradient descent, you will use the Adaptive Moment Estimation (Adam). To use this algorithm in TensorFlow you need to pass the learning_rate value, that determines the incremental steps of the values to find the best weight values.

还有很多算法可以计算梯度下降，您将使用自适应矩估计(Adam) 。要在TensorFlow中使用此算法，您需要传递learning_rate值，该值确定值的增量步长以找到最佳权重值。

The method tf.train.AdamOptimizer(learning_rate).minimize(loss) is a syntactic sugar that does two things:

方法tf.train.AdamOptimizer(learning_rate) .minimize(loss)是一种语法糖，它可以完成两件事：

compute_gradients(loss, <list of variables>)
compute_gradients (损失，<变量列表>)
apply_gradients(<list of variables>)
apply_gradients (<变量列表>)

The method updates all the tf.Variables with the new values, so we don’t need to pass the list of variables. And now you have the code to train the network:

该方法使用新值更新所有tf.Variables ，因此我们不需要传递变量列表。现在，您有了用于训练网络的代码：

learning_rate = 0.001

# Construct modelprediction = multilayer_perceptron(input_tensor, weights, biases)

# Define lossentropy_loss = tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=output_tensor)loss = tf.reduce_mean(entropy_loss)

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)

数据处理 (Data manipulation)

The dataset you will use has many texts in English and we need to manipulate this data to pass it to the neural network. To do that you will do two things:

您将使用的数据集有许多英文文本，我们需要处理这些数据以将其传递到神经网络。为此，您将做两件事：

Create an index for each word
为每个单词创建索引
Create a matrix for each text, where the values are 1 if a word is in the text and 0 if not
为每个文本创建一个矩阵，如果文本中有单词，则值为1，否则为0

Let’s see the code to understand this process:

让我们看一下代码以了解此过程：

import numpy as np    #numpy is a package for scientific computingfrom collections import Counter

vocab = Counter()

text = "Hi from Brazil"

#Get all wordsfor word in text.split(' '):    vocab[word]+=1        #Convert words to indexesdef get_word_2_index(vocab):    word2index = {}    for i,word in enumerate(vocab):        word2index[word] = i            return word2index

#Now we have an indexword2index = get_word_2_index(vocab)

total_words = len(vocab)

#This is how we create a numpy array (our matrix)matrix = np.zeros((total_words),dtype=float)

#Now we fill the valuesfor word in text.split():    matrix[word2index[word]] += 1

print(matrix)

>>> [ 1.  1.  1.]

In the example above the text was ‘Hi from Brazil’ and the matrix was [ 1. 1. 1.]. What if the text was only ‘Hi’?

在上面的示例中，文本为“ Hi from Brazil”，矩阵为[1. 1. 1.] 。如果文本只是“ Hi”怎么办？

matrix = np.zeros((total_words),dtype=float)

text = "Hi"

for word in text.split():    matrix[word2index[word.lower()]] += 1

print(matrix)

>>> [ 1.  0.  0.]

You will to the same with the labels (categories of the texts), but now you will use the one-hot encoding:

您将使用相同的标签(文本类别)，但是现在您将使用一键编码：

y = np.zeros((3),dtype=float)

if category == 0:    y[0] = 1.        # [ 1.  0.  0.]elif category == 1:    y[1] = 1.        # [ 0.  1.  0.]else:     y[2] = 1.       # [ 0.  0.  1.]

运行图并获取结果 (Running the graph and getting the results)

Now comes the best part: getting the results from the model. First let’s take a closer look at the input dataset.

现在最好的部分是：从模型中获取结果。首先，让我们仔细看一下输入数据集。

数据集 (The dataset)

You will use the 20 Newsgroups, a dataset with 18.000 posts about 20 topics. To load this dataset you will use the scikit-learn library. We will use only 3 categories: comp.graphics, sci.space and rec.sport.baseball. The scikit-learn has two subsets: one for training and one for testing. The recommendation is that you should never look at the test data, because this can interfere in your choices while creating the model. You don’t want to create a model to predict this specific test data, you want to create a model with a good generalization.

您将使用20个新闻组，该数据集包含关于20个主题的18.000个帖子。要加载此数据集，您将使用scikit-learn库。我们将仅使用3个类别： comp.graphics ， sci.space和rec.sport.baseball 。 scikit学习包含两个子集：一个用于训练，另一个用于测试。建议您永远不要查看测试数据 ，因为这会干扰创建模型时的选择。您不想创建模型来预测此特定测试数据，而是想创建具有良好泛化性的模型。

This is how you will load the datasets:

这是加载数据集的方式：

from sklearn.datasets import fetch_20newsgroups

categories = ["comp.graphics","sci.space","rec.sport.baseball"]

newsgroups_train = fetch_20newsgroups(subset='train', categories=categories)newsgroups_test = fetch_20newsgroups(subset='test', categories=categories)

训练模型 (Training the model)

In the neural network terminology, one epoch = one forward pass (getting the output values) and one backward pass (updating the weights) of all the training examples.

在神经网络术语中，所有训练示例的一个历元=一个向前传递(获取输出值)和一个向后传递(更新权重)。

Remember the tf.Session.run() method? Let’s take a closer look at it:

还记得tf.Session.run()方法吗？让我们仔细看一下：

tf.Session.run(fetches, feed_dict=None, options=None, run_metadata=None)

In the dataflow graph of the beginning of this article you used the sum operation, but we can also pass a list of things to run. In this neural network run you will pass two things: the loss calculation and the optimization step.

在本文开头的数据流图中，您使用了sum运算，但是我们还可以传递要运行的事物的列表。在此神经网络运行中，您将传递两件事：损失计算和优化步骤。

The feed_dict parameter is where we pass the data for each run step. To pass this data we need to define tf.placeholders (to feed the feed_dict).

feed_dict参数是我们为每个运行步骤传递数据的地方。要传递此数据，我们需要定义tf.placeholders (以供稿feed_dict )。

As the TensorFlow documentation says:

正如TensorFlow文档所说：

“A placeholder exists solely to serve as the target of feeds. It is not initialized and contains no data.” — Source

“占位符仅作为饲料的目标而存在。它尚未初始化，也不包含任何数据。” — 来源

So you will define your placeholders like this:

因此，您将如下定义占位符：

n_input = total_words # Words in vocabn_classes = 3         # Categories: graphics, sci.space and baseball

input_tensor = tf.placeholder(tf.float32,[None, n_input],name="input")output_tensor = tf.placeholder(tf.float32,[None, n_classes],name="output")

You will separate the training data in batches:

您将分批训练数据：

“If you use placeholders for feeding input, you can specify a variable batch dimension by creating the placeholder with tf.placeholder(…, shape=[None, …]). The None element of the shape corresponds to a variable-sized dimension.” — Source

“如果将占位符用于输入输入 ，则可以通过使用tf.placeholder(...，shape = [ None ，…])创建占位符来指定可变批次尺寸 。形状的None元素对应于可变尺寸。” — 来源

We will feed the dict with a larger batch while testing the model, that’s why you need to the define a variable batch dimension.

在测试模型时，我们将以较大的批次为dict填充，这就是为什么您需要定义可变的批次尺寸。

The get_batches() function gives us the number of texts with the size of the batch. And now we can run the model:

get_batches()函数为我们提供了具有批处理大小的文本数。现在我们可以运行模型了：

training_epochs = 10

# Launch the graphwith tf.Session() as sess:    sess.run(init) #inits the variables (normal distribution, remember?)

# Training cycle    for epoch in range(training_epochs):        avg_cost = 0.        total_batch = int(len(newsgroups_train.data)/batch_size)        # Loop over all batches        for i in range(total_batch):            batch_x,batch_y = get_batch(newsgroups_train,i,batch_size)            # Run optimization op (backprop) and cost op (to get loss value)            c,_ = sess.run([loss,optimizer], feed_dict={input_tensor: batch_x, output_tensor:batch_y})

Now you have the model, trained. To test it, you’ll also need to create graph elements. We’ll measure the accuracy of the model, so you need to get the index of the predicted value and the index of the correct value (because we are using the one-hot encoding), check is they are equal and calculate the mean to all the test dataset:

现在您有了模型，经过训练。要测试它，您还需要创建图形元素。我们将测量模型的准确性，因此您需要获取预测值的索引和正确值的索引(因为我们使用的是单热编码)，请检查它们是否相等并计算均值，所有测试数据集：

# Test model    index_prediction = tf.argmax(prediction, 1)    index_correct = tf.argmax(output_tensor, 1)    correct_prediction = tf.equal(index_prediction, index_correct)

# Calculate accuracy    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))    total_test_data = len(newsgroups_test.target)    batch_x_test,batch_y_test = get_batch(newsgroups_test,0,total_test_data)    print("Accuracy:", accuracy.eval({input_tensor: batch_x_test, output_tensor: batch_y_test}))

>>> Epoch: 0001 loss= 1133.908114347    Epoch: 0002 loss= 329.093700409    Epoch: 0003 loss= 111.876660109    Epoch: 0004 loss= 72.552971845    Epoch: 0005 loss= 16.673050320    Epoch: 0006 loss= 16.481995190    Epoch: 0007 loss= 4.848220565    Epoch: 0008 loss= 0.759822878    Epoch: 0009 loss= 0.000000000    Epoch: 0010 loss= 0.079848485    Optimization Finished!

Accuracy: 0.75

And that’s it! You created a model using a neural network to classify texts into categories. Congratulations! ?

就是这样！您使用神经网络创建了一个模型来将文本分类。恭喜你！？

You can see the notebook with the final code here.

您可以在此处查看带有最终代码的笔记本。

Tip: modify the values we defined to see how the changes affect the training time and the model accuracy.

提示：修改我们定义的值，以查看更改如何影响训练时间和模型准确性。

Any questions or suggestions? Leave them in the comments. Oh, and thank’s for reading! ? ✌?

有什么问题或建议吗？将它们留在注释中。哦，谢谢您的阅读！？ ✌？

Did you found this article helpful? I try my best to write a deep dive article each month, you can receive an email when I publish a new one.

您觉得这篇文章对您有帮助吗？我每个月都会尽力写一篇深入的文章，当我发布新文章时，您会收到一封电子邮件。

It would mean a lot if you click the ? and share with friends. Follow me for more articles about Data Science and Machine Learning.

如果您单击“？”，这将意味着很多。 并与朋友分享。 关注我以获取有关数据科学和机器学习的更多文章。

翻译自: https://www.freecodecamp.org/news/big-picture-machine-learning-classifying-text-with-neural-networks-and-tensorflow-d94036ac2274/

cumian9828

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
大图机器学习：使用神经网络和TensorFlow对文本进行分类

by Déborah Mesquita 由DéborahMesquita 大图机器学习：使用神经网络和TensorFlow对文本进行分类 (Big Picture Machine Learning: Classifying Text with Neural Networks and TensorFlow)Developers often say that if you want to get...
复制链接

扫一扫