keras 图像分类示例_多类分类和信息瓶颈使用keras的示例

最新推荐文章于 2023-05-24 10:56:00 发布

weixin_26630173

最新推荐文章于 2023-05-24 10:56:00 发布

阅读量404

点赞数 1

文章标签： python 人工智能计算机视觉 java 深度学习

原文链接：https://towardsdatascience.com/multiclass-classification-and-information-bottleneck-an-example-using-keras-5591b9a2c000

版权

本文提供了使用keras进行多类图像分类的实例，结合信息瓶颈理论，展示了如何在深度学习模型中应用这一概念，以提高模型的泛化能力。

摘要由CSDN通过智能技术生成

keras 图像分类示例

Multiclass Classification is the classification of samples in more than two classes. Classifying samples into precisely two categories is colloquially referred to as Binary Classification.

多类别分类是指将样本分为两个以上类别。将样本精确地分为两类通常称为二元分类。

This piece will design a neural network to classify newsreels from the Reuters dataset, published by Reuters in 1986, into forty-six mutually exclusive classes using the Python library Keras. This problem is a typical example of a single-label, multiclass classification problem.

本文将设计一个神经网络，以使用python库Keras将来自Reuters的Reuters数据集中的新闻卷分类为46个互斥类，该类由Reuters于1986年发布。此问题是单标签，多类分类问题的典型示例。

信息瓶颈 (Information Bottleneck)

Neural networks are comprised of many layers. Each layer performs some transformation on the data mapping input to the output of the network. However, it is crucial to note that these layers do not generate any additional data and work solely on the data that they receive from the preceding layers.

神经网络由许多层组成。每一层对输入到网络输出的数据映射执行一些转换。但是，至关重要的是要注意，这些层不会生成任何其他数据，而只能处理它们从先前层接收到的数据。

If, say, a layer drops some relevant data, that information becomes inaccessible to all subsequent layers. This information is permanently lost and cannot be retrieved. The layer that drops this information now acts as a bottleneck, stifling the increase of the model’s accuracy and performance, thus acting as an information bottleneck.

例如，如果某个层丢弃了一些相关数据，则所有后续层都无法访问该信息。此信息永久丢失，无法检索。现在，丢弃此信息的层将成为瓶颈，扼杀模型准确性和性能的提高，从而成为信息瓶颈。

We shall see this in action later on.

我们将在稍后看到这一点。

路透数据集 (The Reuters Dataset)

The Reuters dataset is a set of short newswires sorted into 46 mutually exclusive topics. Reuters published it in 1986. This dataset is used widely for text classification. There are 46 topics, where some topics are represented more than others. However, each topic contains at least ten examples in the training set.

路透社数据集是一组短新闻通讯，分为46个互斥的主题。路透社于1986年发布了该数据集。该数据集被广泛用于文本分类。有46个主题，其中某些主题比其他主题更具代表性。但是，每个主题在训练集中至少包含十个示例。

The Reuters dataset comes preloaded with Keras and contains 8,982 training examples and 2,246 testing examples.

Reuters 数据集预装有Keras，其中包含8,982个训练示例和2,246个测试示例。

加载数据 (Loading the data)

Load data from the pre-packaged module in Keras. We will limit the data to 10,000 of the most frequently occurring words. To do this, we pass num_words=10000 argument to the load_data function.

从Keras中的预包装模块加载数据。我们会将数据限制为10,000个最常出现的单词。为此，我们将num_words=10000参数传递给load_data函数。

rakshitraj hosted on rakshitraj的代码托管在 GitHub GitHub上

一些探索性数据分析 (Some Exploratory Data Analysis)

We’ll perform some good-old-fashioned EDA on our dataset. Doing so will give us a general idea of the breadth and scope of our data.

我们将在数据集中执行一些老式的EDA。这样做将使我们对数据的广度和范围有一个大致的了解。

rakshitraj hosted on rakshitraj的代码托管在 GitHub GitHub上

解码故事 (Decode a story)

Let’s go ahead and decode a story. Decoding helps us get the gist of the organization and encoding of the data.

让我们继续解码一个故事。解码可以帮助我们了解数据的组织和编码要点。

rakshitraj hosted on rakshitraj的代码托管在 GitHub GitHub上

准备数据 (Preparing the data)

We cannot feed integer sequences to the neural network; therefore, we will vectorize each sequence and convert it into tensors. We do this by One-Hot Encoding each sequence.

最低0.47元/天解锁文章

weixin_26630173

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
keras 图像分类示例_多类分类和信息瓶颈使用keras的示例

keras 图像分类示例Multiclass Classification is the classification of samples in more than two classes. Classifying samples into precisely two categories is colloquially referred to as Binary Classification....
复制链接

扫一扫