keras 图像分类示例_多类分类和信息瓶颈使用keras的示例

本文提供了使用keras进行多类图像分类的实例,结合信息瓶颈理论,展示了如何在深度学习模型中应用这一概念,以提高模型的泛化能力。
摘要由CSDN通过智能技术生成

keras 图像分类示例

Multiclass Classification is the classification of samples in more than two classes. Classifying samples into precisely two categories is colloquially referred to as Binary Classification.

类别分类是指将样本分为两个以上类别。 将样本精确地分为两类通常称为二元 分类

This piece will design a neural network to classify newsreels from the Reuters dataset, published by Reuters in 1986, into forty-six mutually exclusive classes using the Python library Keras. This problem is a typical example of a single-label, multiclass classification problem.

本文将设计一个神经网络,以使用python库Keras将来自Reuters的Reuters数据集中的新闻卷分类为46个互斥类,该类由Reuters于1986年发布。 此问题是单标签,多类分类问题的典型示例

信息瓶颈 (Information Bottleneck)

Neural networks are comprised of many layers. Each layer performs some transformation on the data mapping input to the output of the network. However, it is crucial to note that these layers do not generate any additional data and work solely on the data that they receive from the preceding layers.

神经网络由许多层组成。 每一层对输入到网络输出的数据映射执行一些转换。 但是,至关重要的是要注意,这些层不会生成任何其他数据,而只能处理它们从先前层接收到的数据。

If, say, a layer drops some relevant data, that information becomes inaccessible to all subsequent layers. This information is permanently lost and cannot be retrieved. The layer that drops this information now acts as a bottleneck, stifling the increase of the model’s accuracy and performance, thus acting as an information bottleneck.

例如,如果某个层丢弃了一些相关数据,则所有后续层都无法访问该信息。 此信息永久丢失,无法检索。 现在,丢弃此信息的层将成为瓶颈,扼杀模型准确性和性能的提高,从而成为信息瓶颈。

We shall see this in action later on.

我们将在稍后看到这一点。

路透数据集 (The Reuters Dataset)

The Reuters dataset is a set of short newswires sorted into 46 mutually exclusive topics. Reuters published it in 1986. This dataset is used widely for text classification. There are 46 topics, where some topics are represented more than others. However, each topic contains at least ten examples in the training set.

路透社数据集是一组短新闻通讯,分为46个互斥的主题。 路透社于1986年发布了该数据集。该数据集被广泛用于文本分类。 有46个主题,其中某些主题比其他主题更具代表性。 但是,每个主题在训练集中至少包含十个示例。

The Reuters dataset comes preloaded with Keras and contains 8,982 training examples and 2,246 testing examples.

Reuters 数据集预装有Keras,其中包含8,982个训练示例和2,246个测试示例。

加载数据 (Loading the data)

Load data from the pre-packaged module in Keras. We will limit the data to 10,000 of the most frequently occurring words. To do this, we pass num_words=10000 argument to the load_data function.

从Keras中的预包装模块加载数据。 我们会将数据限制为10,000个最常出现的单词。 为此,我们将num_words=10000参数传递给load_data函数。

rakshitraj hosted on rakshitraj的代码托管在 GitHub GitHub上

一些探索性数据分析 (Some Exploratory Data Analysis)

We’ll perform some good-old-fashioned EDA on our dataset. Doing so will give us a general idea of the breadth and scope of our data.

我们将在数据集中执行一些老式的EDA。 这样做将使我们对数据的广度和范围有一个大致的了解。

rakshitraj hosted on rakshitraj的代码托管在 GitHub GitHub上

解码故事 (Decode a story)

Let’s go ahead and decode a story. Decoding helps us get the gist of the organization and encoding of the data.

让我们继续解码一个故事。 解码可以帮助我们了解数据的组织和编码要点。

rakshitraj hosted on rakshitraj的代码托管在 GitHub GitHub上

准备数据 (Preparing the data)

We cannot feed integer sequences to the neural network; therefore, we will vectorize each sequence and convert it into tensors. We do this by One-Hot Encoding each sequence.

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值