Paper reading——Deep residual pooling network for texture recognition

该研究提出了一种深度残差池化网络,用于纹理识别。通过保留特征图的空间信息并结合残差编码,实现了一个端到端的学习框架。这种方法克服了传统方法中的限制,提高了学习效率,并在多个纹理识别数据集上表现出优越性能。

Title

Deep residual pooling network for texture recognition

Year/ Authors/ Journal

2021

/Mao, Shangbo and Rajan, Deepu and Chia, Liang Tien

/ Pattern Recognition

citation

@article{
   
   mao2021deep,
  title={
   
   Deep residual pooling network for texture recognition},
  author={
   
   Mao, Shangbo and Rajan, Deepu and Chia, Liang Tien},
  journal={
   
   Pattern Recognition},
  volume={
   
   112},
  pages={
   
   107817},
  year={
   
   2021},
  publisher={
   
   Elsevier}
}

Summary

  • The balance between the orderless features and the spatial information for effective texture recognition.

  • Experiments show that retaining the spatial information before aggregation is helpful in feature learning for texture recognition.

Interesting Point(s)

  1. It would be interesting to explore if the best feature maps could be automatically identified as suitable candidates for combining.

  2. In our method, the multi-size training can influence only the features learned in the convolutional transfer module, which will not lead to a major influence in the final performance. We plan to address this in the future.

  3. The properties of the Deep-TEN for integrating Encoding Layer with and end-to-end CNN architecture.

    • spatial information + orderless features.

    • end to end training.

    • same dimensions.

Research Objective(s)

在这里插入图片描述

Fig. 1. Overall framework of deep residual pooling network. When the backbone network is Resnet-50 and the input image size is 224 ×224 ×3 , the dimension of the feature map extracted from $ f_{cnn} $ is 7 ×7 ×2048 , as the orange cube in the figure shows. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Current deep learning-based texture recognition methods extract spatial orderless features from pretrained deep learning models that are trained on large-scale image datasets. These methods either pro- duce high dimensional features or have multiple steps like dictionary learning, feature encoding and dimension reduction. In this paper, we propose a novel end-to-end learning framework that not only overcomes these limitations, but also demonstrates faster learning.

Contributions

The contribution of our work is three fold:

  • We propose a learnable residual pooling layer comprising of a residual encoding module and an aggregation module. We take advantage of the feature learning ability of the convolutional layer and integrate the idea of residual encoding to pro- pose a learnable pooling layer. Besides, the proposed layer produces the residual codes retaining spatial information and aggregates them to a feature with a lower dimension compared with the state-of-the-art methods. Experiments show that retaining the spatial information before aggregation is helpful in feature learning for texture recognition.
  • We propose a novel end-to-end learning framework that integrates the residual pooling layer into a pretrained CNN model for efficient feature transfer for texture recognition. We show the overview of the proposed residual pooling framework in Fig. 1 .
  • We compare our feature dimensions as well as the performance of the proposed pooling layer with other residual encoding schemes to illustrate state-of-the-art performance on bench- mark texture datasets and also on a visual inspection dataset from industry. We also test our method on a scene recognition dataset.

Background / Problem Statement

  • Following its success, several pretrained CNN models complemented by specific modules to improve accuracy have been proposed [5,13,25] that achieve better performance on benchmark datasets such as Flickr Material Dataset (FMD) [23] and Describable Texture Dataset (DTD) [3] . However, since the methods proposed in [4,25] contain multiple steps such as feature extraction, orderless encoding and dimension reduction, the advantages offered by end-to-end learning are not fully utilized. Moreover, the features extracted by all of these methods [4,5,25] have high dimensions resulting in operation on large matrices.
  • There is a need to balance orderless feature and ordered spatial information for effective texture recognition [33] . From feature visualization experiments, we see that pretrained CNN features are able to differentiate textures only to a certain ex- tent. Hence, we propose to use the pretrained CNN features as the compact dictionary. Since the pretrained CNN features mainly focus on the extraction of spatial sensitive information, we implement the hard assignment based on the spatial locations during the calculation of the residuals. Then, in order to get an orderless feature, we propose an aggregation module to remove the spatial sensitive information.
  • The challenge is to make the loss function differentiable with respect to the inputs and layer parameters.

All in one word: for better transferring the deep-learning method into texture recognition. Since model in this field always with pretrained in large dataset (such as ImageNet).

Method(s)

在这里插入图片描述

Unlike Deep TEN [34] , which removed the spatial sensitive information at the beginning itself, we retain the spatial sensitive information until the aggregation module to achieve the balance of orderless features and ordered spatial information.

Our proposed residual encoding module is motivated by Deep TEN [34] , but there are two main differences. The fir

### 关于 Null 的概念解析 Null 是编程领域中的一个重要概念,在不同的上下文中具有特定的意义。以下是对其含义的深入分析: #### 数据库中的 Null 值 在数据库操作中,`NULL` 表示字段缺少有效值的状态。例如,当需要将某条记录的电话号码置为空时,可以执行如下 SQL 语句来实现这一需求[^1]: ```sql UPDATE users SET phone = NULL WHERE user_id = 1; ``` 这条命令的作用是将 `user_id` 为 1 的用户记录中的 `phone` 字段设为 `NULL`。 --- #### C/C++ 中的 Null 指针 在 C 和 C++ 编程语言中,`NULL` 被用来表示空指针常量。它通常用于防止未初始化的指针引发潜在错误。以下是一个典型的例子[^2]: ```c #include <stdio.h> int main(void) { int *p = NULL; // 定义并初始化为 NULL p = (int *)malloc(sizeof(int)); // 动态分配内存 if (p != NULL) { // 验证指针是否合法 *p = 10; // 解引用赋值 printf("Value: %d\n", *p); } free(p); // 释放资源 p = NULL; // 将指针重新设置为 NULL } ``` 上述代码展示了如何安全地处理动态分配的内存以及通过将指针重置为 `NULL` 来避免悬垂指针问题。 --- #### 函数参数传递与 Static 变量 在讨论函数调用过程中涉及的变量生命周期和作用域时,`static` 关键字扮演着重要角色。具体来说,如果希望局部变量在整个程序运行期间保持其状态,则可使用该关键字对其进行声明[^3]。此外,对于修改指针本身所指向的内容而非仅限于数值的情况,往往需要用到双重指针或者返回新地址的方式完成目标。 --- 尽管您提到的是有关 Deep Residual Learning for Image Recognition 主题演示文稿的需求,但当前提供的参考资料主要围绕 null 或者其他基础计算机科学主题展开论述。因此建议进一步明确所需资料的具体方向以便获取更精准的信息源链接或文档推荐服务。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值