Paper | Cross-stitch Networks for Multi-task Learning

论文:Cross-stitch Networks for Multi-task Learning

Misra, Ishan, et al. "Cross-stitch networks for multi-task learning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
Over 160 citations (2019).

1. 问题

假设我们有任务A和B,并且这两个任务存在一定的关联性。最常见的做法是:对相同的输入,A和B共享同一个输入特征提取网络,然后在同样的特征上,各自单独训练,得到最终结果。
至于在哪里分开(独立),我们可以做遍历实验,尝试所有可能的网络结构,如图:
try

显然,这种暴力穷举法非常笨拙,并且找到的最佳结构也不是通用的。尽管我们缺乏理论指导,但有没有更好的实验方法?
有的!下面介绍的这篇文章Cross-stitch Networks for Multi-task Learning,来自卡耐基梅隆机器人所,是CVPR2016高引论文。

首先,作者选择了两对关联任务:语义分割(Semantic segmentation)和曲面法线预测(Surface normal prediction),以及物体检测(Object detection)和属性预测(Attribute prediction)。根据上图的分、合模式,我们可以得到对应的实验结果:
try_result
图中的数据,是与task-specific网络(即上上图最右)比较的结果。作者总结出两点:

  1. 多任务相较于单任务,有一定的优势;
  2. 最佳分、合结构因任务而异。

接下来才是我们的重点。

2. 十字绣结构(Cross-stitch architecture)

看图秒懂:
cross

啥?不知道怎么分合?让它变成超参数!
是的,十字绣结构就这么简单。在每一层的输出后,增加这样的分、合结构,然后再接入下一层的输入。
为了保证可导(可反向传播),这里的分合不是开关结构,而是由超参数加权控制。
图中的\(\alpha_S\)意为the same-task values,\(\alpha_D\)意为the different-task values。\(\alpha_S=1\)就是task-spec结构,\(\alpha_D\)越大共享程度越高。

但在引入新结构以后,出现了以下问题:

  • 这些超参数怎么初始化?
    为了保证十字绣结构前后数据量级不变,很自然地,我们最好规定初始状态下超参数之和\(\alpha_S+\alpha_D=1\)
    但注意,这只是初始化的规定,在训练过程中超参数可以自由发展。
    其次,具体怎么设置,还得靠实验,所以是因任务而异的:
    exp1

  • 由于这些超参数的初始值是网络中一般参数的一到二倍大,因此在实验中发现,这些超参数调整过慢。
    为了加快收敛,这些超参数的学习率被直接乘以10的若干次方。
    通过实验发现,10的2到3次方最佳,此时收敛速度更快,实验结果也更好。
    exp2

  • A和B网络如何初始化?统一还是各自初始化?
    还是实验解答。
  1. 在ImageNet特征上,分别进行20K次迭代,得到one-task initialized的两个网络;再进行10K次统一迭代。
  2. 直接在ImageNet特征上进行30K次迭代。
    结果是前者更好:
    exp3
    因此推荐分别各自初始化。
  • Det+Attr实验中发现,如果十字绣结构连接每一层的对应channel,会导致学习不稳定。
    因此这一对任务的十字绣结构,在每一层之间只用一个。

3. 实验设计

实验关注以下几点:

  1. 在4个任务的实验中,使用cross-stitch都达到了对比算法中的最佳效果,个别除外。

  2. 一些对比算法的参数规模是本方法的2倍。

  3. 对比算法中包括当时最好的结构穷举方法。

  4. 在语义分割任务中,不同类别的数据量不同。实验发现,数据量越少的分类,其准确率上升大致上反而越多(十字绣帮助越大):
    exp4

  5. 对于SS和SN组合,它们的最优分、合模式不完全一样(下图横坐标是通道index):
    exp5

转载于:https://www.cnblogs.com/RyanXing/p/10730829.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
几篇CVPR关于multi-task论文笔记整理,包括 一、 多任务课程学习Curriculum Learning of Multiple Tasks 1 --------------^CVPR2015/CVPR2016v--------------- 5 二、 词典对分类器驱动卷积神经网络进行对象检测Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection 5 三、 用于同时检测和分割的多尺度贴片聚合(MPA)* Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation ∗ 7 四、 通过多任务网络级联实现感知语义分割Instance-aware Semantic Segmentation via Multi-task Network Cascades 10 五、 十字绣网络多任务学习Cross-stitch Networks for Multi-task Learning 15 --------------^CVPR2016/CVPR2017v--------------- 23 六、 多任务相关粒子滤波器用于鲁棒物体跟踪Multi-Task Correlation Particle Filter for Robust Object Tracking 23 七、 多任务网络中的全自适应特征共享与人物属性分类中的应用Fully-Adaptive Feature Sharing in Multi-Task Networks With Applications in Person Attribute Classification 28 八、 超越triplet loss:一个深层次的四重网络,用于人员重新识别Beyond triplet loss: a deep quadruplet network for person re-identification 33 九、 弱监督级联卷积网络Weakly Supervised Cascaded Convolutional Networks 38 十、 从单一图像深度联合雨水检测和去除Deep Joint Rain Detection and Removal from a Single Image 43 十一、 什么可以帮助行人检测?What Can Help Pedestrian Detection? (将额外的特征聚合到基于CNN的行人检测框架) 46 十二、 人员搜索的联合检测和识别特征学习Joint Detection and Identification Feature Learning for Person Search 50 十三、 UberNet:使用多种数据集和有限内存训练用于低,中,高级视觉的通用卷积神经网络UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory 62 一共13篇,希望能够帮助到大家
Abstract: Gas metal arc welding (GMAW) is a widely used welding process in various industries. One of the significant challenges in GMAW is to achieve optimal welding parameters and minimize defects such as spatter and porosity. In this paper, we propose a deep-learning-based approach to analyze metal-transfer images in GMAW processes. Our approach can automatically detect and classify the different types of metal-transfer modes and provide insights for process optimization. Introduction: Gas metal arc welding (GMAW) is a welding process that uses a consumable electrode and an external shielding gas to protect the weld pool from atmospheric contamination. During the GMAW process, the metal transfer mode affects the weld quality and productivity. Three types of metal transfer modes are commonly observed in GMAW: short-circuiting transfer (SCT), globular transfer (GT), and spray transfer (ST). The selection of the transfer mode depends on the welding parameters, such as the welding current, voltage, and wire feed speed. The metal transfer mode can be observed using high-speed imaging techniques, which capture the dynamic behavior of the molten metal during welding. The interpretation of these images requires expertise and is time-consuming. To address these issues, we propose a deep-learning-based approach to analyze metal-transfer images in GMAW processes. Methodology: We collected a dataset of metal-transfer images using a high-speed camera during the GMAW process. The images were captured at a rate of 5000 frames per second, and the dataset includes 1000 images for each transfer mode. We split the dataset into training, validation, and testing sets, with a ratio of 70:15:15. We trained a convolutional neural network (CNN) to classify the metal-transfer mode from the images. We used the ResNet50 architecture with transfer learning, which is a widely used and effective approach for image classification tasks. The model was trained using the categorical cross-entropy loss function and the Adam optimizer. Results: We achieved an accuracy of 96.7% on the testing set using our deep-learning-based approach. Our approach can accurately detect and classify the different types of metal-transfer modes in GMAW processes. Furthermore, we used the Grad-CAM technique to visualize the important regions of the images that contributed to the classification decision. Conclusion: In this paper, we proposed a deep-learning-based approach to analyze metal-transfer images in GMAW processes. Our approach can automatically detect and classify the different types of metal-transfer modes with high accuracy. The proposed approach can provide insights for process optimization and reduce the need for human expertise in interpreting high-speed images. Future work includes investigating the use of our approach in real-time monitoring of the GMAW process and exploring the application of our approach in other welding processes.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值