Reid - 1 : Omni-Scale Feature Learning for Person Re-Identification

Reid - 1 : Omni-Scale Feature Learning for Person Re-Identification

目录

Reid - 1 : Omni-Scale Feature Learning for Person Re-Identification

【别人的阅读笔记】

 

用于行人重识别的全方位特征学习

摘要

作为实例级别的识别问题,人员重新识别(ReID)依赖于区分特征(discriminative feature),这些特征不仅捕获了不同的空间尺度,而且还封装了多个尺度的任意组合。 我们称均质和异质尺度的特征(features of both homogeneous and heterogeneous scales)为全尺度特征(omni-scale features)。

Omni-Scale Network(OSNet), for omni-scale feature learning : 一个残差块,由多个卷积特征流组成,每个卷积特征流以一定比例检测特征。Importantly,引入了一种统一聚合门(unified aggregation gate),以动态融合具有输入相关通道权重的多尺度特征。 为了有效地学习空间通道相关性(spatial-channel correlations)并避免过度拟合,块同时使用了点向和深度卷积(pointwise and depthwise convolutions)。

Introduction:

ReID的关键是学习有区别能力的特性我们认为,这些特征是需要全尺度的,定义为变量同构尺度和异构尺度的组合,每一个都由多个尺度的混合组成

底层构建块由具有不同receive fields的多个卷积特征流组成,每个流所关注的特征比例由指数决定,该指数是一个新的尺寸因子,在整个流中线性增加,以确保在每个块中捕获各种比例。 至关重要的是,结果的多尺度特征图是由统一的聚集门(AG)生成的通道权重动态融合的。这种新颖的AG设计使网络能够学习全尺寸特征表示:根据特定的输入图像,门可以通过为特定的流或比例分配主要权重来将注意力集中在单个比例上。 或者,它可以采摘和混合,从而产生heterogeneous scales。

 

全局特征和局部特征同样重要

还需要一些独特的组合:比如白t恤+上面的logo

由跨越小(徽标大小)和中(上半身)大小的异构功能捕获

全尺度特征学习Omni-Scale Feature Learning

OSNet:专门学习ReID任务的全尺度特性表示。factorised convolutional layer ——> 全尺度残差块 the omni-scale residual block ——> 统一聚合门 unified aggregation gate

Depthwise Separable Convolution 深度可分离卷积:

为了减少参数(为什么深度可分离卷积可以减少参数?)这样调整后,参数量才能够k^2*c*c' 变为 (k^2+c)*c'

深度可分离的基本思想:将一个卷积层ReLU(w*x)变成两个分离的ReLU((vou)*x),核为

 

 

 

【别人的阅读笔记】

  1. https://blog.csdn.net/weixin_42731241/article/details/91415598?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.control&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.control

 

【代码运行】

它具有以下特点:

  • 多GPU训练
  • 支持图像和视频残差
  • 端到端训练和评估
  • 非常简单地准备reid数据集
  • 多数据集训练
  • 跨数据集评估
  • 大多数研究使用的标准协议
  • 高度可扩展的(易于添加模型,数据集,训练方法等)
  • 最新的行人重识别模型的实现
  • access to pretrained reid models
  • 先进的训练技术
  • 可视化工具
import torchreid


datamanager = torchreid.data.ImageDataManager(
    
    root='reid-data',
    
    sources='market1501',
    
    targets='market1501',
    
    height=256,
    
    width=128,
    
    batch_size_train=32,
    
    batch_size_test=100,
    
    transforms=['random_flip', 'random_crop']
)


print(datamanager)

运行结果1:

D:\miniconda\envs\torchreid\python.exe F:/tianye/deep-person-reid-master/deep-person-reid-master/mytest.py
F:\tianye\deep-person-reid-master\deep-person-reid-master\torchreid\metrics\rank.py:12: UserWarning: Cython evaluation (very fast so highly recommended) is unavailable, now use python evaluation.
  'Cython evaluation (very fast so highly recommended) is '
Building train transforms ...
+ resize to 256x128
+ random flip
+ random crop (enlarge to 288x144 and crop 256x128)
+ to torch tensor of range [0, 1]
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Building test transforms ...
+ resize to 256x128
+ to torch tensor of range [0, 1]
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
=> Loading train (source) dataset
F:\tianye\deep-person-reid-master\deep-person-reid-master\torchreid\data\datasets\image\market1501.py:38: UserWarning: The current data structure is deprecated. Please put data folders such as "bounding_box_train" under "Market-1501-v15.09.15".
  'The current data structure is deprecated. Please '
=> Loaded Market1501
  ----------------------------------------
  subset   | # ids | # images | # cameras
  ----------------------------------------
  train    |   751 |    12936 |         6
  query    |   750 |     3368 |         6
  gallery  |   751 |    15913 |         6
  ----------------------------------------
=> Loading test (target) dataset
=> Loaded Market1501
  ----------------------------------------
  subset   | # ids | # images | # cameras
  ----------------------------------------
  train    |   751 |    12936 |         6
  query    |   750 |     3368 |         6
  gallery  |   751 |    15913 |         6
  ----------------------------------------


  **************** Summary ****************
  source            : ['market1501']
  # source datasets : 1
  # source ids      : 751
  # source images   : 12936
  # source cameras  : 6
  target            : ['market1501']
  *****************************************


<torchreid.data.datamanager.ImageDataManager object at 0x000001C9358CFBC8>

Process finished with exit code 0

运行2

问题1:

解决:重新运行一遍

 

问题2::Pipe error

解决:全局搜索num_workers=0

 

问题3:

解决:

import torch
torch.backends.cudnn.benchmark = True

[来源文章](https://blog.csdn.net/weixin_44474718/article/details/89914898)

Deep person re-identification is the task of recognizing a person across different camera views in a surveillance system. It is a challenging problem due to variations in lighting, pose, and occlusion. To address this problem, researchers have proposed various deep learning models that can learn discriminative features for person re-identification. However, achieving state-of-the-art performance often requires carefully designed training strategies and model architectures. One approach to improving the performance of deep person re-identification is to use a "bag of tricks" consisting of various techniques that have been shown to be effective in other computer vision tasks. These techniques include data augmentation, label smoothing, mixup, warm-up learning rates, and more. By combining these techniques, researchers have been able to achieve significant improvements in re-identification accuracy. In addition to using a bag of tricks, it is also important to establish a strong baseline for deep person re-identification. A strong baseline provides a foundation for future research and enables fair comparisons between different methods. A typical baseline for re-identification consists of a deep convolutional neural network (CNN) trained on a large-scale dataset such as Market-1501 or DukeMTMC-reID. The baseline should also include appropriate data preprocessing, such as resizing and normalization, and evaluation metrics, such as mean average precision (mAP) and cumulative matching characteristic (CMC) curves. Overall, combining a bag of tricks with a strong baseline can lead to significant improvements in deep person re-identification performance. This can have important practical applications in surveillance systems, where accurate person recognition is essential for ensuring public safety.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值