- 博客(101)
- 收藏
- 关注
原创 [BCNet] Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers(CVPR. 2021)
1. Motivationoverlapping,occlusion,分割高度重叠的对象具有挑战性,因为通常在真实对象轮廓contours和遮挡边界occlusion boundaries之间没有区别。之前的工作在mask regression上做的很少,并且COCO训练数据中,大部分物体是没有遮挡信息的。mask R-CNN以及它的改进都是直接回归了被遮挡物实例occludee,这种做法忽略了遮挡物实例occluding 以及物体之间重叠的关系。Segmenting highly-overl..
2021-04-23 13:00:25
2126
7
原创 [Swin Transformer] Swin Transformer: HierarchicalVision Transformer using Shifted Windows
1. Motivation将transformer从NLP应用于CV领域存在以下2个方面的挑战,图像尺度的多样性,以及图像像素相对于words的高分辨率,这会造成内存大的花销。Challenges in adapting Transformer from language to vision arise from differences between the two domains, such as large variations in the scale of visual entities .
2021-04-05 17:16:38
417
原创 [PVT] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolution
paper: https://arxiv.org/abs/2102.12122code: https://github.com/whai362/PVT/文章目录1. Motivation2. Contribution3. Method3.1 Overall Architecture3.2 Feature Pyramid for Transformer3.3 Spatial-Reduction Attention3.3 Detailed settings of PVT series4. Experime.
2021-04-01 11:35:36
532
原创 [BoT Net] Bottleneck Transformers for Visual Recognition
1. Motivation 作者认为虽然堆叠更多层可以改善backbone的性能,但是隐式的结果来建模全局依赖(global dependencies),而不需要太多层,可以成为一种powerful和scalable的方案。Although stacking more layers indeed improves the performance of these backbones [72], an explicit mechanism to model global (non-local) de.
2021-03-25 21:48:02
677
原创 [YOLOF] You Only Look One-level Feature (CVPR. 2021)
代码:https://github.com/megvii-model/YOLOF文章目录1. Motivation2. Contribution3. Cost Analysis of MiMo Encoders4. Method4.1 Limited Scale Range4.2 Dilated Encoder4.3 Imbalance Problem on Positive Anchors4.4 Uniform Matching5. YOLOF6. Experiments6.1 Comparison..
2021-03-21 16:18:31
1364
原创 [VIT] Visual Transformer
1. MotivationTransformer在视觉上的应用存在limited。在视觉中,attention方法是用于连接卷积网络,或者用于取代卷积网络的部分构成,但同时保留了总体结构。 While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limit.
2021-03-17 19:47:42
371
原创 [HOI Transfomer] End-to-End Human Object Interaction Detection with HOI Transformer(CVPR. 2021)
1. Motivation目前现有的HOI(任务交互)领域的方法是one-stage或者two-stage的。Current approaches either decouple HOI task into separated stages of object detection and interaction classification or introduce surrogate interaction problem.本文将transformer以端到端的形式应用于human objec.
2021-03-15 21:33:14
1898
1
原创 [MEInst] Mask Encoding for Single Shot Instance Segmentation(CVPR. 2020)
1. Motivation单阶段的分割在mask AP上比不过Mask R-CNN。one-stage alternatives cannot compete with Mask R-CNN in mask AP.作者提出了一个假设:“Is it possible to predict the object mask in the intrinsic low-dimensional space and still achieve competitive accuracy?" 并给出了肯定的.
2021-03-11 16:40:58
777
原创 [MIAL] Multiple Instace Active Learning for Object Detection(CVPR. 2021)
1. Motivation 目前主动学习(active learning)在图像分类上取得了巨大的进步,但是在目标检测领域,还缺乏一种instance-level的主动学习方法。 在这篇文章中,作者提出了多实例主动学习(MIAL),通过观察instance-level的uncertainty,来为检测器的训练挑选最informative的图片。 如图1所示,图a表示传统的方法,没有考虑负样本在目标检测中的不平衡问题,负样本产生了背景中的noisy instances,并干扰了image u
2021-03-07 11:41:18
1471
原创 [RepVGG] RepVGG:Making VGG-style ConvNets Great Again(CVPR.2021)
paper:https://arxiv.org/abs/2101.0369code:https://arxiv.org/abs/2101.03691. Motivation 如今更复杂的卷积网络可以取得更大的精度,但是相对于简单的卷积网络来说,也有2种缺点,首先是多分支结构中复杂的设定,以及复杂卷积网络的计算资源的开销。 同时,其他simple ConvNets的性能并不能比得过complicated ConvNets。2. Contribution本文的贡献如下:本文提出了Rep..
2021-03-04 21:43:41
473
原创 [python]读写CSV基础笔记
csv库读入import csvwith open('ap.csv', 'r') as f: reader = csv.reader(f) title = next(reader) print(title) step = [] ap = [] for r in reader: step.append((int(r[1])+1.0)/1000) ap.append(float(r[2])/100) #
2021-02-28 16:55:33
140
原创 [ATSS]Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive TrainingCVPR.2020
1. Motivation anchor-based method(RetinaNet)和 anchor-free method(FCOS)的主要差异体现在以下4点:The number of anchors tiled per location.The definition of positive and negative samples.The regression starting status.而目前FCOS的实验结果会比RetinaNet好,因此在这三个差异中,哪一点是造.
2021-02-25 17:25:36
227
原创 [TSP-FCOS]Rethinking Transformer-based Set Prediction for Object Detection
文章目录1. Motivation2. Contribution3. What Causes the Slow Convergence of DETR?3.1 Does Instability of the Bipartite Matching Affect Convergence?3.2 Are the Attention Modules the Main Cause?3.3 **Does DETR Really Need Cross-attention?**4. The Proposed Method.
2021-02-10 00:14:40
2229
原创 [DCN]Deformable Convolutional Networks
文章目录1. Motivation2. Contribution3. Deformable Convolutional Networks3.1 Deformable Convolution3.2 Deformable RoI Pooling3.3 Position-Sensitive (PS) RoI Pooling3.4 Deformable ConvNets1. Motivation 由于CNNs固定的几何结构,它们在建模几何变化中受到了限制。Convolutional neural net.
2021-02-02 22:07:22
465
原创 [PSS]Object Detection Made Simpler by Eliminating Heuristic NMS
文章目录1. Motivation2. Contribution3. Our Method3.1 Overall Training Objective3.1.1 PSS LOSS3.1.2 Ranking Loss3.2 One-to-many Label Assignment3.3 One-to-one Label Assignment3.4 Conflict in the Tow Classification Loss Terms3.5 Stop Gradient4. Experiments4.1 A.
2021-02-01 17:00:31
532
原创 [Relation Network]Realtion Networks for Object Detection (CVPR. 2018)
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nbwrgm7Y-1611391066244)(https://raw.githubusercontent.com/Wei-i/My_Image_Hosting/main/img/image-20210121215141004.png)]1. Motivationintuition: 认为建模物体之间的关系会帮助目标检测。Although it is well believed for years that model
2021-01-23 16:39:36
280
原创 [EmbedMask]EmbedMask: Embedding Coupling for One-stage Instance Segmentation
文章目录1. Motivation2. Contribution3. EmbedMask3.1 Overview3.2 Embedding Definition3.3 Learnable Margin3.4 Smooth Loss3.5 Training3.6 Inference1. Motivationtwo-stage 通过ROIPool方法,会造成图像信息的丢失(低分辨率以及ROIPool/Align)的对齐操作。同时它的参数量较多比较复杂。现有(2019)的one-stage实例分割.
2021-01-22 17:11:21
358
原创 [Xshell秘钥登录]使用XShell秘钥登录服务器并配好VSCODE
1. Xshell 新建公钥与私钥公钥存入服务器,私钥存入自己的电脑中。主要操作的新建用户秘钥生成向导,根据下列步骤或者私钥和公钥。注意,填写秘钥也需要一个口令密码,这个密码可以为空,那么后续进入VSCODE的时候,就不用了再输入密码了,但是安全性会低一点,有利有弊,我没有填写。导出私钥2. 服务器存入公钥# 使用原先的口令登录服务器# 进入到自己账户底下的.ssh/ 注意可以不是/root/.ssh 因为服务器 没有这么高的权限cd .ssh/cat xx.pub >
2021-01-13 20:32:31
500
原创 [FreeAnchor]FreeAnchor: Learning to Match Anchors for Visual Object Detection[NeurlPS 2019]
paper:https://arxiv.org/pdf/1909.02466.pdfcode:https://github.com/zhangxiaosong18/FreeAnchor33rd文章目录1. Motivation2. Contribution1. Motivation2. Contribution
2021-01-11 23:04:11
396
原创 [AutoAssign]AutoAssign: Differentiable Label Assignment for Dense Object Detection
文章目录1. Motivation2. Contribution3. Method3.1 Prior-level : Center Weighting3.2 Instance-level: Confidence Weighting3.2.1 Classification confidence3.2.2 Joint confidence indicator3.3.3 Positve weights3.3.4 Negative weights3.3.5 Loss function文中对于Label; Ass
2020-12-30 10:16:20
395
原创 [Sparse R-CNN]Sparse R-CNN: End-to-End Object Detection with Learnable Proposals (CVPR. 2021)
Sparse R-CNN: End-to-End Object Detection with Learnable Proposalspaper:https://arxiv.org/pdf/2011.12450.pdfcode:https://github.com/PeizeSun/SparseR-CNNAbstract作者提出了Sparse R-CNN,一种对于图像目标检测的sparse方法,一种固定的可学习目标候选框,总数量为N,用于目标识别,来进行分类和定位。通过将H x W x K个手工设定的
2020-12-23 10:30:43
1770
2
原创 [CenterNet]Objects as Points笔记
Objects as Points paper:https://arxiv.org/pdf/1904.07850.pdfcode:https://github.com/xingyizhou/CenterNet/环境配置参考:官方安装文档,很棒的教学博客1. Motivation大部分的目标检测网络都需要详尽的可能的目标定位以及分类的anchor,这是非常浪费,不高效并且要引入预处理操作,如NMS。在这篇论文中,作者提出了一个不同的方法,成为CenterNet,将物体建模为a single po.
2020-12-22 10:04:11
148
原创 [OneNet]OneNet: Towards End-to-End One-Stage Object Detection笔记
paper:https://arxiv.org/pdf/2012.05780.pdfcode:https://github.com/PeizeSun/OneNet1. 摘要:文中提出了一个观点,认为以往的one-stage无法去掉NMS实现端到端的主要原因在于正样本的选取。以往的Label assignment任务没有考虑分类的cost,只考虑了位置的cost,从而会得到很多冗余的boxes,造成了后处理中必须使用NMS。如下图所示,RetinaNet选取正样本只考虑了box的IoU ,而FCOS
2020-12-18 21:17:48
703
原创 [DETR] End-to-End Object Detection with Transformers (ECCV. 2020 oral)代码笔记
End-to-End Object Detection with Transformers文章目录End-to-End Object Detection with Transformers网络结构detr/models/detr.pydetr/models/backbone.py论文:https://arxiv.org/pdf/2005.12872.pdf代码:https://github.com/facebookresearch/detr网络结构detr/models/detr.py代码
2020-12-08 19:31:39
1534
原创 [Condinst]Conditional Convolutions for Instance Segmentation(ECCV. 2020 oral)
文章目录网络结构mask headLOSS1. AdelaiDet/adet/modeling/condinst/condinst.py2. AdelaiDet/adet/modeling/condinst/mask_branch.py3. AdelaiDet/adet/modeling/condinst/dynamic_mask_head.py4. AdelaiDet/adet/modeling/fcos/fcos_outputs.py中Condinst的top_feat结构网络结构mask hea
2020-12-04 23:32:59
1686
3
原创 [BlendMask]BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation代码笔记
文章目录BlendMask 网路结构:1. AdelaiDet/adet/modeling/blendmask/blendmask.py2. AdelaiDet/adet/modeling/blendmask/blender.py3. AdelaiDet/adet/modeling/blendmask/basis_module.py总的执行顺序为 backbone fpn and resnet --> fcos --> blendmask.py --> basis_module.py
2020-11-22 22:58:56
1710
3
原创 [FCOS]FCOS: Fully Convolutional One-Stage Object Detection(ICCV. 2019)代码笔记
文章目录1. AdelaiDet/adet/modeling/fcos/fcos.py2. AdelaiDet/adet/modeling/fcos/fcos_outputs.py3. AdelaiDet/adet/layers/iou_loss.py1. AdelaiDet/adet/modeling/fcos/fcos.pyimport mathfrom typing import List, Dictimport torchfrom torch import nnfrom torch.n
2020-11-21 20:34:34
819
2
原创 [SOLO ]SOLO: Segmenting Objects by Locations代码解读笔记(ECCV. 2020)
文章目录SOLO head网络结构损失函数1. SOLO/mmdect/models/detectors/single_stage_ins.py2. SOLO/mmdet/models/anchor_heads/solo_head.py3. SOLO/mmdetect/core/post_processing/matrix_nms.py4. SOLO/configs/solo/solo_r50_fpn_8gpu_1x.py5. SOLO/mmdet/models/anchor_heads/_ _init_
2020-11-20 19:34:37
1795
3
原创 [实例分割]IOU,NMS笔记
1.IOU参考:知乎自己没法直接理解多维度多个box之间的iou,索性一步步写下来。pytoch 源代码=# IOU计算 # 假设box1维度为[N,4] box2维度为[M,4] def iou(self, box1, box2): N = box1.size(0) M = box2.size(0) lt = torch.max( # 左上角的点 注意:是inter的左上角的点,因此才要算MAX b
2020-11-10 20:30:01
1353
原创 [pytorch]基础函数笔记
import torchs = torch.randn((5,2,3)) a = torch.linspace(-1, 1, s.shape[-1]) # 均匀分割。 [-1, 1]分成s.shape[0]b = torch.linspace(-1, 1, s.shape[-2])print('a:', a)print('b:', b)print('-'* 100)a, b = torch.meshgrid(a, b) # meshgrid后为[5, 2] 但是根据元素的不同 a,b也不同p
2020-11-02 10:24:22
1348
原创 [python]数组切片笔记
数组切片笔记a = np.random.randint(1, 100, [2, 3, 4])aarray([[[90, 71, 53, 24], [ 6, 74, 99, 21], [39, 30, 94, 32]], [[83, 5, 24, 96], [48, 47, 91, 24], [40, 40, 58, 89]]])a.max(0)array([[90, 71, 53, 96], [
2020-10-27 16:51:47
85
原创 [pytorch入门]神经网络tutorial教程笔记
代码来源https://github.com/yunjey/pytorch-tutorial,很好的教程。linear_regression.py#linear_regression.pyimport torchimport torch.nn as nnimport numpy as npimport matplotlib.pyplot as plt# Hyper-parametersinput_size = 1output_size = 1num_epochs = 60learn
2020-10-14 09:18:17
549
原创 [Detectron2]使用Detectron2/AdelaiDet训练自己的数据集
使用配置参数,直接集成进detectron2/detectron2/configs/使用Detectron2/AdelaiDet训练自己的数据集。AdelaiDet可以算是Detectron2的一个扩展包,基本和Detectron2类似。buildin.pybuildin
2020-10-10 13:55:57
8744
78
原创 [Paper Reading]FCOS: Fully Convolutional One-Stage Object Detection
FCOS: Fully Convolutional One-Stage Object Detection1. introductionWe propose a fully convolutional one-stage object detec-tor (FCOS) to solve object detection in a per-pixel predic- tion fashion, analogue to semantic segmentation.作者提出了一个FCOS(全连接 单阶段
2020-10-04 21:56:52
336
原创 [AdelaiDet]配置安装并测试
AdelaiDet1. 前言AdelaiDet is an open source toolbox for multiple instance-level recognition tasks on top of Detectron2. All instance-level recognition works from our group are open-sourced here.2. install首先需要安装detectron2,参照install.md。注意,目前还不能和最新的版本适配。
2020-10-04 10:09:08
3708
13
原创 [python基础]基础文本操作
将jpg以偶数或者奇数结尾分开存储 问题:将RAW-VOC格式转化为COCO格式时,需要将原来的jpg文件以train和val.txt中的内容相应进行分离。1.普通文本操作#普通文本操作import osfile=open('/hdd2/wh/pascalraw/PASCALRAW/trainval/train.txt') #file=open('val.txt') file_list = [] labelMat = []for line in file.readli..
2020-10-02 10:32:31
317
1
原创 [linux] 系统命令笔记
今天在使用GPU运行Mask r-cnn 源代码时,出现了以下错误:ImportError: libcublas.so.8.0:cannot open shared object file: No such file or directory查阅资料后,明白是我的CUDA版本与新下载的tensorflow-gpu的版本不对应,我下载的是1.4.0,对应的是CUDA 8。请看:tensorflow-gpu与CUDA版本的对应链接conda list下载CUDA 9对应的包 tensorflo
2020-09-26 11:27:19
123
原创 [Gitee]将Github上的项目导入Gitee并进行下载
将GitHub上的项目导入Gitee并进行下载前言:疫情原因,摸鱼了半个假期,刚好想学点ML的知识,在B站上看了相关视频,但共享的作业源码等在HitHub上的网速比年初更慢了,根本没法下载成功,于是向大佬请教了下,用Gitee,学了一下很方便,最终文件也下载完成,记录一下。一、注册Gitee账号登录码云官网,注册账号https://gitee.com/二、从Github中导入仓库点击“+”图标,选中从GitHub/GitLab导入仓库接着,导入仓库中有3个选项,我们选择第二个“导入Git
2020-07-24 01:27:04
2168
原创 [linux服务器] 无root权限编译安装FFmpeg和athena-jot
前言:因为自己的ubuntu系统跑不动C3D代码,因此使用了学校的服务器,但是由于服务器是无法获得root权限的,因此无法使用sudo的最大权限了。因此在安装一些软件时候显得比较棘手,没有了我心想的傻瓜式安装,应当采用编译安装FFmpeg和athena-jot。写下这篇博客,用来记录这2天编译安装遇到的挫折,用以日后回顾,也是自己对于编译安装的一个锻炼,不那么依靠sudo apt-get ins...
2020-03-14 20:50:54
680
1
原创 [ubuntu]GCC与CUDNN版本不兼容
前言:距离上次的更新都过了4个月了…是在了太懒了…假期即将过去,尽量在回去之前做点啥…由于毕设要做的方向涉及到原作者的代码文件是在linux系统上做的,在win10捣鼓了几天后,只好抄起我一年前的老本行ubuntu>.>首先当然是查看我的cuda 和cudnn。使用这篇博客的方法验证CUDA成功后,发现无法验证CUDNN。报错:error: #error -- unsuppo...
2020-02-27 01:27:42
746
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人
RSS订阅