本专栏是计算机视觉方向论文收集积累,时间:2021年8月2日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: DarkLighter: Light Up The Darkness for UAV Tracking
AUTHORS: Junjie Ye ; Changhong Fu ; Guangze Zheng ; Ziang Cao ; Bowen Li
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: To facilitate aerial tracking in the dark through a general fashion, this work proposes a low-light image enhancer namely DarkLighter, which dedicates to alleviate the impact of poor illumination and noise iteratively.
2, TITLE: Real-time Streaming Perception System for Autonomous Driving
AUTHORS: Yongxiang Gu ; Qianlei Wang ; Xiaolin Qin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To remedy this, we present the real-time steaming perception system in this paper, which is also the 2nd Place solution of Streaming Perception Challenge (Workshop on Autonomous Driving at CVPR 2021) for the detection-only track.
3, TITLE: From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection
AUTHORS: Jiajun Deng ; Wengang Zhou ; Yanyong Zhang ; Houqiang Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, in this work, we regard point clouds as hollow-3D data and propose a new architecture, namely Hallucinated Hollow-3D R-CNN ($\text{H}^2$3D R-CNN), to address the problem of 3D object detection.
4, TITLE: Exploring Low-light Object Detection Techniques
AUTHORS: Winston Chen ; Tejas Shah
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: 2)In the second phase, we explore different object detection models that can be applied to the enhanced image.
5, TITLE: Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action Unit Recognition
AUTHORS: Jingwei Yan ; Jingjing Wang ; Qiang Li ; Chunmao Wang ; Shiliang Pu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Motivated by this, we take the AU properties into consideration and propose two auxiliary AU related tasks to bridge the gap between limited annotations and the model performance in a self-supervised manner via the unlabeled data.
6, TITLE: Iterative, Deep, and Unsupervised Synthetic Aperture Sonar Image Segmentation
AUTHORS: Yung-Chen Sun ; Isaac D. Gerg ; Vishal Monga
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we present a new iterative unsupervised algorithm for learning deep features for SAS image segmentation.
7, TITLE: Product1M: Towards Weakly Supervised Instance-Level Product Retrieval Via Cross-modal Pretraining
AUTHORS: XUNLIN ZHAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we investigate a more realistic setting that aims to perform weakly-supervised multi-modal instance-level product retrieval among fine-grained product categories.
8, TITLE: Relightable Neural Video Portrait
AUTHORS: YOUJIA WANG et. al.
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: In this paper, we present a relightable neural video portrait, a simultaneous relighting and reenactment scheme that transfers the head pose and facial expressions from a source actor to a portrait video of a target actor with arbitrary new backgrounds and lighting conditions.
9, TITLE: Recognizing Emotions Evoked By Movies Using Multitask Learning
AUTHORS: Hassan Hayat ; Carles Ventura ; Agata Lapedriza
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we model the emotions evoked by videos in a different manner: instead of modeling the aggregated value we jointly model the emotions experienced by each viewer and the aggregated value using a multi-task learning approach.
10, TITLE: The Minimum Edit Arborescence Problem and Its Use in Compressing Graph Collections [Extended Version]
AUTHORS: Lucas Gnecco ; Nicolas Boria ; S�bastien Bougleux ; Florian Yger ; David B. Blumenthal
CATEGORY: cs.CV [cs.CV, cs.DS]
HIGHLIGHT: We introduce a unified and generic structure called edit arborescence that relies on edit paths between data in a collection, as well as the Min Edit Arborescence Problem, which asks for an edit arborescence that minimizes the sum of costs of its inner edit paths.
11, TITLE: OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild
AUTHORS: Trung-Nghia Le ; Huy H. Nguyen ; Junichi Yamagishi ; Isao Echizen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a comprehensive study on two new countermeasure tasks: multi-face forgery detection and segmentation in-the-wild. To promote these new tasks, we have created the first large-scale dataset posing a high level of challenges that is designed with face-wise rich annotations explicitly for face forgery detection and segmentation, namely OpenForensics.
12, TITLE: Fourier Series Expansion Based Filter Parametrization for Equivariant Convolutions
AUTHORS: Qi Xie ; Qian Zhao ; Zongben Xu ; Deyu Meng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Against this issue, in this paper we modify the classical Fourier series expansion for 2D filters, and propose a new set of atomic basis functions for filter parametrization.
13, TITLE: Towards Robust Vision By Multi-task Learning on Monkey Visual Cortex
AUTHORS: SHAHD SAFARANI et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: Here, we successfully leveraged these inductive biases with a multi-task learning approach: we jointly trained a deep network to perform image classification and to predict neural activity in macaque primary visual cortex (V1).
14, TITLE: T-SVDNet: Exploring High-Order Prototypical Correlations for Multi-Source Domain Adaptation
AUTHORS: Ruihuang Li ; Xu Jia ; Jianzhong He ; Shuaijun Chen ; Qinghua Hu
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We propose a novel approach named T-SVDNet to address the task of Multi-source Domain Adaptation (MDA), which is featured by incorporating Tensor Singular Value Decomposition (T-SVD) into a neural network's training pipeline.
15, TITLE: Manipulating Identical Filter Redundancy for Efficient Pruning on Deep and Complicated CNN
AUTHORS: Xiaohan Ding ; Tianxiang Hao ; Jungong Han ; Yuchen Guo ; Guiguang Ding
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose to manipulate the redundancy during training to facilitate network pruning.
16, TITLE: Pix2Point: Learning Outdoor 3D Using Sparse Point Clouds and Optimal Transport
AUTHORS: R�my Leroy ; Pauline Trouv�-Peloux ; Fr�d�ric Champagnat ; Bertrand Le Saux ; Marcela Carvalho
CATEGORY: cs.CV [cs.CV, I.4.5; I.2.10]
HIGHLIGHT: In this paper, we address the problem of learning outdoor 3D point cloud from monocular data using a sparse ground-truth dataset.
17, TITLE: Real-Time Anchor-Free Single-Stage 3D Detection with IoU-Awareness
AUTHORS: RUNZHOU GE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this report, we introduce our winning solution to the Real-time 3D Detection and also the "Most Efficient Model" in the Waymo Open Dataset Challenges at CVPR 2021.
18, TITLE: Temporal Feature Warping for Video Shadow Detection
AUTHORS: Shilin Hu ; Hieu Le ; Dimitris Samaras
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a simple but powerful method to better aggregate information temporally.
19, TITLE: ADeLA: Automatic Dense Labeling with Attention for Viewpoint Adaptation in Semantic Segmentation
AUTHORS: YANCHAO YANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We describe an unsupervised domain adaptation method for image content shift caused by viewpoint changes for a semantic segmentation task.
20, TITLE: Automatic Vocabulary and Graph Verification for Accurate Loop Closure Detection
AUTHORS: HAOSONG YUE et. al.
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: To overcome these disadvantages, we propose a natural convergence criterion based on the comparison between the radii of nodes and the drifts of feature descriptors, which is then utilized to build the optimal vocabulary automatically.
21, TITLE: SNE-RoadSeg+: Rethinking Depth-Normal Translation and Deep Supervision for Freespace Detection
AUTHORS: Hengli Wang ; Rui Fan ; Peide Cai ; Ming Liu
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: To address this problem, we introduce SNE-RoadSeg+, an upgraded version of SNE-RoadSeg.
22, TITLE: Can Non-specialists Provide High Quality Gold Standard Labels in Challenging Modalities?
AUTHORS: SAMUEL BUDD et. al.
CATEGORY: cs.CV [cs.CV, cs.HC, cs.LG]
HIGHLIGHT: In this work we challenge this assumption and examine the implications of using a minimally trained novice labelling workforce to acquire annotations for a complex medical image dataset.
23, TITLE: Seeing Poverty from Space, How Much Can It Be Tuned?
AUTHORS: Tomas Sako ; Arturo Jr M. Martinez
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we demonstrate that individuals with no organizational affiliation and equipped only with common hardware, publicly available datasets and cloud-based computing services can participate in the improvement of predicting machine-learning-based approaches to predicting local poverty levels in a given agro-ecological environment.
24, TITLE: DPT: Deformable Patch-based Transformer for Visual Recognition
AUTHORS: ZHIYANG CHEN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this problem, we propose a new Deformable Patch (DePatch) module which learns to adaptively split the images into patches with different positions and scales in a data-driven way rather than using predefined fixed patches.
25, TITLE: Out-of-Core Surface Reconstruction Via Global $TGV$ Minimization
AUTHORS: Nikolai Poliarnyi
CATEGORY: cs.CV [cs.CV, cs.DC, cs.GR, I.4.8; I.3.5; C.2.4]
HIGHLIGHT: We present an out-of-core variational approach for surface reconstruction from a set of aligned depth maps.
26, TITLE: Shadow Art Revisited: A Differentiable Rendering Based Approach
AUTHORS: Kaustubh Sadekar ; Ashish Tiwari ; Shanmuganathan Raman
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we revisit shadow art using differentiable rendering based optimization frameworks to obtain the 3D sculpture from a set of shadow (binary) images and their corresponding projection information.
27, TITLE: Sparse-to-dense Feature Matching: Intra and Inter Domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation
AUTHORS: Duo Peng ; Yinjie Lei ; Wen Li ; Pingping Zhang ; Yulan Guo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In light of this, we propose to further leverage 2D data for 3D domain adaptation by intra and inter domain cross modal learning.
28, TITLE: Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation
AUTHORS: Bowen Zhang ; Yifan Liu ; Zhi Tian ; Chunhua Shen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks.
29, TITLE: Enhancing Social Relation Inference with Concise Interaction Graph and Discriminative Scene Representation
AUTHORS: XIAOTIAN YU et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, by mimicking human understanding on images, we propose an approach of \textbf{PR}actical \textbf{I}nference in \textbf{S}ocial r\textbf{E}lation (PRISE), which concisely learns interactive features of persons and discriminative features of holistic scenes. To further boost the performance in social relation inference, we collect and distribute a new large-scale dataset, which consists of about 240 thousand unlabeled images.
30, TITLE: Instant Visual Odometry Initialization for Mobile AR
AUTHORS: Alejo Concha ; Michael Burri ; Jes�s Briales ; Christian Forster ; Luc Oth
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a 6-DoF monocular visual odometry that initializes instantly and without motion parallax. We release a dataset for the relative pose problem using real data to facilitate the comparison with future solutions for the relative pose problem.
31, TITLE: Medical Instrument Segmentation in 3D US By Hybrid Constrained Semi-Supervised Learning
AUTHORS: HONGXU YANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this article, we propose a semi-supervised learning (SSL) framework for instrument segmentation in 3D US, which requires much less annotation effort than the existing methods.
32, TITLE: Synth-by-Reg (SbR): Contrastive Learning for Synthesis-based Registration of Paired Images
AUTHORS: Adri� Casamitjana ; Matteo Mancini ; Juan Eugenio Iglesias
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: Here we propose a synthesis-by-registration method to convert this problem into an easier intra-modality task.
33, TITLE: High-Resolution Depth Maps Based on TOF-Stereo Fusion
AUTHORS: Vineet Gandhi ; Jan Cech ; Radu Horaud
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel TOF-stereo fusion method based on an efficient seed-growing algorithm which uses the TOF data projected onto the stereo image pair as an initial set of correspondences.
34, TITLE: PiBase: An IoT-based Security System Using Raspberry Pi and Google Firebase
AUTHORS: Venkat Margapuri ; Niketa Penumajji ; Mitchell Neilsen
CATEGORY: cs.CR [cs.CR, cs.CV, cs.LO]
HIGHLIGHT: Machine learning algorithms, namely Haar-feature based cascade classifiers and Linear Binary Pattern Histograms (LBPH), are used for face detection and face recognition, respectively.
35, TITLE: ManiSkill: Learning-from-Demonstrations Benchmark for Generalizable Manipulation Skills
AUTHORS: TONGZHOU MU et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.RO]
HIGHLIGHT: In this work, we focus on object-level generalization and propose SAPIEN Manipulation Skill Benchmark (abbreviated as ManiSkill), a large-scale learning-from-demonstrations benchmark for articulated object manipulation with visual input (point cloud and image).
36, TITLE: Perceiver IO: A General Architecture for Structured Inputs & Outputs
AUTHORS: ANDREW JAEGLE et. al.
CATEGORY: cs.LG [cs.LG, cs.CL, cs.CV, cs.SD, eess.AS]
HIGHLIGHT: The recently-proposed Perceiver model obtains good results on several domains (images, audio, multimodal, point clouds) while scaling linearly in compute and memory with the input size.
37, TITLE: On The Efficacy of Small Self-Supervised Contrastive Models Without Distillation Signals
AUTHORS: HAIZHOU SHI et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we study the issue of training self-supervised small models without distillation signals.
38, TITLE: When Deep Learners Change Their Mind: Learning Dynamics for Active Learning
AUTHORS: Javad Zolfaghari Bengar ; Bogdan Raducanu ; Joost van de Weijer
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we propose a new informativeness-based active learning method.
39, TITLE: Topological Similarity Index and Loss Function for Blood Vessel Segmentation
AUTHORS: R. J. Ara�jo ; J. S. Cardoso ; H. P. Oliveira
CATEGORY: eess.IV [eess.IV, cs.CV, I.2.10; I.4.6]
HIGHLIGHT: In this paper, we propose a similarity index which captures the topological consistency of the predicted segmentations having as reference the ground truth.
40, TITLE: Single Image Deep Defocus Estimation and Its Applications
AUTHORS: Fernando J. Galetto ; Guang Deng
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: Based on this principle and the widely used assumption that Gaussian blur is a good model for defocus blur, we formulate the problem of estimating the spatially varying defocus blurriness as a Gaussian blur classification problem. We have created a dataset of more than 500000 image patches of size 32x32 which are used to train and test several well-known network models.
41, TITLE: Automatic Multi-Stain Registration of Whole Slide Images in Histopathology
AUTHORS: ABUBAKR SHAFIQUE et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we propose a two-step automatic feature-based cross-staining WSI alignment to assist localization of even tiny metastatic foci in the assessment of lymph node.