计算机视觉论文-2021-08-03

最新推荐文章于 2024-09-12 07:46:57 发布

SophiaCV

最新推荐文章于 2024-09-12 07:46:57 发布

阅读量2.6k

点赞数 1

分类专栏： CVPaper 文章标签：计算机视觉人工智能深度学习神经网络机器学习

在公众号【计算机视觉联盟】后台回复【9076】获取独家200页AI笔记！

本文链接：https://blog.csdn.net/Sophia_11/article/details/119356870

版权

CVPaper 专栏收录该内容

78 篇文章 72 订阅

订阅专栏

本专栏是计算机视觉方向论文收集积累，时间：2021年8月3日，来源：paper digest

欢迎关注原创公众号 【计算机视觉联盟】，回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记！

直达笔记地址：机器学习手推笔记（GitHub地址）

1, TITLE: Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction
AUTHORS: FANG ZHENG et. al.
CATEGORY: cs.AI [cs.AI, cs.CV]
HIGHLIGHT: To address these problems, we propose a simple yet effective Unlimited Neighborhood Interaction Network (UNIN), which predicts trajectories of heterogeneous agents in multiply categories.

2, TITLE: MTVR: Multilingual Moment Retrieval in Videos
AUTHORS: Jie Lei ; Tamara L. Berg ; Mohit Bansal
CATEGORY: cs.CL [cs.CL, cs.AI, cs.CV]
HIGHLIGHT: We introduce mTVR, a large-scale multilingual video moment retrieval dataset, containing 218K English and Chinese queries from 21.8K TV show video clips.

3, TITLE: Hyper360 -- A Next Generation Toolset for Immersive Media
AUTHORS: HANNES FASSOLD et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we describe the work done so far in the Hyper360 project on tools for mixed 360{\deg} video and 3D content.

4, TITLE: My Eyes Are Up Here: Promoting Focus on Uncovered Regions in Masked Face Recognition
AUTHORS: PEDRO C. NETO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we address the challenge of masked face recognition (MFR) and focus on evaluating the verification performance in FRS when verifying masked vs unmasked faces compared to verifying only unmasked faces.

5, TITLE: Multiplex Graph Networks for Multimodal Brain Network Analysis
AUTHORS: ZHAOMING KONG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose MGNet, a simple and effective multiplex graph convolutional network (GCN) model for multimodal brain network analysis.

6, TITLE: Wood-leaf Classification of Tree Point Cloud Based on Intensity and Geometrical Information
AUTHORS: JINGQIAN SUN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: By using both the intensity and spatial information, a three-step classification and verification method was proposed to achieve automated wood-leaf classification.

7, TITLE: CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention
AUTHORS: WENXIAO WANG et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To make up this defect, we propose Cross-scale Embedding Layer (CEL) and Long Short Distance Attention (LSDA).

8, TITLE: Efficient Deep Feature Calibration for Cross-Modal Joint Embedding Learning
AUTHORS: Zhongwei Xie ; Ling Liu ; Lin Li ; Luo Zhong
CATEGORY: cs.CV [cs.CV, cs.MM]
HIGHLIGHT: This paper introduces a two-phase deep feature calibration framework for efficient learning of semantics enhanced text-image cross-modal joint embedding, which clearly separates the deep feature calibration in data preprocessing from training the joint embedding model.

9, TITLE: A Dynamic 3D Spontaneous Micro-expression Database: Establishment and Evaluation
AUTHORS: Fengping Wang ; Jie Li ; Chun Qi ; Yun Zhang ; Danmin Miao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Therefore, we proposed a new micro-expression database containing 2D video sequences and 3D point clouds sequences.

10, TITLE: Group Fisher Pruning for Practical Network Compression
AUTHORS: LIYANG LIU et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we present a general channel pruning approach that can be applied to various complicated structures.

11, TITLE: Learning Instance-level Spatial-Temporal Patterns for Person Re-identification
AUTHORS: MIN REN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel Instance-level and Spatial-Temporal Disentangled Re-ID method (InSTD), to improve Re-ID accuracy.

12, TITLE: Towards Explainable Artificial Intelligence (XAI) for Early Anticipation of Traffic Accidents
AUTHORS: Muhammad Monjurul Karim ; Yu Li ; Ruwen Qin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: An accident anticipation model aims to predict accidents promptly and accurately before they occur.

13, TITLE: Self Context and Shape Prior for Sensorless Freehand 3D Ultrasound Reconstruction
AUTHORS: MINGYUAN LUO et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper, we propose a novel approach to sensorless freehand 3D US reconstruction considering the complex skill sequences.

14, TITLE: Greedy Network Enlarging
AUTHORS: CHUANJIAN LIU et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Recent studies on deep convolutional neural networks present a simple paradigm of architecture design, i.e., models with more MACs typically achieve better accuracy, such as EfficientNet and RegNet.

15, TITLE: BezierSeg: Parametric Shape Representation for Fast Object Segmentation in Medical Images
AUTHORS: HAICHOU CHEN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To overcome these undesirable artifacts, we propose the BezierSeg model which outputs bezier curves encompassing the region of interest.

16, TITLE: PSE-Match: A Viewpoint-free Place Recognition Method with Parallel Semantic Embedding
AUTHORS: Peng Yin ; Lingyun Xu ; Anton Egorov ; Bing Li
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: To tackle these challenges, we present PSE-Match, a viewpoint-free place recognition method based on parallel semantic analysis of isolated semantic attributes from 3D point-cloud models.

17, TITLE: Delving Into Deep Image Prior for Adversarial Defense: A Novel Reconstruction-based Defense Framework
AUTHORS: LI DING et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To defend against adversarial attacks in a training-free and attack-agnostic manner, this work proposes a novel and effective reconstruction-based defense framework by delving into deep image prior (DIP).

18, TITLE: Learning Maritime Obstacle Detection from Weak Annotations By Scaffolding
AUTHORS: Lojze ?ust ; Matej Kristan
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: We propose a new scaffolding learning regime (SLR) that allows training obstacle detection segmentation networks only from such weak annotations, thus significantly reducing the cost of ground-truth labeling.

19, TITLE: Towards Robust Object Detection: Bayesian RetinaNet for Homoscedastic Aleatoric Uncertainty Modeling
AUTHORS: Natalia Khanzhina ; Alexey Lapenok ; Andrey Filchenkov
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To model such a noise, in this paper we have proposed the homoscedastic aleatoric uncertainty estimation, and present a series of novel loss functions to address the problem of image object detection at scale.

20, TITLE: FLASH: Fast Neural Architecture Search with Hardware Optimization
AUTHORS: Guihong Li ; Sumit K. Mandal ; Umit Y. Ogras ; Radu Marculescu
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: This paper proposes FLASH, a very fast NAS methodology that co-optimizes the DNN accuracy and performance on a real hardware platform.

21, TITLE: Angle Based Feature Learning in GNN for 3D Object Detection Using Point Cloud
AUTHORS: Md Afzal Ansari ; Md Meraz ; Pavan Chakraborty ; Mohammed Javed
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we present new feature encoding methods for Detection of 3D objects in point clouds.

22, TITLE: GraphFPN: Graph Feature Pyramid Network for Object Detection
AUTHORS: Gangming Zhao ; Weifeng Ge ; Yizhou Yu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose graph feature pyramid networks that are capable of adapting their topological structures to varying intrinsic image structures and supporting simultaneous feature interactions across all scales.

23, TITLE: Learn to Match: Automatic Matching Network Design for Visual Tracking
AUTHORS: Zhipeng Zhang ; Yihao Liu ; Xiao Wang ; Bing Li ; Weiming Hu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Thus, in this work, we introduce six novel matching operators from the perspective of feature fusion instead of explicit similarity learning, namely Concatenation, Pointwise-Addition, Pairwise-Relation, FiLM, Simple-Transformer and Transductive-Guidance, to explore more feasibility on matching operator selection.

24, TITLE: Training Face Verification Models from Generated Face Identity Data
AUTHORS: Dennis Conway ; Loic Simon ; Alexis Lechervy ; Frederic Jurie
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we consider an approach to increase the privacy protection of data sets, as applied to face recognition. By independently varying these vectors during image generation, we create a synthetic data set of fictitious face identities.

25, TITLE: Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR
AUTHORS: Khoi Nguyen ; Yen Nguyen ; Bao Le
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we aim to conduct our analyses on three different aspects of SimCLR, the current state-of-the-art semi-supervised learning framework for computer vision.

26, TITLE: Congested Crowd Instance Localization with Dilated Convolutional Swin Transformer
AUTHORS: Junyu Gao ; Maoguo Gong ; Xuelong Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we focus on how to achieve precise instance localization in high-density crowd scenes, and to alleviate the problem that the feature extraction ability of the traditional model is reduced due to the target occlusion, the image blur, etc.

27, TITLE: Deep Feature Tracker: A Novel Application for Deep Convolutional Neural Networks
AUTHORS: Mostafa Parchami ; Saif Iftekar Sayed
CATEGORY: cs.CV [cs.CV, cs.LG, cs.RO]
HIGHLIGHT: In this paper, we proposed a novel and unified deep learning-based approach that can learn how to track features reliably as well as learn how to detect such reliable features for tracking purposes.

28, TITLE: Object-to-Scene: Learning to Transfer Object Knowledge to Indoor Scene Recognition
AUTHORS: Bo Miao ; Liguang Zhou ; Ajmal Mian ; Tin Lun Lam ; Yangsheng Xu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we analyze the weaknesses of current methods and propose an Object-to-Scene (OTS) method, which extracts object features and learns object relations to recognize indoor scenes.

29, TITLE: Deep Graph Matching Meets Mixed-integer Linear Programming: Relax at Your Own Risk ?
AUTHORS: Zhoubo Xu ; Puqing Chen ; Romain Raveaux ; Xin Yang ; Huadong Liu
CATEGORY: cs.CV [cs.CV, cs.LG, math.OC]
HIGHLIGHT: Therefore, we propose an approach integrating a MILP formulation of the graph matching problem.

30, TITLE: Comparing Object Recognition in Humans and Deep Convolutional Neural Networks -- An Eye Tracking Study
AUTHORS: Leonard E. van Dyck ; Roland Kwitt ; Sebastian J. Denzler ; Walter R. Gruber
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this proof-of-concept study, we demonstrate a comparison of human observers (N = 45) and three feedforward DCNNs through eye tracking and saliency maps.

31, TITLE: Controlling Weather Field Synthesis Using Variational Autoencoders
AUTHORS: Dario Augusto Borges Oliveira ; Jorge Guevara Diaz ; Bianca Zadrozny ; Campbell Watson
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Controlling Weather Field Synthesis Using Variational Autoencoders

32, TITLE: Object-aware Contrastive Learning for Debiased Scene Representation
AUTHORS: Sangwoo Mo ; Hyunwoo Kang ; Kihyuk Sohn ; Chun-Liang Li ; Jinwoo Shin
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: For (a), we propose the contrastive class activation map (ContraCAM), which finds the most discriminative regions (e.g., objects) in the image compared to the other images using the contrastively trained models.

33, TITLE: BORM: Bayesian Object Relation Model for Indoor Scene Recognition
AUTHORS: LIGUANG ZHOU et. al.
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this paper, we propose to utilize meaningful object representations for indoor scene representation.

34, TITLE: Multi-Head Self-Attention Via Vision Transformer for Zero-Shot Learning
AUTHORS: Faisal Alamri ; Anjan Dutta
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we propose an attention-based model in the problem settings of ZSL to learn attributes useful for unseen class recognition.

35, TITLE: Multimodal Feature Fusion for Video Advertisements Tagging Via Stacking Ensemble
AUTHORS: Qingsong Zhou ; Hai Liang ; Zhimin Lin ; Kele Xu
CATEGORY: cs.CV [cs.CV, cs.MM]
HIGHLIGHT: In this paper, we present our approach for Multimodal Video Ads Tagging in the 2021 Tencent Advertising Algorithm Competition.

36, TITLE: HiFT: Hierarchical Feature Transformer for Aerial Tracking
AUTHORS: Ziang Cao ; Changhong Fu ; Junjie Ye ; Bowen Li ; Yiming Li
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: Thus, in this work, we propose an efficient and effective hierarchical feature transformer (HiFT) for aerial tracking.

37, TITLE: Self-supervised Audiovisual Representation Learning for Remote Sensing Data
AUTHORS: KONRAD HEIDLER et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: For this purpose, we introduce the SoundingEarth dataset, which consists of co-located aerial imagery and audio samples all around the world.

38, TITLE: Word2Pix: Word to Pixel Cross Attention Transformer in Visual Grounding
AUTHORS: Heng Zhao ; Joey Tianyi Zhou ; Yew-Soon Ong
CATEGORY: cs.CV [cs.CV, cs.AI, cs.CL]
HIGHLIGHT: In this paper we propose Word2Pix: a one-stage visual grounding network based on encoder-decoder transformer architecture that enables learning for textual to visual feature correspondence via word to pixel attention.

39, TITLE: LDDMM-Face: Large Deformation Diffeomorphic Metric Learning for Flexible and Consistent Face Alignment
AUTHORS: Huilin Yang ; Junyan Lyu ; Pujin Cheng ; Xiaoying Tang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We innovatively propose a flexible and consistent face alignment framework, LDDMM-Face, the key contribution of which is a deformation layer that naturally embeds facial geometry in a diffeomorphic way.

40, TITLE: PoseFusion2: Simultaneous Background Reconstruction and Human Shape Recovery in Real-time
AUTHORS: Huayan Zhang ; Tianwei Zhang ; Tin Lun Lam ; Sethu Vijayakumar
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this work, we present a fast, learning-based human object detector to isolate the dynamic human objects and realise a real-time dense background reconstruction framework.

41, TITLE: Multi-scale Matching Networks for Semantic Correspondence
AUTHORS: DONGYANG ZHAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a multi-scale matching network that is sensitive to tiny semantic differences between neighboring pixels.

42, TITLE: A New Semi-supervised Learning Benchmark for Classifying View and Diagnosing Aortic Stenosis from Echocardiograms
AUTHORS: Zhe Huang ; Gary Long ; Benjamin Wessler ; Michael C. Hughes
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Motivated by the urgent need to improve timely diagnosis of life-threatening heart conditions, especially aortic stenosis, we develop a benchmark dataset to assess semi-supervised approaches to two tasks relevant to cardiac ultrasound (echocardiogram) interpretation: view classification and disease severity classification.

43, TITLE: Chest ImaGenome Dataset for Clinical Reasoning
AUTHORS: JOY T. WU et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.CL, cs.LG]
HIGHLIGHT: Inspired by the Visual Genome effort in the computer vision community, we constructed the first Chest ImaGenome dataset with a scene graph data structure to describe $242,072$ images.

44, TITLE: Unsupervised Cross-Modal Distillation for Thermal Infrared Tracking
AUTHORS: JINGXIAN SUN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To solve this problem, we propose to distill representations of the TIR modality from the RGB modality with Cross-Modal Distillation (CMD) on a large amount of unlabeled paired RGB-TIR data.

45, TITLE: Distributed Attention for Grounded Image Captioning
AUTHORS: NENGLUN CHEN et. al.
CATEGORY: cs.CV [cs.CV, cs.MM]
HIGHLIGHT: To this end, we propose a simple yet effective method to alleviate the issue, termed as partial grounding problem in our paper.

46, TITLE: Forward-Looking Sonar Patch Matching: Modern CNNs, Ensembling, and Uncertainty
AUTHORS: Arka Mallick ; Paul Pl�ger ; Matias Valdenegro-Toro
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this paper we improve on our previous results for this problem (Valdenegro-Toro et al, 2017), instead of modeling features manually, a Convolutional Neural Network (CNN) learns a similarity function and predicts if two input sonar images are similar or not.

47, TITLE: GTNet:Guided Transformer Network for Detecting Human-Object Interactions
AUTHORS: A S M Iftekhar ; Satish Kumar ; R. Austin McEver ; Suya You ; B. S. Manjunath
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: GTNet:Guided Transformer Network for Detecting Human-Object Interactions

48, TITLE: SDEdit: Image Synthesis and Editing with Stochastic Differential Equations
AUTHORS: CHENLIN MENG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: We introduce a new image editing and synthesis framework, Stochastic Differential Editing (SDEdit), based on a recent generative model using stochastic differential equations (SDEs).

49, TITLE: S$^2$-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
AUTHORS: Tan Yu ; Xu Li ; Yunfeng Cai ; Mingming Sun ; Ping Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we improve the S$^2$-MLP vision backbone.

50, TITLE: On The State of Data In Computer Vision: Human Annotations Remain Indispensable for Developing Deep Learning Models
AUTHORS: ZEYAD EMAM et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we survey computer vision research domains that study the effects of such large datasets on model performance across different vision tasks.

51, TITLE: Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
AUTHORS: Liangbin Xie ; Xintao Wang ; Chao Dong ; Zhongang Qi ; Ying Shan
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To find the answer, we propose a new diagnostic tool -- Filter Attribution method based on Integral Gradient (FAIG).

52, TITLE: Margin-Aware Intra-Class Novelty Identification for Medical Images
AUTHORS: Xiaoyuan Guo ; Judy Wawira Gichoya ; Saptarshi Purkayastha ; Imon Banerjee
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle the challenges, we propose a hybrid model - Transformation-based Embedding learning for Novelty Detection (TEND) which without any out-of-distribution training data, performs novelty identification by combining both autoencoder-based and classifier-based method.

53, TITLE: Pose-Guided Feature Learning with Knowledge Distillation for Occluded Person Re-Identification
AUTHORS: KECHENG ZHENG et. al.
CATEGORY: cs.CV [cs.CV, cs.MM]
HIGHLIGHT: At the cost of incorporating a pose estimator, many works introduce pose information to alleviate the misalignment in both training and testing.

54, TITLE: Manifold-Inspired Single Image Interpolation
AUTHORS: Lantao Yu ; Kuida Liu ; Michael T. Orchard
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To overcome the challenge in the first part, we propose a carefully-designed adaptive technique to remove aliasing in severely aliased regions, which cannot be removed from traditional techniques.

55, TITLE: T$_k$ML-AP: Adversarial Attacks to Top-$k$ Multi-Label Learning
AUTHORS: Shu Hu ; Lipeng Ke ; Xin Wang ; Siwei Lyu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we develop methods to create adversarial perturbations that can be used to attack top-$k$ multi-label learning-based image annotation systems (TkML-AP).

56, TITLE: Scene Inference for Object Illumination Editing
AUTHORS: ZHONGYUN BAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, we apply a physically-based rendering method to create a large-scale, high-quality dataset, named IH dataset, which provides rich illumination information for seamless illumination integration task.

57, TITLE: An Applied Deep Learning Approach for Estimating Soybean Relative Maturity from UAV Imagery to Aid Plant Breeding Decisions
AUTHORS: Saba Moeinizade ; Hieu Pham ; Ye Han ; Austin Dobbels ; Guiping Hu
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: In this paper, we develop a robust and automatic approach for estimating the relative maturity of soybeans using a time series of UAV images.

58, TITLE: Reliable Semantic Segmentation with Superpixel-Mix
AUTHORS: GIANNI FRANCHI et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, stat.ML]
HIGHLIGHT: To improve reliability, we introduce Superpixel-mix, a new superpixel-based data augmentation method with teacher-student consistency training.

59, TITLE: HR-Crime: Human-Related Anomaly Detection in Surveillance Videos
AUTHORS: Kayleigh Boekhoudt ; Alina Matei ; Maya Aghaei ; Estefan�a Talavera
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce HR-Crime, a subset of the UCF-Crime dataset suitable for human-related anomaly detection tasks.

60, TITLE: Multilevel Knowledge Transfer for Cross-Domain Object Detection
AUTHORS: Botos Csaba ; Xiaojuan Qi ; Arslan Chaudhry ; Puneet Dokania ; Philip Torr
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we address the domain shift problem for the object detection task.

61, TITLE: SyDog: A Synthetic Dog Dataset for Improved 2D Pose Estimation
AUTHORS: Moira Shooter ; Charles Malleson ; Adrian Hilton
CATEGORY: cs.CV [cs.CV, cs.AI, cs.GR]
HIGHLIGHT: To address this problem we introduce SyDog: a synthetic dataset of dogs containing ground truth pose and bounding box coordinates which was generated using the game engine, Unity. We release the SyDog dataset as a training and evaluation benchmark for research in animal motion.

62, TITLE: Towards Adversarially Robust and Domain Generalizable Stereo Matching By Rethinking DNN Feature Backbones
AUTHORS: Kelvin Cheng ; Christopher Healey ; Tianfu Wu
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: This paper proposes to rethink the learnable DNN-based feature backbone towards adversarially-robust and domain generalizable stereo matching, either by completely removing it or by applying it only to the left reference image.

63, TITLE: Learning Few-shot Open-set Classifiers Using Exemplar Reconstruction
AUTHORS: Sayak Nag ; Dripta S. Raychaudhuri ; Sujoy Paul ; Amit K. Roy-Chowdhury
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Instead, we propose a novel exemplar reconstruction-based meta-learning strategy for jointly detecting open class samples, as well as, categorizing samples from seen classes via metric-based classification.

64, TITLE: ELLIPSDF: Joint Object Pose and Shape Optimization with A Bi-level Ellipsoid and Signed Distance Function Description
AUTHORS: Mo Shan ; Qiaojun Feng ; You-Yi Jau ; Nikolay Atanasov
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper proposes an expressive yet compact model for joint object pose and shape optimization, and an associated optimization algorithm to infer an object-level map from multi-view RGB-D camera observations.

65, TITLE: BundleTrack: 6D Pose Tracking for Novel Objects Without Instance or Category-Level 3D Models
AUTHORS: Bowen Wen ; Kostas Bekris
CATEGORY: cs.CV [cs.CV, cs.AI, cs.GR, cs.RO]
HIGHLIGHT: This work proposes BundleTrack, a general framework for 6D pose tracking of novel objects, which does not depend upon 3D models, either at the instance or category-level.

66, TITLE: Pro-UIGAN: Progressive Face Hallucination from Occluded Thumbnails
AUTHORS: Yang Zhang ; Xin Yu ; Xiaobo Lu ; Ping Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we study the task of hallucinating an authentic high-resolution (HR) face from an occluded thumbnail.

67, TITLE: LASOR: Learning Accurate 3D Human Pose and Shape Via Synthetic Occlusion-Aware Data and Neural Mesh Rendering
AUTHORS: Kaibing Yang ; Renshu Gu ; Masahiro Toyoura ; Gang Xu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a framework that synthesizes occlusion-aware silhouette and 2D keypoints data and directly regress to the SMPL pose and shape parameters.

68, TITLE: Multiple Classifiers Based Maximum Classifier Discrepancy for Unsupervised Domain Adaptation
AUTHORS: Yiju Yang ; Taejoon Kim ; Guanghui Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose to extend the structure to multiple classifiers to further boost its performance.

69, TITLE: Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to Popular Convolutional Neural Network and Potential Visual Transformer
AUTHORS: Jinghua Zhang ; Chen Li ; Marcin Grzegorzek
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this review, the background and motivation are introduced first.

70, TITLE: RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth
AUTHORS: Mengyang Pu ; Yaping Huang ; Qingji Guan ; Haibin Ling
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel neural network solution, RINDNet, to jointly detect all four types of edges. For training and evaluation, we construct the first public benchmark, BSDS-RIND, with all four types of edges carefully annotated.

71, TITLE: Neural Free-Viewpoint Performance Rendering Under ComplexHuman-object Interactions
AUTHORS: GUOXING SUN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.GR]
HIGHLIGHT: In this paper, we propose a neural human performance capture and rendering system to generate both high-quality geometry and photo-realistic texture of both human and objects under challenging interaction scenarios in arbitrary novel views, from only sparse RGB streams.

72, TITLE: Investigating Attention Mechanism in 3D Point Cloud Object Detection
AUTHORS: Shi Qiu ; Yunfan Wu ; Saeed Anwar ; Chongyi Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This work investigates the role of the attention mechanism in 3D point cloud object detection and provides insights into the potential of different attention modules.

73, TITLE: Knowing When to Quit: Selective Cascaded Regression with Patch Attention for Real-Time Face Alignment
AUTHORS: Gil Shapira ; Noga Levy ; Ishay Goldin ; Roy J. Jevnisek
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we aim to optimize for both accuracy and speed and explore the trade-off between them.

74, TITLE: Recurrent Mask Refinement for Few-Shot Medical Image Segmentation
AUTHORS: Hao Tang ; Xingwei Liu ; Shanlin Sun ; Xiangyi Yan ; Xiaohui Xie
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a new framework for few-shot medical image segmentation based on prototypical networks.

75, TITLE: Visual Boundary Knowledge Translation for Foreground Segmentation
AUTHORS: ZUNLEI FENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we make an attempt towards building models that explicitly account for visual boundary knowledge, in hope to reduce the training effort on segmenting unseen categories.

76, TITLE: Edge-competing Pathological Liver Vessel Segmentation with Limited Labels
AUTHORS: ZUNLEI FENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Based on the collected dataset, we propose an Edge-competing Vessel Segmentation Network (EVS-Net), which contains a segmentation network and two edge segmentation discriminators. This paper collects the first pathological liver image dataset containing 522 whole slide images with labels of vessels, MVI, and hepatocellular carcinoma grades.

77, TITLE: Constrained Graphic Layout Generation Via Latent Optimization
AUTHORS: Kotaro Kikuchi ; Edgar Simo-Serra ; Mayu Otani ; Kota Yamaguchi
CATEGORY: cs.CV [cs.CV, cs.MM]
HIGHLIGHT: In this work, we generate graphic layouts that can flexibly incorporate such design semantics, either specified implicitly or explicitly by a user.

78, TITLE: WAS-VTON: Warping Architecture Search for Virtual Try-on Network
AUTHORS: ZHENYU XIE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we address this problem by finding clothing category-specific warping networks for the virtual try-on task via Neural Architecture Search (NAS).

79, TITLE: Shallow Feature Matters for Weakly Supervised Object Localization
AUTHORS: JUN WEI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a simple but effective Shallow feature-aware Pseudo supervised Object Localization (SPOL) model for accurate WSOL, which makes the utmost of low-level features embedded in shallow layers.

80, TITLE: Developing A Compressed Object Detection Model Based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System
AUTHORS: ISSAC SIM et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Therefore, this paper proposes a new object detection model, referred as YOffleNet, which is compressed at a high ratio while minimizing the accuracy loss for real-time and safe driving application on an autonomous system.

81, TITLE: Shallow Attention Network for Polyp Segmentation
AUTHORS: JUN WEI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address the above issues, we propose the Shallow Attention Network (SANet) for polyp segmentation.

82, TITLE: Threat of Adversarial Attacks on Deep Learning in Computer Vision: Survey II
AUTHORS: Naveed Akhtar ; Ajmal Mian ; Navid Kardan ; Mubarak Shah
CATEGORY: cs.CV [cs.CV, cs.CR, cs.CY, cs.LG]
HIGHLIGHT: Hence, as a legacy sequel of [2], this literature review focuses on the advances in this area since 2018.

83, TITLE: Discovering "Semantics" in Super-Resolution Networks
AUTHORS: YIHAO LIU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we give affirmative answers to this question.

84, TITLE: CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural Network for Semantic Segmentation
AUTHORS: HAITONG TANG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper, we proposed a novel strategy that reformulated the popularly-used convolution operation to multi-layer convolutional sparse coding block to ease the aforementioned deficiency.

85, TITLE: SSPU-Net: Self-Supervised Point Cloud Upsampling Via Differentiable Rendering
AUTHORS: Yifan Zhao ; Le Hui ; Jin Xie
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a self-supervised point cloud upsampling network (SSPU-Net) to generate dense point clouds without using ground truth.

86, TITLE: Learning TFIDF Enhanced Joint Embedding for Recipe-Image Cross-Modal Retrieval Service
AUTHORS: Zhongwei Xie ; Ling Liu ; Yanzhao Wu ; Lin Li ; Luo Zhong
CATEGORY: cs.CV [cs.CV, cs.IR]
HIGHLIGHT: We present a Multi-modal Semantics enhanced Joint Embedding approach (MSJE) for learning a common feature space between the two modalities (text and image), with the ultimate goal of providing high-performance cross-modal retrieval services.

87, TITLE: Active Perception for Ambiguous Objects Classification
AUTHORS: Evgenii Safronov ; Nicola Piga ; Michele Colledanchise ; Lorenzo Natale
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this work, we propose a framework that, given a single view of an object, provides the coordinates of a next viewpoint to discriminate the object against similar ones, if any, and eliminates ambiguities.

88, TITLE: Explainable Deep Few-shot Anomaly Detection with Deviation Networks
AUTHORS: Guansong Pang ; Choubo Ding ; Chunhua Shen ; Anton van den Hengel
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: To address this problem, we introduce a novel weakly-supervised anomaly detection framework to train detection models without assuming the examples illustrating all possible classes of anomaly.

89, TITLE: An Effective and Robust Detector for Logo Detection
AUTHORS: XIAOJUN JIA et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To overcome this problem, a novel logo detector based on the mechanism of looking and thinking twice is proposed in this paper for robust logo detection.

90, TITLE: Self-supervised Learning with Local Attention-Aware Feature
AUTHORS: Trung X. Pham ; Rusty John Lloyd Mina ; Dias Issa ; Chang D. Yoo
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this work, we propose a novel methodology for self-supervised learning for generating global and local attention-aware visual features.

91, TITLE: CERL: A Unified Optimization Framework for Light Enhancement with Realistic Noise
AUTHORS: Zeyuan Chen ; Yifan Jiang ; Dong Liu ; Zhangyang Wang
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: We present Coordinated Enhancement for Real-world Low-light Noisy Images (CERL), that seamlessly integrates light enhancement and noise suppression parts into a unified and physics-grounded optimization framework.

92, TITLE: Flip Learning: Erase to Segment
AUTHORS: YUHAO HUANG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, cs.MA]
HIGHLIGHT: Unlike existing weakly-supervised approaches, in this study, we propose a novel and general WSS framework called Flip Learning, which only needs the box annotation.

93, TITLE: I2V-GAN: Unpaired Infrared-to-Visible Video Translation
AUTHORS: SHUANG LI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this challenging problem, we propose an infrared-to-visible (I2V) video translation method I2V-GAN to generate fine-grained and spatial-temporal consistent visible light videos by given unpaired infrared videos. Thus, we provide a new dataset for I2V video translation, which is named IRVI.

94, TITLE: StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
AUTHORS: Rinon Gal ; Or Patashnik ; Haggai Maron ; Gal Chechik ; Daniel Cohen-Or
CATEGORY: cs.CV [cs.CV, cs.CL, cs.GR, cs.LG]
HIGHLIGHT: Leveraging the semantic power of large scale Contrastive-Language-Image-Pre-training (CLIP) models, we present a text-driven method that allows shifting a generative model to new domains, without having to collect even a single image from those domains.

95, TITLE: BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
AUTHORS: Jinyuan Jia ; Yupei Liu ; Neil Zhenqiang Gong
CATEGORY: cs.CR [cs.CR, cs.CV, cs.LG]
HIGHLIGHT: In this work, we propose BadEncoder, the first backdoor attack to self-supervised learning.

96, TITLE: Bringing AI Pipelines Onto Cloud-HPC: Setting A Baseline for Accuracy of COVID-19 AI Diagnosis
AUTHORS: Iacopo Colonnelli ; Barbara Cantalupo ; Concetto Spampinato ; Matteo Pennisi ; Marco Aldinucci
CATEGORY: cs.DC [cs.DC, cs.CV, eess.IV, D.1.3; D.3.2; C.1.3]
HIGHLIGHT: In this work, we advocate the StreamFlow Workflow Management System as a crucial ingredient to define a parametric pipeline, called "CLAIRE COVID-19 Universal Pipeline," which is able to explore the optimization space of methods to classify COVID-19 lung lesions from CT scans, compare them for accuracy, and therefore set a performance baseline.

97, TITLE: Self-Supervised Disentangled Representation Learning for Third-Person Imitation Learning
AUTHORS: Jinghuan Shang ; Michael S. Ryoo
CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we present a TPIL approach for robot tasks with egomotion.

98, TITLE: Musical Speech: A Transformer-based Composition Tool
AUTHORS: Jason d'Eon ; Sri Harsha Dumpala ; Chandramouli Shama Sastry ; Dani Oore ; Sageev Oore
CATEGORY: cs.SD [cs.SD, cs.CV, cs.LG, eess.AS]
HIGHLIGHT: In this paper, we propose a new compositional tool that will generate a musical outline of speech recorded/provided by the user for use as a musical building block in their compositions.

99, TITLE: Thermal Image Super-Resolution Using Second-Order Channel Attention with Varying Receptive Fields
AUTHORS: Nolan B. Gutierrez ; William J. Beksi
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we introduce a system to efficiently reconstruct thermal images.

100, TITLE: Bespoke Fractal Sampling Patterns for Discrete Fourier Space Via The Kaleidoscope Transform
AUTHORS: Jacob M. White ; Stuart Crozier ; Shekhar S. Chandra
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: Through the introduction of a novel image transform known as the kaleidoscope transform, which formalises and extends upon the concept of downsampling and concatenating an image with itself, this paper: (1) demonstrates a fundamental relationship between multiplication in modular arithmetic and downsampling; (2) provides a rigorous mathematical explanation for the fractal nature of the sampling pattern in the DFT; and (3) leverages this understanding to develop a collection of novel fractal sampling patterns for the 2D DFT with customisable properties.

101, TITLE: DCT2net: An Interpretable Shallow CNN for Image Denoising
AUTHORS: S�bastien Herbreteau ; Charles Kervrann
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we demonstrate that a DCT denoiser can be seen as a shallow CNN and thereby its original linear transform can be tuned through gradient descent in a supervised manner, improving considerably its performance.

102, TITLE: Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation
AUTHORS: BRENNAN NICHYPORUK et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a generalized affine conditioning framework to learn and account for cohort biases across multi-source datasets, which we call Source-Conditioned Instance Normalization (SCIN).

103, TITLE: Style Curriculum Learning for Robust Medical Image Segmentation
AUTHORS: ZHENDONG LIU et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We propose a novel framework to ensure robust segmentation in the presence of such distribution shifts.

104, TITLE: Multi-phase Liver Tumor Segmentation with Spatial Aggregation and Uncertain Region Inpainting
AUTHORS: YUE ZHANG et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this work, we propose a novel LiTS method to adequately aggregate multi-phase information and refine uncertain region segmentation.

105, TITLE: Projective Skip-Connections for Segmentation Along A Subset of Dimensions in Retinal OCT
AUTHORS: Dmitrii Lachinov ; Philipp Seeboeck ; Julia Mai ; Ursula Schmidt-Erfurth ; Hrvoje Bogunovic
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this work, we propose a novel convolutional neural network architecture that can effectively learn to produce a lower-dimensional segmentation mask than the input image.