cvpr_2020_papers
CVPR 2020 Papers. Updating …
-
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
[Paper]
-
High-Resolution Daytime Translation Without Domain Labels
[Paper]
-
SAM: The Sensitivity of Attribution Methods to Hyperparameters
[Paper]
-
Efficient and Robust Shape Correspondence via Sparsity-Enforced Quadratic Assignment
[Paper]
-
CPR-GCN: Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling of Coronary Arteries
[Paper]
-
Collaborative Distillation for Ultra-Resolution Universal Style Transfer
[Paper]
-
A Content Transformation Block For Image Style Transfer
[Paper]
-
Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination
[Paper]
-
Event Probability Mask (EPM) and Event Denoising Convolutional NeuralNetwork (EDnCNN) for Neuromorphic Cameras
[Paper]
-
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification
[Paper]
-
SwapText: Image Based Texts Transfer in Scenes
[Paper]
-
Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images
[Paper]
-
Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
[Paper]
-
CycleISP: Real Image Restoration via Improved Data Synthesis
[Paper]
-
Weakly-Supervised Salient Object Detection via Scribble Annotations
[Paper]
-
EventSR: From Asynchronous Events to Image Reconstruction, Restoration, and Super-Resolution via End-to-End Adversarial Learning
[Paper]
-
Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild
[Paper]
-
Revisiting the Sibling Head in Object Detector
[Paper]
-
Resolution Adaptive Networks for Efficient Inference
[Paper]
-
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
[Paper]
-
Closed-loop Matters: Dual Regression Networks for Single Image Super-Resolution
[Paper]
-
Vec2Face: Unveil Human Faces from their Blackbox Features in Face Recognition
[Paper]
-
Camera Trace Erasing
[Paper]
-
Siamese Box Adaptive Network for Visual Tracking
[Paper]
-
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps
[Paper]
-
Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition
[Paper]
-
Collaborative Motion Prediction via Neural Motion Message Passing
[Paper]
-
Counterfactual Samples Synthesizing for Robust Visual Question Answering
[Paper]
-
OccuSeg: Occupancy-aware 3D Instance Segmentation
[Paper]
-
GeoDA: a geometric framework for black-box adversarial attacks
[Paper]
-
Harmonizing Transferability and Discriminability for Adapting Object Detectors
[Paper]
-
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems
[Paper]
-
Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
[Paper]
-
Semantic Pyramid for Image Generation
[Paper]
-
Partial Weight Adaptation for Robust DNN Inference
[Paper]
-
Cascade EF-GAN: Progressive Facial Expression Editing with Local Focuses
[Paper]
-
Towards Photo-Realistic Virtual Try-On by Adaptively Generating
[Paper]
-
End-to-End Learning Local Multi-view Descriptors for 3D Point Clouds
[Paper]
-
Towards High-Fidelity 3D Face Reconstruction from In-the-Wild Images Using Graph Convolutional Networks
[Paper]
-
Learning to Segment 3D Point Clouds in 2D Image Space
[Paper]
-
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
[Paper]
-
Softmax Splatting for Video Frame Interpolation
[Paper]
-
ENSEI: Efficient Secure Inference via Frequency-Domain Homomorphic Convolution for Privacy-Preserving Visual Recognition
[Paper]
-
Equalization Loss for Long-Tailed Object Recognition
[Paper]
-
Cars Can’t Fly up in the Sky: Improving Urban-Scene Segmentation via Height-driven Attention Networks
[Paper]
-
Visual Grounding in Video for Unsupervised Word Translation
[Paper]
-
Cloth in the Wind: A Case Study of Physical Measurement through Simulation
[Paper]
-
Learning Video Object Segmentation from Unlabeled Videos
[Paper]
-
PANDA: A Gigapixel-level Human-centric Video Dataset
[Paper]
-
Hierarchical Human Parsing with Typed Part-Relation Reasoning
[Paper]
-
Incremental Few-Shot Object Detection
[Paper]
-
TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style
[Paper]
-
FOAL: Fast Online Adaptive Learning for Cardiac Motion Estimation
[Paper]
-
Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion
[Paper]
-
Cascaded Human-Object Interaction Recognition
[Paper]
-
Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS
[Paper]
-
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models
[Paper]
-
Better Captioning with Sequence-Level Exploration
[Paper]
-
Scalable Uncertainty for Computer Vision with Functional Variational Inference
[Paper]
-
Probability Weighted Compact Feature for Domain Adaptive Retrieval
[Paper]
-
D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features
[Paper]
-
Bundle Adjustment on a Graph Processor
[Paper]
-
Show, Edit and Tell: A Framework for Editing Image Captions
[Paper]
-
Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
[Paper]
-
Combating noisy labels by agreement: A joint training method with co-regularization
[Paper]
-
Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning
[Paper]
-
Detecting Attended Visual Targets in Video
[Paper]
-
Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization
[Paper]
-
Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement
[Paper]
-
RMP-SNNs: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Networks
[Paper]
-
Holistically-Attracted Wireframe Parsing
[Paper]
-
Unsupervised Learning of Intrinsic Structural Representation Points
[Paper]
-
Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
[Paper]
-
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
[Paper]
-
Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve Adversarial Robustness
[Paper]
-
D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
[Paper]
-
Learning Fast and Robust Target Models for Video Object Segmentation
[Paper]
-
Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation
[Paper]
-
Extremely Dense Point Correspondences using a Learned Feature Descriptor
[Paper]
-
MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
[Paper]
-
PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling
[Paper]
-
Learning When and Where to Zoom with Deep Reinforcement Learning
[Paper]
-
Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision
[Paper]
-
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
[Paper]
-
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
[Paper]
-
Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation
[Paper]
-
HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
[Paper]
-
Transferring Dense Pose to Proximal Animal Classes
[Paper]
-
Predicting Sharp and Accurate Occlusion Boundaries in Monocular Depth Estimation Using Displacement Fields
[Paper]
-
KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations
[Paper]
-
A Spatiotemporal Volumetric Interpolation Network for 4D Dynamic Medical Image
[Paper]
-
A U-Net Based Discriminator for Generative Adversarial Networks
[Paper]
-
4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
[Paper]
-
MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment
[Paper]
-
Learning in the Frequency Domain
[Paper]
-
Blurry Video Frame Interpolation
[Paper]
-
Meta-Transfer Learning for Zero-Shot Super-Resolution
[Paper]
-
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
[Paper]
-
Visual Commonsense R-CNN
[Paper]
-
Evolving Losses for Unsupervised Video Representation Learning
[[Paper](https://arxiv.org/pdf/2002.12177)]
- Unbiased Scene Graph Generation from Biased Training
[[Paper](https://arxiv.org/pdf/2002.11949)]
- Auto-Encoding Twin-Bottleneck Hashing
[[Paper](https://arxiv.org/pdf/2002.11930)]
- Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
[[Paper](https://arxiv.org/pdf/2002.11927)]
- Towards Universal Representation Learning for Deep Face Recognition
[[Paper](https://arxiv.org/pdf/2002.11841)]
- Learning to Shade Hand-drawn Sketches
[[Paper](https://arxiv.org/pdf/2002.11812)]
- Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
[[Paper](https://arxiv.org/pdf/2002.11616)]
- Object Relational Graph with Teacher-Recommended Learning for Video Captioning
[[Paper](https://arxiv.org/pdf/2002.11566)]
- Rethinking the Route Towards Weakly Supervised Object Localization
[[Paper](https://arxiv.org/pdf/2002.11359)]
- Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data
[[Paper](https://arxiv.org/pdf/2002.11297)]
- Generalized Product Quantization Network for Semi-supervised Image Retrieval
[[Paper](https://arxiv.org/pdf/2002.11281)]
- Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization
[[Paper](https://arxiv.org/pdf/2002.11244)]
- PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
[[Paper](https://arxiv.org/pdf/2002.10876)]
- MPM: Joint Representation of Motion and Position Map for Cell Tracking
[[Paper](https://arxiv.org/pdf/2002.10749)]
- FPConv: Learning Local Flattening for Point Convolution
[[Paper](https://arxiv.org/pdf/2002.10701)]
- Hierarchical Conditional Relation Networks for Video Question Answering
[[Paper](https://arxiv.org/pdf/2002.10698)]
- Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
[[Paper](https://arxiv.org/pdf/2002.10638)]
- Revisiting Saliency Metrics: Farthest-Neighbor Area Under Curve
[[Paper](https://arxiv.org/pdf/2002.10540)]
- Sketchformer: Transformer-based Representation for Sketched Structure
[[Paper](https://arxiv.org/pdf/2002.10381)]
- Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval
[[Paper](https://arxiv.org/pdf/2002.10310)]
- Mnemonics Training: Multi-Class Incremental Learning without Forgetting
[[Paper](https://arxiv.org/pdf/2002.10211)]
- ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
[[Paper](https://arxiv.org/pdf/2002.10200)]
- Online high rank matrix completion
[[Paper](https://arxiv.org/pdf/2002.08934)]
- Unsupervised Multi-Class Domain Adaptation: Theory, Algorithms, and Practice
[[Paper](https://arxiv.org/pdf/2002.08681)]
- MAST: A Memory-Augmented Self-supervised Tracker
[[Paper](https://arxiv.org/pdf/2002.07793)]
- On Positive-Unlabeled Classification in GAN
[[Paper](https://arxiv.org/pdf/2002.01136)]
- Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
[[Paper](https://arxiv.org/pdf/2001.09691)]
- Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences
[[Paper](https://arxiv.org/pdf/2001.06891)]
- Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network
[[Paper](https://arxiv.org/pdf/2001.06268)]
- Registration made easy – standalone orthopedic navigation with HoloLens
[[Paper](https://arxiv.org/pdf/2001.06209)]
- DSGN: Deep Stereo Geometry Network for 3D Object Detection
[[Paper](https://arxiv.org/pdf/2001.03398)]
- Convolutional Networks with Dense Connectivity
[[Paper](https://arxiv.org/pdf/2001.02394)]
- Deep Snake for Real-Time Instance Segmentation
[[Paper](https://arxiv.org/pdf/2001.01629)]
- Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules
[[Paper](https://arxiv.org/pdf/2001.01568)]
- Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis
[[Paper](https://arxiv.org/pdf/2001.01306)]
- PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
[[Paper](https://arxiv.org/pdf/1912.12898)]
- AdaBits: Neural Network Quantization with Adaptive Bit-Widths
[[Paper](https://arxiv.org/pdf/1912.09666)]
- Fashion Outfit Complementary Item Retrieval
[[Paper](https://arxiv.org/pdf/1912.08967)]
- Cross-Batch Memory for Embedding Learning
[[Paper](https://arxiv.org/pdf/1912.06798)]
- Neural Cages for Detail-Preserving 3D Deformations
[[Paper](https://arxiv.org/pdf/1912.06395)]
- L3DOC: Lifelong 3D Object Classification
[[Paper](https://arxiv.org/pdf/1912.06135)]
- VIBE: Video Inference for Human Body Pose and Shape Estimation
[[Paper](https://arxiv.org/pdf/1912.05656)]
- Learning to Discriminate Information for Online Action Detection
[[Paper](https://arxiv.org/pdf/1912.04461)]
- Optimizing Rank-based Metrics with Blackbox Differentiation
[[Paper](https://arxiv.org/pdf/1912.03500)]
- Grid-GCN for Fast and Scalable Point Cloud Learning
[[Paper](https://arxiv.org/pdf/1912.02984)]
- Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
[[Paper](https://arxiv.org/pdf/1912.02424)]
- BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
[[Paper](https://arxiv.org/pdf/1912.02413)]
- 15 Keypoints Is All You Need
[[Paper](https://arxiv.org/pdf/1912.02323)]
- Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification
[[Paper](https://arxiv.org/pdf/1912.01230)]
- Just Go with the Flow: Self-Supervised Scene Flow Estimation
[[Paper](https://arxiv.org/pdf/1912.00497)]
- Blockwisely Supervised Neural Architecture Search with Knowledge Distillation
[[Paper](https://arxiv.org/pdf/1911.13053)]
- GhostNet: More Features from Cheap Operations
[[Paper](https://arxiv.org/pdf/1911.11907)]
- End-to-End Model-Free Reinforcement Learning for Urban Driving using Implicit Affordances
[[Paper](https://arxiv.org/pdf/1911.10868)]
- Two Causal Principles for Improving Visual Dialog
[[Paper](https://arxiv.org/pdf/1911.10496)]
- Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation
[[Paper](https://arxiv.org/pdf/1911.10194)]
- Orderless Recurrent Models for Multi-label Classification
[[Paper](https://arxiv.org/pdf/1911.09996)]
- Search to Distill: Pearls are Everywhere but not the Eyes
[[Paper](https://arxiv.org/pdf/1911.09074)]
- EfficientDet: Scalable and Efficient Object Detection
[[Paper](https://arxiv.org/pdf/1911.09070)]
- Improving the Robustness of Capsule Networks to Image Affine Transformations
[[Paper](https://arxiv.org/pdf/1911.07968)]
- Towards Visually Explaining Variational Autoencoders
[[Paper](https://arxiv.org/pdf/1911.07389)]
- CenterMask : Real-Time Anchor-Free Instance Segmentation
[[Paper](https://arxiv.org/pdf/1911.06667)]
- Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
[[Paper](https://arxiv.org/pdf/1911.04933)]
- Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings
[[Paper](https://arxiv.org/pdf/1911.02357)]
- Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization
[[Paper](https://arxiv.org/pdf/1909.11378)]
- Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors
[[Paper](https://arxiv.org/pdf/1909.06872)]
- End-to-End Learnable Geometric Vision by Backpropagating PnP Optimization
[[Paper](https://arxiv.org/pdf/1909.06043)]
- HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation
[[Paper](https://arxiv.org/pdf/1908.10357)]
- Neural Blind Deconvolution Using Deep Priors
[[Paper](https://arxiv.org/pdf/1908.02197)]
- AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation
[[Paper](https://arxiv.org/pdf/1907.10244)]
- Dynamic Face Video Segmentation via Reinforcement Learning
[[Paper](https://arxiv.org/pdf/1907.01296)]
- NAS-FCOS: Fast Neural Architecture Search for Object Detection
[[Paper](https://arxiv.org/pdf/1906.04423)]
- Defending Against Universal Attacks Through Selective Feature Regeneration
[[Paper](https://arxiv.org/pdf/1906.03444)]
- Unsupervised Learning from Video with Deep Neural Embeddings
[[Paper](https://arxiv.org/pdf/1905.11954)]
- Towards Efficient Model Compression via Learned Global Ranking
[[Paper](https://arxiv.org/pdf/1904.12368)]
- RevealNet: Seeing Behind Objects in RGB-D Scans
[[Paper](https://arxiv.org/pdf/1904.12012)]
- Deep Parametric Shape Predictions using Distance Fields
[[Paper](https://arxiv.org/pdf/1904.08921)]
- Training Quantized Neural Networks with a Full-precision Auxiliary Module
[[Paper](https://arxiv.org/pdf/1903.11236)]
- The GAN that Warped: Semantic Attribute Editing with Unpaired Data
[[Paper](https://arxiv.org/pdf/1811.12784)]
- Semi-Supervised Semantic Image Segmentation with Self-correcting Networks
[[Paper](https://arxiv.org/pdf/1811.07073)]
- Π−nets: Deep Polynomial Neural Networks
[Paper]