Paper IDPaper TitleCategory
267Quaternion Equivariant Capsule Networks for 3D Point CloudsOral
283DeepFit: 3D Surface Fitting by Neural Network Weighted Least SquaresOral
343MoSaNAS: Multi-Objective Surrogate-Assisted Neural Architecture SearchOral
384Describing Textures using Natural LanguageOral
410Empowering Relational Network by Self-Attention Augmented Conditional Random Fields for Group Activity RecognitionOral
445AiR: Attention with Reasoning CapabilityOral
500Self6D: Self-Supervised Monocular 6D Object Pose EstimationOral
529Invertible Image RescalingOral
612Synthesize then Compare: Detecting Failures and Anomalies for Semantic SegmentationOral
677House-GAN: Relational Generative Adversarial Networks for Graph-constrained House Layout GenerationOral
736Crowdsampling the Plenoptic FunctionOral
738End-to-End Estimation of Multi-Person 3D Poses from Multiple CamerasOral
832End-to-End Object Detection with TransformersOral
840DeepSFM: Structure From Motion Via Deep Bundle AdjustmentOral
1044Ladybird: Deep Implicit Field Based 3D Reconstruction with Sampling and SymmetryOral
1059Segment as Points for Efficient Online Multi-Object Tracking and SegmentationOral
1105Conditional Convolutions for Instance SegmentationOral
1196MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and ResolutionOral
1203Fashionpedia: Ontology, Segmentation, and an Attribute Localization DatasetOral
1273Privacy Preserving Structure-from-MotionOral
1326Rewriting a Deep Generative ModelOral
1417Compare and Reweight: Distinctive Image Captioning Using Similar Images SetsOral
1448Long-term Human Motion Prediction with Scene ContextOral
1473NeRF: Representing Scenes as Neural Radiance Fields for View SynthesisOral
1501ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World ScenesOral
1737MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere ImagesOral
1793Learning and aggregating deep local descriptors for instance-level recognitionOral
1969A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point ProblemOral
2096Learn to Recover Visible Color for Video Surveillance in a DayOral
2149Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single-view ImagesOral
2193Spatially Adaptive Inference with Stochastic Feature Sampling and InterpolationOral
2211BorderDet: Border Feature for Dense Object DetectionOral
2258Regularization with Latent Space Virtual Adversarial TrainingOral
2263Du$^2$Net: Learning Depth Estimation from Dual-Cameras and Dual-PixelsOral
2307Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learningOral
2463Targeted Attack for Deep Hashing based RetrievalOral
2471Gradient Centralization: A New Optimization Technique for Deep Neural NetworksOral
2503Content-Aware Unsupervised Deep Homography EstimationOral
2556Multi-View Optimization of Local Feature GeometryOral
2597Efficient Model Fitting by Combining Lifted Optimization with Phong Surface ModelsOral
2641Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person VideoOral
2683Learning Stereo from Single ImagesOral
2748Prototype Rectification for Few-Shot LearningOral
2784Learning Feature Descriptors using Camera Pose SupervisionOral
2785Semantic Flow for Fast and Accurate Scene ParsingOral
2788Appearance Consensus Driven Self-Supervised Human Mesh RecoveryOral
2825Diffraction Line ImagingOral
2834Aligning and Projecting Images to Class-conditional Generative NetworksOral
2852Suppress and Balance: A Simple Gated Network for Salient Object DetectionOral
2904Visual Memorability for Robotic Interestingness Prediction via Unsupervised Online LearningOral
2949Post-Training Piecewise Linear Quantization for Deep Neural NetworksOral
2974Joint Disentangling and Adaptation for Cross-Domain Person Re-IdentificationOral
2978In-Home Daily-Life Captioning Using Radio SignalsOral
3018Self-Challenging Improves Cross-Domain GeneralizationOral
3029A Competence-aware Curriculum for Visual Concepts Learning via Question AnsweringOral
3047Multi-task Learning Increases Adversarial RobustnessOral
3054S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture SearchOral
3112Improving Deep Video Compression by Resolution-adaptive Flow CodingOral
3158Motion Capture from Internet VideosOral
3183Appearance-Preserving 3D Convolution for Video-based Person Re-identificationOral
3241Solving the Blind Perspective-n-Point Problem End-To-End With Robust Differentiable Geometric OptimizationOral
3265Exploiting Deep Generative Prior for Versatile Image Restoration and ManipulationOral
3312Deep Spatial-angular Regularization for Compressive Light Field Reconstruction over Coded AperturesOral
3331Video-based Remote Physiological Measurement via Cross-verified Feature DisentanglingOral
3356Combining Implicit Function Learning and Parametric Models for 3D Human ReconstructionOral
3376Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention NetworkOral
3387Mining Cross-Image Semantics for Weakly Supervised Semantic SegmentationOral
3439Coherent full scene 3D reconstruction from a single RGB imageOral
3482Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNsOral
3526RAFT: Recurrent All-Pairs Field Transforms for Optical FlowOral
3528Domain-invariant Stereo Matching NetworksOral
3538DeepHandMesh: Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling from a Single RGB ImageOral
3544Content Adaptive and Error Propagation Aware Deep Video CompressionOral
3553Towards Streaming Image UnderstandingOral
3570Towards Automated Testing and Robustification by Semantic Adversarial Data GenerationOral
3582Adversarial Generative Grammars for Human Activity PredictionOral
3587Greedy Sampler and Dumb Learner: A Surprisingly Effective Approach for Continual LearningOral
3622Learning Lane Graph Representations for Motion ForecastingOral
3651What Matters in Unsupervised Optical FlowOral
3678Synthesis and Completion of Facades from Satellite ImageryOral
3772Mapillary Planet-Scale Depth DatasetOral
3838V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and PredictionOral
3891Training Interpretable Convolutional Neural Networks by Differentiating Class-specific FiltersOral
3948EagleEye: Fast Sub-net Evaluation for Efficient Neural Network PruningOral
3975Intrinsic Point Cloud Interpolation via Dual Latent Space NavigationOral
3976Cross-Domain Cascaded Deep TranslationOral
4043"Look Ma, no landmarks!" - Unsupervised, model-based dense face alignmentOral
4158Online Invariance Selection for Local Feature DescriptorsOral
4179Rethinking image inpainting via a mutual encoder-decoder with feature equalizationOral
4358TextCaps: a Dataset for Image Captioning with Reading ComprehensionOral
4423It is not the Journey but the Destination: Endpoint Conditioned Trajectory PredictionOral
4440Learning What to Learn for Video Object SegmentationOral
4732SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D ClothingOral
4866LIMP: Learning Latent Shape Representations with Metric Preservation PriorsOral
5277Unsupervised Sketch-to-Photo SynthesisOral
5360A simple way to make neural networks robust against diverse image corruptionsOral
5457SoftpoolNet: Shape Descriptor for Point Cloud Completion and ClassificationOral
5800Hierarchical Face Aging through Disentangled Latent CharacteristicsOral
5859Hybrid Models for Open Set RecognitionOral
5932TopoGAN: A Topology-Aware Generative Adversarial NetworkOral
6101Learning to Localize Actions from MomentsOral
6147ForkGAN: Seeing into the Rainy NightOral
6209TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality LearningOral
6502ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image RetrievalOral
22A Simple and Versatile Framework for Image-to-Image TranslationSpotlight
43ProxyBNN: Learning Binarized Neural Networks via Proxy MatricesSpotlight
87Fair Attribute Classification through Latent Space De-biasingSpotlight
148HMOR: Hierarchical Multi-person Ordinal Relations for Monocular Multi-Person 3D Pose EstimationSpotlight
193Mask2CAD: 3D Shape Prediction by Learning to Segment and RetrieveSpotlight
223A Unified Framework of Surrogate Loss by Refactorization and InterpolationSpotlight
362Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric ImagesSpotlight
366Memory-augmented Dense Predictive Coding for Video Representation LearningSpotlight
378PointMixup: Augmentation for Point CloudsSpotlight
415Identity-Guided Human Semantic Parsing Learning for Person Re-IdentificationSpotlight
462Learning Gradient Fields for Shape GenerationSpotlight
467Few-Shot Unsupervised Image Translation with a Content Conditioned Style EncoderSpotlight
492Corner Proposal Network for Anchor-free, Two-stage Object DetectionSpotlight
495PhraseClick: Toward Achieving Flexible Interactive Segmentation by Phrase and ClickSpotlight
513Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video ParsingSpotlight
526Learning Delicate Local Representations for Multi-Person Pose EstimationSpotlight
544Learning to plan with uncertain topological mapsSpotlight
574Neural Design Network: Graphic Layout Generation with ConstraintsSpotlight
591Learning Open Set Network with Discriminative Reciprocal PointsSpotlight
597Convolutional Occupancy NetworksSpotlight
672Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View GeometrySpotlight
849A General Toolbox for Understanding Errors in Object DetectionSpotlight
893PointContrast: Unsupervised Pretraining for 3D Point Cloud UnderstandingSpotlight
922DSA: More Efficient Budgeted Pruning via Differentiable Sparsity AllocationSpotlight
990Circumventing Outliers of AutoAugment with Knowledge DistillationSpotlight
997S2DNet: Learning accurate correspondences for sparse-to-dense feature matchingSpotlight
1054RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous DrivingSpotlight
1062Video Object Segmentation with Graph Memory NetworkSpotlight
1101Rethinking Bottleneck Structure for Efficient Mobile Network DesignSpotlight
1104Side-Tuning: A Baseline for Network Adaptation via Additive Side NetworksSpotlight
1121Towards Part-aware Monocular 3D Human Pose Estimation: An Architecture Search ApproachSpotlight
1207A Tool for Measuring and Mitigating Bias in Visual DatasetsSpotlight
1327Contrastive Learning for Weakly Supervised Phrase GroundingSpotlight
1362Collaborative Learning of Gesture Recognition and 3D Hand Pose Estimation with Multi-Order Feature AnalysisSpotlight
1425Studying the Transferability of Adversarial Attacks on Object DetectorsSpotlight
1449TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired ImagesSpotlight
1479Semi-Siamese Training for Shallow Face LearningSpotlight
1488GAN Slimming: All-in-One Unified GAN CompressionSpotlight
1526Human Interaction Learning on 3D Skeleton Point Clouds for Video Violence RecognitionSpotlight
1530Binarized Neural Network for Single Image Super ResolutionSpotlight
1564Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic SegmentationSpotlight
1605Adaptive Computationally Efficient Network for Monocular 3D Hand Pose EstimationSpotlight
1624Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and TrackingSpotlight
1631Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed DatasetsSpotlight
1676Hamiltonian Dynamics for Real-World Shape InterpolationSpotlight
1694Learning to Scale Multilingual Representations for Vision-Language TasksSpotlight
1710Multi-modal Transformer for Video RetrievalSpotlight
1761Matching Feature Matters: End-to-End Learning for Neural Texture TransferSpotlight
1802RobustFusion: Human Volumetric Capture with Data-driven Visual Cues using a RGBD CameraSpotlight
1886Surface Normal Estimation of Tilted Images via Spatial RectifierSpotlight
1915Multimodal Shape Completion via Conditional Generative Adversarial NetworksSpotlight
1977Generative Sparse Detection Network for 3D Single-shot Object DetectionSpotlight
1987Grounded Situation RecognitionSpotlight
2019Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in VideosSpotlight
2157Unpaired Learning of Deep Blind Image DenoisingSpotlight
2191Self-supervising Fine-grained Region Similarities for Large-scale Image LocalizationSpotlight
2215Rotationally-Temporally Consistent Novel-View Synthesis of Human Performance VideoSpotlight
2272Side-Aware Boundary Localization for More Precise Object DetectionSpotlight
2314SF-Net: Single-Frame Supervision for Temporal Action LocalizationSpotlight
2317Negative Margin Matters: Understanding Margin in Few-shot ClassificationSpotlight
2323Particularity beyond Commonality: Unpaired Identity Transfer with Multiple ReferencesSpotlight
2342Tracking objects as pointsSpotlight
2390CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image SynthesisSpotlight
2402Transporting Labels via Hierarchical Optimal Transport for Semi-Supervised LearningSpotlight
2449MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task LearningSpotlight
2473Learning to Factorize a CitySpotlight
2495Region Graph Embedding Network for Zero-Shot LearningSpotlight
2534GRAB: A Dataset of Whole-Body Human Grasping of ObjectsSpotlight
2616DEMEA: Deep Mesh Autoencoders for Non-Rigidly Deforming ObjectsSpotlight
2623RANSAC-Flow: generic two-stage image alignmentSpotlight
2632Semantic Object Prediction with Binaural SoundsSpotlight
2636Neural Object Learning for 6D Pose Estimation Using a Few Cluttered ImagesSpotlight
2666Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency CheckingSpotlight
2707Pixel-Pair Occlusion Relationship Map (P2ORM): Formulation, Inference & ApplicationSpotlight
2710MovieNet: A Holistic Dataset for Movie UnderstandingSpotlight
2723Short-Term and Long-Term Context Aggregation Network for Video InpaintingSpotlight
2754Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DOF RelocalizationSpotlight
2755Face Super-Resolution Guided by 3D Facial PriorsSpotlight
2763Label Propagation with Augmented Anchors: A Simple Semi-Supervised Learning baseline for Unsupervised Domain AdaptationSpotlight
2767Are Labels Necessary for Neural Architecture Search?Spotlight
2776BLSM: A Bone-Level Skinned Model of the Human MeshSpotlight
2826Associative Alignment for Few-shot Image ClassificationSpotlight
2873Cyclic Functional Mapping:Self-supervised correspondence between non-isometric deformable shapesSpotlight
2905View-Invariant Probabilistic Embedding for Human PoseSpotlight
2918Contact and Human Dynamics from Monocular VideoSpotlight
2950PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow EstimationSpotlight
2965Point2Surf: Learning Implicit Surfaces from Point Cloud PatchesSpotlight
2983Few-Shot Scene-Adaptive Anomaly DetectionSpotlight
2986Personalized Face Modeling for Improved Face Reconstruction and Motion RetargetingSpotlight
2988Entropy Minimisation Framework for Event-based Vision Model EstimationSpotlight
2992Reconstructing NBA PlayersSpotlight
3087PIoU Loss: Towards Accurate Oriented Object Detection in Complex EnvironmentsSpotlight
3089TENet: Triple Excitation Network for Video Salient Object DetectionSpotlight
3099Deep Feedback Inverse Problem SolverSpotlight
3119Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed ClassificationSpotlight
3120Hallucinating Visual Instances in Total AbsentiaSpotlight
3125Unsupervised 3D Shape Completion in the WildSpotlight
3335DTVNet: Dynamic Time-lapse Video Generation via Single Still ImageSpotlight
3365CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding LossSpotlight
3385Collaborative Video Object Segmentation by Foreground-Background IntegrationSpotlight
3456Adaptive Margin Diversity Regularizer for handling Data Imbalance in Zero-Shot SBIRSpotlight
3477XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze VariationSpotlight
3499Calibration-free Structure-from-Motion with Calibrated Radial Trifocal TensorsSpotlight
3594Occupancy anticipation for efficient navigationSpotlight
3601Unified Image and Video Saliency ModelingSpotlight
3604TAO: A Large-scale Benchmark for Tracking Any ObjectSpotlight
3657A Generalization of Otsu's Method and Minimum Error ThresholdingSpotlight
3663A Cordial Sync: Moving Furniture by Moving Beyond Marginal PoliciesSpotlight
3665Big Transfer (BiT): General Visual Representation LearningSpotlight
3684Visual Commonsense Graphs: Reasoning about the Dynamic Context of a Still ImageSpotlight
3831Few-shot Action Recognition via Permutation-invariant AttentionSpotlight
3913Character Grounding and Re-Identification in Story of Videos and Text DescriptionsSpotlight
3977AABO: Adaptive Anchor Box Optimization for Object Detection via Bayesian Sub-samplingSpotlight
3984Learning Visual Context by ComparisonSpotlight
3994Large scale holistic video understandingSpotlight
3995Indirect Local Attacks for Context-aware Semantic Segmentation NetworksSpotlight
4294Inferring Visual Overlap of Images through Interpretable Non-Metric EmbeddingsSpotlight
4296Connecting Vision and Language with Localized NarrativesSpotlight
4383Adversarial T-shirt! Evading Person Detectors in A Physical WorldSpotlight
4404Bounding-box Channels for Visual Relationship DetectionSpotlight
4407Minimal Rolling Shutter Absolute Pose with Unknown Focal Length and Radial DistortionSpotlight
4442SRFlow: Learning the Super-Resolution Space with Normalizing FlowSpotlight
4452DeepGMR: Learning Latent Gaussian Mixture Models for RegistrationSpotlight
4458Active 3D Perception using Light CurtainsSpotlight
4521Invertible Neural BRDF for Object Inverse RenderingSpotlight
4545Semi-supervised Semantic Segmentation via Strong-weak Dual-branch NetworkSpotlight
4571Practical Deep Raw Image Denoising on Mobile DevicesSpotlight
4577Audio-Visual Embodied NavigationSpotlight
4602Two-Stream Consensus Networks for Weakly-Supervised Temporal Action LocalizationSpotlight
4677Erasing Appearance Preservation in Image SmoothingSpotlight
4727Counterfactual Vision-and-Language Navigation via Adversarial Path SamplerSpotlight
4749Guided Deep Decoder: Unsupervised Image Pair FusionSpotlight
4809Filter Style Transfer between PhotosSpotlight
4860JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network for 3D Hand Pose Estimation from a Single Depth ImageSpotlight
4867Dynamic Group Convolution for Accelerating Convolutional Neural NetworksSpotlight
4880RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and RenderingSpotlight
5021Object-Contextual Representations for Semantic SegmentationSpotlight
5116Spatio-Temporal Efficient Recurrent Neural Network for Video DeblurringSpotlight
5393The Semantic Mutex Watershed for Efficient Bottom-Up Semantic Instance SegmentationSpotlight
5471Photon-Efficient 3D Imaging with A Non-Local Neural NetworkSpotlight
5554Generative Latent Textured Proxies for Category-Level Object ModelingSpotlight
5672Improving Vision-and-Language Navigation with Image-Text Pairs from the WebSpotlight
5685Directional Temporal Modeling for Action RecognitionSpotlight
5714Shonan Rotation Averaging: Global Optimality by Surfing $SO(p)^n$Spotlight
5723Semantic Curiosity for Visual NavigationSpotlight
5821Multi-Temporal Recurrent Neural Networks For Progressive Non-Uniform Single Image Deblurring With Incremental Temporal TrainingSpotlight
5975ProgressFace: Scale-Aware Progressive Learning for Face DetectionSpotlight
6025Learning Multi-layer Latent Variable Model with Short Run Inference DynamicsSpotlight
6053CoTeRe-Net: Discovering Collaborative Ternary Relations in VideosSpotlight
6100Modeling the Effects of Windshield Refraction for Camera CalibrationSpotlight
6124Skin Segmentation from NIR Images using Unsupervised Domain Adaptation through Generative Latent SearchSpotlight
6254PROFIT: A Novel Training Method for sub-4-bit MobileNet ModelsSpotlight
6277Visual Relation Grounding in VideosSpotlight
6296Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing FlowsSpotlight
6314Controlling semantics and style in conditional image synthesisSpotlight
6360Jointly learning visual motion and confidence from local patches in event camerasSpotlight
6406SODA: Story Oriented Dense Video Captioning Evaluation FrameworkSpotlight
6490Sketch-Guided Object Localization in Natural ImagesSpotlight
6496Metric learning: cross-entropy vs. pairwise lossesSpotlight
6959Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language ModelsSpotlight
7231The Hessian Penalty: A Weak Prior for Unsupervised DisentanglementSpotlight
5STAR: Sparse Trained Articulated Human Body RegressorPoster
13Optical Flow Distillation: Towards Efficient and Stable Video Style TransferPoster
15Collaboration by Competition: Self-coordinated Knowledge Amalgamation for Multi-talent Student LearningPoster
25Do Not Disturb Me: Person Re-identification Under the Interference of Other PedestriansPoster
31Learning 3D Part Assembly from A Single ImagePoster
32PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree ConditionsPoster
50Highly Efficient Salient Object Detection with 100K ParametersPoster
69HardGAN: A Haze-Aware Representation Distillation GAN for Single Image DehazingPoster
88Lifespan Age Transformation SynthesisPoster
90Domain2Vec: Domain Embedding for Unsupervised Domain AdaptationPoster
106Synthesizing Content Consistent Vehicle Datasets with Attribute DescentPoster
116Multiview Pedestrian Detection with Feature Perspective TransformationPoster
121Learning Object Relation Graph and Tentative Policy for Visual NavigationPoster
123Adversarial Self-Supervised Learning for Semi-Supervised 3D Action RecognitionPoster
132Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal LearningPoster
138Inducing Optimal Attributes Representations for Conditional GANsPoster
152AR-Net: Adaptive Frame Resolution for Efficient Action RecognitionPoster
156Image-to-Voxel Model Translation for 3D Scene Reconstruction and SegmentationPoster
157Consistency Guided Scene Flow EstimationPoster
160Autoregressive Unsupervised Image SegmentationPoster
169Controllable Image Synthesis via SegVAEPoster
173Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture SearchPoster
177Efficient Non-Line-of-Sight Imaging by Circular and Confocal ScanningPoster
181Texture Hallucination for Large-Factor Painting Super-ResolutionPoster
183Learning Progressive Joint Propagation for Human Motion PredictionPoster
184Rolling Shutter Image Stitching and Rectification via Differential HomographyPoster
186ParSeNet: A Parametric Surface Fitting Network for 3D Point CloudsPoster
188The Group Loss for Deep Metric LearningPoster
203Learning Object Depth from Camera Motion and Video Object SegmentationPoster
206OnlineAugment: Online Data Augmentation with Less Domain KnowledgePoster
209Learning Inter-Plane Relations for Piecewise Planar ReconstructionPoster
230Intra-class Compactness Distillation for Semantic SegmentationPoster
233Temporal Distinct Representation Learning for 2D-CNN-based Action RecognitionPoster
241Representative Graph Neural NetworkPoster
264Deformation-Aware 3D Shape Embedding and RetrievalPoster
277Atlas: End-to-End 3D Scene Reconstruction from Posed ImagesPoster
278Multiple Class Novelty Detection Under the Data Distribution ShiftPoster
281Colorization of Depth Map via DisentanglementPoster
287Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor ScenesPoster
292GeoGraph: Learning graph-based multi-view object detection with geometric cues end-to-endPoster
300Localizing the Common Action Among a Few VideosPoster
306TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classificationPoster
312Traffic Accident Analysis by Cause and Effect Events LocalizationPoster
318Face Anti-Spoofing with Human Material PerceptionPoster
328How Can I See My Future? FvTraj: Using First-person View for Pedestrian Trajectory PredictionPoster
338Multiple Expert Brainstorming for Domain Adaptive Person Re-identificationPoster
344NASA: Neural Articulated Shape ApproximationPoster
350Towards Unique and Informative Captioning of ImagesPoster
352When Does Self-supervision Improve Few-shot Learning?Poster
355Two-branch Recurrent Network for Isolating Deepfakes in VideosPoster
360Incremental Few-Shot Meta-Learning via Indirect Feature AlignmentPoster
363BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage ModelsPoster
386Differentiable Hierarchical Graph Grouping for Multi-Person Pose EstimationPoster
392Global Distance-distributions Separation for Unsupervised Person Re-identificationPoster
397I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB ImagePoster
398Pose2Mesh: Graph Convolutional Network for 3D human Pose and Mesh Recovery from 2D Human PosePoster
402ALRe: Outlier Detection for Guided RefinementPoster
414Weakly-Supervised Crowd Counting Learns from Sorting rather than LocationsPoster
429Unsupervised Domain Attention Adaptation Network for Caricature Attribute RecognitionPoster
438Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object DetectionPoster
441Curriculum DeepSDFPoster
444Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio GuidancePoster
457Improved Adversarial Training via Learned OptimizerPoster
471Component Divide-and-Conquer for Real-World Image Super-ResolutionPoster
479Enabling Deep Residual Networks for Weakly Supervised Object DetectionPoster
494Deep near-light photometric stereo for spatially varying reflectancesPoster
498Learning Visual Representations with Caption AnnotationsPoster
509Solving Long-tailed Recognition with Deep Realistic Taxonomic ClassifierPoster
512Regression of Instance Boundary by Aggregated CNN and GCNPoster
520Social Adaptive Module for Weakly-supervised Group Activity RecognitionPoster
521RGB-D Salient Object Detection with Cross-Modality Modulation and SelectionPoster
524RetrieveGAN: Image Synthesis via Differentiable Patch RetrievalPoster
536Cheaper Pre-training Lunch: An Efficient Paradigm for Object DetectionPoster
566Faster Person Re-IdentificationPoster
570Quantization Guided JPEG Artifact CorrectionPoster
5713PointTM: Faster Measurement of High-Dimensional Transmission MatricesPoster
575Joint Bilateral Learning for Real-time Universal Photorealistic Style TransferPoster
581Beyond 3DMM Space: Towards Fine-grained 3D Face ReconstructionPoster
587World-Consistent Video-to-Video SynthesisPoster
596Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance SegmentationPoster
598GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the WildPoster
600Event-based Asynchronous Sparse Convolutional NetworksPoster
604AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World AssumptionPoster
607Spatiotemporal Attention Cell Search for Video ClassificationPoster
609REMIND Your Neural Network to Prevent Catastrophic ForgettingPoster
611Image Classification in the dark using Quanta Image SensorsPoster
615$n$-Reference Transfer Learning for Saliency PredictionPoster
618Progressively Guided Alternate Refinement Network for RGB-D Salient Object DetectionPoster
622Bottom-Up Temporal Action Localization with Mutual RegularizationPoster
623On Learning to Modulate the Gradient for Fast Adaptation of Neural NetworksPoster
634Domain-Specific Mappings for Generative Adversarial Style TransferPoster
636DiVA: Diverse Visual Feature Aggregation for Deep Metric LearningPoster
637DHP: Differentiable Meta Pruning via HyperNetworksPoster
639Deep Transferring QuantizationPoster
645Deep Credible Metric Learning for Unsupervised Domain Adaptation Person Re-identificationPoster
648Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification?Poster
666Arbitrary-Oriented Object Detection with Circular Smooth LabelPoster
671Learning Event-Driven Video Deblurring and InterpolationPoster
678Vectorizing world buildings: planar graph reconstruction by primitive detection and relationship inferencePoster
692Learning to Combine: Knowledge Aggregation for Multi-Source Domain AdaptationPoster
696CSCL: Critical Semantic-Consistent Learning for Unsupervised Domain AdaptationPoster
700Prototype Mixture Models for Few-shot Semantic SegmentationPoster
701Webly Supervised Image Classification with Self-Contained ConfidencePoster
704Search what you want: Barrier Panelty NAS for mixed precision quantizationPoster
709Monocular 3D Object Detection via Feature Domain AdaptationPoster
718Talking-head Generation with Rhythmic Head MotionPoster
719AUTO3D: Novel view synthesis through unsupervised-learned variational viewpoints and global 3D representationsPoster
720VPN: Learning Video-Pose Embedding for Activities of Daily LivingPoster
721Soft Anchor-Point Object DetectionPoster
735Deformable GridPoster
751Soft Expert Reward Learning for Vision-and-Language NavigationPoster
754Part-aware Prototype Network for Few-shot Semantic SegmentationPoster
759Learning from Extrinsic and Intrinsic Supervisions for Domain GeneralizationPoster
761Joint Learning of Social Groups, Individuals Action and Sub-group Activities in VideosPoster
768Whole-Body Human Pose Estimation in the WildPoster
770Relative Pose Estimation of Calibrated Cameras with Known $\mathrm{SE}(3)$ InvariantsPoster
777A Novel Compressed Sensing Approach on Convolutions and Runge-Kutta MethodsPoster
779Deep Hough Transform for Semantic Line DetectionPoster
781Cross-domain Structured Landmark Detection via Progressive Topology-Adapting Deep Graph LearningPoster
7873D Human Shape and Pose from a Single Low-Resolution ImagePoster
790Learning to Balance Specificity and Invariance for In and Out of Domain GeneralizationPoster
792Contrastive Learning for Conditional Image GenerationPoster
794DLow: Diversifying Latent Flows for Diverse Human Motion PredictionPoster
798GRNet: Gridding Residual Network for Dense Point Cloud CompletionPoster
800Learning Discriminative and Compact Representations for Gait RecognitionPoster
806Blind Face Restoration via Deep Multi-scale Component DictionariesPoster
866Robust Neural Networks inspired by Strong Stability Preserving Runge-Kutta methodsPoster
867Inequality-Constrained and Robust 3D Face Model FittingPoster
869Gabor Layers Enhance Network RobustnessPoster
871Conditional Image Repainting via Semantic Bridge and Piecewise Value FunctionPoster
872Learnable Cost Volume using the Cayley RepresentationPoster
884Learning to Adapt: Towards Resource-Efficient On-Device Adaptation Beyond Gradient DescentPoster
890Structured3D: A Large Photo-realistic Dataset for Structured 3D ModelingPoster
894BroadFace: Looking at Tens of Thousands of People at Once for Face RecognitionPoster
895Interpretable Visual Reasoning via Probabilistic Formulation under Natural SupervisionPoster
896Domain Adaptive Semantic Segmentation Using Weak LabelsPoster
898Knowledge Distillation Meets Self-SupervisionPoster
909Efficient Neighbourhood Consensus Networks via Submanifold Sparse ConvolutionsPoster
910Reconstructing the Noise Manifold for Image DenoisingPoster
916Occlusion-Aware Depth Estimation with Adaptive Normal ConstraintsPoster
927VisualEchoes: Spatial Image Representation Learning through EcholocationPoster
929Smooth-AP: Smoothing the Path Towards Large-Scale Image RetrievalPoster
942Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene SegmentationPoster
946Spatially Aware Multimodal Transformers for TextVQAPoster
948Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object DetectorPoster
960URIE: Universal Image Enhancement for Visual Recognition in the WildPoster
961Pyramid Multi-view Stereo Net with Self-adaptive View AggregationPoster
977SPL-MLL: Selecting Predictable Landmarks for Multi-Label LearningPoster
978Unpaired Image-to-Image Translation using Adversarial Consistency LossPoster
981Discriminability Distillation in Group Representation LearningPoster
983Monocular Expressive Body Regression through Body-Driven AttentionPoster
984Dual Adversarial Network: Toward Real Noise Removal and Noise GenerationPoster
986Linguistic Structure Guided Context Modeling for Referring Image SegmentationPoster
988Meta-Learning across Meta-Tasks for Few-Shot LearningPoster
994Federated Visual Classification with Real-World Data DistributionPoster
996Robust Re-Identification by Multiple Views Knowledge DistillationPoster
1003Defocus Deblurring Using Dual-Pixel DataPoster
1008RhyRNN: Rhythmic RNN for Recognizing Events in Long and Complex VideosPoster
1012Take an Emotion Walk: Perceiving Emotions from Gaits Using Hierarchical Attention Pooling and Affective MappingPoster
1022Weighting Counts: Sequential Crowd Counting by Reinforcement LearningPoster
1024Reflection Backdoor: A Natural Backdoor Attack on Deep Neural NetworksPoster
1035Learning to Learn with Variational Information Bottleneck for Domain GeneralizationPoster
1045Deep Positional and Relational Feature Learning for Rotation-Invariant Point Cloud AnalysisPoster
1046Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural NetworksPoster
1051Layered Neighborhood Expansion for Incremental Multiple Graph MatchingPoster
1057Learning To Classify Images Without LabelsPoster
1060Graph convolutional networks for learning with few clean and many noisy labelsPoster
1078Object-and-Action Aware Model for Visual Language NavigationPoster
1079A Comprehensive Study of Weight Sharing in Graph Networks for 3D Human Pose EstimationPoster
1086MuCAN: Multi-Correspondence Aggregation Network for Video Super-ResolutionPoster
1094Efficient Semantic Video Segmentation with Per-frame InferencePoster
1097Increasing the Robustness of Semantic Segmentation Models with Painting-by-NumbersPoster
1103Deep Spiking Neural Network: Energy Efficiency Through Time based CodingPoster
1137InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information ModelingPoster
1139Utilizing Patch-level Category Activation Patterns for Multiple Class Novelty DetectionPoster
1143People as Scene ProbesPoster
1147Mapping in a Cycle: Sinkhorn Regularized Unsupervised Learning for Point Cloud ShapesPoster
1148Label-Efficient Learning on Point Clouds using Approximate Convex DecompositionsPoster
1152TexMesh: Reconstructing Human Texture and Geometry from Monocular VideoPoster
1153Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling CostPoster
1162Point-Set Anchors for Object Detection, Instance Segmentation and Pose EstimationPoster
1163Modeling 3D shapes by Reinforcement LearningPoster
1164LST-Net: Learning a Convolutional Neural Networkwith a Learnable Sparse TransformPoster
1165Learning What Makes a Difference from Counterfactual Examples and Gradient SupervisionPoster
1171CN: Channel Normalization in Point CloudPoster
1182Rethinking the Defocus Blur Detection Problem and A Real-Time Deep DBD ModelPoster
1184AutoMix: Mixup Networks for Sample Interpolation via Cooperative Barycenter LearningPoster
1186Scene Text Image Super-Resolution in the WildPoster
1220Coupling Explicit and Implicit Surface Representations for Generative 3D ModelingPoster
1227Learning Disentangled Representations with Latent Variation PredictabilityPoster
1232Deep Space-Time Video Upsampling NetworksPoster
1242Large-Scale Few-Shot Learning via Multi-Modal Knowledge DiscoveryPoster
1248Fast Video Object Segmentation using Global Context ModulePoster
1263Uncertainty-aware Weakly Supervised Action Detection from Long VideosPoster
1267Selecting Relevant Features from a Universal Representation for Few-shot LearningPoster
1276MessyTable: Instance Association in Multiple Camera ViewsPoster
1277A Unified Framework for Shot Type Classification Based on Subject Centric LensPoster
1279BSL-1K: Scaling up co-articulated sign recognition using mouthing cuesPoster
1280Parametric Hand Texture Model for 3D Hand Reconstruction and PersonalizationPoster
1290CycAs: Self-supervised Cycle Association for Learning Re-identifiable Person DescriptionsPoster
1291Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary InstructionsPoster
1292Towards Real-time MOT: A Joint Solution for Detection and Appearance EmbeddingPoster
1294A Balanced and Uncertainty-aware Approach for Partial Domain AdaptationPoster
1295Unsupervised Deep Metric Learning with Transformed Attention Consistency and Contrastive Clustering LossPoster
1299STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in VideosPoster
1302Hierarchical Style-based Networks for Motion SynthesisPoster
1303Who left the dogs out? 3D Animal Reconstruction with Expectation Maximization in the LoopPoster
1308Learning to Count in the Crowd from Limited Labeled DataPoster
1314SPOD: Selective Point Cloud Densification for Better Localization in Point Cloud Object DetectionPoster
1319Explainable Face RecognitionPoster
1321From Shadow Segmentation to Shadow RemovalPoster
1322Diverse and Admissible Trajectory Prediction through Multimodal Context UnderstandingPoster
1332CONFIG: Controllable Neural Face Image GenerationPoster
1337Scene Scale Estimation from Single Image in the WildPoster
1340Procedure Planning in Instructional VideosPoster
1342Funnel Activation for Visual RecognitionPoster
1354GIQA: Generated Image Quality AssessmentPoster
1355Adversarial Continual LearningPoster
1358Adapting Object Detectors with Conditional Domain NormalizationPoster
1360HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity PredictionPoster
1363Pseudo RGB-D for Self-Improving Monocular SLAM and Depth PredictionPoster
1369Interpretable and Generalizable Person Re-identification with Query-adaptive Convolution and Temporal LiftingPoster
1372Unsupervised Bayesian Deep Learning for Image Reconstruction in Compressive SensingPoster
1380Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose RefinementPoster
1381Semi-supervised Learning with a Teacher-student Network for Generalized Attribute PredictionPoster
1391Unsupervised Domain Adaptation with Noise Resistible Mutual-Training for Person Re-identificationPoster
1395DPDist : Comparing Point Clouds Using Deep Point Cloud DistancePoster
1399Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic SegmentationPoster
1408FaceMix: Privacy-Preserving Facial Attribute Classification on the CloudPoster
1415Neural Re-Rendering of Humans from a Single ImagePoster
1420Reversing the cycle: self-supervised deep stereo through enhanced monocular distillationPoster
1421PIPAL: a Large-Scale Image Quality Assessment Dataset for Perceptual Image RestorationPoster
1422Why do These Match? Explaining the Behavior of Image Similarity ModelsPoster
1426CooGAN: A Memory-Efficient Framework for High-Resolution Facial Attribute EditingPoster
1430Progressive Transformers for End-to-End Sign Language ProductionPoster
1436 Mask TextSpotter V3: Segmentation Proposal Network for Robust Scene Text SpottingPoster
1440Making Affine Correspondences Work in Camera Geometry ComputationPoster
1445Sub-center ArcFace: Boosting Face Recognition by Large-scale Noisy Web FacesPoster
1450Foley Music: Learning to Generate Music from VideosPoster
1453Contrastive Multiview CodingPoster
1456Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against DefensesPoster
1469Generative Low-bitwidth Data Free QuantizationPoster
1470Local Correlation Consistency for Knowledge DistillationPoster
1474Perceiving 3D Human-Object SpatialArrangements from a Single Image in the WildPoster
1483Sep-Stereo: Visual-Guided Stereophonic Audio Generation by Associating Source SeparationPoster
1485CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich AnnotationsPoster
1486Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware CluesPoster
1489Weakly-Supervised Cell Tracking via Backward-and-Forward PropagationPoster
1491SeqHAND:RGB-Sequence-Based 3D Hand Pose and Shape EstimationPoster
1493Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch NormalizationPoster
1509AMLN: Adversarial-based Mutual Learning Network for Online Knowledge DistillationPoster
1514Online Multi-modal Person Search in VideosPoster
1520Single Image Super-Resolution via a Holistic Attention NetworkPoster
1535Can You Read Me Now? Content Aware Rectification using Angle SupervisionPoster
1538Momentum Batch Normalization for Deep Learning with Small Batch SizePoster
1541AdvPC: Transferable Adversarial Perturbations on 3D Point CloudsPoster
1543Edge-aware Graph Representation Learning and Reasoning for Face ParsingPoster
1547BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy NetworkPoster
1557G-LBM: Generative Low-dimensional Background Model Estimation from Video SequencesPoster
1561H3DNet: 3D Object Detection Using Hybrid Geometric PrimitivesPoster
1567Expressive Telepresence via Modular Codec AvatarPoster
1571Cascade Graph Neural Networks for RGB-D Salient Object DetectionPoster
1585FairALM: Augmented Lagrangian Method for Training Fair Models with Little RegretPoster
1586Generating Videos of Zero-Shot Compositions of Actions and ObjectsPoster
1593ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural LanguagePoster
1600Renovating Parsing R-CNN for Accurate Multiple Human ParsingPoster
1612Multi-Task Curriculum Framework for Open-Set Semi-Supervised LearningPoster
1615Gradient-Induced Co-Saliency DetectionPoster
1616Nighttime Defogging Using High-Low Frequency Decomposition and Grayscale-Color NetworksPoster
1633SegFix: Model-Agnostic Boundary Refinement for SegmentationPoster
1636Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory PredictionPoster
1637Fast Bi-layer Neural Synthesis of One-Shot Realistic Head AvatarsPoster
1644Neural Geometric Parser for Single Image Camera CalibrationPoster
1647Learning Flow-based Feature Warping for Face Frontalization with Illumination Inconsistent SupervisionPoster
1652Learning Architectures for Binary NetworksPoster
1653Semantic View SynthesisPoster
1659An Analysis of Sketched IRLS for Accelerated Sparse Residual RegressionPoster
1677Relative pose from deep learned depth and affine correspondencesPoster
1698Video Super-Resolution with Recurrent Structure-Detail NetworkPoster
1702Shape Adaptor: A Learnable Resizing ModulePoster
1712Shuffle and Attend: Video Domain AdaptationPoster
1714DRG: Dual Relation Graph for Human-Object Interaction DetectionPoster
1715Flow-edge Guided Video CompletionPoster
1721Deep End-to-End Trainable Active Contours for Building Footprint DelineationPoster
1728Towards End-to-end Video-based Eye-TrackingPoster
1732Generating Handwriting via Decoupled Style DescriptorsPoster
1742LEED: Label-Free Expression Editing via DisentanglementPoster
1763Fashion Captioning: Towards Generating Accurate Descriptions with Semantic RewardsPoster
1765Reducing Language Biases in Visual Question Answering with Visually-Grounded Question EncoderPoster
1766Unsupervised Cross-Modal Alignment For Multi-Person 3D Pose EstimationPoster
1769Class-Incremental Domain AdaptationPoster
1789Anti-Bandit Neural Architecture Search for Model DefensePoster
1792Wavelet-Based Dual-Branch Neural Network for Image DemoireingPoster
1809Low light video Enhancement using Synthetic Data Produced with an Intermediate Domain MappingPoster
1810Non-Local Spatial Propagation Network for Depth CompletionPoster
1816DanbooRegion: Illustration and Cartoon Region Dataset Annotated by Real-life ArtistsPoster
1819Event Enhanced High-Quality Image RecoveryPoster
1821PackDet: Packed Long-Head Object DetectorPoster
1825A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NASPoster
1829Learning Semantic Neural Tree for Human ParsingPoster
1834Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph GenerationPoster
1848Burst Denoising via Temporally Shifted Wavelet TransformsPoster
1849JSSR: Joint Synthesis Segmentation and Registration System for 3D Multi-Model Image AnalysisPoster
1850SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen CamerasPoster
1851ScribbleBox: Interactive Annotation Framework for Video Object SegmentationPoster
1862Rethinking Pseudo-LiDAR RepresentationPoster
1868Deep Multi Depth Panoramas for View SynthesisPoster
1880MINI-Net: Multiple Instance Ranking Network for Video Highlight DetectionPoster
1889ContactPose: A Dataset of Grasps with Object Contact and Hand PosePoster
1895API-Net: Robust Generative Classifier via a Single DiscriminatorPoster
1905Bias-based Universal Adversarial Patch Attack for Automatic Check-outPoster
1912Imbalanced Continual Learning with Partitioning Reservoir SamplingPoster
1932Guided Collaborative Training for Pixel-wise Semi-Supervised LearningPoster
1938Stacking Networks Dynamically for Image Restoration Based on the Plug-and-Play FrameworkPoster
1942Efficient Transfer Learning via Joint Adaptation of Network Architecture and WeightPoster
1951Spatial Attention Pyramid Network for Unsupervised Domain AdaptationPoster
1955GSIR: Generalizable 3D Shape Interpretation and ReconstructionPoster
1956Weakly Supervised 3D Object Detection from Lidar Point CloudPoster
1960Two-phase Pseudo Label Densification for Self-training based Domain AdaptationPoster
1972Adaptive Offline Quintuplet Loss for Image-text MatchingPoster
1973Learning Object Placement by Inpainting for Compositional Data AugmentationPoster
1978Deep Vectorization of Technical DrawingsPoster
1979Shape Fitting with Deformable CAD ModelsPoster
1991An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile DevicesPoster
2006AutoTrajectory: Label-free Trajectory Extraction and Prediction from Videos using Dynamic PointsPoster
2013Multi-Agent Embodied Question Answering in Interactive Environments via 3D ReconstructionPoster
2014Conditional Sequential Modulation for Efficient Image RetouchingPoster
2016Segmenting Transparent Objects in the WildPoster
2035Length Controllable Image CaptioningPoster
2042Few-Shot Semantic Segmentation with Democratic Attention NetworksPoster
2044Defocus Blur Detection via Depth DistillationPoster
2054Motion Guided 3D Pose Estimation from VideoPoster
2055Reflection Separation via Multi-bounce Polarization State TracingPoster
2057SIP: Spatial Information Preservation for Fast Instance SegmentationPoster
2059SemanticAdv: Generating Adversarial Examples via Attribute-conditioned Image EditingPoster
2062Learning with Noisy Class Labels for Instance SegmentationPoster
2085Deep Image Clustering with Category-Style RepresentationPoster
2090Self-supervised Learning of Motion Representation via Scattering Local Motion CuesPoster
2094Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary DatasetsPoster
2095BMBC:Bilateral Motion Estimation with Bilateral Cost Volume for Video InterpolationPoster
2100Hard negatives examples are hard, but usefulPoster
2106ReActNet: Towards Precise Binary Neural Network with Generalized Activation FunctionsPoster
2107Video Object Detection via Object-level Temporal AggregationPoster
2113Object Detection with a Unified Label Space from Multiple DatasetsPoster
2114Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3DPoster
2115Comprehensive Image Captioning via Scene Graph DecompositionPoster
2116Symbiotic Adversarial Learning for Attribute-Based Person SearchPoster
2117Amplifying Key Cues for Human-Object-Interaction DetectionPoster
2118Rethinking few-shot image classification: a good embedding is all you need?Poster
2121Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity LocalizationPoster
2129Action Localization through Continual Predictive LearningPoster
2130Generative View-Correlation Adaptation for Semi-Supervised Multi-View LearningPoster
2135ReAD: Reciprocal Attention Discriminator for Image-to-Video Re-IdentificationPoster
2136Detailed Human Shape and Pose Estimation from a Single Polarization ImagePoster
2142The Devil is in the Details: Self-Supervised Attention for Vehicle Re-IdentificationPoster
2152Improving One-stage Visual Grounding by Recursive Sub-query ConstructionPoster
2160Multi-level Wavelet-based Generative Adversarial Network for Perceptual Quality Enhancement of Compressed VideoPoster
2168Example-Guided Image Synthesis across Arbitrary Scenes using Masked Spatial-Channel Attention and Self-SupervisionPoster
2178Content-Consistent Matching for Domain Adaptive Semantic SegmentationPoster
2183AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text SpottingPoster
2186History Repeats Itself: Human Motion Prediction via Motion AttentionPoster
2189 Unsupervised Video Object Segmentation with Joint Hotspot TrackingPoster
2201SRNet: Improving Generalization in 3D Human Pose Estimation with a Split-and-Recombine ApproachPoster
2202CAFE-GAN: Arbitrary Face Attribute Editing with Complementary Attention FeaturePoster
2209MimicDet: Bridging the Gap Between One-Stage and Two-Stage Object DetectionPoster
2212Topic-aware Multi-Label ClassificationPoster
2216Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change CaptioningPoster
2235Attract, Perturb, and Explore: Learning a Feature Alignment Network for Semi-supervised Domain AdaptationPoster
2238Curriculum Manager for Source Selection in Multi-Source Domain AdaptationPoster
2244Powering One-shot Topological NAS with Stabilized Share-parameter ProxyPoster
2246Classes Matter: A Fine-grained Adversarial Approach to Cross-domain Semantic SegmentationPoster
2252Boundary-preserving Mask R-CNNPoster
2253Self-supervised Single-view 3D Reconstruction via Semantic ConsistencyPoster
2255MetaDistiller: Network Self-boosting via Meta-learned Top-down DistillationPoster
2256Learning Monocular Visual Odometry via Self-Supervised Long-Term ModelingPoster
2257The Devil is in Classification: A Simple Framework for Long-tail Instance SegmentationPoster
2266What is Learned in Deep Uncalibrated Photometric Stereo?Poster
2270Prior-based Domain Adaptive Object Detection for Hazy and Rainy ConditionsPoster
2274Adversarial Ranking Attack and DefensePoster
2279ReDro: Efficiently Learning Large-sized SPD Visual RepresentationPoster
2287Graph-Based Social Relation ReasoningPoster
2290EPNet: Enhancing Point Features with Image Semantics for 3D Object DetectionPoster
2293Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry ConsistencyPoster
2295Asynchronous Interaction Aggregation for Action DetectionPoster
2305Shape and Viewpoint without KeypointsPoster
2306Learning Attentive and Hierarchical Representations for 3D Shape RecognitionPoster
2308TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture SearchPoster
2313Associative3D: Volumetric Reconstruction from Sparse ViewsPoster
2318PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution UnitPoster
2319Memory Selection Network for Video PropagationPoster
2325Disentangled Non-local Neural NetworksPoster
2327URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale BenchmarkPoster
2329Generalizing Person Re-Identification by Camera-Aware Invariance Learning and Cross-Domain MixupPoster
2330Semi-supervised Crowd Counting via Self-training on Surrogate TasksPoster
2335Dynamic R-CNN: Towards High Quality Object Detection via Dynamic TrainingPoster
2336Boosting Decision-based Black-box Adversarial Attacks with Random Sign FlipPoster
2338Knowledge Transfer via Dense Cross-layer Mutual-distillationPoster
2339Matching Guided DistillationPoster
2341Clustering-driven Deep Autoencoder for Video Anomaly DetectionPoster
2343Learning to Compose Hypercolumns for Visual CorrespondencePoster
2348Stochastic Bundle Adjustment for Efficient and Scalable Structure from MotionPoster
2353Object-based Illumination Estimation with Rendering-aware Neural NetworksPoster
2354Progressive Point Cloud Deconvolution Generation NetworkPoster
2356SSCGAN: Facial Attribute Editing via Style Skip ConnectionsPoster
2374Negative Pseudo Labeling using Class Proportion for Semantic Segmentation in PathologyPoster
2376Learn to Propagate Reliably on Noisy Affinity GraphsPoster
2382Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture SearchPoster
2383TANet: Towards Fully Automatic Tooth ArrangementPoster
2391UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction DetectionPoster
2393GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware SupervisionPoster
2394Resolution Switchable Networks for Runtime Efficient Image ClassificationPoster
2395SMAP: Single-Shot Multi-Person Absolute 3D Pose EstimationPoster
2396Learning to Detect Open Classes for Universal Domain AdaptationPoster
2400Visual Compositional Learning for Human Object Interaction DetectionPoster
2422Deep Plastic Surgery: Robust and Controllable Image Editing with Human-Drawn SketchesPoster
2423Rethinking Class Activation Mapping for Weakly Supervised Object LocalizationPoster
2424OS2D: One-Stage One-Shot Object Detection by Matching Anchor FeaturesPoster
2426Interpretable Neural Networks DecouplingPoster
2433Omni-sourced Webly-supervised Video RecognitionPoster
2437CurveLane-NAS: Unifying Lane-Sensitive Architecture Search and Adaptive Point BlendingPoster
2442Contextual-Relation Consistent Domain Adaptation for Semantic SegmentationPoster
2455Estimating People Flows to Better Count Them in Crowded ScenesPoster
2456RAN: Resolution Adaption Network for Low-resolution Face RecognitionPoster
2460Learning Feature Embeddings for Discriminant Model based TrackingPoster
2461WeightNet: Revisiting the Design Space of Weight NetworksPoster
2472Partially-Shared Variational Auto-encoders for Unsupervised Domain Adaptation with Target ShiftPoster
2475Learning Where to Focus for Efficient Video Object DetectionPoster
2481Learning Object Permanence from VideoPoster
2492Adaptive Text Recognition through Visual MatchingPoster
2497Actions as Moving PointsPoster
2499Learning to Exploit Multiple Vision Modalities by Using Grafted NetworksPoster
2501Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the WildPoster
25053D Fluid Flow Reconstruction Using Compact Light Field PIVPoster
2510Contextual Diversity for Active LearningPoster


