CVPR 2024 论文列表

一、Python所有方向的学习路线

Python所有方向路线就是把Python常用的技术点做整理,形成各个领域的知识点汇总,它的用处就在于,你可以按照上面的知识点去找对应的学习资源,保证自己学得较为全面。

二、学习软件

工欲善其事必先利其器。学习Python常用的开发软件都在这里了,给大家节省了很多时间。

三、入门学习视频

我们在看视频学习的时候,不能光动眼动脑不动手,比较科学的学习方法是在理解之后运用它们,这时候练手项目就很适合了。

网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。

需要这份系统化学习资料的朋友,可以戳这里获取

一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!

paper | code

E-CIR: Event-Enhanced Continuous Intensity Recovery(事件增强的连续强度恢复)

keywords: Event-Enhanced Deblurring, Video Representation

paper | code

图像编辑/图像修复(Image Edit/Inpainting)

High-Fidelity GAN Inversion for Image Attribute Editing(用于图像属性编辑的高保真 GAN 反演)

paper | code

Style Transformer for Image Inversion and Editing(用于图像反转和编辑的样式transformer)

paper | code

MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting(用于高保真图像修复的多级交互式 Siamese 过滤)

paper | code

HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)

keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks

paper

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding(增量transformer结构增强图像修复与掩蔽位置编码)

keywords: Image Inpainting, Transformer, Image Generation

paper | code

图像翻译(Image Translation)

Globetrotter: Connecting Languages by Connecting Images(通过连接图像连接语言)

paper

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation(图像翻译中对比学习的查询选择注意)

paper | code

FlexIT: Towards Flexible Semantic Image Translation(迈向灵活的语义图像翻译)

paper

Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks(探索图像到图像翻译任务中对比学习的补丁语义关系)

keywords: image translation, knowledge transfer,Contrastive learning

paper

风格迁移(Style Transfer)

Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization(任意风格迁移和域泛化的精确特征分布匹配)

paper | code

Style-ERD: Responsive and Coherent Online Motion Style Transfer(响应式和连贯的在线运动风格迁移)

paper

CLIPstyler: Image Style Transfer with a Single Text Condition(具有单一文本条件的图像风格转移)

keywords: Style Transfer, Text-guided synthesis, Language-Image Pre-Training (CLIP)

paper

人脸(Face)


人脸(Face)

Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?(跨模态感知者:可以从声音中收集面部几何形状吗?)

paper

Portrait Eyeglasses and Shadow Removal by Leveraging 3D Synthetic Data(利用 3D 合成数据去除人像眼镜和阴影)

paper | code

HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule Network(分层解析胶囊网络的无监督人脸部分发现)

paper

FaceFormer: Speech-Driven 3D Facial Animation with Transformers(FaceFormer:带有transformer的语音驱动的 3D 面部动画)

paper | code

Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning(用于鲁棒人脸对齐和地标固有关系学习的稀疏局部补丁transformer)

paper | code

人脸识别/检测(Facial Recognition/Detection)

Privacy-preserving Online AutoML for Domain-Specific Face Detection(用于特定领域人脸检测的隐私保护在线 AutoML)

paper

An Efficient Training Approach for Very Large Scale Face Recognition(一种有效的超大规模人脸识别训练方法)

paper | code

人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)

FENeRF: Face Editing in Neural Radiance Fields(神经辐射场中的人脸编辑)

paper

GCFSR: a Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors(一种没有面部和 GAN 先验的生成可控人脸超分辨率方法)

paper

Sparse to Dense Dynamic 3D Facial Expression Generation(稀疏到密集的动态 3D 面部表情生成)

keywords: Facial expression generation, 4D face generation, 3D face modeling

paper

人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)

Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing(通过 Shuffled Style Assembly 进行域泛化以进行人脸反欺骗)

paper | code

Voice-Face Homogeneity Tells Deepfake

paper | code

Protecting Celebrities with Identity Consistency Transformer(使用身份一致性transformer保护名人)

paper

目标跟踪(Object Tracking)


目标跟踪(Object Tracking)

Transforming Model Prediction for Tracking(转换模型预测以进行跟踪)

paper | code

MixFormer: End-to-End Tracking with Iterative Mixed Attention(具有迭代混合注意力的端到端跟踪)

paper | code

Unsupervised Domain Adaptation for Nighttime Aerial Tracking(夜间空中跟踪的无监督域自适应)

paper | code

Iterative Corresponding Geometry: Fusing Region and Depth for Highly Efficient 3D Tracking of Textureless Objects(迭代对应几何:融合区域和深度以实现无纹理对象的高效 3D 跟踪)

paper | [code]( )

TCTrack: Temporal Contexts for Aerial Tracking(空中跟踪的时间上下文)

paper | code

Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds(超越 3D 连体跟踪:点云中 3D 单对象跟踪的以运动为中心的范式)

keywords: Single Object Tracking, 3D Multi-object Tracking / Detection, Spatial-temporal Learning on Point Clouds

paper

Correlation-Aware Deep Tracking(相关感知深度跟踪)

paper

图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)


图像&视频检索/视频理解(Image&Video Retrieval/Video Understanding)

Bridging Video-text Retrieval with Multiple Choice Questions(桥接视频文本检索与多项选择题)

paper | code

BEVT: BERT Pretraining of Video Transformers(视频Transformer的 BERT 预训练)

keywords: Video understanding, Vision transformers, Self-supervised representation learning, BERT pretraining

paper | code

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

E2(GO)MOTION: Motion Augmented Event Stream for Egocentric Action Recognition(用于以自我为中心的动作识别的运动增强事件流)

paper

Look for the Change: Learning Object States and State-Modifying Actions from Untrimmed Web Videos(寻找变化:从未修剪的网络视频中学习对象状态和状态修改操作)

paper | code

DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition(鲁棒动作识别的 Transformer 方法中的定向注意)

paper

Self-supervised Video Transformer(自监督视频transformer)

paper | code

Spatio-temporal Relation Modeling for Few-shot Action Recognition(小样本动作识别的时空关系建模)

paper | code

RCL: Recurrent Continuous Localization for Temporal Action Detection(用于时间动作检测的循环连续定位)

paper

OpenTAL: Towards Open Set Temporal Action Localization(走向开放集时间动作定位)

paper | code

End-to-End Semi-Supervised Learning for Video Action Detection(视频动作检测的端到端半监督学习)

paper

Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos(模态特定注释视频上多模态动作识别的可学习不相关模态丢失)

paper

Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation(通过代表性片段知识传播的弱监督时间动作定位)

paper | code

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars(通过咨询示例进行有效且高效的在线动作检测)

keywords: Online action detection(在线动作检测)

paper

行人重识别/检测(Re-Identification/Detection)

Cascade Transformers for End-to-End Person Search(用于端到端人员搜索的级联transformer)

paper | code

图像/视频字幕(Image/Video Caption)

Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources(通过在线资源对上下文外图像进行开放域、基于内容、多模式的事实检查)

paper | code

Hierarchical Modular Network for Video Captioning(用于视频字幕的分层模块化网络)

paper | code

X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)

paper

医学影像(Medical Imaging)


医学影像(Medical Imaging)

ACPL: Anti-curriculum Pseudo-labelling for Semi-supervised Medical

### CVPR 2024 扩散模型论文解读 CVPR(计算机视觉和模式识别会议)作为顶级学术会议之一,在2024年的议程中涵盖了众多前沿研究领域,其中包括扩散模型的研究进展[^1]。扩散模型作为一种强大的生成模型框架,近年来受到了广泛关注。 #### 扩散模型概述 扩散模型通过逐步向数据添加噪声来学习其分布特性,并反过来利用这一过程生成新的样本。该方法最初受到非平衡热力学中的扩散方程启发而得名。在图像生成任务上表现出色的同时,也逐渐扩展到其他模态的数据处理当中。 #### 论文亮点分析 针对CVPR 2024所收录的相关工作,部分研究表明如何改进现有架构以提高效率并减少计算成本;另一些则探索了不同应用场景下扩散模型的应用潜力,比如医学影像重建、视频预测等领域内的创新应用案例。 ```python import torch.nn as nn class DiffusionModel(nn.Module): def __init__(self, input_size, hidden_layers): super(DiffusionModel, self).__init__() self.layers = nn.Sequential( nn.Linear(input_size, hidden_layers), nn.ReLU(), # 更多层... ) def forward(self, x): return self.layers(x) ``` #### 实验结果与讨论 实验结果显示,经过优化后的算法能够在保持高质量输出的前提下显著降低训练时间及资源消耗。此外,对于特定行业需求定制化的解决方案也被提出,进一步证明了此类技术的强大适应性和广阔前景。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值