CVPR 2024 | idea这不就有了!扩散diffusion模型100+篇论文、40+研究方向(清单版)...

30个方向130篇!CVPR 2023最全AIGC论文

30个方向!ICCV 2023 最全AIGC论文

25个方向!CVPR 2022 GAN论文汇总

 35个方向!ICCV 2021 最全GAN论文汇总

超110篇!CVPR 2021 最全GAN论文梳理

超100篇!CVPR 2020 最全GAN论文梳理 

最新视觉顶会 CVPR 2024 会议,涌现出大量基于生成式AIGC的CV论文,尤其扩散模型diffusion为代表!除直接生成,还广泛应用在各类 low-level、high-level 视觉任务!本文集齐和梳理CVPR 2024共40+方向、AIGC+扩散模型论文!均已分类打包好!

关注【机器学习与AI生成创作】公众号,后台回复 CVPR2024 (长按红字、选中复制)即可获取分类、按文件夹汇总好的论文集!!!

本文为清单版,详细版文章很长(CVPR 2024 | 绝了!!最新 diffusion 扩散模型梳理!100+篇论文、40+研究方向!),梳理不易,越到后面越有趣!麻烦列位,转发、分享、在看三连,多多鼓励!!!

扩散模型应用方向目录

  • 1、扩散模型改进

  • 2、可控文生图

  • 3、风格迁移

  • 4、人像生成

  • 5、图像超分

  • 6、图像恢复

  • 7、目标跟踪

  • 8、目标检测

  • 9、关键点检测

  • 10、deepfake检测

  • 11、异常检测

  • 12、图像分割

  • 13、图像压缩

  • 14、视频理解

  • 15、视频生成

  • 16、倾听人生成

  • 17、数字人生成

  • 18、新视图生成

  • 19、3D相关

  • 20、图像修复

  • 21、草图相关

  • 22、版权隐私

  • 23、数据增广

  • 24、医学图像

  • 25、交通驾驶

  • 26、语音相关

  • 27、姿势估计

  • 28、图相关

  • 29、动作检测/生成

  • 30、机器人规划/智能决策

  • 31、视觉叙事/故事生成

  • 32、因果生成

  • 33、隐私保护-对抗估计

  • 34、扩散模型改进-补充

  • 35、交互式可控生成

  • 36、图像恢复-补充

  • 37、域适应-迁移学习

  • 38、手交互

  • 39、伪装检测

  • 40、多任务学习

  • 41、轨迹预测

  • 42、场景生成

  • 43、流估计-3D相关

一、扩散模型改进

1、Accelerating Diffusion Sampling with Optimized Time Steps

2、DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

  • https://github.com/mit-han-lab/distrifuser

3、Balancing Act: Distribution-Guided Debiasing in Diffusion Models

4、Few-shot Learner Parameterization by Diffusion Time-steps

5、Structure-Guided Adversarial Training of Diffusion Models

6、Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models

  • https://github.com/PangzeCheung/SingDiffusion

7、Boosting Diffusion Models with Moving Average Sampling in Frequency Domain

8、Towards Memorization-Free Diffusion Models

9、SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer

二、可控文生图

10、ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

  • https://lukashoel.github.io/ViewDiff/

11、NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

  • https://github.com/univ-esuty/noisecollage

12、Discriminative Probing and Tuning for Text-to-Image Generation

  • https://github.com/LgQu/DPT-T2I

13、Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs

14、Face2Diffusion for Fast and Editable Face Personalization

  • https://github.com/mapooon/Face2Diffusion

15、LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

  • https://github.com/ewrfcas/LeftRefill

16、InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

  • https://jiuntian.github.io/interactdiffusion/

17、MACE: Mass Concept Erasure in Diffusion Models

  • https://github.com/Shilin-LU/MACE

18、MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

  • https://migcproject.github.io/

19、One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications

  • https://lyumengyao.github.io/projects/spm

20、FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models

三、风格迁移

21、DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

  • https://tianhao-qi.github.io/DEADiff/

22、Deformable One-shot Face Stylization via DINO Semantic Guidance

  • https://github.com/zichongc/DoesFS

23、One-Shot Structure-Aware Stylized Image Synthesis

四、人像生成

24、Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis

  • https://github.com/YanzuoLu/CFLD

25、High-fidelity Person-centric Subject-to-Image Synthesis

  • https://github.com/CodeGoat24/Face-diffuser

26、Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

  • https://hcplayercvpr2024.github.io/

27、A Unified and Interpretable Emotion Representation and Expression Generation

  • https://emotion-diffusion.github.io/

28、CosmicMan: A Text-to-Image Foundation Model for Humans

  • https://cosmicman-cvpr2024.github.io/

29、DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans

30、Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On

五、图像超分

31、Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder

32、Diffusion-based Blind Text Image Super-Resolution

33、Text-guided Explorable Image Super-resolution

34、Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model

  • https://github.com/dongrunmin/RefDiff

六、图像恢复

35、Boosting Image Restoration via Priors from Pre-trained Models

36、Image Restoration by Denoising Diffusion Models with Iteratively Preconditioned Guidance

37、Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

  • https://yuhaoliu7456.github.io/Diff-Plugin/

38、Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model

  • https://github.com/iSEE-Laboratory/DiffUIR

39、Shadow Generation for Composite Image Using Diffusion Model

  • https://github.com/bcmi/Object-Shadow-Generation-Dataset-DESOBAv2

七、目标跟踪

40、Delving into the Trajectory Long-tail Distribution for Muti-object Tracking

  • https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT

八、目标检测

41、SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

  • https://github.com/zhanggang001/HEDNet

42、DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

43、SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection

九、关键点检测

44、Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery

十、deepfake检测

####45、Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

十一、异常检测

46、RealNet: A Feature Selection Network with Realistic Synthetic Anomaly for Anomaly Detection

  • https://github.com/cnulab/RealNet

十二、抠图/分割

47、In-Context Matting
  • https://github.com/tiny-smart/in-context-matting/tree/master

十三、图像压缩

48、Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis

十四、视频理解

49、Abductive Ego-View Accident Video Understanding for Safe Driving Perception

  • http://www.lotvsmmau.net/

十五、视频生成

50、FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

51、Grid Diffusion Models for Text-to-Video Generation

52、TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

  • https://trip-i2v.github.io/TRIP/

53、Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

  • https://github.com/thuhcsi/S2G-MDDiffusion

54、Video Interpolation With Diffusion Models

  • https://vidim-interpolation.github.io/

十六、倾听人生成

55、CustomListener: Text-guided Responsive Interaction for User-friendly Listening Head Generation

  • https://customlistener.github.io/

十七、数字人生成

56、Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

  • https://github.com/ICTMCG/Make-Your-Anchor

十八、新视图生成

57、EscherNet: A Generative Model for Scalable View Synthesis

十九、3D相关

58、Bayesian Diffusion Models for 3D Shape Reconstruction

59、DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior

  • https://github.com/tyhuang0428/DreamControl

60、DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance

  • https://github.com/Carmenw1203/DanceCamera3D-Official

61、DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

  • https://tangjiapeng.github.io/projects/DiffuScene/

62、IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

  • https://yushuang-wu.github.io/IPoD/

63、Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance

  • https://afford-motion.github.io/

64、MicroDiffusion: Implicit Representation-Guided Diffusion for 3D Reconstruction from Limited 2D Microscopy Projections

  • https://github.com/UCSC-VLAA/MicroDiffusion

65、Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

  • https://stellarcheng.github.io/Sculpt3D/

66、Score-Guided Diffusion for 3D Human Recovery

  • https://statho.github.io/ScoreHMR/

67、Towards Realistic Scene Generation with LiDAR Diffusion Models

  • https://github.com/hancyran/LiDAR-Diffusion

68、VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation

  • https://vp3d-cvpr24.github.io/

二十、图像修复

69、Structure Matters: Tackling the Semantic Discrepancy in Diffusion Models for Image Inpainting

  • https://github.com/htyjers/StrDiffusion

二十一、草图相关

70、It’s All About Your Sketch: Democratising Sketch Control in Diffusion Models

71、Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

二十二、版权隐私

72、CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion

  • https://github.com/Nicholas0228/Revelio

73、CPR: Retrieval Augmented Generation for Copyright Protection

二十三、数据增广

74、SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation

75、ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object

  • https://github.com/chenshuang-zhang/imagenet_d

二十四、医学图像

76、MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant

二十五、交通驾驶

77、Controllable Safety-Critical Closed-loop Traffic Simulation via Guided Diffusion

  • https://safe-sim.github.io/

78、Generalized Predictive Model for Autonomous Driving

二十六、语音相关

79、FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models

  • https://shivangi-aneja.github.io/projects/facetalk/

80、ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis

  • https://vcai.mpi-inf.mpg.de/projects/ConvoFusion/

81、Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

  • https://yzxing87.github.io/Seeing-and-Hearing/

二十七、姿势估计

82、Object Pose Estimation via the Aggregation of Diffusion Features

  • https://github.com/Tianfu18/diff-feats-pose

二十八、图相关

83、DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly

  • https://github.com/IIT-PAVIS/DiffAssemble

二十九、动作检测或生成

84、Action Detection via an Image Diffusion Process

85、Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

  • https://li-ronghui.github.io/lodge

86、OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

  • https://tr3e.github.io/omg-page/

三十、机器人规划/智能决策

87、SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

  • https://skilldiffuser.github.io/

三十一、视觉叙事-故事生成

88、Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models

  • https://haoningwu3639.github.io/StoryGen_Webpage/

三十二、因果归因

89、 ProMark: Proactive Diffusion Watermarking for Causal Attribution

三十三、隐私保护-对抗估计

90、Robust Imperceptible Perturbation against Diffusion Models

  • https://github.com/liuyixin-louis/MetaCloak

三十四、扩散模型改进-补充

91、Condition-Aware Neural Network for Controlled Image Generation

  • https://github.com/mit-han-lab/efficientvit

三十五、交互式可控生成

92、Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

  • https://github.com/haofengl/DragNoise

三十六、图像恢复-补充

93、Generating Content for HDR Deghosting from Frequency View

三十七、域适应/迁移学习

94、Unknown Prompt, the only Lacuna: Unveiling CLIP’s Potential for Open Domain Generalization

  • https://github.com/mainaksingha01/ODG-CLIP

三十八、手交互

95、Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction

  • https://github.com/JunukCha/Text2HOI

96、InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

  • https://jyunlee.github.io/projects/interhandgen/

三十九、伪装检测

97、LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion

  • https://github.com/PanchengZhao/LAKE-RED

四十、多任务学习

98、DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

  • https://prismformore.github.io/diffusionmtl/

四十一、轨迹预测

99、SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model

  • https://github.com/inhwanbae/SingularTrajectory

四十二、场景生成

100、SemCity: Semantic Scene Generation with Triplane Diffusion

  • https://github.com/zoomin-lee/SemCity

四十三、3D相关/流估计

101、DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Iterative Diffusion-Based Refinement

关注公众号【机器学习与AI生成创作】,更多精彩等你来读

如何跟进 AIGC+CV 视觉前沿技术?

CVPR 2024 | diffusion扩散模型梳理!100+论文、40+方向!

ICCV 2023 | diffusion扩散模型方向!百篇论文

CVPR 2023 | 30个方向130篇!最全 AIGC 论文一口读完

深入浅出stable diffusion:AI作画技术背后的潜在扩散模型论文解读

深入浅出ControlNet,一种可控生成的AIGC绘画生成算法! 

经典GAN不得不读:StyleGAN

f380515c15e3c8d96150cf2084127640.png 戳我,查看GAN的系列专辑~!

最新最全100篇汇总!生成扩散模型Diffusion Models

ECCV2022 | 生成对抗网络GAN部分论文汇总

CVPR 2022 | 25+方向、最新50篇GAN论文

 ICCV 2021 | 35个主题GAN论文汇总

超110篇!CVPR 2021最全GAN论文梳理

超100篇!CVPR 2020最全GAN论文梳理

拆解组新的GAN:解耦表征MixNMatch

StarGAN第2版:多域多样性图像生成

附下载 | 《可解释的机器学习》中文版

附下载 |《TensorFlow 2.0 深度学习算法实战》

附下载 |《计算机视觉中的数学方法》分享

《基于深度学习的表面缺陷检测方法综述》

《零样本图像分类综述: 十年进展》

《基于深度神经网络的少样本学习综述》

《礼记·学记》有云:独学而无友,则孤陋而寡闻

点击跟进 AIGC+CV视觉 前沿技术,真香!,加入 AI生成创作与计算机视觉 知识星球!

  • 12
    点赞
  • 28
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值