基于segment anything model(SAM)相关性研究的各个方向论文/项目汇总

简介

有关anything相关的主流任务: 2d检测相关(AnyObject), 3d检测相关(Any3D),AI生成相关(AnyGeneration), AI模型优化相关(), AI任务相关, etc.

  • AnyObject - 分割、检测、分类、医学图像、OCR、姿态等。
  • AnyGeneration - 文本到图像的生成、编辑、修复、样式转换等。
  • Any3D - 3D 生成、分割等。
  • AnyModel - 任何修剪、任何量化、模型重使用。
  • AnyTask -LLM 控制器 + ModelZoo,通用解码,多任务学习。
  • AnyX - 其他主题:字幕等

anything项目整理

AnyObject

Title & AuthorsIntroUseful Links

Segment Anything
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick
> Meta Research
> Preprint’23

[Segment Anything (Project)]
在这里插入图片描述
[Github]
[Page]
[Demo]

OVSeg: Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu
> Meta Research
> Preprint’23

[OVSeg (Project)]
image[Github]
[Page]

Learning to Segment Every Thing
Ronghang Hu, Piotr Dollar, Kaiming He, Trevor Darrell, Ross Girshick
> UC Berkeley, FAIR
> CVPR’18

[seg_every_thing (Project)]
image[Github]
[Page]

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu and Zhaoyang Zeng and Tianhe Ren and Feng Li and Hao Zhang and Jie Yang and Chunyuan Li and Jianwei Yang and Hang Su and Jun Zhu and Lei Zhang
> IDEA-Research
> Preprint’23

[Grounded-SAM, GroundingDINO (Project)]
在这里插入图片描述
[Github]
[Demo]

SegGPT: Segmenting Everything In Context
Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang
> BAAI-Vision
> Preprint’23

[SegGPT (Project)]
image[Github]
V3Det: Vast Vocabulary Visual Detection Dataset
Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, Dahua Lin
> Shanghai AI Laboratory, CUHK
> Preprint’23
image

segment-anything-video (Project)
Kadir Nar
在这里插入图片描述

[Github]

Towards Segmenting Anything That Moves
Achal Dave, Pavel Tokmakov, Deva Ramanan
> ICCV’19 Workshop

[segment-any-moving (Project)]
[Github]

Semantic Segment Anything
Jiaqi Chen, Zeyu Yang, Li Zhang

[Semantic-Segment-Anything (Project)]
image[Github]

Grounded Segment Anything: From Objects to Parts (Project)
Peize Sun and Shoufa Chen
[Github]

GroundedSAM-zero-shot-anomaly-detection (Project)
Yunkang Cao
image[Github]

Segment Anything Labelling Tool (SALT) (Project)
Anurag Ghosh
[Github]

Prompt-Segment-Anything (Project)
Rockey
[Github]

SAM-RBox (Project)
Qingyun Li
intro[Github]

VISAM (Project)
Feng Yan, Weixin Luo, Yujie Zhong, Yiyang Gan, Lin Ma
[Github]

Segment Anything EO tools: Earth observation tools for Meta AI Segment Anything (Project)
Aliaksandr Hancharenka, Alexander Chichigin
[Github]

napari-segment-anything: Segment Anything Model (SAM) native Qt UI (Project)
Jordão Bragantini, Kyle I S Harrington, Ajinkya Kulkarni
image[Github]

SAM-Medical-Imaging: Segment Anything Model (SAM) native Qt UI (Project)
Jordão Bragantini, Kyle I S Harrington, Ajinkya Kulkarni
image[Github]

OCR-SAM: Combining MMOCR with Segment Anything & Stable Diffusion. (Project)
Zhenhua Yang, Qing Jiang
[Github]

segment-anything-u-specify: using sam+clip to segment any objs u specify with text prompts. (Project)
MaybeShewill-CV
[Github]

Segment Everything Everywhere All at Once
Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, Jianfeng Gao, Yong Jae Lee

[SEEM (Project)]
[Github]

SegDrawer: Simple static web-based mask drawer (Project)
Harry
[Github]

Magic Copy: a Chrome extension (Project)
Harry
image[Github]

Track Anything: Segment Anything Meets Videos
Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng

[Track-Anything (Project)]
[Github]
[Demo]

Count Anything (Project)
Liqi Yan
image[Github]

Segment-and-Track-Anything (Project)
Zongxin Yang
image[Github]

Pose for Everything: Towards Category-Agnostic Pose Estimation
Lumin Xu*, Sheng Jin*, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
> CUHK, SenseTime
> ECCV’22 Oral

[Pose-for-Everything (Project)]
[Github]

Relate Anything Model (Project)
Zujin Guo*, Bo Li*, Jingkang Yang*, Zijian Zhou*, Ziwei Liu
> MMLab@NTU
> VisCom Lab, KCL/TongJi
Github

SegmentAnyRGBD (Project)
Jun Cen, Yizheng Wu, Xingyi Li, Jingkang Yang, Yixuan Pei, Lingdong Kong
> Visual Intelligence Lab@HKUST,
> HUST,
> MMLab@NTU,
> Smiles Lab@XJTU,
> NUS
Github



AnyGeneration

Title & AuthorsIntroUseful Links

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer
> LMU München, Runway ML
> CVPR’22

[Stable-Diffusion (Project)]
intro[Github]
[Page]
[Demo]

Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang, Maneesh Agrawala
> Stanford University
> Preprint’23

[ControlNet (Project)]
intro[Github]
[Demo]
GigaGAN: Large-scale GAN for Text-to-Image Synthesis
Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park
> POSTECH, Carnegie Mellon University, Adobe Research
> CVPR’23
image[Page]

Inpaint-Anything: Segment Anything Meets Image Inpainting (Project)
Tao Yu
[Github]

IEA: Image Editing Anything (Project)
Zhengcong Fei
intro[Github]

EditAnything (Project)
Shanghua Gao, Pan Zhou
[Github]

Segment Anything for Stable Diffusion Webui (Project)
Chengsong Zhang
image[Github]

Segment Anything with Clip (Project)
Jinwoo Park
intro[Github]

ShowAnything: Edit and Generate Anything In Image and Video (Project)
Showlab, NUS
Github

Transfer-Any-Style: About An interactive demo based on Segment-Anything for style transfer (Project)
LV-Lab, NUS
Github



Any3D

Title & AuthorsIntroUseful Links

Anything-3D: Segment-Anything + 3D, Let’s lift the anything to 3D (Project)
LV-Lab, NUS
Github

SAM 3D Selector: Utilizing segment-anything to help the region selection of 3D point cloud or mesh. (Project)
Nexuslrf
Github

3D-Box via Segment Anything. (Project)
dvlab-research
[Github]

Segment Anything 3D (Project)
Yunhan Yang, Xiaoyang Wu
[Github]



AnyModel

Title & AuthorsIntroUseful Links
[
DepGraph: Towards Any Structural Pruning
Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, Xinchao Wang
> Learning and Vision Lab @ NUS
> CVPR’23

[Torch-Pruning (Project)]
[Github]
[Demo]

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark
Yuhang Li and Mingzhu Shen and Jian Ma and Yan Ren and Mingxin Zhao and Qi Zhang and Ruihao Gong and Fengwei Yu and Junjie Yan
> SenseTime Research
> NeurIPS’21

[MQBench (Project)]
intro[Github]
[Page]

OTOv2: Automatic, Generic, User-Friendly
Tianyi Chen, Luming Liang, Tianyu Ding, Ilya Zharkov
> Microsoft
> ICLR’23

[Only Train Once (Project)]
intro[Github]

Deep Model Reassembly
Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, Xinchao Wang
LV Lab, NUS
> NeurIPS’22

[Deep Model Reassembly (Project)]
[Github]
[Page]



AnyTask

Title & AuthorsIntroUseful Links

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang
> Zhejiang University, MSRA
Preprint’23

[Jarvis (Project)]
[Github]
[Demo]
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan
> Microsoft
> > Preprint’23
[Github]

Generalized Decoding for Pixel, Image and Language
Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, Jianfeng Gao
> Microsoft
> CVPR’23

[X-Decoder (Project)]
intro[Github]
[Page]
[Demo]

Pre-Trained Image Processing Transformer
Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen
> Huawei-Noah
> CVPR’21

[Pretrained-IPT (Project)]
[Github]

OpenAGI: When LLM Meets Domain Experts
Yingqiang Ge, Wenyue Hua, Jianchao Ji, Juntao Tan, Shuyuan Xu, Yongfeng Zhang
> Rutgers University
> Preprint’23

[OpenAGI (Project)]
Github



AnyX

Title & AuthorsIntroUseful Links

Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang, Jinrui Zhang, Junjie Fei, Hao Zheng, Yunlong Tang, Zhe Li, Mingqi Gao, Shanshan Zhao
> SUSTech VIP Lab
> Preprint’23

Caption Anything (Project)
[Github]
[Demo]

Image2Paragraph:Transform Image into Unique Paragraph (Project)
Jinpeng Wang
Github



论文汇总

AnyObejct

PaperFirst AuthorVenueTopic
Segment AnythingAlexander KirillovPreprint’23Segmentation
Learning to Segment Every ThingRonghang HuCVPR’18
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object DetectionShilong LiuPreprint’23Grouding+Detection
SegGPT: Segmenting Everything In ContextXinlong WangPreprint’23Segmentation
V3Det: Vast Vocabulary Visual Detection DatasetJiaqi WangPreprint’23Dataset
Pose for Everything: Towards Category-Agnostic Pose EstimationLumin XuECCV’22 OralPose

AnyGeneration

PaperFirst AuthorVenueTopic
High-Resolution Image Synthesis with Latent Diffusion ModelsRobin RombachCVPR’22Text-to-Image Generation
Adding Conditional Control to Text-to-Image Diffusion ModelsLvmin ZhangPreprint’23Controlllable Generation
GigaGAN: Large-scale GAN for Text-to-Image SynthesisMinguk KangCVPR’23Large-scale GAN
Inpaint Anything: Segment Anything Meets Image InpaintingTao YuPreprint’23Inpainting

AnyModel

PaperFirst AuthorVenueTopic
DepGraph: Towards Any Structural PruningGongfan FangCVPR’23Network Pruning
MQBench: Towards Reproducible and Deployable Model Quantization BenchmarkYuhang LiNeurIPS’21Network Quantization
OTOv2: Automatic, Generic, User-FriendlyTianyi ChenICLR’23Network Pruning
Deep Model ReassemblyXingyi YangNeurIPS’22Model Reuse

AnyTask

PaperFirst AuthorVenueTopic
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFaceYongliang ShenPreprint’23Modelzoo + LLM
TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIsYaobo LiangPreprint’23Modelzoo + LLM
Generalized Decoding for Pixel, Image and LanguageXueyan ZouCVPR’23Multi Tasking
Pre-Trained Image Processing TransformerChen, HantingCVPR’21Low-level Vision
  • 1
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AI扩展坞

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值