具身智能基础路线

参考地址:https://www.bilibili.com/video/BV1d5ukedEsi/

github地址:https://github.com/yunlongdong/Awesome-Embodied-AI

Scene Understanding

Image

DescriptionPaperCode
SAMSegmentationhttps://arxiv.org/abs/2304.02643GitHub - facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
YOLO-WorldOpen-Vocabulary Detectionhttps://arxiv.org/abs/2401.17270GitHub - AILab-CVC/YOLO-World: [CVPR 2024] Real-Time Open-Vocabulary Object Detection

Segment anything(SAM)论文及demo使用保姆级教程-CSDN博客

Point Cloud

DescriptionPaperCode
SAM3DSegmentationhttps://arxiv.org/abs/2306.03908GitHub - Pointcept/SegmentAnything3D: [ICCV'23 Workshop] SAM3D: Segment Anything in 3D Scenes
PointMixerUnderstandinghttps://arxiv.org/abs/2401.17270GitHub - LifeBeyondExpectations/PointMixer

Multi-Modal Grounding

DescriptionPaperCode
GPT4VMLM(Image+Language->Language)https://arxiv.org/abs/2303.08774
Claude3-OpusMLM(Image+Language->Language)Introducing the next generation of Claude \ Anthropic
GLaMMPixel Groundinghttps://arxiv.org/abs/2311.03356GitHub - mbzuai-oryx/groundingLMM: [CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
All-SeeingPixel Groundinghttps://arxiv.org/abs/2402.19474GitHub - OpenGVLab/all-seeing: [ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"
LEO3Dhttps://arxiv.org/abs/2311.12871GitHub - embodied-generalist/embodied-generalist: [ICML 2024] Official code repository for 3D embodied generalist agent LEO

ICML'24开源 | LEO:首个三维世界中的具身通用智能体 - 哔哩哔哩

Data Collection

From Video

DescriptionPaperCode
Vid2Robothttps://vid2robot.github.io/vid2robot.pdf
RT-Trajectoryhttps://arxiv.org/abs/2311.01977
MimicPlayhttps://mimic-play.github.io/assets/MimicPlay.pdfGitHub - j96w/MimicPlay: "MimicPlay: Long-Horizon Imitation Learning by Watching Human Play" code repository

Hardware

DescriptionPaperCode
UMITwo-Fingershttps://arxiv.org/abs/2402.10329GitHub - real-stanford/universal_manipulation_interface: Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
DexCapFive-Fingershttps://dex-cap.github.io/assets/DexCap_paper.pdfGitHub - j96w/DexCap: [RSS 2024] "DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation" code repository
HIRO HandHand-over-handhttps://sites.google.com/view/hiro-hand

Generative Simulation

DescriptionPaperCode
MimicGenhttps://arxiv.org/abs/2310.17596GitHub - NVlabs/mimicgen_environments: This code corresponds to simulation environments used as part of the MimicGen project.
RoboGenhttps://arxiv.org/abs/2311.01455GitHub - Genesis-Embodied-AI/RoboGen: A generative and self-guided robotic agent that endlessly propose and master new skills.

Action Output

动作规划

Generative Imitation Learning

DescriptionPaperCode
Diffusion Policyhttps://arxiv.org/abs/2303.04137GitHub - real-stanford/diffusion_policy: [RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
ACThttps://arxiv.org/abs/2304.13705GitHub - tonyzhaozh/act

ffordance Map

DescriptionPaperCode
CLIPortPick&Placehttps://arxiv.org/pdf/2109.12098.pdfGitHub - cliport/cliport: CLIPort: What and Where Pathways for Robotic Manipulation
Robo-AffordancesContact&Post-contact trajectorieshttps://arxiv.org/abs/2304.08488GitHub - shikharbahl/vrb
Robo-ABChttps://arxiv.org/abs/2401.07487GitHub - TEA-Lab/Robo-ABC: This is the official repository of "Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation"
Where2ExploreFew shot learning from semantic similarityhttps://proceedings.neurips.cc/paper_files/paper/2023/file/0e7e2af2e5ba822c9ad35a37b31b5dd4-Paper-Conference.pdf
Move as You Say, Interact as You CanAffordance to motion from diffusion modelhttps://arxiv.org/pdf/2403.18036.pdf
AffordanceLLMGrounding affordance with LLMhttps://arxiv.org/pdf/2401.06341.pdf
Environment-aware Affordancehttps://proceedings.neurips.cc/paper_files/paper/2023/file/bf78fc727cf882df66e6dbc826161e86-Paper-Conference.pdf
OpenADOpen-Voc Affordance Detection from point cloudhttps://www.csc.liv.ac.uk/~anguyen/assets/pdfs/2023_OpenAD.pdfGitHub - Fsoft-AIC/Open-Vocabulary-Affordance-Detection-in-3D-Point-Clouds: [IROS 2023] Open-Vocabulary Affordance Detection in 3d Point Clouds
RLAffordEnd-to-End affordance learning with RLhttps://gengyiran.github.io/pdf/RLAfford.pdf
General FlowCollect affordance from videohttps://general-flow.github.io/general_flow.pdfGitHub - michaelyuancb/general_flow: Repository for "General Flow as Foundation Affordance for Scalable Robot Learning"
PreAffordancePre-grasping planninghttps://arxiv.org/pdf/2404.03634.pdf
ScenFun3dFine-grained functionality&affordance in 3D scenehttps://aycatakmaz.github.io/data/SceneFun3D-preprint.pdfGitHub - SceneFun3D/scenefun3d: SceneFun3D ToolKit

Question&Answer from LLM

DescriptionPaperCode
COPAhttps://arxiv.org/abs/2403.08248
ManipLLMhttps://arxiv.org/abs/2312.16217
ManipVQAhttps://arxiv.org/pdf/2403.11289.pdfGitHub - SiyuanHuang95/ManipVQA: ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models

Language Corrections

DescriptionPaperCode
OLAFhttps://arxiv.org/pdf/2310.17555
YAYRobothttps://arxiv.org/abs/2403.12910GitHub - yay-robot/yay_robot: PyTorch implementation of YAY Robot

Planning from LLM

DescriptionPaperCode
SayCanAPI Levelhttps://arxiv.org/abs/2204.01691google-research/saycan at master · google-research/google-research · GitHub
VILAPrompt Levelhttps://arxiv.org/abs/2311.17842

  • 5
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值