Figure HELIX 全面解析:5 项全新 AI 能力将彻底变革人形机器人技术 - YouTube https://www.youtube.com/watch?v=aBP99-EhlFk
转录:
(00:00)英文: these robots can now learn as fast as you can speak to them and that's because figure just revealed its breakthrough Helix Vision language action AI model with a series of five first ever robot Technologies for General humanoid intelligence but how smart is it and what can it do put simply Helix can generalize across a variety of environments objects and tasks in the home which is particularly difficult because unlike factories where tasks are rigid and predictable households instead demand a level of adaptability to
中文: 这些机器人现在可以像您对它们说话一样快地学习,这是因为 Figure 公司刚刚发布了其突破性的 Helix 视觉语言动作 AI 模型,它包含一系列五项首创的机器人技术,旨在实现通用人形智能。但是它有多智能?它能做什么? 简而言之,Helix 可以在各种家庭环境、物体和任务中进行泛化,这尤其困难,因为与工厂中任务是固定和可预测的不同,家庭反而需要对环境的适应性,以应对
(00:30)英文: changing objects and scenarios in fact without Helix teaching a robot any skill like folding a shirt generally requires hours of coding or hundreds of demos or simulations but Helix completely flips this script by finally using Vision language models which excel at understanding scenes and language and translating them into instant robotic actions for the real world this results in the distillation of the vm's Common Sense smarts into flexible real time control with Helix allowing robots to learn as fast as humans can speak to
中文: 不断变化的物体和场景。事实上,如果没有 Helix,教机器人任何技能,比如叠衬衫,通常需要数小时的编程或数百次的演示或模拟。但是,Helix 完全颠覆了这一局面,它最终采用了视觉语言模型,这些模型擅长理解场景和语言,并将它们转化为现实世界中即时的机器人动作。 这使得 VLM(视觉语言模型)的常识智能能够提炼成灵活的实时控制,借助 Helix,机器人可以像人类对它们说话一样快地学习。
(01:04)英文: them but it's all made possible with the following five first ever Tech breakthroughs number one full upper body control Helix sets a new bar as the first VA to orchestrate a humanoids entire upper body at 200 Hertz managing a 35 degree of Freedom action space from wrist twists to finger flexes torso shifts to head tilts it handles it all with Precision moving the head or torso Alters reach and sight lines a feed feedback loop that's tripped up past systems yet Helix thrives with video demos showing a figure robot tracking
中文: 而这一切都得益于以下五项首创的技术突破。第一项:全面上半身控制。Helix 树立了新的标杆,成为首个能够以 200 赫兹的频率协调人形机器人整个上半身的 VA (视觉动作模型),管理着 35 个自由度的动作空间,从手腕扭转到手指弯曲,躯干移动到头部倾斜,它都能精准地处理。 移动头部或躯干会改变触及范围和视线,这是一个过去常常让系统卡顿的反馈循环,但 Helix 在视频演示中表现出色,展示了 Figure 机器人追踪
(01:36)英文: its hands with its head adjusting its torso and grasping objects delicately all in sync this whole body coordination want a pipe dream for high-dimensional tasks finally equips Helix to tackle intricate jobs like arranging a table or sorting laundry with a humanlike Touch number two multi-root collaboration Helix breaks more ground as the first ever Vision language action model to synchronize two robots for shared long Horizon tasks without prior training in fact this was demonstrated with two figure robots teaming up to store
中文: 它的手,头部调整姿态,躯干协同运动,并同步轻柔地抓取物体。这种全身协调,曾经是高维度任务的奢望,最终让 Helix 能够胜任复杂的任务,例如布置餐桌或整理衣物,并带有类似人类的触感。第二项:多机器人协同。 Helix 再次取得突破,成为首个能够同步两台机器人进行共享、长时程任务的视觉语言动作模型,且无需事先训练。事实上,这已通过两台 Figure 机器人合作存储
(02:08)英文: groceries grabbing items like crinkly bags or odd-shaped veggies that they've never seen before one passes a cookie bag to the other which stashes it in a drawer Guided by verbal prompts both of these robots run identical Helix weights with no custom rolls needed impressively this level of zero shot collaboration is a VA first hinting at a future where robot team can adapt to work together dynamically and this brings us to the next breakthrough from Helix number three the pickup anything emergence one of helix's standout features is its
中文: 食品杂货的演示得到证明,它们抓取了从未见过的物品,例如易皱的袋子或形状奇特的蔬菜。一台机器人将一包饼干递给另一台,另一台则将其放入抽屉中。在语言提示的引导下,这两台机器人运行着相同的 Helix 权重,无需任何自定义角色。令人印象深刻的是,这种零样本协同是 VA 领域的首创,预示着机器人团队未来可以动态适应协同工作。 这就引出了 Helix 的下一个突破,第三项: “万物皆可拾取” 的涌现能力。 Helix 的突出特点之一是其
(02:40)英文: emergent ability allowing figure robots to pick up nearly any small household object with a casual verbal command to pick that up and tests show it can handle thousands of Novel items including glassware toys tools and even messy clothes all in cluttered environments with no demos required for example when told to pick up the desert item Helix spots a toy Cactus picks the best hand and grabs it securely this language to action magic fuses internet scale comprehension with precise control making robots as adaptable as your
中文: 涌现能力,它允许 Figure 机器人通过随意的口头命令 “拿起那个” 来拿起几乎任何小型家用物品。测试表明,它可以处理数千种新颖物品,包括玻璃器皿、玩具、工具,甚至凌乱的衣服,所有这些都在杂乱的环境中完成,且无需任何演示。 例如,当被告知 “拿起甜点” 时,Helix 会识别出一个玩具仙人掌,选择最佳的手,并牢固地抓住它。这种语言到动作的魔力,将互联网规模的理解力与精确的控制相结合,使机器人像您的
(03:14)英文: imagination this finally makes it possible for humanoids to thrive in unpredictable settings with no pre-programming at all because by simply being told to pick something up the robot can just learn how on the spot all by itself bringing us to the next first ever breakthrough from helix number four one unified neural network unlike earlier vision language action models that needed task specific tweaks or multiple action heads Helix excels with a single neural network totaling 7 billion parameters for planning and
中文: 想象力一样适应性强。这最终使人形机器人有可能在不可预测的环境中生存,而无需任何预编程,因为只需被告知拿起某物,机器人就可以当场自学如何操作。 这就引出了 Helix 的下一个首创突破,第四项:统一的神经网络。 与早期需要特定任务调整或多个动作头的视觉语言动作模型不同,Helix 以单个神经网络脱颖而出,该网络共有 70 亿个参数用于规划,
(03:45)英文: another 80 million parameters for control this loan model Powers picking and placing operating drawers and fridges and multi-root handovers all while generalizing to new objects with no fine-tuning and no extra stages just raw versa ility this Simplicity opens the door to scalability in learning allowing Helix to rival specialized systems while only having a fraction of their complexity in fact it's almost like the robot has a unified brain for its expanding repository of talents streamlining its path to widespread use
中文: 另有 8000 万个参数用于控制。这个单一模型能够支持拾取和放置、操作抽屉和冰箱以及多机器人交接,同时能够泛化到新物体,无需微调,无需额外阶段,只需原始的通用性。 这种简洁性为学习的可扩展性打开了大门,使 Helix 能够与专业系统相媲美,同时只拥有它们复杂性的一小部分。事实上,这几乎就像机器人拥有一个统一的大脑,用于其不断扩展的技能库,从而简化了其广泛应用的道路。
(04:19)英文: and that brings us to Tech breakthrough number five commercial Readiness Helix is the first VA fully operational on figure robots embedded low power gpus ready for for commercial action today its dual system setup uses the robot system 2 at 7 to 9 Hertz for Big Picture thinking and its system one at 200 htz for instant action splitting the work across dual gpus to ensure seamless performance system 2 is a VM trained on vast internet data and it decodes scenes and commands while system 1 executes with realtime responsiveness and being
中文: 这就引出了第五项技术突破:商业就绪。 Helix 是首个在 Figure 机器人上完全运行的 VA,嵌入式低功耗 GPU,已为今天的商业应用做好准备。其双系统设置使用机器人系统 2,以 7-9 赫兹的频率进行宏观思考,而系统 1 以 200 赫兹的频率进行即时动作,在双 GPU 之间分配工作,以确保无缝性能。 系统 2 是一个基于海量互联网数据训练的 VM,它解码场景和命令,而系统 1 则以实时响应性执行,并且由于
(04:56)英文: trained to match onboard latency Helix runs as fast as single task policies but with no external compute needed this practicality makes it a plug-and-play solution for real world deployment from homes to service Industries out of the box but the entire secret to Helix really revolves around its core system one system 2 design which finally solves a classic robotics dilemma VMS generalize but lag while visu motor policies speed but stagnate to solve this helix's AI system to thinks slow at a speed of just 7 to 9 Hertz processing
中文: 经过训练以匹配板载延迟,Helix 的运行速度与单任务策略一样快,但无需外部计算。这种实用性使其成为即插即用的解决方案,可直接从家庭部署到服务行业。 但是,Helix 的整个秘密实际上围绕其核心系统 1 和系统 2 的设计展开,这最终解决了机器人技术中的一个经典难题:VLM 具有泛化能力但有延迟,而视觉运动策略速度快但停滞不前。 为了解决这个问题,Helix 的 AI 系统 2 以仅 7-9 赫兹的速度进行慢速思考,处理
(05:32)英文: images and prompts into a semantic Vector this takes place while helix's AI system one thinks fast at 200 HZ turning it into precise movements and adjusting mid-action as needed plus because it's built with open- Source components and trained on 500 hours of teleoperated data Helix is extremely lean and uses just 5% of typical VA data sets as a result its decoupled systems evolve independently blending speed scalability and simplicity into a Powerhouse model and unlike earlier robot systems Helix generates long Horizon collaborative
中文: 图像和提示,将其转化为语义向量。与此同时,Helix 的 AI 系统 1 以 200 赫兹的速度进行快速思考,将其转化为精确的运动,并根据需要进行动作中调整。 此外,由于 Helix 采用开源组件构建,并在 500 小时的远程操作数据上进行训练,因此它非常精简,仅使用了典型 VA 数据集的 5%。 因此,其解耦的系统独立发展,将速度、可扩展性和简洁性融合成一个强大的模型。与早期的机器人系统不同,Helix 能够即时生成长时程的协作式
(06:09)英文: dextrous manipulation on the fly with no task specific demos or extensive coding needed on top of this it boasts strong object generalization picking up thousands of Novel household items varying in shape size color and texture simply by request soon Helix could evolve to orchestrate Full House C tasks like cooking meals assembling furniture or managing chores with a single command not only that but multi-root teams maintaining entire spaces could even work together adapting to new tools or layouts instantly or even assisting
中文: 灵巧操作,无需特定任务的演示或大量的编码。 除此之外,它还拥有强大的物体泛化能力,只需简单请求,即可拾取数千种形状、尺寸、颜色和纹理各异的新颖家用物品。 很快,Helix 可能会进化到能够通过单个命令来协调全屋的任务,例如烹饪、组装家具或管理家务。 不仅如此,维护整个空间的多机器人团队甚至可以协同工作,即时适应新工具或布局,甚至辅助
(06:46)英文: humans with intuitive language- guided care as Helix matures it will likely be the backbone on top of which domestic robots borrow from blending seamlessly into our routines with humanlike flexibility this Leap Forward is pivotal for figure's mission to scale humanoid behaviors for everyday homes as for the future Helix is a Launchpad for the next figure humanoid robots which are expected to ramp up production later this year for the next leaps in dexterity and intelligence scaling to cooking entire meals or furniture
中文: 人类,提供直观的、语言引导的照护。 随着 Helix 的成熟,它很可能成为家用机器人的支柱,借鉴其技术,无缝融入我们的日常生活,并具有类似人类的灵活性。 这次飞跃对于 Figure 公司实现人形机器人在日常家庭中的普及至关重要。 至于未来,Helix 是 Figure 公司下一代人形机器人的发射台,预计将在今年晚些时候提高产量,以实现灵巧性和智能方面的进一步飞跃,扩展到烹饪整餐饭菜或组装家具等任务。
(07:18)英文: assembly could push the boundaries even further towards real life home robots when combining all five breakthroughs of full body control multi-root synchronization Universal grasp ing unified weights and Commercial viability they all begin to paint a clearer picture of the future excitingly this future looks like it will involve a generally intelligent R2D2 or c3p bro that is your best friend and can hang out do chores or work together with other robots to complete tasks you tell it to by voice command these robots can
中文: 组装家具等任务可能会将边界进一步推向真正的家用机器人。 当结合全身控制、多机器人同步、通用抓取、统一权重和商业就绪这五项突破时,它们共同开始描绘出一幅更清晰的未来图景。 令人兴奋的是,这个未来看起来将包含一个像 R2-D2 或 C-3PO 那样具有通用智能的机器人伙伴,它可以是您最好的朋友,可以一起玩耍、做家务或与其他机器人协同完成您通过语音命令指示的任务。 这些机器人能够
(07:52)英文: effectively learn on the Fly which leads to countless new possibilities with Helix being the first Vision language action model to directly control an entire humanoid upper body from natural language but how hard would it be for a hacker to execute tasks remotely anyways like And subscribe and tell us in the comments whether you'd trust this robot in your home how much you'd pay for it and thanks for watching
中文: 有效地即时学习,这为 Helix 带来了无数新的可能性,它是首个能够通过自然语言直接控制整个人形机器人上半身的视觉语言动作模型。 但是,黑客远程执行任务有多困难呢? 请点赞、订阅,并在评论中告诉我们您是否会信任这款机器人进入您的家,您愿意为它支付多少钱? 感谢您的观看。