ALOHA 机械臂令人难以置信的全能机器人向我们展示了一个无需做家务的未来

原文链接：Incredible generalist robots do your laundry and dishes

Pi-zero enables the robots to open the washing dryer, fill the laundry hamper, close the dry and then fold the clothes in ways specific to each item

Physical Intelligence

Pi-zero 可以让机器人打开烘干洗衣机、往脏衣篮里装衣服、关上烘干洗衣机，然后以每件衣物特有的方式折叠衣服

Physical Intelligence

Emerging startup Physical Intelligence has no interest in building robots. Instead, the team has something better in mind: powering the hardware with the continuously learning generalist 'brains' of AI software, so existing machines will be able to autonomously carry out a growing amount of tasks that require precise movements and dexterity – including housework.

新兴初创公司 Physical Intelligence 对制造机器人不感兴趣。相反，这个团队有更加宏大的想法：利用人工智能软件持续学习的通用“大脑”为硬件提供动力，从而使现有机器能够自主执行越来越多需要精确动作和灵巧度的任务——包括家务活。

Over the past year we've seen robot dogs dancing, even some equipped to shoot flames, as well as increasingly advanced humanoids and machines built for specialist roles on assembly lines. But we're still waiting for our Rosey the Robot from The Jetsons.

过去一年，我们看到了会跳舞的机器狗，甚至还有一些配备了喷射火焰装置的机器狗，以及越来越先进的人形机器人和专为生产线特定岗位打造的机器。但我们仍在等待《摩登原始人》中的机器人罗西。

But we may be there soon. San Francisco's Physical Intelligence (Pi) has revealed its generalist AI model for robotics, which can empower existing machines to perform various tasks – in this case, getting the washing out of the dryer and folding clothes, delicately packing eggs into their container, grinding coffee beans and 'bussing' tables. It's not a stretch to imagine that this system could see these mobile metal helpers rolling through the house, vacuuming, packing and unpacking the dishwasher, making the bed, looking in the refrigerator and pantry to catalog their contents and coming up with a plan for dinner – and, hey, why not, also cooking that dinner.

但我们可能很快就会实现。旧金山 Physical Intelligence（PI）公司已经公布了其面向机器人的通用人工智能模型，该模型能够增强现有机器执行各种任务的能力——比如从烘干机里拿出洗好的衣物并将衣服叠好、小心翼翼地把鸡蛋装进盒子里、研磨咖啡豆以及收拾餐桌。不难想象，这样的系统能让这些移动的金属助手在屋子里穿梭，打扫房间、装卸洗碗机里的餐具、铺床、查看冰箱和食品柜里的物品并记录、制定晚餐计划，嘿，为什么不呢，还能烹饪晚餐。

It's with this vision that Pi reveals its "general-purpose robot foundational model" known as π₀ (pi-zero).

正是基于这一愿景，Pi 推出了其“通用机器人基础模型”，即 π0（pi-zero）。

"We believe this is a first step toward our long-term goal of developing artificial physical intelligence, so that users can simply ask robots to perform any task they want, just like they can ask large language models (LLMs) and chatbot assistants," the company explains. "Like LLMs, our model is trained on broad and diverse data and can follow various text instructions. Unlike LLMs, it spans images, text, and actions and acquires physical intelligence by training on embodied experience from robots, learning to directly output low-level motor commands via a novel architecture. It can control a variety of different robots, and can either be prompted to carry out the desired task, or fine-tuned to specialize it to challenging application scenarios."

公司解释道：“我们相信，这是朝着我们开发人工智能体这一长期目标迈出的第一步，让用户只需像向大型语言模型（LLM）和聊天机器人助手发出指令一样，向机器人发出指令，要求其执行任何任务即可。”“与LLM一样，我们的模型也是在广泛且多样化的数据上进行训练的，可以遵循各种文本指令。但与LLM不同的是，它涵盖了图像、文本和动作，并通过对机器人实体体验的训练获得人工智能体的能力，通过一种新型架构学习直接输出低级运动指令。它可以控制各种不同类型的机器人，既可以被提示执行所需任务，也可以经过微调，专门用于应对具有挑战性的应用场景。”

In their research, pi-zero demonstrates how a variety of jobs requiring different levels of dexterity and movements can be performed by hardware trained by the AI. In total, the foundational model carried out 20 tasks, all requiring different skills and manipulations.

在他们的研究中，pi-zero 证明了经过人工智能训练的硬件可以执行各种需要不同灵活度和动作的工作。总体而言，基础模型完成了 20 项任务，每项任务都需要不同的技能和操作。

"Our goal in selecting these tasks is not to solve any particular application, but to start to provide our model with a general understanding of physical interactions – an initial foundation for physical intelligence," the team notes.

该团队指出：“我们选择这些任务的目的不是为了解决任何特定的应用问题，而是为了开始为我们的模型提供一种对物理交互的通用理解——这是physical intelligence的初步基础。”

π₀ is a VLA generalist: - it performs dexterous tasks (laundry folding, table bussing and many others) - transformer+flow matching combines benefits of VLM pre-training and continuous action chunks at 50Hz - it's pre-trained on a large π dataset spanning many form factors

π₀是一个视觉-语言-动作（VLA）通用模型，它能够执行灵巧的任务（如叠衣服、收拾桌子等），通过将Transformer与流匹配（flow matching）相结合，集成了视觉语言模型（VLM）预训练的优势，并能以50赫兹的频率输出连续的动作块，它还在一个涵盖多种形态的庞大π数据集上进行了预训练。

Now, I'm the last person at New Atlas to get excited about robotics, largely because most of what we've seen have been specialist machines – and, to be honest, I've had my fill of humanoids moving boxes from point A to B. In biology, specialists are very good at exploiting one niche – for example bees, butterflies and the koala – and do it exceptionally well. That is, until external forces such as habitat loss or disease, reveals their limitations.

现在，在新智元（New Atlas），我可能是最后一个对机器人技术感到兴奋的人，很大程度上是因为我们看到的大多数都是专业机器——而且，坦率地说，我已经看够了人形机器把箱子从A点搬到B点。在生物学中，专家非常擅长利用一个生态位——比如蜜蜂、蝴蝶和考拉——并且做得非常出色。也就是说，直到栖息地丧失或疾病等外部力量出现，才暴露出它们的局限性。

However, generalists – like a racoon or a grizzly bear – may not be as good at occupying one niche as others, but they're far more adaptable to a wider range of habitats and food sources. Which ultimately makes them more suited to dynamic changes in the environment.

然而，像浣熊或灰熊这样的杂食动物，可能并不比其他动物更擅长占据某一生态位，但它们对更广泛的栖息地和食物来源的适应能力却强得多。而这最终使它们更能适应环境的动态变化。

Similarly, generalist robots will be able to do more than expertly build a brick wall; and, capable of learning, they will be able to adapt to different challenges in the physical world and have a suite of ever-evolving skills.

同样地，通用型机器人不仅能出色地砌一堵墙，而且具备学习能力，能够适应物理世界中的不同挑战，并且拥有一系列不断进化的技能。

Pi-zero uses internet-scale vision-language model (VLM) pre-training with flow matching to synchronize its movements with its AI learnings. Its pre-training included 10,000 hours of "dexterous manipulation data" from seven different robot configurations, as well as 68 tasks. This was in addition to existing robot manipulation datasets from OXE, DROID and Bridge.

Pi-zero 使用互联网规模的视觉语言模型 (VLM) 预训练，并通过流匹配将其动作与 AI 学习同步。其预训练包括来自七种不同机器人配置的 10,000 小时“灵巧操作数据”，以及 68 项任务。此外，还有来自 OXE、DROID 和 Bridge 的现有机器人操作数据集。

See this long, uncut video of the robot doing laundry folding with a single model

看看这段长长的、未经剪辑的视频，一个机器人用同一个模型折叠衣物

We compare π₀ and π₀-small (non-VLM version) to a number of prior models:

- Octo and OpenVLA for 0-shot VLA

- ACT and Diffusion Policy for single task

It outperforms zero-shot on seen tasks, fine-tuning to new tasks, and at following language

我们将π₀和π₀-small（非VLM版本）与多种先前的模型进行了比较：

- 对于0样本视觉语言对齐（VLA），比较了Octo和OpenVLA

- 对于单任务，比较了ACT和扩散策略（Diffusion Policy）

它在已见任务的零样本学习、新任务的微调以及语言跟随方面均表现更佳

"Dexterous robot manipulation requires pi-zero to output motor commands at a high frequency, up to 50 times per second," the team notes. "To provide this level of dexterity, we developed a novel method to augment pre-trained VLMs with continuous action outputs via flow matching, a variant of diffusion models. Starting from diverse robot data and a VLM pre-trained on Internet-scale data, we train our vision-language-action flow matching model, which we can then post-train on high-quality robot data to solve a range of downstream tasks.

“灵巧的机器人操作需要pi-zero以高达每秒50次的高频率输出电机指令，”团队指出。“为了达到这种灵巧程度，我们开发了一种新方法，即通过流匹配（一种扩散模型的变体）增强预训练的视觉语言模型（VLM），使其具有连续动作输出能力。我们从多样化的机器人数据和在互联网规模数据上预训练的VLM开始，训练我们的视觉-语言-动作流匹配模型，然后可以在高质量的机器人数据上对其进行后训练，以解决一系列下游任务。

"To our knowledge, this represents the largest pre-training mixture ever used for a robot manipulation model," the researchers noted in their study.

“据我们所知，这是用于机器人操作模型的最大规模的预训练混合数据集，”研究人员在研究中指出。

While the company is still in its early days of research and development, Pi co-founder and CEO Karol Hausman – a scientist who previously worked on robotics at Google – believes its foundational model will overcome existing hurdles in the field of generalisation, including the amount of time and cost involved in training the hardware on physical world data in order to learn new tasks. The Pi team also includes co-founder Sergey Levine, who has pioneered robotics development at Stanford University and Brian Ichter, former research scientist at Google.

虽然该公司仍处于研发初期，但Pi的联合创始人兼首席执行官卡罗尔·豪斯曼（Karol Hausman）认为，其基础模型将克服通用化领域现有的障碍。豪斯曼曾是一名科学家，之前在谷歌从事机器人技术研究。这些障碍包括为了在物理世界的数据上训练硬件以学习新任务而投入的大量时间和成本。Pi团队还包括联合创始人谢尔盖·莱文（Sergey Levine），他曾是斯坦福大学机器人技术开发的先驱，以及前谷歌研究科学家布莱恩·伊彻特（Brian Ichter）。

In 2023, satirist and architect Karl Sharro went viral with his tweet: "Humans doing the hard jobs on minimum wage while the robots write poetry and paint is not the future I wanted." The same year, Hollywood ground to a halt as members of the Writers Guild of America went on strike, seeing the bleak path ahead for creatives in the face of this new age of technology.

2023年，讽刺作家兼建筑师卡尔·沙罗（Karl Sharro）的一条推特迅速走红：“人类做着最低工资的辛苦工作，而机器人写着诗、画着画，这不是我想要的未来。”同年，好莱坞陷入停滞，美国作家协会的成员发起罢工，因为在这个技术新纪元面前，创意工作者的前景堪忧。

And while AI may still be coming – and has already come – for many of our jobs (you don't have to remind us journalists of that), Pi's vision feels more in line with those of the mid-20th century futurists, who saw a world in which the machines made our lives easier. Call me naive, perhaps, but if a robot comes for my housework, it can take it.

虽然人工智能可能仍在进军（而且已经进军了）我们的许多工作岗位（你无需提醒我们记者这一点），但Pi的愿景似乎更符合20世纪中叶未来主义者的愿景，他们看到了一个机器让我们的生活变得更轻松的世界。也许你会觉得我天真，但如果一个机器人能帮我做家务，我会欣然接受。

You can see more videos of the drills the team put the pi-zero robots through on the Pi blog post, but here's one that demonstrates its impressive – and delicate – work.

你可以在Pi的博客文章中看到该团队让pi-zero机器人进行的更多训练视频，但下面这个视频展示了它令人印象深刻且精细的工作。

Sorting processed eggs

对加工过的鸡蛋进行分类

The research paper on pi-zero's development and training can be found here.

有关pi-zero的开发与训练的研究论文可在此处查阅。

Source: Physical Intelligence

来源: Physical Intelligence

产品名称	京东店铺
智能佳Mobile ALOHA2 机械臂完整套装斯坦福ALOHA 深度学习家政服务ROS开源实验平台高端复合机器人 ALOHA 2机械臂	https://item.jd.com/10097978503518.html
智能佳机械臂 Mobile ALOHA 斯坦福机械臂完整复刻版复合机器人远程操控机械臂ROS开源学习实验平台 Mobile ALOHA 机械臂	https://item.jd.com/10100493559285.html

您对此产品感兴趣，请联系我们！

智能佳机器人

400 099 1872

www.bjrobot.com

京东店铺：智能佳机器人专营店 - 京东 (jd.com)

淘宝店铺：首页-智能佳机器人-淘宝网 (taobao.com)

企业淘宝：首页-智能佳机器人官方店铺-淘宝网 (taobao.com)

posted @ 2024-11-12 17:38 智能佳机器人阅读(300) 评论(0) 收藏举报

刷新页面返回顶部

智能佳机器人

ALOHA 机械臂 令人难以置信的全能机器人向我们展示了一个无需做家务的未来

公告

ALOHA 机械臂令人难以置信的全能机器人向我们展示了一个无需做家务的未来