数据集
收集视觉、触觉、力觉、运动轨迹以及机器人本体状态等多源异构数据
采集方式主要分为真实数据和仿真数据两大类
关键字
机器人类型--机器人包括单臂、双臂和四足机器人
本体:主流机械臂(UR5、Franka、Kuka和Flexiv)、
4种夹爪(Dahuan AG95、WSG50、Robotiq-85和Franka)、
3种力传感器(OptoForce 、ATI Axia80-M20和Franka)
环境 environment :
场景 Scene :
任务 task : 任务长度 长程任务 平均轨迹长度为655帧 task_name
技能 Manipulations : Picking(抓)、Moving(移动)、Pushing(推)、Placing(放 ) Wiping、Assembling、Turning on等
基础的抓取、放置、推、拉等操作,到搅拌、折叠、熨烫等精细长程、双臂协同复杂交互
操作对象 : 物品类型 材质方面包括木质、地毯、石制等9种主要材料
片段 episode :
轨迹 trajectory :
智能体 agent :
奖励函数(Reward) : 用于量化动作效果的代码模块,每个细分部分(如速度、角速度、加速度、碰撞等)都设置了相应惩罚或奖励
输出:7维末端执行器动作离散量(6D for EE: x, y, z, roll, pitch, yaw and 1D for gripper)外加对轨迹状态的说明(terminate_episode)
线性速度 角速度 朝向 orientation
##机器人
机器人(robot) 关节(joint 夹具 gripper
头部(head)
腰部(waist)指机器人的腰部关节,有两个自由度(俯仰pitch和升降lift
末端(end) 效应器(effector)
轮式全向移动底盘 机械臂
控制器:
传感器:力矩传感器,也被叫做扭矩传感器
执行器
末端执行器 夹抓(gripper) 灵巧手
###机械和物理
扭矩 torques
dof_vel 自由度的速度 加速度acc
线性速度指令(xy 轴) 角速度指令(偏航)
自由度(DOFs,Degrees of Freedom)
barett 机械手共有4个自由度,3个手指屈伸自由度,一个旋转自由度。
机械手控制分为 position模式(直接发布机械手关节绝对位置,机械手自动达到),
velocity模式(设置某关节的运动角速度)
name(控制的关节名称),position(需要设置的关节位置),velocity(需要设置的关节速度),
effort(需要设置的最大功率,速度一定时决定机械手停止的最大握力)
六轴力/力矩(Force/Torque)传感器也经常被称作多轴力/力矩传感器系,多轴加载单元,F/T传感器,或者六轴加载单元
state 机械臂的关节角度 小红方块位置和方向 目标位置(target x,y,zx,y,zx,y,z)
action 一个动作是一个机器人命令。
在该例子中为电机扭矩(如下图所示,七个关节的电机扭矩τ1−τ7 )
运动学数据,包括关节角度、扭矩,以及夹爪的位置、朝向和开合状态
机器人操作器(Robot end effector),简称“Robot end”,是机器人的一种关键组成部分,也称为末端执行器
双绳驱仿生灵巧手DexHand
teleoperate -远程操纵
###算法
强化学习 模仿学习
强化学习的基本流程(在具身智能中):
状态(State):来自智能体的传感器输入,比如相机图像、激光雷达、IMU 等。
动作(Action):智能体可以执行的操作,比如移动、抓取、旋转等
奖励(Reward):智能体完成某个目标(或接近目标)后获得的反馈
转移函数(transition function)
当我们对当前状态施加一个动作之后,机器臂就进入了下一个状态。
我们使用转移函数T\mathcal{T}T描述状态如何根据动作随时间变化。确定性转移和随机性转移
模仿学习
基于力学原理设计规则:属于“经典机器人学”(Classic Robotics)
###数据
场景系列(Series)- 任务(Tasks)- 片段(Episodes)的层次结构
数据集
谷歌Open X-Embodiment开源数据集 2023年10月
整合了60个已有数据集,涵盖311个场景、 100 多万条真实机器人轨迹,包括527种技能、160266项任务
LeRobot的数据集
├ hf_dataset: a Hugging Face dataset (backed by Arrow/parquet). Typical features example:
│ ├ observation.images.cam_high (VideoFrame):VideoFrame = {'path': path to a mp4 video, 'timestamp' (float32): timestamp in the video}
│ │
│ ├ observation.state (list of float32): position of an arm joints (for instance)
│ ├ action (list of float32): goal position of an arm joints (for instance)
│ ├ episode_index (int64): index of the episode for this sample
│ ├ frame_index (int64): index of the frame for this sample in the episode ; starts at 0 for each episode
│ ├ timestamp (float32): timestamp in the episode
│ ├ next.done (bool): indicates the end of an episode ; True for the last frame in each episode
│ └ index (int64): general index in the whole dataset
├ episode_data_index: contains 2 tensors with the start and end indices of each episode
│ ├ from (1D int64 tensor): first frame index for each episode — shape (num episodes,) starts with 0
│ └ to: (1D int64 tensor): last frame index for each episode — shape (num episodes,)
├ stats: a dictionary of statistics (max, mean, min, std) for each feature in the dataset, for instance
│ ├ observation.images.cam_high: {'max': tensor with same number of dimensions (e.g. `(c, 1, 1)` for images, `(c,)` for states), etc.}
├ info: a dictionary of metadata on the dataset
│ ├ codebase_version (str): this is to keep track of the codebase version the dataset was created with
│ ├ fps (float): frame per second the dataset is recorded/synchronized to
│ ├ video (bool): indicates if frames are encoded in mp4 video files to save space or stored as png files
│ └ encoding (dict): if video, this documents the main options that were used with ffmpeg to encode the videos
├ videos_dir (Path): where the mp4 videos or png images are stored/accessed
└ camera_keys (list of string): the keys to access camera features in the item returned by the dataset (e.g. `["observation.images.cam_high", ...]`
宇树数据集
meta/info.json:
"observation.state": {
"dtype": "float32",
"shape": [ 14 ],
"names": [ [
"kLeftWaist", 腰部
"kLeftShoulder", 肩
"kLeftElbow", 肘部
"kLeftForearmRoll", 前臂
"kLeftWristAngle", 手腕
"kLeftWristRotate",
"kLeftGripper",
"kRightWaist", "kRightShoulder", "kRightElbow",
"kRightForearmRoll",
"kRightWristAngle", "kRightWristRotate", "kRightGripper"
] ]},
"observation.state": {
"dtype": "float32",
"shape": [ 16 ],
"names": [
[
"kLeftShoulderPitch", "kLeftShoulderRoll", "kLeftShoulderYaw",
"kLeftElbow",
"kLeftWristRoll", "kLeftWristPitch", "kLeftWristYaw",
"kRightShoulderPitch", "kRightShoulderRoll", "kRightShoulderYaw",
"kRightElbow",
"kRightWristRoll", "kRightWristPitch","kRightWristYaw",
"kLeftGripper", "kRightGripper"
]
]
}
智元数据集 AgiBot World 发布时间2024.10
data
├── task_info
│ ├── task_327.json
├── observations
│ ├── 327 # This represents the task id.
│ │ ├── 648642 # This represents the episode id.
│ │ │ ├── depth # This is a folder containing depth information saved in PNG format.
│ │ │ ├── videos # This is a folder containing videos from all camera perspectives.
├── parameters
│ ├── 327
│ │ ├── 648642
│ │ │ ├── camera
├── proprio_stats
│ ├── 327[task_id]
│ │ ├── 648642[episode_id]
│ │ │ ├── proprio_stats.h5 # This file contains all the robot's proprioceptive information.
│ │ ├── 648649
│ │ │ └── proprio_stats.h5
Group Shape Meaning
/timestamp [N] timestamp in nanoseconds
/state/effector/position (gripper) [N, 2] left [:, 0], right [:, 1], gripper open range in mm
/state/effector/position (dexhand) [N, 12] left [:, :6], right [:, 6:], joint angle in rad
/state/end/orientation [N, 2, 4] left [:, 0, :], right [:, 1, :], flange quaternion with xyzw
/state/end/position [N, 2, 3] left [:, 0, :], right [:, 1, :], flange xyz in meters
/state/head/position [N, 2] yaw [:, 0], pitch [:, 1], rad
/state/joint/current_value [N, 14] left arm [:, :7], right arm [:, 7:]
/state/joint/position [N, 14] left arm [:, :7], right arm [:, 7:], rad
/state/robot/orientation [N, 4] quaternion in xyzw, yaw only
/state/robot/position [N, 3] xyz position, where z is always 0 in meters
/state/waist/position [N, 2] pitch [:, 0] in rad, lift [:, 1]in meters
/action/*/index [M] actions indexes refer to when the control source is actually sending signals
/action/effector/position (gripper) [N, 2] left [:, 0], right [:, 1], 0 for full open and 1 for full close
/action/effector/position (dexhand) [N, 12] same as /state/effector/position
/action/effector/index [M_1] index when the control source for end effector is sending control signals
/action/end/orientation [N, 2, 4] same as /state/end/orientation
/action/end/position [N, 2, 3] same as /state/end/position
/action/end/index [M_2] same as other indexes
/action/head/position [N, 2] same as /state/head/position
/action/head/index [M_3] same as other indexes
/action/joint/position [N, 14] same as /state/joint/position
/action/joint/index [M_4] same as other indexes
/action/robot/velocity [N, 2] vel along x axis [:, 0], yaw rate [:, 1]
/action/robot/index [M_5] same as other indexes
/action/waist/position [N, 2] same as /state/waist/position
/action/waist/index [M_6] same as other indexes
数据集特点
1)图像、点云、声音、文本和触觉,旨在提供更丰富和多样化的数据,以增强机器人的感知和交互能力。
2)基于时间戳的对齐机制:基于时间戳实现多传感器数据同步记录,兼容不同采样频率,以确保多模态数据的同步。具体来说,相机数据以30Hz的频率记录,激光雷达数据以10Hz记录,本体感知数据以200Hz记录,触觉数据以100Hz记
从物理信号到数据
主从臂零位校准
标定文件解读:
homing offset:
表示每个电机的零点偏移(Homing Offset),单位为步数(Steps)。它是将实际电机读数调整到标定目标位置(通常为0°或90°)所需的修正量。
drive mode:
表示每个电机的驱动方向(DriveMode),0表示原始方向(无需反转),1表示反转方向。
start pos :表示标定过程中手动移到“零位”(Zero Position)时的电机读数(单位:步数),即 zero_pos。
end pos :表示标定过程中手动移到“旋转位”(Rotated Position,90°)时的电机读数(单位:步数),即 rotated_pos。
calib mode :表示每个电机的标定模式,DEGREE表示旋转关节(单位:度,范围[-180,180]),LINEAR表示线性关节(范围[0,100])。
motor names: 机械臂电机的名称列表,与其他参数一一对应。指定参数的适用对象,表示6个自由度(DOF)的机械臂:
shoulder pan: 肩部水平旋转
shoulder lift: 肩部俯仰
elbow flex: 肘部弯曲
wrist flex: 腕部俯仰
wrist roll: 腕部滚动
gripper: 夹爪
标定结束,终端会输出leader和follower的关节参数
旋转关节机器人
D-H参数包含α(连杆扭转角)、a(连杆长度)、d(连杆偏距)以及θ(关节角)这四个运动学参数,
其中前两个参数用来描述连杆本身,
后两个参数用来描述连杆之间的连接关系,
这种用连杆参数描述机构运动关系的规则称为Denavit-Hartenberg方法,相应的参数称为D-H参数
运动学:
正运动学:已知机器人各关节变量的取值,确定末端执行器的位置和姿态。
逆运动学:已知工具坐标系相对于固定坐标系的期望位置和姿态,求解各个关节角度
正运动学(Forward Kinematics, FK)描述的是机器人根据其关节角度或其他输入参数,计算末端执行器(通常是机器人的手爪或工具)的位置和姿态的过程。
它将输入变量(关节角度)转换成输出变量(末端执行器的位置和姿态)。
正运动学方程通常包括几个步骤: - 确定每个关节的类型和运动参数
逆运动学解的求解不总是存在的,或者即使存在,也未必是唯一的
运动学
机器人
机器人本体感受(Robot Proprioception)
关节状态:
位置(Position):关节角度(如机械臂肘部弯曲90°);
速度(Velocity):关节运动速率(用于平滑控制);
扭矩(Torque):驱动力反馈(避免过载或碰撞)。
连杆状态:
末端执行器位姿(End-effector pose):如机械爪的三维坐标与朝向,决定抓取精度
触觉/力觉感知(Haptic/Tactile Perception)
电机(Electric motors):驱动轮子、关节等(如四足机器人的腿部运动)
末端执行器(End-effectors):如夹爪、吸盘,实现物体抓取
使用Denavit-Hartenberg(DH)参数描述关节坐标系变换;