具身智能_数据集和基本概念

数据集

收集视觉、触觉、力觉、运动轨迹以及机器人本体状态等多源异构数据
   采集方式主要分为真实数据和仿真数据两大类

关键字

机器人类型--机器人包括单臂、双臂和四足机器人

   本体:主流机械臂(UR5、Franka、Kuka和Flexiv)、
	       4种夹爪(Dahuan AG95、WSG50、Robotiq-85和Franka)、
		   3种力传感器(OptoForce 、ATI Axia80-M20和Franka)
  环境 environment   :		   
  场景 Scene         :
  任务 task	         : 任务长度 长程任务	  平均轨迹长度为655帧   task_name
  技能 Manipulations : Picking(抓)、Moving(移动)、Pushing(推)、Placing(放 ) Wiping、Assembling、Turning on等
                        基础的抓取、放置、推、拉等操作,到搅拌、折叠、熨烫等精细长程、双臂协同复杂交互
  操作对象           : 物品类型 材质方面包括木质、地毯、石制等9种主要材料	
  片段 episode       :
  轨迹 trajectory    :
  智能体 agent       :
  奖励函数(Reward) : 用于量化动作效果的代码模块,每个细分部分(如速度、角速度、加速度、碰撞等)都设置了相应惩罚或奖励
  输出:7维末端执行器动作离散量(6D for EE: x, y, z, roll, pitch, yaw and 1D for gripper)外加对轨迹状态的说明(terminate_episode)
  
   线性速度 角速度 朝向 orientation
##机器人 
    机器人(robot) 关节(joint   夹具 gripper
    头部(head)
    腰部(waist)指机器人的腰部关节,有两个自由度(俯仰pitch和升降lift
    末端(end) 效应器(effector)
	轮式全向移动底盘 机械臂
	
	控制器:
    传感器:力矩传感器,也被叫做扭矩传感器
    执行器
    末端执行器	夹抓(gripper) 灵巧手	
  
###机械和物理
    扭矩	torques
    dof_vel	自由度的速度  加速度acc
    线性速度指令(xy 轴)  角速度指令(偏航)
	自由度(DOFs,Degrees of Freedom)
	    barett 机械手共有4个自由度,3个手指屈伸自由度,一个旋转自由度。
        机械手控制分为 position模式(直接发布机械手关节绝对位置,机械手自动达到),
		               velocity模式(设置某关节的运动角速度)
			name(控制的关节名称),position(需要设置的关节位置),velocity(需要设置的关节速度),
			effort(需要设置的最大功率,速度一定时决定机械手停止的最大握力)		   
			 六轴力/力矩(Force/Torque)传感器也经常被称作多轴力/力矩传感器系,多轴加载单元,F/T传感器,或者六轴加载单元
    state 机械臂的关节角度   小红方块位置和方向 目标位置(target x,y,zx,y,zx,y,z)  
	action 一个动作是一个机器人命令。
         在该例子中为电机扭矩(如下图所示,七个关节的电机扭矩τ1−τ7 )
	运动学数据,包括关节角度、扭矩,以及夹爪的位置、朝向和开合状态	

    机器人操作器(Robot end effector),简称“Robot end”,是机器人的一种关键组成部分,也称为末端执行器
    双绳驱仿生灵巧手DexHand	

    teleoperate -远程操纵		
###算法
    强化学习 模仿学习
	    强化学习的基本流程(在具身智能中):
            状态(State):来自智能体的传感器输入,比如相机图像、激光雷达、IMU 等。
            动作(Action):智能体可以执行的操作,比如移动、抓取、旋转等
			奖励(Reward):智能体完成某个目标(或接近目标)后获得的反馈
			转移函数(transition function)
                 当我们对当前状态施加一个动作之后,机器臂就进入了下一个状态。
				 我们使用转移函数T\mathcal{T}T描述状态如何根据动作随时间变化。确定性转移和随机性转移
		模仿学习
			 
	基于力学原理设计规则:属于“经典机器人学”(Classic Robotics)
	
###数据
    场景系列(Series)- 任务(Tasks)- 片段(Episodes)的层次结构	

数据集

谷歌Open X-Embodiment开源数据集 2023年10月
     整合了60个已有数据集,涵盖311个场景、 100 多万条真实机器人轨迹,包括527种技能、160266项任务

LeRobot的数据集

  ├ hf_dataset: a Hugging Face dataset (backed by Arrow/parquet). Typical features example:
  │  ├ observation.images.cam_high (VideoFrame):VideoFrame = {'path': path to a mp4 video, 'timestamp' (float32): timestamp in the video}
  │  │   
  │  ├ observation.state (list of float32): position of an arm joints (for instance)
  │  ├ action (list of float32): goal position of an arm joints (for instance)
  │  ├ episode_index (int64): index of the episode for this sample
  │  ├ frame_index (int64): index of the frame for this sample in the episode ; starts at 0 for each episode
  │  ├ timestamp (float32): timestamp in the episode
  │  ├ next.done (bool): indicates the end of an episode ; True for the last frame in each episode
  │  └ index (int64): general index in the whole dataset
  ├ episode_data_index: contains 2 tensors with the start and end indices of each episode
  │  ├ from (1D int64 tensor): first frame index for each episode — shape (num episodes,) starts with 0
  │  └ to: (1D int64 tensor): last frame index for each episode — shape (num episodes,)
  ├ stats: a dictionary of statistics (max, mean, min, std) for each feature in the dataset, for instance
  │  ├ observation.images.cam_high: {'max': tensor with same number of dimensions (e.g. `(c, 1, 1)` for images, `(c,)` for states), etc.}
  ├ info: a dictionary of metadata on the dataset
  │  ├ codebase_version (str): this is to keep track of the codebase version the dataset was created with
  │  ├ fps (float): frame per second the dataset is recorded/synchronized to
  │  ├ video (bool): indicates if frames are encoded in mp4 video files to save space or stored as png files
  │  └ encoding (dict): if video, this documents the main options that were used with ffmpeg to encode the videos
  ├ videos_dir (Path): where the mp4 videos or png images are stored/accessed
  └ camera_keys (list of string): the keys to access camera features in the item returned by the dataset (e.g. `["observation.images.cam_high", ...]`

宇树数据集

    meta/info.json:
    "observation.state": {
        "dtype": "float32",
        "shape": [ 14 ],
        "names": [ [
                "kLeftWaist",    腰部
                "kLeftShoulder", 肩
                "kLeftElbow",  肘部
                "kLeftForearmRoll", 前臂
                "kLeftWristAngle", 手腕
                "kLeftWristRotate",
                "kLeftGripper",
				
                "kRightWaist",  "kRightShoulder", "kRightElbow",
                "kRightForearmRoll",
                "kRightWristAngle",  "kRightWristRotate",  "kRightGripper"
            ] ]},

     "observation.state": {
            "dtype": "float32",
            "shape": [  16 ],
            "names": [
                [
                    "kLeftShoulderPitch", "kLeftShoulderRoll", "kLeftShoulderYaw",
                    "kLeftElbow",
                    "kLeftWristRoll", "kLeftWristPitch", "kLeftWristYaw",
                    "kRightShoulderPitch", "kRightShoulderRoll", "kRightShoulderYaw",
                    "kRightElbow",
                    "kRightWristRoll", "kRightWristPitch","kRightWristYaw",
                    "kLeftGripper", "kRightGripper"
                ]
            ]
        } 

智元数据集 AgiBot World 发布时间2024.10

data
   ├── task_info
   │   ├── task_327.json
   ├── observations
   │   ├── 327 # This represents the task id.
   │   │   ├── 648642 # This represents the episode id.
   │   │   │   ├── depth # This is a folder containing depth information saved in PNG format.
   │   │   │   ├── videos # This is a folder containing videos from all camera perspectives.
   ├── parameters
   │   ├── 327
   │   │   ├── 648642
   │   │   │   ├── camera
   ├── proprio_stats
   │   ├── 327[task_id]
   │   │   ├── 648642[episode_id]
   │   │   │   ├── proprio_stats.h5 # This file contains all the robot's proprioceptive information.
   │   │   ├── 648649
   │   │   │   └── proprio_stats.h5
   
 Group	Shape	Meaning
/timestamp	[N]	timestamp in nanoseconds
    /state/effector/position (gripper)	[N, 2]	left [:, 0], right [:, 1], gripper open range in mm
    /state/effector/position (dexhand)	[N, 12]	left [:, :6], right [:, 6:], joint angle in rad
	
    /state/end/orientation	[N, 2, 4]	left [:, 0, :], right [:, 1, :], flange quaternion with xyzw
    /state/end/position	    [N, 2, 3]	left [:, 0, :], right [:, 1, :], flange xyz in meters
	
    /state/head/position	    [N, 2]	yaw [:, 0], pitch [:, 1], rad
    /state/joint/current_value	[N, 14]	left arm [:, :7], right arm [:, 7:]
    /state/joint/position	    [N, 14]	left arm [:, :7], right arm [:, 7:], rad
    /state/robot/orientation	[N, 4]	quaternion in xyzw, yaw only
    /state/robot/position	    [N, 3]	xyz position, where z is always 0 in meters
    /state/waist/position	    [N, 2]	pitch [:, 0] in rad, lift [:, 1]in meters
	
    /action/*/index	[M]	actions indexes refer to when the control source is actually sending signals
    /action/effector/position (gripper)	[N, 2]	left [:, 0], right [:, 1], 0 for full open and 1 for full close
    /action/effector/position (dexhand)	[N, 12]	same as /state/effector/position
    /action/effector/index	[M_1]	index when the control source for end effector is sending control signals
    /action/end/orientation	[N, 2, 4]	same as /state/end/orientation
    /action/end/position	[N, 2, 3]	same as /state/end/position
    /action/end/index	[M_2]	same as other indexes
    /action/head/position	[N, 2]	same as /state/head/position
    /action/head/index	[M_3]	same as other indexes
    /action/joint/position	[N, 14]	same as /state/joint/position
    /action/joint/index	[M_4]	same as other indexes
    /action/robot/velocity	[N, 2]	vel along x axis [:, 0], yaw rate [:, 1]
    /action/robot/index	[M_5]	same as other indexes
    /action/waist/position	[N, 2]	same as /state/waist/position
    /action/waist/index	[M_6]	same as other indexes

数据集特点

1)图像、点云、声音、文本和触觉,旨在提供更丰富和多样化的数据,以增强机器人的感知和交互能力。
2)基于时间戳的对齐机制:基于时间戳实现多传感器数据同步记录,兼容不同采样频率,以确保多模态数据的同步。具体来说,相机数据以30Hz的频率记录,激光雷达数据以10Hz记录,本体感知数据以200Hz记录,触觉数据以100Hz记

从物理信号到数据

主从臂零位校准
标定文件解读:
      homing offset: 
	        表示每个电机的零点偏移(Homing Offset),单位为步数(Steps)。它是将实际电机读数调整到标定目标位置(通常为0°或90°)所需的修正量。
      drive mode:
	       表示每个电机的驱动方向(DriveMode),0表示原始方向(无需反转),1表示反转方向。
      start pos :表示标定过程中手动移到“零位”(Zero Position)时的电机读数(单位:步数),即 zero_pos。
      end pos   :表示标定过程中手动移到“旋转位”(Rotated Position,90°)时的电机读数(单位:步数),即 rotated_pos。
      calib mode :表示每个电机的标定模式,DEGREE表示旋转关节(单位:度,范围[-180,180]),LINEAR表示线性关节(范围[0,100])。
      motor names: 机械臂电机的名称列表,与其他参数一一对应。指定参数的适用对象,表示6个自由度(DOF)的机械臂:

   shoulder pan: 肩部水平旋转
   shoulder lift: 肩部俯仰
   elbow flex: 肘部弯曲
   wrist flex: 腕部俯仰
   wrist roll: 腕部滚动
   gripper: 夹爪
 
标定结束,终端会输出leader和follower的关节参数
   旋转关节机器人
   D-H参数包含α(连杆扭转角)、a(连杆长度)、d(连杆偏距)以及θ(关节角)这四个运动学参数,
      其中前两个参数用来描述连杆本身,
	  后两个参数用来描述连杆之间的连接关系,
	  这种用连杆参数描述机构运动关系的规则称为Denavit-Hartenberg方法,相应的参数称为D-H参数
运动学:
    正运动学:已知机器人各关节变量的取值,确定末端执行器的位置和姿态。
    逆运动学:已知工具坐标系相对于固定坐标系的期望位置和姿态,求解各个关节角度
正运动学(Forward Kinematics, FK)描述的是机器人根据其关节角度或其他输入参数,计算末端执行器(通常是机器人的手爪或工具)的位置和姿态的过程。
它将输入变量(关节角度)转换成输出变量(末端执行器的位置和姿态)。
   正运动学方程通常包括几个步骤: - 确定每个关节的类型和运动参数
 逆运动学解的求解不总是存在的,或者即使存在,也未必是唯一的

运动学

机器人

 机器人本体感受(Robot Proprioception)
     关节状态:
         位置(Position):关节角度(如机械臂肘部弯曲90°);
         速度(Velocity):关节运动速率(用于平滑控制);
         扭矩(Torque):驱动力反馈(避免过载或碰撞)。
     连杆状态:
         末端执行器位姿(End-effector pose):如机械爪的三维坐标与朝向,决定抓取精度
触觉/力觉感知(Haptic/Tactile Perception) 

 电机(Electric motors):驱动轮子、关节等(如四足机器人的腿部运动)
 末端执行器(End-effectors):如夹爪、吸盘,实现物体抓取
 使用Denavit-Hartenberg(DH)参数描述关节坐标系变换;	 
posted @ 2025-07-09 18:05  辰令  阅读(140)  评论(0)    收藏  举报