具身智能_数据集和基本概念

数据集

收集视觉、触觉、力觉、运动轨迹以及机器人本体状态等多源异构数据
   采集方式主要分为真实数据和仿真数据两大类

关键字

机器人类型--机器人包括单臂、双臂和四足机器人

   本体：主流机械臂（UR5、Franka、Kuka和Flexiv）、
	       4种夹爪（Dahuan AG95、WSG50、Robotiq-85和Franka）、
		   3种力传感器（OptoForce 、ATI Axia80-M20和Franka）
  环境 environment   ：		   
  场景 Scene         ：
  任务 task	         ： 任务长度 长程任务	  平均轨迹长度为655帧   task_name
  技能 Manipulations ： Picking（抓）、Moving（移动）、Pushing（推）、Placing（放 ) Wiping、Assembling、Turning on等
                        基础的抓取、放置、推、拉等操作，到搅拌、折叠、熨烫等精细长程、双臂协同复杂交互
  操作对象           ： 物品类型 材质方面包括木质、地毯、石制等9种主要材料	
  片段 episode       ：
  轨迹 trajectory    ：
  智能体 agent       ：
  奖励函数（Reward） ： 用于量化动作效果的代码模块，每个细分部分（如速度、角速度、加速度、碰撞等）都设置了相应惩罚或奖励
  输出：7维末端执行器动作离散量（6D for EE: x, y, z, roll, pitch, yaw and 1D for gripper）外加对轨迹状态的说明（terminate_episode）
  
   线性速度 角速度 朝向 orientation
##机器人 
    机器人（robot） 关节（joint   夹具 gripper
    头部（head）
    腰部（waist）指机器人的腰部关节，有两个自由度（俯仰pitch和升降lift
    末端（end） 效应器（effector）
	轮式全向移动底盘 机械臂
	
	控制器：
    传感器：力矩传感器，也被叫做扭矩传感器
    执行器
    末端执行器	夹抓（gripper） 灵巧手	
  
###机械和物理
    扭矩	torques
    dof_vel	自由度的速度  加速度acc
    线性速度指令（xy 轴）  角速度指令（偏航）
	自由度（DOFs，Degrees of Freedom）
	    barett 机械手共有4个自由度，3个手指屈伸自由度，一个旋转自由度。
        机械手控制分为 position模式（直接发布机械手关节绝对位置，机械手自动达到），
		               velocity模式（设置某关节的运动角速度）
			name(控制的关节名称),position(需要设置的关节位置),velocity(需要设置的关节速度),
			effort(需要设置的最大功率,速度一定时决定机械手停止的最大握力)		   
			 六轴力/力矩（Force/Torque）传感器也经常被称作多轴力/力矩传感器系，多轴加载单元，F/T传感器，或者六轴加载单元
    state 机械臂的关节角度   小红方块位置和方向 目标位置(target x,y,zx,y,zx,y,z)  
	action 一个动作是一个机器人命令。
         在该例子中为电机扭矩（如下图所示，七个关节的电机扭矩τ1−τ7 ）
	运动学数据，包括关节角度、扭矩，以及夹爪的位置、朝向和开合状态	

    机器人操作器(Robot end effector),简称“Robot end”,是机器人的一种关键组成部分,也称为末端执行器
    双绳驱仿生灵巧手DexHand	

    teleoperate -远程操纵		
###算法
    强化学习 模仿学习
	    强化学习的基本流程（在具身智能中）：
            状态（State）：来自智能体的传感器输入，比如相机图像、激光雷达、IMU 等。
            动作（Action）：智能体可以执行的操作，比如移动、抓取、旋转等
			奖励（Reward）：智能体完成某个目标（或接近目标）后获得的反馈
			转移函数（transition function）
                 当我们对当前状态施加一个动作之后，机器臂就进入了下一个状态。
				 我们使用转移函数T\mathcal{T}T描述状态如何根据动作随时间变化。确定性转移和随机性转移
		模仿学习
			 
	基于力学原理设计规则：属于“经典机器人学”（Classic Robotics）
	
###数据
    场景系列（Series）- 任务（Tasks）- 片段（Episodes）的层次结构

数据集

谷歌Open X-Embodiment开源数据集 2023年10月
     整合了60个已有数据集，涵盖311个场景、 100 多万条真实机器人轨迹，包括527种技能、160266项任务

LeRobot的数据集

  ├ hf_dataset: a Hugging Face dataset (backed by Arrow/parquet). Typical features example:
  │  ├ observation.images.cam_high (VideoFrame):VideoFrame = {'path': path to a mp4 video, 'timestamp' (float32): timestamp in the video}
  │  │   
  │  ├ observation.state (list of float32): position of an arm joints (for instance)
  │  ├ action (list of float32): goal position of an arm joints (for instance)
  │  ├ episode_index (int64): index of the episode for this sample
  │  ├ frame_index (int64): index of the frame for this sample in the episode ; starts at 0 for each episode
  │  ├ timestamp (float32): timestamp in the episode
  │  ├ next.done (bool): indicates the end of an episode ; True for the last frame in each episode
  │  └ index (int64): general index in the whole dataset
  ├ episode_data_index: contains 2 tensors with the start and end indices of each episode
  │  ├ from (1D int64 tensor): first frame index for each episode — shape (num episodes,) starts with 0
  │  └ to: (1D int64 tensor): last frame index for each episode — shape (num episodes,)
  ├ stats: a dictionary of statistics (max, mean, min, std) for each feature in the dataset, for instance
  │  ├ observation.images.cam_high: {'max': tensor with same number of dimensions (e.g. `(c, 1, 1)` for images, `(c,)` for states), etc.}
  ├ info: a dictionary of metadata on the dataset
  │  ├ codebase_version (str): this is to keep track of the codebase version the dataset was created with
  │  ├ fps (float): frame per second the dataset is recorded/synchronized to
  │  ├ video (bool): indicates if frames are encoded in mp4 video files to save space or stored as png files
  │  └ encoding (dict): if video, this documents the main options that were used with ffmpeg to encode the videos
  ├ videos_dir (Path): where the mp4 videos or png images are stored/accessed
  └ camera_keys (list of string): the keys to access camera features in the item returned by the dataset (e.g. `["observation.images.cam_high", ...]`

宇树数据集

    meta/info.json:
    "observation.state": {
        "dtype": "float32",
        "shape": [ 14 ],
        "names": [ [
                "kLeftWaist",    腰部
                "kLeftShoulder", 肩
                "kLeftElbow",  肘部
                "kLeftForearmRoll", 前臂
                "kLeftWristAngle", 手腕
                "kLeftWristRotate",
                "kLeftGripper",
				
                "kRightWaist",  "kRightShoulder", "kRightElbow",
                "kRightForearmRoll",
                "kRightWristAngle",  "kRightWristRotate",  "kRightGripper"
            ] ]},

     "observation.state": {
            "dtype": "float32",
            "shape": [  16 ],
            "names": [
                [
                    "kLeftShoulderPitch", "kLeftShoulderRoll", "kLeftShoulderYaw",
                    "kLeftElbow",
                    "kLeftWristRoll", "kLeftWristPitch", "kLeftWristYaw",
                    "kRightShoulderPitch", "kRightShoulderRoll", "kRightShoulderYaw",
                    "kRightElbow",
                    "kRightWristRoll", "kRightWristPitch","kRightWristYaw",
                    "kLeftGripper", "kRightGripper"
                ]
            ]
        }

智元数据集 AgiBot World 发布时间2024.10

data
   ├── task_info
   │   ├── task_327.json
   ├── observations
   │   ├── 327 # This represents the task id.
   │   │   ├── 648642 # This represents the episode id.
   │   │   │   ├── depth # This is a folder containing depth information saved in PNG format.
   │   │   │   ├── videos # This is a folder containing videos from all camera perspectives.
   ├── parameters
   │   ├── 327
   │   │   ├── 648642
   │   │   │   ├── camera
   ├── proprio_stats
   │   ├── 327[task_id]
   │   │   ├── 648642[episode_id]
   │   │   │   ├── proprio_stats.h5 # This file contains all the robot's proprioceptive information.
   │   │   ├── 648649
   │   │   │   └── proprio_stats.h5
   
 Group	Shape	Meaning
/timestamp	[N]	timestamp in nanoseconds
    /state/effector/position (gripper)	[N, 2]	left [:, 0], right [:, 1], gripper open range in mm
    /state/effector/position (dexhand)	[N, 12]	left [:, :6], right [:, 6:], joint angle in rad
	
    /state/end/orientation	[N, 2, 4]	left [:, 0, :], right [:, 1, :], flange quaternion with xyzw
    /state/end/position	    [N, 2, 3]	left [:, 0, :], right [:, 1, :], flange xyz in meters
	
    /state/head/position	    [N, 2]	yaw [:, 0], pitch [:, 1], rad
    /state/joint/current_value	[N, 14]	left arm [:, :7], right arm [:, 7:]
    /state/joint/position	    [N, 14]	left arm [:, :7], right arm [:, 7:], rad
    /state/robot/orientation	[N, 4]	quaternion in xyzw, yaw only
    /state/robot/position	    [N, 3]	xyz position, where z is always 0 in meters
    /state/waist/position	    [N, 2]	pitch [:, 0] in rad, lift [:, 1]in meters
	
    /action/*/index	[M]	actions indexes refer to when the control source is actually sending signals
    /action/effector/position (gripper)	[N, 2]	left [:, 0], right [:, 1], 0 for full open and 1 for full close
    /action/effector/position (dexhand)	[N, 12]	same as /state/effector/position
    /action/effector/index	[M_1]	index when the control source for end effector is sending control signals
    /action/end/orientation	[N, 2, 4]	same as /state/end/orientation
    /action/end/position	[N, 2, 3]	same as /state/end/position
    /action/end/index	[M_2]	same as other indexes
    /action/head/position	[N, 2]	same as /state/head/position
    /action/head/index	[M_3]	same as other indexes
    /action/joint/position	[N, 14]	same as /state/joint/position
    /action/joint/index	[M_4]	same as other indexes
    /action/robot/velocity	[N, 2]	vel along x axis [:, 0], yaw rate [:, 1]
    /action/robot/index	[M_5]	same as other indexes
    /action/waist/position	[N, 2]	same as /state/waist/position
    /action/waist/index	[M_6]	same as other indexes

数据集特点

1）图像、点云、声音、文本和触觉，旨在提供更丰富和多样化的数据，以增强机器人的感知和交互能力。
2）基于时间戳的对齐机制：基于时间戳实现多传感器数据同步记录，兼容不同采样频率，以确保多模态数据的同步。具体来说，相机数据以30Hz的频率记录，激光雷达数据以10Hz记录，本体感知数据以200Hz记录，触觉数据以100Hz记

从物理信号到数据

主从臂零位校准
标定文件解读：
      homing offset: 
	        表示每个电机的零点偏移（Homing Offset），单位为步数（Steps）。它是将实际电机读数调整到标定目标位置（通常为0°或90°）所需的修正量。
      drive mode:
	       表示每个电机的驱动方向（DriveMode），0表示原始方向（无需反转），1表示反转方向。
      start pos :表示标定过程中手动移到“零位”（Zero Position）时的电机读数（单位：步数），即 zero_pos。
      end pos   :表示标定过程中手动移到“旋转位”（Rotated Position，90°）时的电机读数（单位：步数），即 rotated_pos。
      calib mode :表示每个电机的标定模式，DEGREE表示旋转关节（单位：度，范围[-180,180]），LINEAR表示线性关节（范围[0,100]）。
      motor names: 机械臂电机的名称列表，与其他参数一一对应。指定参数的适用对象，表示6个自由度（DOF）的机械臂：

   shoulder pan: 肩部水平旋转
   shoulder lift: 肩部俯仰
   elbow flex: 肘部弯曲
   wrist flex: 腕部俯仰
   wrist roll: 腕部滚动
   gripper: 夹爪
 
标定结束，终端会输出leader和follower的关节参数
   旋转关节机器人
   D-H参数包含α（连杆扭转角）、a（连杆长度）、d（连杆偏距）以及θ（关节角）这四个运动学参数，
      其中前两个参数用来描述连杆本身，
	  后两个参数用来描述连杆之间的连接关系，
	  这种用连杆参数描述机构运动关系的规则称为Denavit-Hartenberg方法，相应的参数称为D-H参数
运动学：
    正运动学：已知机器人各关节变量的取值，确定末端执行器的位置和姿态。
    逆运动学：已知工具坐标系相对于固定坐标系的期望位置和姿态，求解各个关节角度
正运动学（Forward Kinematics, FK）描述的是机器人根据其关节角度或其他输入参数，计算末端执行器（通常是机器人的手爪或工具）的位置和姿态的过程。
它将输入变量（关节角度）转换成输出变量（末端执行器的位置和姿态）。
   正运动学方程通常包括几个步骤： - 确定每个关节的类型和运动参数
 逆运动学解的求解不总是存在的，或者即使存在，也未必是唯一的

运动学

机器人

 机器人本体感受（Robot Proprioception）
     关节状态：
         位置（Position）：关节角度（如机械臂肘部弯曲90°）；
         速度（Velocity）：关节运动速率（用于平滑控制）；
         扭矩（Torque）：驱动力反馈（避免过载或碰撞）。
     连杆状态：
         末端执行器位姿（End-effector pose）：如机械爪的三维坐标与朝向，决定抓取精度
触觉/力觉感知（Haptic/Tactile Perception） 

 电机（Electric motors）：驱动轮子、关节等（如四足机器人的腿部运动）
 末端执行器（End-effectors）：如夹爪、吸盘，实现物体抓取
 使用Denavit-Hartenberg（DH）参数描述关节坐标系变换；

posted @ 2025-07-09 18:05 辰令阅读(140) 评论(0) 收藏举报

刷新页面返回顶部

辰令

辰时令节

具身智能_数据集和基本概念

数据集

关键字

数据集

LeRobot的数据集

宇树数据集

智元数据集 AgiBot World 发布时间2024.10

数据集特点

从物理信号到数据

运动学

机器人