OpenAI Gym的一些基础使用方法
注册环境
import gym
from gym import envs
# 查看当前Gym注册了哪些环境
env_specs = gym.envs.registry.all()
env_ids = [env_specs.id for env_specs in env_specs]
print(env_ids)
# 运行结果
['Copy-v0', 'RepeatCopy-v0', 'ReversedAddition-v0', 'ReversedAddition3-v0', 'DuplicatedInput-v0', 'Reverse-v0', 'CartPole-v0', 'CartPole-v1', 'MountainCar-v0', 'MountainCarContinuous-v0', 'Pendulum-v0', 'Acrobot-v1', 'LunarLander-v2', 'LunarLanderContinuous-v2', 'BipedalWalker-v3', 'BipedalWalkerHardcore-v3', 'CarRacing-v0', 'Blackjack-v0', 'KellyCoinflip-v0', 'KellyCoinflipGeneralized-v0', 'FrozenLake-v0', 'FrozenLake8x8-v0', 'CliffWalking-v0', 'NChain-v0', 'Roulette-v0', 'Taxi-v3', 'GuessingGame-v0', 'HotterColder-v0', 'Reacher-v2', 'Pusher-v2', 'Thrower-v2', 'Striker-v2', 'InvertedPendulum-v2', 'InvertedDoublePendulum-v2', 'HalfCheetah-v2', 'HalfCheetah-v3', 'Hopper-v2', 'Hopper-v3', 'Swimmer-v2', 'Swimmer-v3',
………
观测空间和动作空间
import gym
from gym import envs
# 观测空间和动作空间
env = gym.make('MountainCar-v0')
obs_space = env.observation_space
obs_space_dim = obs_space.shape[0]
obs_space_high = obs_space.high
obs_space_low = obs_space.low
print(obs_space, obs_space_dim, obs_space_high, obs_space_low)
act_space = env.action_space
act_dim = act_space.n
print(act_space, act_dim)
# 运行结果
Box(-1.2000000476837158, 0.6000000238418579, (2,), float32) 2 [0.6 0.07] [-1.2 -0.07]
Discrete(3) 3
- 在gym中连续空间为Box,离散空间为Discrete;
- 连续空间(Box)下,例如绝大多数的观测空间和部分环境的动作空间,通常使用
env.observation_space.shape[0]来获得连续空间的维度,而离散空间(Discrete)需要使用env.action_space.n获得离散空间的维度; - 连续空间下,例如
env.observation.high,返回的该连续空间各个维度下的最高阈值;最低范围同理。
随机种子
env.seed(0)
可以设置随机种子,主要用来结果精确复现,一般可以忽略

浙公网安备 33010602011771号