2020 年 1月 28 日随笔档案 - yingfengwu

2020年1月28日

摘要：强化学习一般分为无模型的强化学习(Model-Free RL)和基于模型的强化学习(Model-Based RL) ·无模型的强化学习又分为Policy Optimization和Q-learning 使用Policy Optimization的算法：Policy Gradient、A2C/A3C、阅读全文

posted @ 2020-01-28 14:54 yingfengwu 阅读(2323) 评论(0) 推荐(0)

yingfengwu

The so-called excellent person is to let the world, because with me, be a little different.

公告