摘要:
1.Markov decision processes formally describe an environment for reinforcement learning Where the environment is fully observable The current state co 阅读全文
posted @ 2018-11-20 17:01
TaeYoon
阅读(233)
评论(0)
推荐(0)
摘要:
1.The difference of the reinforcement learning:(区别于传统的监督/非监督学习) no supervisor ,only a reward signal(小孩试错的过程) feedback is delayed,not instantaneous(错误的 阅读全文
posted @ 2018-11-20 16:59
TaeYoon
阅读(321)
评论(0)
推荐(0)

浙公网安备 33010602011771号