随笔分类 - Reinforcement Learning
摘要:1.Markov decision processes formally describe an environment for reinforcement learning Where the environment is fully observable The current state co
阅读全文
摘要:1.The difference of the reinforcement learning:(区别于传统的监督/非监督学习) no supervisor ,only a reward signal(小孩试错的过程) feedback is delayed,not instantaneous(错误的
阅读全文

浙公网安备 33010602011771号