强化学习资料

openai 一个页面看强化学习 https://lilianweng.github.io/posts/2018-04-08-policy-gradient/
- chatgpt 中的PPO 策略 https://huggingface.co/blog/rlhf
- https://wandb.ai/ayush-thakur/RLHF/reports/Understanding-Reinforcement-Learning-from-Human-Feedback-RLHF-Part-1--VmlldzoyODk5MTIx

经典入门电子书: Reinforcement Learning: An Introduction
李宏毅深度强化学习(国语)课程: https://www.bilibili.com/video/av24724071/?p=1
上海交大讲义: http://wnzhang.net/tutorials/marl2018/index.html

其他

其他资料介绍: https://zhuanlan.zhihu.com/p/34918639
李宏毅老师的课: http://speech.ee.ntu.edu.tw/~tlkagk/courses.html

posted @ 2019-07-26 17:40 bregman 阅读(286) 评论(1) 收藏举报

刷新页面返回顶部