ContextualBandits算法

前言

　　Bandits -> Contextual Bandits -> RL， 3个方向不断进阶。contextual bandits 相比于bandits多了特征优势，相比于RL是一步reward反馈。正好有个大佬整理了这几种算法的对比，顺便学习下。

资料链接：

github地址：https://github.com/sauxpa/neural_exploration

posted @ 2021-06-03 16:28 Data'Insight 阅读(314) 评论(0) 收藏举报

刷新页面返回顶部

Data'Insight