2021 年 6月 3 日随笔档案 - Data'Insight

2021年6月3日

摘要：前言 Bandits -> Contextual Bandits -> RL， 3个方向不断进阶。contextual bandits 相比于bandits多了特征优势，相比于RL是一步reward反馈。正好有个大佬整理了这几种算法的对比，顺便学习下。资料链接： github地址：https:// 阅读全文

posted @ 2021-06-03 16:28 Data'Insight 阅读(297) 评论(0) 推荐(0)

ContextualBandits系列

摘要：碎碎念 Bandits, Contextual Bandits, RL。3个方向，属于不断升级。CB是一步reward的rl，相比于bandits，可以使用特征信息。最新的研究成果应该就是neural bandits了。正好看到一个github上有很不错的bandits算法对比，正好也在做这块，打算阅读全文

posted @ 2021-06-03 11:22 Data'Insight 阅读(169) 评论(0) 推荐(0)

Data'Insight

公告