摘要: 有关上下文强化学习的优质论文收集: Awesome In-Context Reinforcement Learning In-context Reinforcement Learning with Algorithm Distillation Michael Laskin, Luyu Wang, J 阅读全文
posted @ 2025-03-05 19:38 霜尘FrostDust 阅读(149) 评论(0) 推荐(0)
摘要: 本文记录此次报告的key point(个人向) llm时代的几点difficulity Inference-time computation scalling OpenAI o1 利用RL来显式整合inference期间推理的step(inference-time computation) (从pr 阅读全文
posted @ 2025-03-05 11:23 霜尘FrostDust 阅读(39) 评论(1) 推荐(0)