摘要:
有关上下文强化学习的优质论文收集: Awesome In-Context Reinforcement Learning In-context Reinforcement Learning with Algorithm Distillation Michael Laskin, Luyu Wang, J 阅读全文
posted @ 2025-03-05 19:38
霜尘FrostDust
阅读(149)
评论(0)
推荐(0)
摘要:
本文记录此次报告的key point(个人向) llm时代的几点difficulity Inference-time computation scalling OpenAI o1 利用RL来显式整合inference期间推理的step(inference-time computation) (从pr 阅读全文
posted @ 2025-03-05 11:23
霜尘FrostDust
阅读(39)
评论(1)
推荐(0)

浙公网安备 33010602011771号