2025 年 3月 5 日随笔档案 - 霜尘FrostDust

2025年3月5日

offline RL | In-Context Reinforcement Learning Papers Collection

摘要：有关上下文强化学习的优质论文收集： Awesome In-Context Reinforcement Learning In-context Reinforcement Learning with Algorithm Distillation Michael Laskin, Luyu Wang, J 阅读全文

posted @ 2025-03-05 19:38 霜尘FrostDust 阅读(149) 评论(0) 推荐(0)

RLChina2024 | 汪军 LLM and AI Agents: A Roadmap and Vision towards AGI

摘要：本文记录此次报告的key point（个人向） llm时代的几点difficulity Inference-time computation scalling OpenAI o1 利用RL来显式整合inference期间推理的step(inference-time computation) （从pr 阅读全文

posted @ 2025-03-05 11:23 霜尘FrostDust 阅读(39) 评论(1) 推荐(0)

FrostDust

公告