论文速读 | 25年10月
Mastering the game of Go with deep neural networks and tree search
- AlphaGo 2016
- 人类数据训练网络 —— 自我对弈强化学习 —— MCTS(PUCT)
Mastering the game of Go without human knowledge
- AlphaZero 2017
- 完全摒弃人类数据
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
- 2024
- VLM前一半中间层feature质量更好
- action expert为flow matching transformer
\(π_0\): A Vision-Language-Action Flow Model for General Robot Control
- RSS 2025