2023 年 5月 9 日随笔档案 - 阿Qi早起了吗

2023年5月9日

摘要：概率密度函数期望（expect） state s action a agent policy Π(a|s) reward r state transition p(s'|s,a) return（cumulative future reward 未来累计回报） discounted return（γ 阅读全文

posted @ 2023-05-09 17:26 阿Qi早起了吗阅读(87) 评论(0) 推荐(0)

公告