摘要:
目录引入TD learing of state valuesTD learing of action values SarsaTD learing of action values Expected SarsaTD learing of action values n-step SarsaTD le 阅读全文
posted @ 2024-10-29 21:10
cxy8
阅读(174)
评论(0)
推荐(0)
摘要:
目录Robbins-Monro algorithmStochastic gradient descentBGD、MBGD、 and SGDSummary Robbins-Monro algorithm 迭代式求平均数的算法 \(Stochastic \; approximation \;(SA)\) 阅读全文
posted @ 2024-10-29 14:02
cxy8
阅读(238)
评论(0)
推荐(0)
摘要:
目录MC BasicMC Exploring StartsMC Epsilon-Greedy MC Basic 从\(model \: base \:\)的\(Reinforcement \: learning \:\)过渡到\(model \: free \:\)的\(\: Reinforceme 阅读全文
posted @ 2024-10-29 09:44
cxy8
阅读(166)
评论(0)
推荐(1)

浙公网安备 33010602011771号