Q-learning and RL implementation - 鱼市口

Aim: Train a model to properly play vintage video games...

Deep Q-learning Algo~

Very short Brief of Notations:

{A,pi(Policy),Q(quality of action-at a state),R ((s,a,s') - Reward, s state doing a to go to s' and get a specific r)}

So, if we want to train a model to play a video game like master. Modules are to be implemented as minimum, listed. below:

a class that can catch enough frames(typically consequtive) for game env analysis -> might need preprocessing to lower the memory overhead
a class of NN based model for training, weights init/update/storage/write/fork/reset; also the actions in a single play is recorded for optimization
a class that utilize the possible actions and abstrct to humble level to do anything player is going to do w/o generative issue at the beginning(can go general when model matured)
game to model/pre-processing module

This is just the minimum...

posted on 2023-09-01 19:14 鱼市口阅读(24) 评论(0) 收藏举报

刷新页面返回顶部