博弈树 极大极小值算法。
2018-08-06 14:16 Lyp_02 阅读(735) 评论(0) 收藏 举报参考:https://www.ics.uci.edu/~eppstein/180a/970417.html
Minimax and negamax search:
Breadth First and Depth First Search; Negamax Code
As described above, the computation of game tree values is breadth first (we compute the values in the tree bottom-up, a single level in the tree at a time). Instead, we can perform a depth-first (post-order) recursive traversal of the tree that evaluates a node by recursively evaluating its children and keeping track of the values seen so far. This is much more space-efficient because it doesn't need to store the whole game tree, only a single path (which would generally be quite short, e.g. eight moves with the example stopping rule above). As we'll see next time when I discuss alpha-beta search, depth-first traversal also has the advantage that you can use the information you've found so far to help decide not to visit certain irrelevant parts of the tree, saving a lot of time.
It's convenient to modify the game tree values slightly, so that we only need maximization operations rather than having to alternate between minima and maxima. At odd levels of the tree (nodes in which the second player is to move), negate the values defined above. Then at each node, these modified values can be found by computing the maximum of the negations of the node's children's values. Maybe this will make more sense if I write down some source code for game tree search:
// search game tree to given depth, return evaluation of root node
double negamax(int depth)
{
if (depth <= 0 || game is over) return eval(pos);
else {
double e = -infty;
for (each move m available in pos) {
make move m;
e = max(e, -negamax(depth - 1));
unmake move m;
}
return e;
}
}
Note that this only finds the evaluation, but doesn't determine which move to actually make. We only need to find an actual move at the root of the tree (although many programs return an entire principal variation). To do this, we slightly modify the search performed at the root: // search game tree to given depth, return evaluation of root node
move rootsearch(int depth)
{
double e = -infty;
move mm;
for (each move m available in pos) {
make move m;
double em = -negamax(depth - 1);
if (e < em) {
e = em;
mm = m;
}
unmake move m;
}
return mm;
}
alpha -beta剪枝 : alpha<=N<=beta
出现 -beta 与 -alpha 是因为返回值 是正负交叉的。
假如此次范围为 alpha<=N<=beta传递到下次,将是
负数,但是,取得的负数值在变为正之后,要满足
这样不等式关系,假如不满足,就需要剪枝,满足条件
是 -beta<=-N<=-alpha
在开始博弈树时刻,沿着深度优先搜索路径把[alpha =-infinity,beta=-inifity]
范围传递到最后层次,得到评价函数,更新范围,以此按照中序遍历原则,更新父节点
沿着这个父节点向着对应兄弟结点传递,在这期间如果取得新评价函数值超过beta就返回,
如果大于alpha则进行更新。
Alpha-Beta Pseudocode
In general, when a returned value is better than the value of a sibling an even number of levels up in the tree, we can return immediately. If we pass the minimum value of any of these siblings in as a parameter beta to the search, we can do this pruning very efficiently. We also use another parameter alpha to keep track of the siblings at odd levels of the tree. Pruning using these two values is very simple; code to do so is listed below. Like last time, we use the negamax formulation, in which evaluations at alternate levels of the trees are negated.
double alphabeta(int depth, double alpha, double beta)
{
if (depth <= 0 || game is over) return evaluation();
generate and sort list of moves available in the position
for (each move m) {
make move m;
double val = -alphabeta(depth - 1, -beta, -alpha);
unmake move m;
if (val >= beta) return val;
if (val > alpha) alpha = val; //与《游戏编程精粹1》中博弈树不一样,它是负极大算法,
}
return alpha;
}
.
浙公网安备 33010602011771号