pacman-作业参考-2025.03.19-night - from 黄老师

该项目聚焦于在吃豆人游戏场景中实现搜索算法,以帮助吃豆人智能体寻找路径和收集食物。需要完成的作业主要是在search.py文件中实现深度优先搜索(DFS)、一致代价搜索(UCS)、A*算法这三种搜索算法,以及在searchAgents.py文件中为CornersProblem实现一个非平凡一致性启发函数。

  1. 深度优先搜索(DFS)
    • search.py文件中,深度优先搜索需要使用util.Stack数据结构来实现。从起始状态开始,将其压入栈中,并标记为已访问。然后不断从栈中弹出状态,检查是否为目标状态。若是,则返回到达该状态的动作序列;若不是,则获取该状态的后继状态,将未访问过的后继状态及其动作序列压入栈中,并标记为已访问,持续这个过程直到找到目标状态或栈为空。
    def depthFirstSearch(problem):
        stack = util.Stack()
        visited = set()
        stack.push((problem.getStartState(), []))
        visited.add(problem.getStartState())
        while not stack.isEmpty():
            state, actions = stack.pop()
            if problem.isGoalState(state):
                return actions
            successors = problem.getSuccessors(state)
            for successor, action, _ in successors:
                if successor not in visited:
                    stack.push((successor, actions + [action]))
                    visited.add(successor)
    
  2. 一致代价搜索(UCS):在search.pyuniformCostSearch函数中,利用util.PriorityQueue来实现一致代价搜索。将起始状态及其代价(初始为0)放入优先队列,同时记录已访问状态。每次从队列中取出代价最小的状态,若为目标状态则返回路径;否则,获取其后继状态,计算新的代价并将未访问的后继状态及其新代价和路径加入队列,直到找到目标状态。
    def uniformCostSearch(problem):
        priority_queue = util.PriorityQueue()
        priority_queue.push((problem.getStartState(), []), 0)
        visited = set()
        while not priority_queue.isEmpty():
            state, actions = priority_queue.pop()
            if problem.isGoalState(state):
                return actions
            if state in visited:
                continue
            visited.add(state)
            successors = problem.getSuccessors(state)
            for successor, action, cost in successors:
                new_actions = actions + [action]
                new_cost = problem.getCostOfActions(new_actions)
                priority_queue.push((successor, new_actions), new_cost)
    
  3. A*算法:在search.pyaStarSearch函数里,A*算法同样使用util.PriorityQueue,结合启发式函数来进行搜索。将起始状态及其总代价(初始为启发式函数值)放入队列,记录已访问状态。每次取出总代价最小的状态,若为目标状态则返回路径;否则,获取后继状态,计算新的总代价(路径代价与启发式函数值之和)并将未访问的后继状态及其新总代价和路径加入队列,直至找到目标状态。
    def aStarSearch(problem, heuristic=nullHeuristic):
        priority_queue = util.PriorityQueue()
        priority_queue.push((problem.getStartState(), []), heuristic(problem.getStartState(), problem))
        visited = set()
        while not priority_queue.isEmpty():
            state, actions = priority_queue.pop()
            if problem.isGoalState(state):
                return actions
            if state in visited:
                continue
            visited.add(state)
            successors = problem.getSuccessors(state)
            for successor, action, cost in successors:
                new_actions = actions + [action]
                new_cost = problem.getCostOfActions(new_actions) + heuristic(successor, problem)
                priority_queue.push((successor, new_actions), new_cost)
    
  4. 启发函数设计:在searchAgents.py中为CornersProblem实现启发函数时,要考虑当前状态与目标角落的距离以及已访问角落的情况。利用曼哈顿距离计算当前位置到未访问角落的距离之和,作为启发函数值。同时,确保函数满足非平凡、非负且一致的条件。
    def cornersHeuristic(state, problem):
        corners = problem.corners
        walls = problem.walls
        node = state[0]
        visited_corners = state[1]
        h_sum = 0
        for corner in corners:
            if corner not in visited_corners:
                h_sum += util.manhattanDistance(node, corner)
        return h_sum
    

完成上述代码实现后,可以按照题目中的测试指令,使用pacman.pyautograder.py对代码进行测试和验证,确保算法的正确性和有效性。

以下是对加州伯克利大学Pac-Man项目中四种搜索算法的详细讲解及Python代码示例:


题目1:深度优先算法(DFS)

原理:通过栈实现,优先探索最深层节点。可能找到非最优解。
Pac-Man应用:用于简单路径探索,但可能绕远路。

def depthFirstSearch(problem):
    stack = [(problem.getStartState(), [], 0)]
    visited = set()
    while stack:
        state, path, cost = stack.pop()  # LIFO
        if problem.isGoalState(state):
            return path
        if state not in visited:
            visited.add(state)
            # 生成所有子节点(未反转顺序)
            successors = problem.getSuccessors(state)
            for nextState, action, stepCost in successors:
                stack.append((nextState, path + [action], cost + stepCost))
    return []

题目2:一致代价搜索(UCS)

原理:优先队列按路径成本排序,保证最优解。
Pac-Man应用:计算避开幽灵的最短路径。

import heapq
def uniformCostSearch(problem):
    heap = []
    start = problem.getStartState()
    heapq.heappush(heap, (0, start, [], 0))  # (cost, state, path, total_cost)
    visited = {}  # 记录各状态的最小代价
    while heap:
        currentCost, state, path, totalCost = heapq.heappop(heap)
        if problem.isGoalState(state):
            return path
        if state in visited and visited[state] <= currentCost:
            continue  # 已存在更优路径
        visited[state] = currentCost
        for nextState, action, stepCost in problem.getSuccessors(state):
            newCost = totalCost + stepCost
            if nextState not in visited or newCost < visited.get(nextState, float('inf')):
                heapq.heappush(heap, (newCost, nextState, path + [action], newCost))
    return []

题目3:A*算法

原理:UCS + 启发式函数,优先扩展总代价(g(n)+h(n))最小的节点。
Pac-Man应用:快速找到吃豆或逃生的最优路径。

def aStarSearch(problem, heuristic=nullHeuristic):
    start = problem.getStartState()
    heap = []
    heapq.heappush(heap, (heuristic(start, problem), start, [], 0))
    visited = {}  # 记录状态及其最小实际代价
    while heap:
        priority, state, path, cost = heapq.heappop(heap)
        if problem.isGoalState(state):
            return path
        if state in visited and visited[state] <= cost:
            continue
        visited[state] = cost
        for nextState, action, stepCost in problem.getSuccessors(state):
            newCost = cost + stepCost
            newPriority = newCost + heuristic(nextState, problem)
            if nextState not in visited or newCost < visited.get(nextState, float('inf')):
                heapq.heappush(heap, (newPriority, nextState, path + [action], newCost))
    return []

# 曼哈顿距离启发式
def manhattanHeuristic(state, problem):
    xy1 = state
    xy2 = problem.goal
    return abs(xy1[0] - xy2[0]) + abs(xy1[1] - xy2[1])

题目4:启发函数设计

设计原则

  1. 可采纳性:启发式估值 ≤ 真实代价(保证A*最优性)
  2. 一致性:h(n) ≤ c(n,n') + h(n')(保证无需重复检查节点)

Pac-Man启发式示例

# 最近食物距离(可采纳)
def closestFoodHeuristic(state, problem):
    position, foodGrid = state
    if foodGrid.count() == 0:
        return 0
    minDist = float('inf')
    for food in foodGrid.asList():
        dist = abs(position[0]-food[0]) + abs(position[1]-food[1])
        minDist = min(minDist, dist)
    return minDist

# 松弛问题:忽略障碍后的MST启发式(可采纳)
def mstHeuristic(state, problem):
    # 实现需计算所有食物的最小生成树,此处简化
    return maxFoodDistance(state, problem)  # 示例返回最大食物距离

# 动态调整启发式(考虑幽灵位置)
def adaptiveHeuristic(state, problem):
    ghostFactor = 10  # 幽灵距离惩罚系数
    position = state[0]
    ghostDist = min([manhattanDistance(position, ghost) for ghost in problem.getGhostPositions()])
    foodHeur = closestFoodHeuristic(state, problem)
    return foodHeur + (1.0 / (ghostDist + 1)) * ghostFactor

总结

  • DFS适合快速探索但非最优
  • UCS保证成本最优但效率低
  • A*通过启发式加速搜索
  • 启发式设计需平衡准确性与计算开销,确保可采纳性。
posted @ 2025-03-19 23:22  kkman2000  阅读(141)  评论(0)    收藏  举报