速通ReAct Agent + MCP + 人工审查 + Postgre记忆 + ...(更新中)

主要参考：

[1] 南哥 https://www.bilibili.com/video/BV1YWJWzXEro

[2] ChatGPT

一、简单ReAct+MCP

版本

Python 3.11

pip install langgraph==0.5.0 （此处用0.4.5会报错。）
pip install langchain==0.3.25
pip install langchain-openai==0.3.17
pip install langchain-mcp-adapters==0.1.0

import asyncio
import os
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.prebuilt import create_react_agent
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_core.messages import SystemMessage, HumanMessage
from langchain.chat_models import init_chat_model
from typing import Dict, List, Any

模型定义

基础参数三板斧：

llm = init_chat_model(
    model="openai:qwen-plus",
    temperature=0,
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key=os.environ.get("aliQwen-api")
)

温度设置：

模型参数里的 temperature（温度）一般取值范围是 0 到 2 之间，常用的默认或推荐值如下：

0 到 0.3：生成更确定、重复性更高的文本，适合需要精准答案、专业文档或代码场景。temperature 越接近 0，输出越“死板”，几乎没有随机性。

0.5 到 1.0：比较平衡的随机性和多样性，适合对话、创意写作、生成自然语言内容。

1.0 以上：生成更富创造性和多样性，但可能变得不太连贯甚至胡言乱语。

常见设置举例

ChatGPT 默认大约 0.7 左右（既保证回答流畅，也有一定创新性）

写代码时通常用 0.0~0.2，保证代码准确无误

创意写作、诗歌可用 1.0 或更高

MCP实例化：

client = MultiServerMCPClient({
    # 高德地图MCP Server
    "amap-amap-sse": {
        "url": "https://mcp.amap.com/sse?key=<高德地图api-key>",
        "transport": "sse",
    }
})

ReAct风格Agent实例化：

# 从MCP Server中获取可提供使用的全部工具
tools = await client.get_tools()
# 基于内存存储的short-term
checkpointer = InMemorySaver()
# 定义系统消息，指导如何使用工具
system_message = SystemMessage(content=(
    "你是一个AI助手，使用高德地图工具获取信息。"
))
# 创建ReAct风格的agent
agent = create_react_agent(
    model=llm,
    tools=tools,
    prompt=system_message,
    # prompt=f"你是一个乐于助人的AI助手。",
    checkpointer=checkpointer
)
# 定义short-term需使用的thread_id
config = {"configurable": {"thread_id": "1"}}

查询

非流式

agent_response = await agent.ainvoke({"messages": [HumanMessage(content="这个118.79815,32.01112经纬度对应的地方是哪里")]}, config)
agent_response_content = agent_response["messages"][-1].content
print(f"agent_response:{agent_response_content}")

流式

# message_chunk：token数据 
async for message_chunk, metadata in agent.astream( 
        input={"messages": [HumanMessage(content="上海的天气如何?")]},
        config=config,
        stream_mode="messages"
):
    # 跳过工具输出
    if metadata["langgraph_node"]=="tools":
        continue
    # 输出最终结果
    if message_chunk.content:
        print(message_chunk.content, end="", flush=True)

二、人工审查工具调用

自定义工具示例

# @tool("book_hotel",description="提供预订酒店的工具")
@tool("book_hotel",description="需要人工审查/批准的预定酒店的工具")
def book_hotel(hotel_name: str):
    return f"成功预定了在{hotel_name}的住宿。"

MCP工具示例

client = MultiServerMCPClient({
    # 高德地图MCP Server
    "amap-amap-sse": {
        "url": "https://mcp.amap.com/sse?key=9bc2bffca80fbf67ecc8f0ar49ff43313b6053cf",
        "transport": "sse",
    }
})

包装工具加入人工

注：其中的回复类型 response["type"] 是自己定义的，可以扩展，并分别设计执行内容。后续在人工审查回复时指定回复类型。

# 定义一个函数，用于为工具添加人工审查（human-in-the-loop）功能
# 参数：tool（可调用对象或 BaseTool 对象），interrupt_config（可选的人工中断配置）
# 返回：一个带有人工审查功能的 BaseTool 对象
async def add_human_in_the_loop(
        tool: Callable | BaseTool,
        *,
        interrupt_config: HumanInterruptConfig = None,
) -> BaseTool:
    """Wrap a tool to support human-in-the-loop review."""

    # 检查传入的工具是否为 BaseTool 的实例
    if not isinstance(tool, BaseTool):
        # 如果不是 BaseTool，则将可调用对象转换为 BaseTool 对象
        tool = create_tool(tool)

    # 检查是否提供了 interrupt_config 参数
    if interrupt_config is None:
        # 如果未提供，则设置默认的人工中断配置，允许接受、编辑和响应
        interrupt_config = {
            "allow_accept": True,
            "allow_edit": True,
            "allow_respond": True,
        }

    # 使用 create_tool 装饰器定义一个新的工具函数，继承原工具的名称、描述和参数模式
    @create_tool(
        tool.name,
        description=tool.description,
        args_schema=tool.args_schema
    )
    # 定义内部函数，用于处理带有中断逻辑的工具调用
    async def call_tool_with_interrupt(config: RunnableConfig, **tool_input):
        # 创建一个人为中断请求，包含工具名称、输入参数和配置
        request: HumanInterrupt = {
            "action_request": {
                "action": tool.name,
                "args": tool_input
            },
            "config": interrupt_config,
            "description": "Please review the tool call"
        }
        # 调用 interrupt 函数，获取人工审查的响应（取第一个响应）
        response = interrupt([request])[0]
        # 检查响应类型是否为“接受”（accept）
        if response["type"] == "accept":
            # 如果接受，直接调用原始工具并传入输入参数和配置
            tool_response = await tool.ainvoke(tool_input, config)
        # 检查响应类型是否为“编辑”（edit）
        elif response["type"] == "edit":
            # 如果是编辑，更新工具输入参数为响应中提供的参数
            tool_input = response["args"]["args"]
            # 使用更新后的参数调用原始工具
            tool_response = await tool.ainvoke(tool_input, config)
        # 检查响应类型是否为“响应”（response）
        elif response["type"] == "response":
            # 如果是响应，直接将用户反馈作为工具的响应
            user_feedback = response["args"]
            tool_response = user_feedback
        # 如果响应类型不被支持，则抛出异常
        else:
            raise ValueError(f"Unsupported interrupt response type: {response['type']}")

        # 返回工具的响应结果
        return tool_response

    # 返回包装后的工具函数
    return call_tool_with_interrupt

使用步骤

获取MCP工具（如有）

all_tools = await client.get_tools()

包装工具，加入人工审查

tools = [await add_human_in_the_loop(index) for index in all_tools]

自定义工具（同理）

tools.append(await add_human_in_the_loop(book_hotel))

agent创建

同前。

发送请求

流式或非流式。当调用工具时，模型会暂停。

模拟人工命令

以下示例的三种回复和call_tool_with_interrupt()函数中的执行一一对应。可扩展，可修改。

同意并继续

agent_response = await agent.ainvoke(
    Command(resume=[{"type": "accept"}]), config
)

编辑参数

agent_response = await agent.ainvoke(
    Command(resume=[{"type": "edit", "args": {"args": {'location': '120.619585,31.299379'}}}]),
    config
)

回复大模型

注：前两处只是处理调用工具的参数，而此处回复后由大模型决定后续流程。可以停止，或者继续，或者干别的事情。

agent_response = await agent.ainvoke(
    Command(resume=[{"type": "response", "args": "我不想查询了"}]),
    config
)

三、基于Postgre的记忆

无需自己配置，只要使用Postgre的库就行

from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver

版本

python包：

pip install langgraph-checkpoint-postgres==2.0.21

docker安装Postgre：

docker-compose up -d

docker-compose.yml：

services:
  postgres:
    image: postgres:15        # 指定具体版本
    container_name: postgres_db
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: postgres
      TZ: Asia/Shanghai       # 设置时区
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: unless-stopped
    healthcheck:             # 健康检查
      test: ["CMD", "pg_isready", "-U", "postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
    command: ["postgres", "-c", "max_connections=200"]  # 自定义配置

volumes:
  pgdata:

短期记忆

连接数据库

db_uri = "postgresql://postgres:postgres@localhost:5432/postgres?sslmode=disable"

初始化带记忆的agent

# 是的 不需要自己搞，调这东西就全OK了
async with AsyncPostgresSaver.from_conn_string(db_uri) as checkpointer:
    # 初始化
    await checkpointer.setup()

    # 创建ReAct风格的agent
    agent = create_react_agent(
        model=llm,
        tools=tools,
        prompt=system_message,
        # 一个可选的节点，用于添加在agent节点之前
        # pre_model_hook=pre_model_hook,
        checkpointer=checkpointer,
    )

    # 定义thread_id
    config = {"configurable": {"thread_id": "1"}}

    # 将检索出的信息拼接到用户输入中
    user_input = "我叫什么"
    # user_input = "我是南哥"
    # user_input = "我叫什么"


    # 1、非流式处理查询
    agent_response = await agent.ainvoke({"messages": [HumanMessage(content=user_input)]}, config)
    agent_response_content = agent_response["messages"][-1].content
    print(f"final response: {agent_response_content}")

控制读取记忆数量

# 每次在调用 LLM 的节点之前，都会调用该函数
# 修剪聊天历史以满足 token 数量或消息数量的限制
def pre_model_hook(state):
    trimmed_messages = trim_messages(
        messages = state["messages"],
        # 限制为 4 条消息
        max_tokens=4,
        strategy="last",
        # 使用 len 计数消息数量
        token_counter=len,
        start_on="human",
        include_system=True,
        allow_partial=False,
    )
    # 按token数
    # trimmed_messages = trim_messages(
    #     messages = state["messages"],
    #     strategy="last",
    #     token_counter=count_tokens_approximately,
    #     max_tokens=20,
    #     start_on="human",
    #     end_on=("human", "tool"),
    # )
    # 可以在 `llm_input_messages` 或 `messages` 键下返回更新的信息
    return {"llm_input_messages": trimmed_messages}

短期记忆存储在checkpoints表中。

长期记忆

其实也是数据库的库自动整的。emm，得研究研究其他数据库的函数调用了

之前的短期记忆agent初始化是这样的：

async with AsyncPostgresSaver.from_conn_string(db_uri) as checkpointer:
    # 初始化
    await checkpointer.setup()

    # 创建ReAct风格的agent
    ...

而长期记忆是这样的：

async with (
        AsyncPostgresSaver.from_conn_string(db_uri) as checkpointer,
        AsyncPostgresStore.from_conn_string(db_uri) as store

    ):
        # 初始化
        await store.setup()
        await checkpointer.setup()

        # 创建ReAct风格的agent
        agent = create_react_agent(
            ...,
            store=store
        )
        ...

长期记忆存储示例：

# 短期按线程id 长期按用户id
config = {"configurable": {"thread_id": "1", "user_id": "1"}}4
# 自定义存储逻辑 对用户输入进行处理，检查是否需要存储长期记忆
namespace = ("memories", config["configurable"]["user_id"])
memory1 = "我的名字叫南哥"
await store.aput(namespace, str(uuid.uuid4()), {"data": memory1})
memory2 = "我的住宿偏好是:有窗户、有Wi-Fi"
await store.aput(namespace, str(uuid.uuid4()), {"data": memory2})
print("已存储长期记忆！")

长期记忆存储检索：

# 长期记忆检索 如检索当前用户所关联的配好设置等
user_id = config["configurable"]["user_id"]
namespace = ("memories", user_id)
memories = await store.asearch(namespace, query="")
info = " ".join([d.value["data"] for d in memories]) if memories else "无长期记忆信息"
print(f"检索的信息为:{info}")

长期记忆存储在store库中。

emm，如何通过前后端交互保存长期记忆是个问题。这个长期记忆，是应该由大模型决定呢，还是用户来决定，或者both？

来自chatgpt的回答：

当前实践中有没有用到长期记忆？

商业产品中

ChatGPT 官方及大多数类似产品暂时主要依赖短期上下文窗口，没有真正意义上的“长期记忆”贯穿多会话。

但微软的Copilot、部分企业定制的AI助手在做“用户配置文件”或“知识库”集成，有点类似长期记忆。

一些AI笔记应用（如Mem.ai、Obsidian AI 插件）利用向量数据库做跨会话知识管理。

研究和开源项目

有基于向量数据库（如Pinecone, Weaviate）的长期记忆实现，结合大模型做检索增强生成（RAG）。

Agents框架（LangChain, LlamaIndex）提供了记忆模块，支持“会话记忆”和“长期知识库”。

这些方案多数依赖“外部记忆存储 + 大模型动态调用”模式。

感觉不如RAG。话说南哥本项目似乎还没引入RAG，好像在之前的项目里有。我到时候看看能不能整合过来吧。

四、FastAPI整合

posted @ 2025-07-13 19:43 Fordson 阅读(192) 评论(0) 收藏举报

刷新页面返回顶部

Fordson

code小白

速通ReAct Agent + MCP + 人工审查 + Postgre记忆 + ...(更新中)

一、简单ReAct+MCP

版本

模型定义

常见设置举例

查询

非流式

流式

二、人工审查工具调用

自定义工具示例

MCP工具示例

包装工具加入人工

使用步骤

三、基于Postgre的记忆

版本

短期记忆

长期记忆

当前实践中有没有用到长期记忆？

四、FastAPI整合

公告

Fordson

code小白

速通ReAct Agent + MCP + 人工审查 + Postgre记忆 + ...(更新中)

一、 简单ReAct+MCP

版本

模型定义

常见设置举例

查询

非流式

流式

二、人工审查工具调用

自定义工具示例

MCP工具示例

包装工具加入人工

使用步骤

三、基于Postgre的记忆

版本

短期记忆

长期记忆

当前实践中有没有用到长期记忆？

四、FastAPI整合

公告

一、简单ReAct+MCP