langgraph和langfuse集成

trace用法



"""
Chat API 路由
文件: src/server/routes/chat.py
对应任务: T-502 Chat API 路由

提供聊天接口，支持 SSE 流式响应
"""

import asyncio
import json
from typing import AsyncIterator
from uuid import uuid4
from fastapi import APIRouter, HTTPException
from fastapi.responses import StreamingResponse
from langchain_core.messages import HumanMessage, AIMessage

from src.graph.builder import get_graph
from src.server.chat_request import ChatRequest
from src.utils.logger import get_logger

logger = get_logger(__name__)
from langfuse import observe

from langfuse.langchain import CallbackHandler
from langfuse import get_client


lf_callback = CallbackHandler()
langfuse = get_client()


router = APIRouter()

# ============================================================================
# SSE 事件格式化
# ============================================================================

def _make_event(event_type: str, data: dict) -> str:
    """格式化 SSE 事件
    
    Args:
        event_type: 事件类型
        data: 事件数据
    
    Returns:
        SSE 格式的字符串
    """
    return f"event: {event_type}\ndata: {json.dumps(data, ensure_ascii=False)}\n\n"


def _summarize_tool_result(tool_result: any) -> dict:
    """总结工具调用结果（用于前端展示）
    
    Args:
        tool_result: 工具返回的原始结果
    
    Returns:
        总结后的字典
    """
    if tool_result is None:
        return {"type": "empty", "summary": "无结果"}
    
    if isinstance(tool_result, dict):
        # 如果是字典，提取关键信息
        if "error" in tool_result:
            return {"type": "error", "summary": f"错误: {tool_result.get('error')}"}
        
        # 根据不同的工具结果类型总结
        if "product" in tool_result or "model" in tool_result:
            # 产品信息
            product = tool_result.get("product") or tool_result.get("model")
            price = tool_result.get("price")
            return {"type": "product", "summary": f"{product}: {price}元" if price else product}
        elif "comparison" in tool_result:
            # 产品对比
            comp = tool_result.get("comparison", [])
            return {"type": "comparison", "summary": f"对比了 {len(comp)} 款产品"}
        else:
            # 其他字典类型
            keys = list(tool_result.keys())[:3]
            return {"type": "dict", "summary": f"包含字段: {', '.join(keys)}"}
    
    elif isinstance(tool_result, list):
        return {"type": "list", "summary": f"返回 {len(tool_result)} 条记录"}
    else:
        return {"type": "other", "summary": str(tool_result)[:100]}


# ============================================================================
# Chat API
# ============================================================================

@router.post("/chat")
async def customer_service_chat(request: ChatRequest):
    """客服聊天接口（SSE 流式响应）
    
    Args:
        request: 聊天请求
    
    Returns:
        StreamingResponse (SSE)
    """
    thread_id = request.thread_id or str(uuid4())
    
    # 获取最后一条消息内容（字典格式）
    last_message = request.messages[-1] if request.messages else {}
    last_content = last_message.get("content", "") if isinstance(last_message, dict) else ""
    
    logger.info("Chat request received, thread_id=%s, message=%s", 
               thread_id, last_content[:50])

    @observe(
                name="customer_service_chat",
                as_type="trace"
            )
    async def event_stream() -> AsyncIterator[str]:
        """SSE 事件流生成器"""
        try:
            # 获取 Graph
            graph = get_graph()
            
            # 将前端消息转换为 LangChain 消息
            langchain_messages = request.to_langchain_messages()
            
            # 构建初始状态
            initial_state = {
                "messages": langchain_messages,
                "user_input": langchain_messages[-1].content if langchain_messages else ""
            }
            
            # 执行 Graph（流式）
            config = {
                "configurable": {
                    "thread_id": thread_id
                },"callbacks": [lf_callback]
            }
            
            # 发送开始事件
            yield _make_event("start", {
                "thread_id": thread_id,
                "status": "processing"
            })

           
            # 执行 Graph 并流式返回
           
            async for event in graph.astream(initial_state, config):
                logger.debug("Graph event: %s", list(event.keys()))
                
                # 处理不同节点的输出
                for node_name, node_output in event.items():
                    # 跳过系统节点
                    if node_name in ["__start__", "__end__"]:
                        continue
                    
                    # 发送节点事件（包含详细信息）
                    node_data = {
                        "node": node_name,
                        "thread_id": thread_id
                    }
                    
                    # 根据节点类型添加详细信息
                    if node_name == "coordinator" and isinstance(node_output, dict):
                        # 意图识别结果
                        node_data["details"] = {
                            "intents": node_output.get("intents", []),
                            "route_to": node_output.get("route_to", "unknown")
                        }
                    elif node_name == "react_agent" and isinstance(node_output, dict):
                        # 工具调用信息
                        tool_result = node_output.get("tool_result")
                        node_data["details"] = {
                            "tool_used": node_output.get("tool_used"),
                            "tool_result_summary": _summarize_tool_result(tool_result),
                            "tool_result": tool_result  # 完整结果用于详细展示
                        }
                    elif node_name == "rag_agent" and isinstance(node_output, dict):
                        # RAG 检索信息
                        node_data["details"] = {
                            "retrieved_docs": node_output.get("retrieved_docs", 0),
                            "context_length": node_output.get("context_length", 0)
                        }
                    
                    yield _make_event("node", node_data)
                    
                    # 如果有响应，发送消息事件
                    if isinstance(node_output, dict) and "response" in node_output:
                        response = node_output["response"]
                        
                        yield _make_event("message_chunk", {
                            "id": str(uuid4()),
                            "thread_id": thread_id,
                            "role": "assistant",
                            "content": response,
                            "agent": node_name,
                            "finish_reason": "stop"
                        })
            
            # 发送完成事件
            yield _make_event("done", {
                "thread_id": thread_id,
                "status": "completed"
            })
            
            logger.info("Chat request completed, thread_id=%s", thread_id)
            
        except Exception as e:
            logger.error("Error in chat stream: %s", str(e), exc_info=True)
            
            # 发送错误事件
            yield _make_event("error", {
                "thread_id": thread_id,
                "error": str(e),
                "message": "处理请求时发生错误，请稍后再试"
            })
    
    return StreamingResponse(
        event_stream(),
        media_type="text/event-stream",
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
            "X-Accel-Buffering": "no"
        }
    )


__all__ = ["router"]

trace用法

这是一个非常好的问题，已经进入 Langfuse 可观测模型的核心层 了。
我直接给你确定结论，然后用你当前这套 SSE + LangGraph 的真实场景来解释清楚。

一、结论先行（你可以直接记住）

在同一个执行上下文中，只能有一个「root Trace」。
第一个创建的 as_type="trace" 是 root；
后续再出现的 as_type="trace" 不会再创建新的 root，而是会被降级为子 Span（或者被合并到当前 Trace）。

因此：

/chat 的 @observe(as_type="trace") → root Trace（业务会话）
你在内部函数上加的 @observe(as_type="trace") → 不会成为 root

二、用你的代码场景具体说明

你现在的结构大概是这样：

@observe(name="customer_service_chat", as_type="trace")
async def event_stream():
    ...
    result = coordinator_node(...)
    ...

然后你又在某个自定义函数上写了：

@observe(name="intent_routing", as_type="trace")
def route_intent(...):
    ...

实际在 Langfuse 里发生的是：

Trace (customer_service_chat)   ← root
 ├─ Span (intent_routing)       ← 自动降级
 ├─ Span (coordinator)
 │   └─ ...
 └─ ...

❗ 不会出现两个并列 root Trace

这是 Langfuse Python SDK 的硬约束。

三、Langfuse 是如何判定「谁是 root」的？

核心规则只有一条：

谁先创建 Trace，谁就是 root。

更具体一点：

@observe(as_type="trace")
- 如果 当前上下文没有 Trace → 创建 root Trace
- 如果 已经存在 Trace → 不再创建 Trace
后续的 @observe：
- 会挂到当前 Trace 下
- 变成 Span / Observation

四、那我在“自定义函数”上应该怎么加才是对的？

✅ 正确心智模型

场景	应该用
一次 HTTP / SSE / 任务	`@observe(as_type="trace")`
Trace 内部的业务步骤	`@observe()`（默认是 span）
非关键逻辑	不加

❌ 不推荐的写法

@observe(as_type="trace")
def intent_classifier():
    ...

除非：

这个函数可能被独立调用
或在 CLI / 离线任务中作为入口

✅ 推荐写法（你这个项目）

@observe(name="intent_routing")
def intent_classifier(...):
    ...

让它自然变成：

Trace: customer_service_chat
 └─ Span: intent_routing

五、如果你“真的想”让某个函数成为 root，怎么办？

只有 两种合法方式：

方式 1：它是执行入口

@observe(as_type="trace")
def batch_job():
    ...

并且它是 第一个被调用的被 observe 的函数。

方式 2：显式开启一个新上下文（高级用法，不推荐常用）

from langfuse import get_client

langfuse = get_client()
with langfuse.start_as_current_trace(name="independent_task"):
    ...

这会人为切断当前 Trace。

⚠️ 在 Web / SSE 场景里非常容易把 Trace 弄乱
不建议你这么干

六、你这个项目的「最佳实践总结」

结合你现在的代码，我给你一个可以长期遵守的规则：

每个“业务入口”只允许一个 as_type="trace"；
其余地方一律用 @observe()（span）。

具体到你这里：

位置	用法
`/chat` SSE	`@observe(as_type="trace")` ✅
coordinator_node	`@observe()`
orchestrator_node	`@observe()`
自定义规则函数	`@observe()`
工具 / util	视情况

七、一句话总结（你可以直接复述）

Langfuse 中 root Trace 由“第一个被创建的 trace”决定；
在已有 Trace 的上下文里，再写 as_type="trace" 不会产生新的 root，只会变成子 Span。

如果你愿意，下一步我可以直接帮你：

审一遍你现在所有 @observe，告诉你哪些该是 trace，哪些该是 span
或 给你一份“多入口系统（HTTP + SSE + Cron）如何划 Trace 边界”的模板

你选一个，我继续。

posted @ 2025-12-19 09:52 向着朝阳阅读(324) 评论(0) 收藏举报

刷新页面返回顶部

aibi1