如何保证 LLM 输出格式规范

为什么需要格式规范

先看一个会崩溃的例子。假设用户说"我的电话是 +55 12 3923-5555"，我们需要从中提取出标识符去查数据库：

# 不可靠的做法
response = llm.invoke("从这句话提取客户标识符：我的电话是+55 12 3923-5555")
# response 可能是 "标识符是 +55 12 3923-5555"
# 也可能是 "+55 12 3923-5555"
# 还可能是 "用户提供的电话号码为：+55..."
# 你没法稳定地拿到那串数字

问题在于：LLM 默认输出自然语言，而程序需要确定的数据结构。解决这个矛盾，核心思路是给模型套上"格式的笼子"。

手段一：工具调用的参数规范（bind_tools）

当 LLM 需要调用工具时，参数必须格式正确。LangChain 用 @tool 装饰器自动从函数签名生成 JSON Schema：

from langchain_core.tools import tool

@tool
def get_albums_by_artist(artist: str):
    """根据艺术家名字查找专辑"""
    return db.run(...)

# 绑定工具到 LLM
music_tools = [get_albums_by_artist, ...]
llm_with_music_tools = llm.bind_tools(music_tools)

@tool 会把函数的参数名、类型注解、docstring 转成模型能理解的 schema。这样模型生成工具调用时，就知道"调用 get_albums_by_artist 必须传一个叫 artist 的字符串"，而不会乱传参数。

手段二：结构化输出（with_structured_output）—— 核心

这是保证格式规范最关键的手段。做法分两步：

用 Pydantic 定义"数据长什么样"
用 with_structured_output 把 schema 绑定到 LLM

例子 1：解析用户输入

from pydantic import BaseModel, Field

# 第一步：定义数据结构
class UserInput(BaseModel):
    """用于解析用户提供的账户信息的架构"""
    identifier: str = Field(
        description="Identifier, which can be a customer ID, email, or phone number."
    )

# 第二步：绑定到 LLM，强制输出符合 UserInput
structured_llm = llm.with_structured_output(schema=UserInput)

# 调用后，result 是一个验证过的 UserInput 对象
result = structured_llm.invoke("我的电话是 +55 12 3923-5555")
print(result.identifier)  # 直接拿到 "+55 12 3923-5555"，类型确定

无论用户输入多么自由，模型都必须返回一个有 identifier 字段的对象，你可以直接用 .identifier 访问，不用手动解析字符串。

例子 2：保存结构化数据

from typing import List

class UserProfile(BaseModel):
    customer_id: str = Field(description="The customer ID of the customer")
    music_preferences: List[str] = Field(description="The music preferences of the customer")

# 让模型分析对话，输出符合 UserProfile 的结构
updated_memory = llm.with_structured_output(UserProfile).invoke([system_message])

# 拿到的就是规整的数据，可以直接存数据库
store.put(namespace, "user_memory", {"memory": updated_memory})

模型分析整段对话后，必须输出 customer_id（字符串）+ music_preferences（字符串列表）的对象。存进数据库时格式就是确定的，读取时也能稳定访问。

例子 3：严格模式

对格式要求特别高的场景（比如评估打分），可以开启严格模式：

class Grade(BaseModel):
    is_correct: bool = Field(description="答案是否正确")
    reasoning: str = Field(description="打分理由")

# method="json_schema" + strict=True 在 API 层面强制 100% 符合
grader_llm = llm.with_structured_output(Grade, method="json_schema", strict=True)

with_structured_output 的底层原理

它有两种实现方式，框架会根据模型能力自动选择：

方式一：Function Calling（工具调用模式）
  把 Pydantic 模型伪装成一个"虚拟工具"
  强制模型"调用"这个工具
  从工具调用的参数里提取结构化数据

方式二：JSON Schema 模式（method="json_schema"）
  直接告诉 API："必须返回符合此 JSON Schema 的内容"
  strict=True 时由 API 层面保证 100% 符合 schema

无论哪种方式，最终返回的都是一个已经过 Pydantic 验证的对象，字段类型有保证。

手段三：Prompt + Schema 双重保险

光靠 Schema 还不够，好的做法是在 prompt 里再次强调格式要求，形成双重保险。看项目中保存记忆的 prompt：

create_memory_prompt = """You are an expert analyst...

The customer's memory profile should have the following fields:
- customer_id: the customer ID of the customer
- music_preferences: the music preferences of the customer

Ensure your response is an object that has the following fields:
- customer_id: the customer ID of the customer
- music_preferences: the music preferences of the customer
"""

两层约束各司其职：

层级	手段	约束类型	作用
Prompt	文字描述期望格式	软约束	引导模型理解意图
Schema	with_structured_output	硬约束	技术层面强制结构

软约束让模型"想对"，硬约束保证"必须对"。

什么时候该用，什么时候不该用

一个重要原则：只对需要程序后续处理的数据做格式规范，给用户看的自然语言回答不要套结构。

在这个项目里：

需要结构化（用 with_structured_output）：
  - 提取的用户标识符  → 要拿去查数据库
  - 用户音乐偏好      → 要存进 store
  - 评估分数          → 要做统计计算

不需要结构化（保持自由文本）：
  - 子智能体给用户的最终回答（"推荐这些摇滚歌曲..."）
    这本来就该是自然语言，套结构反而别扭

判断标准很简单：这个输出是给机器读的，还是给人读的？ 给机器读的就规范格式，给人读的就保持自然。

总结

保证 LLM 输出格式规范的完整工具箱：

场景	手段	保证什么
工具调用	`@tool` + `bind_tools`	调用参数格式正确
提取/解析数据	`with_structured_output(Schema)`	返回对象有确定字段
高要求场景	`with_structured_output(Schema, strict=True)`	API 层面严格符合
所有场景	Prompt 中描述格式	引导模型 + 双重保险

一句话概括：用 Pydantic 定义数据结构，用 with_structured_output 把结构强制绑定到模型，配合 prompt 里的文字说明形成双重保险。 这样需要程序处理的数据就永远是可靠、可解析的，不会因为模型"自由发挥"而导致代码崩溃。

posted @ 2026-05-29 10:42 江鸟Dev 阅读(13) 评论(0) 收藏举报

刷新页面返回顶部

lhy-dev