本文将演示使用大语言模型从文本中提炼结构化信息。这次我们不直接使用提示词,而是使用大模型的 `few-shot prompting` 特性,即使用很少的例子来引导大模型做推理。  
我们将用 `llama3.1` 和 `deepseek` 做一个简单的对比。
> 由于 `langchain` 可能对不同大模型支持程度不同,不同大模型的特点也不同,所以这个对比并不能说明哪个模型更好。
 

准备

在正式开始撸代码之前,需要准备一下编程环境。

1. 计算机  
本文涉及的所有代码可以在没有显存的环境中执行。 我使用的机器配置为:  
    - CPU: Intel i5-8400 2.80GHz  
    - 内存: 16GB  

2. Visual Studio Code 和 venv
这是很受欢迎的开发工具,相关文章的代码可以在 `Visual Studio Code` 中开发和调试。 我们用 `python` 的 `venv` 创建虚拟环境, 详见:  

3. Ollama
在 `Ollama` 平台上部署本地大模型非常方便,基于此平台,我们可以让 `langchain` 使用 `llama3.1`、`qwen2.5` 等各种本地大模型。详见:  
 

简单推理

下面我们给大模型举几个简单的例子,看看到模型能否根据这些例子,推理出结果。
def reference(model_name):
    messages = [
        {"role": "user", "content": "2 🦜 2"},
        {"role": "assistant", "content": "4"},
        {"role": "user", "content": "2 🦜 3"},
        {"role": "assistant", "content": "5"},
        {"role": "user", "content": "3 🦜 4"},
    ]

    response = ChatOllama(model=model_name,temperature=0.5,verbose=True).invoke(messages)
    return response.content

  

我们使用 `llama3.1` 和`deepseek` 做一下测试:
 
llama3.1
deepseek-r1-tool-calling
7
... \boxed{7}
我运行了3次,`llama3.1` 每次都给出了简洁的正确结果,`deepseek` 输出的内容比较多,推理时间也略长,只有一次推理成功,不知道是否是 `langchain`  的原因。

下面我们再让大模型执行复杂一点的任务。  

定义数据格式

我们先用 `Pydantic` 来定义要提取的数据格式。
class Person(BaseModel):
    """Information about a person."""

    # ^ Doc-string for the entity Person.
    # This doc-string is sent to the LLM as the description of the schema Person,
    # and it can help to improve extraction results.

    # Note that:
    # 1. Each field is an `optional` -- this allows the model to decline to extract it!
    # 2. Each field has a `description` -- this description is used by the LLM.
    # Having a good description can help improve extraction results.
    name: Optional[str] = Field(default=None, description="The name of the person")
    hair_color: Optional[str] = Field(
        default=None, description="The color of the person's hair if known"
    )
    height_in_meters: Optional[float] = Field(
        default=None, description="Height measured in meters"
    )

  

准备消息

 我们使用 `LangChain` 的实用函数 `tool_example_to_messages` 将样例转化为消息序列,用于后面的推理任务。  
examples = [
    (
        "The ocean is vast and blue. It's more than 20,000 feet deep.",
        Person(),
    ),
    (
        "Fiona traveled far from France to Spain.",
        Person(name="Fiona", height_in_meters=None, hair_color=None),
    ),
    (
        "Alan Smith is 1.83 meters tall and has blond hair.",
        Person(name="Alan Smith", height_in_meters=1.83, hair_color="blond"),
    ),
]

messages = []

for txt, tool_call in examples:
    if tool_call.name is None:
        # This final message is optional for some providers
        ai_response = "Detected people."
    else:
        ai_response = "Detected no people."
    messages.extend(tool_example_to_messages(txt, [tool_call], ai_response=ai_response))

经过处理后,这些例子将转化为三种消息:

- User message
- AI message with tool call
- Tool message with result

我们打印出转化后的消息看一下:
for message in messages:
    message.pretty_print()

 

================================ Human Message =================================

The ocean is vast and blue. It's more than 20,000 feet deep.
================================== Ai Message ==================================
Tool Calls:
  Person (16513a89-ca85-46b4-8fac-db02a47b03fe)
 Call ID: 16513a89-ca85-46b4-8fac-db02a47b03fe
  Args:
    name: None
    hair_color: None
    height_in_meters: None
================================= Tool Message =================================

You have correctly called this tool.
================================== Ai Message ==================================

Detected people.
================================ Human Message =================================

Fiona traveled far from France to Spain.
================================== Ai Message ==================================
Tool Calls:
  Person (84e608e4-f444-49e1-b983-c28c38a1c870)
 Call ID: 84e608e4-f444-49e1-b983-c28c38a1c870
  Args:
    name: Fiona
    hair_color: None
    height_in_meters: None
================================= Tool Message =================================

You have correctly called this tool.
================================== Ai Message ==================================

Detected no people.
================================ Human Message =================================

Alan Smith is 1.83 meters tall and has blond hair.
================================== Ai Message ==================================
Tool Calls:
  Person (e464cd1f-fca8-48fb-85a0-5fc722328ec0)
 Call ID: e464cd1f-fca8-48fb-85a0-5fc722328ec0
  Args:
    name: Alan Smith
    hair_color: blond
    height_in_meters: 1.83
================================= Tool Message =================================

You have correctly called this tool.
================================== Ai Message ==================================

Detected no people.

 

提取Person

 
我们定义两个方法:extract方法不使用前面生成的messages;而extract_with_messages则使用前面的messages。
def extract(model_name,text):
    structured_llm = ChatOllama(model=model_name,temperature=0,verbose=True).with_structured_output(schema=Person)  
    user_message = {"role": "user", "content":text}
    response = structured_llm.invoke([user_message])
    return response

def extract_with_messages(model_name,text):
    structured_llm = ChatOllama(model=model_name,temperature=0,verbose=True).with_structured_output(schema=Person)  
    user_message = {"role": "user", "content":text}
    structured_llm.invoke(messages + [user_message])
    return response

  

现在使用两个字符串做一下测试:
- Roy is 1.73 meters tall and has black hair.  
  llama3.1 deepseek-r1-tool-calling
extract name='Roy' hair_color='black' height_in_meters=1.73 name='' hair_color='black' height_in_meters=1.73
extract_with_messages name='Roy' hair_color='black' height_in_meters=1.73  name='' hair_color='black' height_in_meters=1.73
- John Doe is 1.72 meters tall and has brown hair.  
  llama3.1 deepseek-r1-tool-calling
extract name='John Doe' hair_color='brown' height_in_meters=1.72 name='' hair_color='brown' height_in_meters=1.72
extract_with_messages name='John Doe' hair_color='brown' height_in_meters=1.72 name='' hair_color='brown' height_in_meters=1.72
通过以上简单的推理任务,我们发现 `deepseek` 的表现没有 `llama3.1` 准确;`few-shot prompting` 对结果影响不大。

总结

通过以上的演练,我们可以看出 `llama3.1` 和 `deepseek` 在根据例子做推理时表现不一样:
- `deepseek` 似乎推理过程更加复杂一点
- `llama3.1` 推理更加准确一些
- `few-shot prompting` 对此类简单任务结果没什么影响  
> 以上的所谓对比,可能因为 `langchain`  对不同模型的支持不同而失真,不能作为评定大模型好坏的标准,仅供参考。

代码

本文涉及的所有代码以及相关资源都已经共享,参见:
- [github]
- [gitee]