应用引入LLM实践-一次性输出和流式输出(思维链)
在大模型应用时,有的场景希望根据prompt要求一次性输出结果,有的场景则希望输出整个思维过程以及最后的结果。
这部分在网上看了一些文章说的都不一样,自己尝试了一下,正确的写法是这样的,记录一下。
一次性输出:
from openai import OpenAI def generate_huoshan(prompt): client = OpenAI( # 从环境变量中读取您的方舟API Key api_key="**", base_url="https://ark.cn-beijing.volces.com/api/v3", # 深度推理模型耗费时间会较长,建议您设置一个较长的超时时间,推荐为30分钟 timeout=1800, ) response = client.chat.completions.create( model="deepseek-r1-250120", messages=[ {"role": "system", "content": "You are a professional market research assistant who needs to accurately obtain retail price information for specified electronic products in a specific market"}, {"role": "user", "content": prompt}, ], max_tokens=1024, temperature=0.6, stream=False ) answer = response.choices[0].message.content return answer.strip()
流式输出:
from openai import OpenAI def generate_huoshan(prompt): client = OpenAI( api_key="*", base_url="https://ark.cn-beijing.volces.com/api/v3", # 深度推理模型耗费时间会较长,建议您设置一个较长的超时时间,推荐为30分钟 timeout=1800, ) response = client.chat.completions.create( model="deepseek-r1-250120", messages=[ {"role": "system", "content": "You are a professional market research assistant who needs to accurately obtain retail price information for specified electronic products in a specific market"}, {"role": "user", "content": prompt}, ], max_tokens=1024, temperature=0.6, stream=True ) for chunk in response: delta = chunk.choices[0].delta # 优先提取思维链内容 if hasattr(delta, 'reasoning_content') and delta.reasoning_content: yield delta.reasoning_content #print(f"[推理过程] {delta.reasoning_content}", end="\n", flush=True) # 处理最终回答内容 elif delta.content: yield delta.content #print(f"[最终回答] {delta.content}", end="", flush=True) else: continue
外层通过这样返回:
def generate_stream(): try: for chunk in generate_text(model_name, prompt): #yield chunk yield json.dumps({"msg": "Success", "code": 200, "data": chunk})+ '\n' except Exception as e: yield json.dumps({"code": 500, "message": str(e)})+ '\n' # Yield a JSON string headers = { 'Content-Type':'text/event-stream', 'Cache-Control': 'no-cache', 'X-Accel-Buffering':'no', } return Response(generate_stream(), mimetype='text/event-stream',headers=headers)
然后,前端相应做解析即可。


浙公网安备 33010602011771号