[本科项目实训] 项目总结
环境配置与维护
主要包括模型推理微调环境和前后端执行环境,以及在环境出现问题时的维护。
运行环境参数
-
操作系统:Ubuntu 18.04.6 LTS
-
CPU:Intel(R) Xeon(R) Platinum 8362 CPU @ 2.80GHz
-
GPU:NVIDIA GeForce RTX 3090,CUDA version 12.4
-
Conda:4.10.3
-
Node.js:18.12.0
模型推理微调环境
使用 Anaconda
进行环境配置和环境迁移,参考的项目通常包含 requirements.txt
通用的配置方式为:
pip install -r requirements.txt
但部分库可能存在冲突,需要手动安装,例如强制更新指定库版本:
pip3 install --ignore-installed PyYAML
由于过程中存在环境迁移, Anaconda
可以很好的解决这一问题:
# local
conda pack -n your_env_name
# server
conda config --add envs_dirs your_conda_env_path
tar -zxvf your_env_name.tar.gz -C your_conda_env_path
conda init
bash ~/.bashrc
conda activate your_env_name
通过上传与解压降低了模型配置过程中连接库的开销,并且提高了环境迁移过程中的稳定性。
前后端执行环境
使用 Node.js
、 npm
与 pnpm
进行前后端环境管理,使用 n
控制 node
版本。
$ n
ο node/18.12.0
Use up/down arrow keys to select a version, return key to install, d to delete, q to quit
$ pnpm bootstrap
> chatglm-web@0.1.0 bootstrap /home/lyc/workspace/DSBTPG-web
> pnpm install && pnpm run common:prepare
Lockfile is up to date, resolution step is skipped
Already up to date
Done in 1.2s
> chatglm-web@0.1.0 common:prepare /home/lyc/workspace/DSBTPG-web
> husky install
fatal: not a git repository (or any of the parent directories): .git
husky - git command not found, skipping install
$ pnpm dev
> chatglm-web@0.1.0 dev /home/lyc/workspace/DSBTPG-web
> vite
Port 3000 is in use, trying another one...
VITE v4.1.4 ready in 331 ms
➜ Local: http://localhost:3001/
➜ Network: http://172.17.0.3:3001/
➜ press h to show help
系统环境维护
使用 nvidia-smi
和 fuser
维护 GPU 环境,使用 netstat
维护网络端口。
watch -n 2 -d nvidia-smi
fuser -v /dev/nvidia*
netstat -anlpt | grep your_port
# example
$ watch -n 2 -d nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 On | 00000000:12:00.0 Off | N/A |
| 38% 28C P8 20W / 350W | 12120MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
$ fuser -v /dev/nvidia*
USER PID ACCESS COMMAND
/dev/nvidia5: root kernel mount /dev/nvidia5
root 47329 F...m pt_main_thread
/dev/nvidiactl: root kernel mount /dev/nvidiactl
root 47329 F...m pt_main_thread
/dev/nvidia-uvm: root kernel mount /dev/nvidia-uvm
root 47329 F...m pt_main_thread
/dev/nvidia-uvm-tools:
root kernel mount /dev/nvidia-uvm-tools
$ kill -9 47329
$ netstat -anlpt | grep 3003
tcp 0 0 0.0.0.0:3003 0.0.0.0:* LISTEN 5354/node
$ kill -9 5354
LLM(ChatGLM) 部署与微调
主要包含 模型运行DEMO 和 P-Tuning V2 微调。
- 模型运行DEMO(100+)
- 数据生成(300+)
- 模型训练(100+)
- 行动模拟接口示例 (100+)
- 杂项
模型运行
由于原始模型提供的DEMO没法直接运行,因而我基于其进行了一定的调整,用于在命令行直接测试模型的微调结果。
# cli_demo.py
import os, sys
import platform
import signal
import torch
import transformers
from transformers import (
AutoConfig,
AutoModel,
AutoTokenizer,
AutoTokenizer,
DataCollatorForSeq2Seq,
HfArgumentParser,
Seq2SeqTrainingArguments,
set_seed,
)
from arguments import ModelArguments, DataTrainingArguments
import readline
# LOCAL_PATH = "/home/lyc/workspace/ChatGLM-6B"
# tokenizer = AutoTokenizer.from_pretrained(LOCAL_PATH+"/chatglm-6b", trust_remote_code=True)
# model = AutoModel.from_pretrained(LOCAL_PATH+"/chatglm-6b", trust_remote_code=True).half().cuda()
# model = model.eval()
model = None
tokenizer = None
os_name = platform.system()
clear_command = 'cls' if os_name == 'Windows' else 'clear'
stop_stream = False
def build_prompt(history):
prompt = "欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序"
for query, response in history:
prompt += f"\n\n用户:{query}"
prompt += f"\n\nChatGLM-6B:{response}"
return prompt
def signal_handler(signal, frame):
global stop_stream
stop_stream = True
def main():
global model, tokenizer
parser = HfArgumentParser((
ModelArguments))
if len(sys.argv) == 2 and sys.argv[1].endswith(".json"):
# If we pass only one argument to the script and it's the path to a json file,
# let's parse it to get our arguments.
model_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))[0]
else:
model_args = parser.parse_args_into_dataclasses()[0]
tokenizer = AutoTokenizer.from_pretrained(
model_args.model_name_or_path, trust_remote_code=True)
config = AutoConfig.from_pretrained(
model_args.model_name_or_path, trust_remote_code=True)
config.pre_seq_len = model_args.pre_seq_len
config.prefix_projection = model_args.prefix_projection
if model_args.ptuning_checkpoint is not None:
print(f"Loading prefix_encoder weight from {model_args.ptuning_checkpoint}")
model = AutoModel.from_pretrained(model_args.model_name_or_path, config=config, trust_remote_code=True)
prefix_state_dict = torch.load(os.path.join(model_args.ptuning_checkpoint, "pytorch_model.bin"))
new_prefix_state_dict = {}
for k, v in prefix_state_dict.items():
if k.startswith("transformer.prefix_encoder."):
new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v
model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)
else:
model = AutoModel.from_pretrained(model_args.model_name_or_path, config=config, trust_remote_code=True)
if model_args.quantization_bit is not None:
print(f"Quantized to {model_args.quantization_bit} bit")
model = model.quantize(model_args.quantization_bit)
if model_args.pre_seq_len is not None:
# P-tuning v2
model = model.half().cuda()
model.transformer.prefix_encoder.float().cuda()
model = model.eval()
history = []
global stop_stream
print("欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序")
while True:
query = input("\n用户:")
if query.strip() == "stop":
break
if query.strip() == "clear":
history = []
os.system(clear_command)
print("欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序")
continue
count = 0
for response, history in model.stream_chat(tokenizer, query, history=history):
if stop_stream:
stop_stream = False
break
else:
count += 1
if count % 8 == 0:
os.system(clear_command)
print(build_prompt(history), flush=True)
signal.signal(signal.SIGINT, signal_handler)
os.system(clear_command)
print(build_prompt(history), flush=True)
if __name__ == "__main__":
main()
行动模拟接口
def simulate(uid, prompt, options, params, history_formatted, chat_tmp_path):
# init
data_lines = []; act_img_path = None
prefix = ["【系统】","【核心】"]; suffix = "\n\n"; key = "KEY"
act_map = {"零":__act0, "壹":__act1, "贰":__act2}
template = ["步骤#理解并基于输入拆分函数参数\*语音#%s",
"步骤#基于函数调用结果进行总结\*结果#%s"]
# step1: LLM response -> understand & split
data_lines.append("%s正在与LLM语言核心交互%s" % (prefix[0], suffix))
response0, _ = model.chat(tokenizer, template[0] % prompt, history_formatted,
max_length=params['max_length'], top_p=params['top_p'], temperature=params['temperature'])
# WARN:TEST
response0 = response0 + "KEY壹;济南"
data_lines.append("%s%s%s" % (prefix[1], response0, suffix))
# step2: text split & action select( N=3 )
kidx_st = response0.find(key)
uf_flag = kidx_st==-1
if not uf_flag:
kidx_ed = response0.find(";", kidx_st)
uf_flag = kidx_ed==-1
if uf_flag:
data_lines.append("%sLLM语言核心解析函数失败%s" % (prefix[0], suffix))
else:
act_info, act_response, act_text, act_img_path = act_map[response0[kidx_st+3:kidx_ed]](response0[kidx_ed:])
data_lines.append("%s%s%s" % (prefix[0], act_info, suffix))
data_lines.append("%s%s%s" % (prefix[0], "默认查询日期:" + str(datetime.date.today()), suffix))
data_lines.append("%s%s%s" % (prefix[0], act_response, suffix))
# step4: LLM response -> analysis
response1, _ = model.chat(tokenizer, template[1] % act_response, history_formatted,
max_length=params['max_length'], top_p=params['top_p'], temperature=params['temperature'])
data_lines.append("%s%s%s" % (prefix[1], response1, suffix))
# step4: pdf generate
if __gen_pdf(uid, act_text):
data_lines.append("%s任务报告输出完成%s" % (prefix[1], suffix))
else:
data_lines.append("%s任务报告生成失败%s" % (prefix[1], suffix))
# step5: write to the tmp_file
with open(chat_tmp_path, mode="w", encoding='utf-8') as f:
f.writelines(data_lines)
# step6: send back sim_img_path
return act_img_path
数据生成
# v1
import json
import random
train_f = True
# train_f = False
# 20 * 2 * 12 * 2 = 960
data_num = 20 if train_f else 10
tenplate_num = 12
task_num = 3
max_pnum = 50
file_path = "./train.json" if train_f else "./dev.json"
data_list = []
const_content_prompt_0 = "步骤#问候\*语音#%s"
const_content_prompt_1 = "步骤#理解并基于输入拆分函数参数\*语音#%s"
const_content_prompt_2 = "步骤#基于函数调用结果进行总结\*工单查询#应到%s人\*目标检测#实到%s人\*任务类型#%s"
place_num = 12
place_db = ["哈尔滨","长春","沈阳","石家庄","兰州","西宁","西安","郑州","济南","太原","乌鲁木齐","呼和浩特"]
content_template_0 = [
"你好", "您好", "早上好", "中午好", "晚上好", "好久不见", "你是谁",
"你叫什么名字", "自我介绍下", "简单自我介绍下", "你能做什么", "你的用途",
]
summary_template_0 = "您好,我是电网大模型的语言核心,用于拆解执行行为库并向您汇报结果"
content_template_1 = [
"请尝试查看%s的工地的人数情况",
"输出工地%s人数情况",
"尝试查%s的人数",
"查查在%s工地人数嘛",
"开始检查%s的安全状况",
"试试看输出%s携带安全帽的情况",
"进行在%s的安全帽检测工作",
"检测在%s的安全帽佩戴情况啊",
"检测在%s的电笔配备情况",
"检查%s有没有设备异常",
"请输出%s工地的设备检查结果",
"告诉我%s配备电笔检查结果",
]
summary_template_1 = [
"因为出现地名为%s, 所以函数一参数为%s; 因为出现人数要求, 所以函数二参数为人数检测",
"因为出现地名为%s, 所以函数一参数为%s; 因为出现安全要求, 所以函数二参数为安全检查",
"因为出现地名为%s, 所以函数一参数为%s; 因为出现设备要求, 所以函数二参数为设备检查",
]
summary_template_2 = [
"[人数检测] 应到%d人, 实到%d人, 无人缺勤, 正常出工",
"[人数检测] 应到%d人, 实到%d人, 有人缺勤",
"[人数检测] 应到%d人, 实到%d人, 人数异常, 可能存在非法进入",
"[人数检测] 应到%d人, 实到%d人, 无工单出工, 可能存在非法进入",
"[安全检查] 实到%d人, 均佩戴安全帽",
"[安全检查] 实到%d人, %d人佩戴安全帽, 存在安全隐患",
"[安全检查] 实到%d人, %d人佩戴安全帽, 存在安全隐患",
"[安全检查] 现场无人施工",
"[设备检查] 实到%d人, 均正常携带设备",
"[设备检查] 实到%d人, %d人未携带设备",
"[设备检查] 实到%d人, %d人未携带设备",
"[设备检查] 现场无人施工",
]
content_template_2 = ["人数检测","安全检查","设备检查",]
mask_num = 7
mask_template = ["啊","哦","吗","嘛","哈","是","阿",]
for i in range(data_num):
random.seed()
data = {}
data["content"] = const_content_prompt_0 % content_template_0[data_num%12]
data["summary"] = summary_template_0
data_list.append(data)
if train_f:
tmp_s = list(content_template_0[data_num%12])
tmp_s[random.randint(0,len(tmp_s)-1)] = mask_template[random.randint(0,mask_num-1)]
data["content"] = const_content_prompt_0 % "".join(tmp_s)
data_list.append(data)
for j in range(tenplate_num):
data = {}
cur_place = place_db[random.randint(0,place_num-1)]
data["content"] = const_content_prompt_1 % (content_template_1[j] % (cur_place))
# p1:place, p2:task
data["summary"] = summary_template_1[j%3] % (cur_place, cur_place)
data_list.append(data)
if train_f:
tmp_s = list(content_template_1[j] % (cur_place))
tmp_s[random.randint(0,len(tmp_s)-1)] = mask_template[random.randint(0,mask_num-1)]
data["content"] = const_content_prompt_1 % "".join(tmp_s)
data_list.append(data)
cur_n1 = random.randint(2,max_pnum-1)
cur_n2 = random.randint(1,max_pnum)
cur_n3 = random.randint(1,cur_n1-1)
cur_n4 = random.randint(cur_n1+1,max_pnum)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n1, content_template_2[0])
data["summary"] = (summary_template_2[0] % (cur_n1,cur_n1))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[0])
data["summary"] = (summary_template_2[1] % (cur_n1,cur_n3))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n4, content_template_2[0])
data["summary"] = (summary_template_2[2] % (cur_n1,cur_n4))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (0, cur_n2,content_template_2[0])
data["summary"] = (summary_template_2[3] % (0,cur_n2))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n1, content_template_2[1])
data["summary"] = (summary_template_2[4] % (cur_n1))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[1])
data["summary"] = (summary_template_2[5] % (cur_n1,cur_n3))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[1])
data["summary"] = (summary_template_2[6] % (cur_n1,cur_n3))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (0, 0, content_template_2[1])
data["summary"] = (summary_template_2[7])
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n1, content_template_2[2])
data["summary"] = (summary_template_2[8] % (cur_n1))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[2])
data["summary"] = (summary_template_2[9] % (cur_n1,cur_n1 - cur_n3))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[2])
data["summary"] = (summary_template_2[10] % (cur_n1,cur_n1 - cur_n3))
data_list.append(data)
data = {}
data["content"] = const_content_prompt_2 % (0, 0, content_template_2[2])
data["summary"] = (summary_template_2[11])
data_list.append(data)
random.shuffle(data_list)
with open(file_path, 'w', encoding='utf-8') as f:
json.dump(data_list, f, indent = 4, sort_keys = True, ensure_ascii=False)
# v2
import json
import random
import transformers
from transformers import (
AutoConfig,
AutoModel,
AutoTokenizer
)
# train_f = True
train_f = False
data_num = 36 if train_f else 12
tenplate_num = 12
file_path = "./trainVX.json" if train_f else "./devVX.json"
data_list = []
const_content_prompt = '要求#请根据输入准确提取出地点信息和模式信息,其中模式在[人数,安全帽,设备,问候,未知]列表中有且只有一个,以{"city":"提取出的地点信息","module":"提取出的模式信息"}形式返回输入#INPUT:%s, OUTPUT:'
place_num = 12
place_db = ["哈尔滨","长春","沈阳","石家庄","兰州","西宁","西安","郑州","济南","太原","乌鲁木齐","呼和浩特"]
content_template_0 = [
"你好", "您好", "早上好", "中午好", "晚上好", "好久不见", "你是谁",
"你叫什么名字", "自我介绍下", "简单自我介绍下", "你能做什么", "你的用途",
]
summary_template_0 = '{"city":"", "module":"问候"}'
content_template_1 = [
"请尝试查看%s的工地的人数情况",
"输出工地%s人数情况",
"尝试查%s的人数",
"查查在%s工地人数嘛",
"开始检查%s的安全状况",
"试试看输出%s携带安全帽的情况",
"进行在%s的安全帽检测工作",
"检测在%s的安全帽佩戴情况啊",
"检测在%s的安全绳配备情况",
"检查%s有没有设备异常",
"请输出%s工地的设备检查结果",
"告诉我%s配备安全绳检查结果",
]
summary_template_1 = ['{"city":"%s", "module":"人数"}',
'{"city":"%s", "module":"安全帽"}',
'{"city":"%s", "module":"设备"}']
mask_num = 7
mask_template = ["啊","哦","吗","嘛","哈","是","阿",]
for i in range(data_num):
random.seed()
data = {}
data["content"] = const_content_prompt % content_template_0[i%12]
data["summary"] = summary_template_0
data_list.append(data)
for j in range(tenplate_num):
data = {}
cur_place = place_db[random.randint(0,place_num-1)]
data["content"] = const_content_prompt % (content_template_1[j] % (cur_place))
# p1:place, p2:task
data["summary"] = summary_template_1[j//4] % cur_place
data_list.append(data)
# load model
model_path = "/home/lyc/workspace/ChatGLM-6B/chatglm-6b"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()
model.eval()
query = "请任意说一句之前没有说过的话"
history = []
# gen unknown content
for i in range(data_num):
response, history = model.chat(tokenizer, query, history=history)
data = {}
data["content"] = const_content_prompt % response
data["summary"] = '{"city":"", "module":"未知"}'
data_list.append(data)
random.shuffle(data_list)
with open(file_path, 'w', encoding='utf-8') as f:
json.dump(data_list, f, indent = 4, sort_keys = True, ensure_ascii=False)