[本科项目实训] 项目总结

环境配置与维护

主要包括模型推理微调环境前后端执行环境,以及在环境出现问题时的维护

运行环境参数

  • 操作系统:Ubuntu 18.04.6 LTS

  • CPU:Intel(R) Xeon(R) Platinum 8362 CPU @ 2.80GHz

  • GPU:NVIDIA GeForce RTX 3090,CUDA version 12.4

  • Conda:4.10.3

  • Node.js:18.12.0

模型推理微调环境

使用 Anaconda 进行环境配置和环境迁移,参考的项目通常包含 requirements.txt 通用的配置方式为:

pip install -r requirements.txt

但部分库可能存在冲突,需要手动安装,例如强制更新指定库版本:

pip3 install --ignore-installed PyYAML

由于过程中存在环境迁移, Anaconda 可以很好的解决这一问题:

# local
conda pack -n your_env_name

# server
conda config --add envs_dirs your_conda_env_path
tar -zxvf your_env_name.tar.gz -C your_conda_env_path
conda init
bash ~/.bashrc
conda activate your_env_name

通过上传与解压降低了模型配置过程中连接库的开销,并且提高了环境迁移过程中的稳定性。

前后端执行环境

使用 Node.jsnpmpnpm 进行前后端环境管理,使用 n 控制 node 版本。

$ n

ο node/18.12.0

Use up/down arrow keys to select a version, return key to install, d to delete, q to quit

$ pnpm bootstrap

> chatglm-web@0.1.0 bootstrap /home/lyc/workspace/DSBTPG-web
> pnpm install && pnpm run common:prepare

Lockfile is up to date, resolution step is skipped
Already up to date
Done in 1.2s

> chatglm-web@0.1.0 common:prepare /home/lyc/workspace/DSBTPG-web
> husky install

fatal: not a git repository (or any of the parent directories): .git
husky - git command not found, skipping install

$ pnpm dev

> chatglm-web@0.1.0 dev /home/lyc/workspace/DSBTPG-web
> vite

Port 3000 is in use, trying another one...

  VITE v4.1.4  ready in 331 ms

  ➜  Local:   http://localhost:3001/
  ➜  Network: http://172.17.0.3:3001/
  ➜  press h to show help

系统环境维护

使用 nvidia-smifuser 维护 GPU 环境,使用 netstat 维护网络端口。

watch -n 2 -d nvidia-smi
fuser -v /dev/nvidia*

netstat -anlpt | grep your_port
# example
$ watch -n 2 -d nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        On  |   00000000:12:00.0 Off |                  N/A |
| 38%   28C    P8             20W /  350W |   12120MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

$ fuser -v /dev/nvidia*
                     USER        PID ACCESS COMMAND
/dev/nvidia5:        root     kernel mount /dev/nvidia5
                     root      47329 F...m pt_main_thread
/dev/nvidiactl:      root     kernel mount /dev/nvidiactl
                     root      47329 F...m pt_main_thread
/dev/nvidia-uvm:     root     kernel mount /dev/nvidia-uvm
                     root      47329 F...m pt_main_thread
/dev/nvidia-uvm-tools:
                     root     kernel mount /dev/nvidia-uvm-tools
                     
$ kill -9 47329

$ netstat -anlpt | grep 3003
tcp        0      0 0.0.0.0:3003            0.0.0.0:*               LISTEN      5354/node

$ kill -9 5354

LLM(ChatGLM) 部署与微调

主要包含 模型运行DEMOP-Tuning V2 微调。

  • 模型运行DEMO(100+)
  • 数据生成(300+)
  • 模型训练(100+)
  • 行动模拟接口示例 (100+)
  • 杂项

模型运行

由于原始模型提供的DEMO没法直接运行,因而我基于其进行了一定的调整,用于在命令行直接测试模型的微调结果。

# cli_demo.py
import os, sys
import platform
import signal

import torch
import transformers
from transformers import (
    AutoConfig,
    AutoModel,
    AutoTokenizer,
    AutoTokenizer,
    DataCollatorForSeq2Seq,
    HfArgumentParser,
    Seq2SeqTrainingArguments,
    set_seed,
)

from arguments import ModelArguments, DataTrainingArguments

import readline

# LOCAL_PATH = "/home/lyc/workspace/ChatGLM-6B"

# tokenizer = AutoTokenizer.from_pretrained(LOCAL_PATH+"/chatglm-6b", trust_remote_code=True)
# model = AutoModel.from_pretrained(LOCAL_PATH+"/chatglm-6b", trust_remote_code=True).half().cuda()
# model = model.eval()

model = None
tokenizer = None

os_name = platform.system()
clear_command = 'cls' if os_name == 'Windows' else 'clear'
stop_stream = False


def build_prompt(history):
    prompt = "欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序"
    for query, response in history:
        prompt += f"\n\n用户:{query}"
        prompt += f"\n\nChatGLM-6B:{response}"
    return prompt


def signal_handler(signal, frame):
    global stop_stream
    stop_stream = True


def main():
    global model, tokenizer

    parser = HfArgumentParser((
        ModelArguments))
    if len(sys.argv) == 2 and sys.argv[1].endswith(".json"):
        # If we pass only one argument to the script and it's the path to a json file,
        # let's parse it to get our arguments.
        model_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))[0]
    else:
        model_args = parser.parse_args_into_dataclasses()[0]

    tokenizer = AutoTokenizer.from_pretrained(
        model_args.model_name_or_path, trust_remote_code=True)
    config = AutoConfig.from_pretrained(
        model_args.model_name_or_path, trust_remote_code=True)

    config.pre_seq_len = model_args.pre_seq_len
    config.prefix_projection = model_args.prefix_projection

    if model_args.ptuning_checkpoint is not None:
        print(f"Loading prefix_encoder weight from {model_args.ptuning_checkpoint}")
        model = AutoModel.from_pretrained(model_args.model_name_or_path, config=config, trust_remote_code=True)
        prefix_state_dict = torch.load(os.path.join(model_args.ptuning_checkpoint, "pytorch_model.bin"))
        new_prefix_state_dict = {}
        for k, v in prefix_state_dict.items():
            if k.startswith("transformer.prefix_encoder."):
                new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v
        model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)
    else:
        model = AutoModel.from_pretrained(model_args.model_name_or_path, config=config, trust_remote_code=True)

    if model_args.quantization_bit is not None:
        print(f"Quantized to {model_args.quantization_bit} bit")
        model = model.quantize(model_args.quantization_bit)

    if model_args.pre_seq_len is not None:
        # P-tuning v2
        model = model.half().cuda()
        model.transformer.prefix_encoder.float().cuda()
    
    model = model.eval()

    history = []
    global stop_stream
    print("欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序")
    while True:
        query = input("\n用户:")
        if query.strip() == "stop":
            break
        if query.strip() == "clear":
            history = []
            os.system(clear_command)
            print("欢迎使用 ChatGLM-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序")
            continue
        count = 0
        for response, history in model.stream_chat(tokenizer, query, history=history):
            if stop_stream:
                stop_stream = False
                break
            else:
                count += 1
                if count % 8 == 0:
                    os.system(clear_command)
                    print(build_prompt(history), flush=True)
                    signal.signal(signal.SIGINT, signal_handler)
        os.system(clear_command)
        print(build_prompt(history), flush=True)


if __name__ == "__main__":
    main()

行动模拟接口

def simulate(uid, prompt, options, params, history_formatted, chat_tmp_path):
    # init
    data_lines = []; act_img_path = None
    prefix = ["【系统】","【核心】"]; suffix = "\n\n"; key = "KEY"
    act_map = {"零":__act0, "壹":__act1, "贰":__act2}
    template = ["步骤#理解并基于输入拆分函数参数\*语音#%s",
                "步骤#基于函数调用结果进行总结\*结果#%s"]

    # step1: LLM response -> understand & split
    data_lines.append("%s正在与LLM语言核心交互%s" %  (prefix[0], suffix))
    response0, _ = model.chat(tokenizer, template[0] % prompt, history_formatted, 
        max_length=params['max_length'], top_p=params['top_p'], temperature=params['temperature'])
    
    # WARN:TEST
    response0 = response0 + "KEY壹;济南"
    
    data_lines.append("%s%s%s" %  (prefix[1], response0, suffix))

    # step2: text split & action select( N=3 )
    kidx_st = response0.find(key)
    uf_flag = kidx_st==-1
    if not uf_flag:
        kidx_ed = response0.find(";", kidx_st)
        uf_flag = kidx_ed==-1
    if uf_flag:
        data_lines.append("%sLLM语言核心解析函数失败%s" %  (prefix[0], suffix))
    else:
        act_info, act_response, act_text, act_img_path = act_map[response0[kidx_st+3:kidx_ed]](response0[kidx_ed:])
        data_lines.append("%s%s%s" %  (prefix[0], act_info, suffix))
        data_lines.append("%s%s%s" %  (prefix[0], "默认查询日期:" + str(datetime.date.today()), suffix))
        data_lines.append("%s%s%s" %  (prefix[0], act_response, suffix))

    # step4: LLM response -> analysis
        response1, _ = model.chat(tokenizer, template[1] % act_response, history_formatted, 
            max_length=params['max_length'], top_p=params['top_p'], temperature=params['temperature'])
        data_lines.append("%s%s%s" %  (prefix[1], response1, suffix))

    # step4: pdf generate
        if __gen_pdf(uid, act_text):
            data_lines.append("%s任务报告输出完成%s" %  (prefix[1], suffix))
        else:
            data_lines.append("%s任务报告生成失败%s" %  (prefix[1], suffix))

    # step5: write to the tmp_file
    with open(chat_tmp_path, mode="w", encoding='utf-8') as f:
        f.writelines(data_lines)

    # step6: send back sim_img_path
    return act_img_path

数据生成

# v1
import json
import random

train_f = True
# train_f = False
# 20 * 2 * 12 * 2 = 960
data_num = 20 if train_f else 10
tenplate_num = 12
task_num = 3
max_pnum = 50
file_path = "./train.json" if train_f else "./dev.json"
data_list = []

const_content_prompt_0 = "步骤#问候\*语音#%s"

const_content_prompt_1 = "步骤#理解并基于输入拆分函数参数\*语音#%s"

const_content_prompt_2 = "步骤#基于函数调用结果进行总结\*工单查询#应到%s人\*目标检测#实到%s人\*任务类型#%s"

place_num = 12
place_db = ["哈尔滨","长春","沈阳","石家庄","兰州","西宁","西安","郑州","济南","太原","乌鲁木齐","呼和浩特"]

content_template_0 = [
    "你好", "您好", "早上好", "中午好", "晚上好", "好久不见", "你是谁",
    "你叫什么名字", "自我介绍下", "简单自我介绍下", "你能做什么", "你的用途",
]

summary_template_0 = "您好,我是电网大模型的语言核心,用于拆解执行行为库并向您汇报结果"

content_template_1 = [
    "请尝试查看%s的工地的人数情况",
    "输出工地%s人数情况",
    "尝试查%s的人数",
    "查查在%s工地人数嘛",

    "开始检查%s的安全状况",
    "试试看输出%s携带安全帽的情况",
    "进行在%s的安全帽检测工作",
    "检测在%s的安全帽佩戴情况啊",

    "检测在%s的电笔配备情况",
    "检查%s有没有设备异常",
    "请输出%s工地的设备检查结果",
    "告诉我%s配备电笔检查结果",
]

summary_template_1 = [
    "因为出现地名为%s, 所以函数一参数为%s; 因为出现人数要求, 所以函数二参数为人数检测",
    "因为出现地名为%s, 所以函数一参数为%s; 因为出现安全要求, 所以函数二参数为安全检查",
    "因为出现地名为%s, 所以函数一参数为%s; 因为出现设备要求, 所以函数二参数为设备检查",
]

summary_template_2 = [
    "[人数检测] 应到%d人, 实到%d人, 无人缺勤, 正常出工",
    "[人数检测] 应到%d人, 实到%d人, 有人缺勤",
    "[人数检测] 应到%d人, 实到%d人, 人数异常, 可能存在非法进入",
    "[人数检测] 应到%d人, 实到%d人, 无工单出工, 可能存在非法进入",
    "[安全检查] 实到%d人, 均佩戴安全帽",
    "[安全检查] 实到%d人, %d人佩戴安全帽, 存在安全隐患",
    "[安全检查] 实到%d人, %d人佩戴安全帽, 存在安全隐患",
    "[安全检查] 现场无人施工",
    "[设备检查] 实到%d人, 均正常携带设备",
    "[设备检查] 实到%d人, %d人未携带设备",
    "[设备检查] 实到%d人, %d人未携带设备",
    "[设备检查] 现场无人施工",
]

content_template_2 = ["人数检测","安全检查","设备检查",]

mask_num = 7
mask_template = ["啊","哦","吗","嘛","哈","是","阿",]

for i in range(data_num):
    random.seed()
    data = {}
    data["content"] = const_content_prompt_0 % content_template_0[data_num%12]
    data["summary"] = summary_template_0
    data_list.append(data)
    if train_f:
        tmp_s = list(content_template_0[data_num%12])
        tmp_s[random.randint(0,len(tmp_s)-1)] = mask_template[random.randint(0,mask_num-1)]
        data["content"] = const_content_prompt_0 % "".join(tmp_s)
        data_list.append(data)
    for j in range(tenplate_num):
        data = {}
        cur_place = place_db[random.randint(0,place_num-1)]
        data["content"] = const_content_prompt_1 % (content_template_1[j] % (cur_place))
        # p1:place, p2:task
        data["summary"] = summary_template_1[j%3] % (cur_place, cur_place)
        data_list.append(data)
        if train_f:
            tmp_s = list(content_template_1[j] % (cur_place))
            tmp_s[random.randint(0,len(tmp_s)-1)] = mask_template[random.randint(0,mask_num-1)]
            data["content"] = const_content_prompt_1 % "".join(tmp_s)
            data_list.append(data)
    cur_n1 = random.randint(2,max_pnum-1)
    cur_n2 = random.randint(1,max_pnum)
    cur_n3 = random.randint(1,cur_n1-1)
    cur_n4 = random.randint(cur_n1+1,max_pnum)
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n1, content_template_2[0])
    data["summary"] = (summary_template_2[0] % (cur_n1,cur_n1))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[0])
    data["summary"] = (summary_template_2[1] % (cur_n1,cur_n3))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n4, content_template_2[0])
    data["summary"] = (summary_template_2[2] % (cur_n1,cur_n4))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (0, cur_n2,content_template_2[0])
    data["summary"] = (summary_template_2[3] % (0,cur_n2))
    data_list.append(data)

    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n1, content_template_2[1])
    data["summary"] = (summary_template_2[4] % (cur_n1))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[1])
    data["summary"] = (summary_template_2[5] % (cur_n1,cur_n3))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[1])
    data["summary"] = (summary_template_2[6] % (cur_n1,cur_n3))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (0, 0, content_template_2[1])
    data["summary"] = (summary_template_2[7])
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n1, content_template_2[2])
    data["summary"] = (summary_template_2[8] % (cur_n1))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[2])
    data["summary"] = (summary_template_2[9] % (cur_n1,cur_n1 - cur_n3))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (cur_n1, cur_n3, content_template_2[2])
    data["summary"] = (summary_template_2[10] % (cur_n1,cur_n1 - cur_n3))
    data_list.append(data)
    
    data = {}
    data["content"] = const_content_prompt_2 % (0, 0, content_template_2[2])
    data["summary"] = (summary_template_2[11])
    data_list.append(data)
    

random.shuffle(data_list)

with open(file_path, 'w', encoding='utf-8') as f:
    json.dump(data_list, f, indent = 4, sort_keys = True, ensure_ascii=False)
# v2
import json
import random
import transformers
from transformers import (
    AutoConfig,
    AutoModel,
    AutoTokenizer
)

# train_f = True
train_f = False
data_num = 36 if train_f else 12
tenplate_num = 12
file_path = "./trainVX.json" if train_f else "./devVX.json"
data_list = []

const_content_prompt = '要求#请根据输入准确提取出地点信息和模式信息,其中模式在[人数,安全帽,设备,问候,未知]列表中有且只有一个,以{"city":"提取出的地点信息","module":"提取出的模式信息"}形式返回输入#INPUT:%s, OUTPUT:'

place_num = 12
place_db = ["哈尔滨","长春","沈阳","石家庄","兰州","西宁","西安","郑州","济南","太原","乌鲁木齐","呼和浩特"]

content_template_0 = [
    "你好", "您好", "早上好", "中午好", "晚上好", "好久不见", "你是谁",
    "你叫什么名字", "自我介绍下", "简单自我介绍下", "你能做什么", "你的用途",
]

summary_template_0 = '{"city":"", "module":"问候"}'

content_template_1 = [
    "请尝试查看%s的工地的人数情况",
    "输出工地%s人数情况",
    "尝试查%s的人数",
    "查查在%s工地人数嘛",

    "开始检查%s的安全状况",
    "试试看输出%s携带安全帽的情况",
    "进行在%s的安全帽检测工作",
    "检测在%s的安全帽佩戴情况啊",

    "检测在%s的安全绳配备情况",
    "检查%s有没有设备异常",
    "请输出%s工地的设备检查结果",
    "告诉我%s配备安全绳检查结果",
]

summary_template_1 = ['{"city":"%s", "module":"人数"}',
                      '{"city":"%s", "module":"安全帽"}',
                      '{"city":"%s", "module":"设备"}']

mask_num = 7
mask_template = ["啊","哦","吗","嘛","哈","是","阿",]

for i in range(data_num):
    random.seed()
    data = {}
    data["content"] = const_content_prompt % content_template_0[i%12]
    data["summary"] = summary_template_0
    data_list.append(data)
    for j in range(tenplate_num):
        data = {}
        cur_place = place_db[random.randint(0,place_num-1)]
        data["content"] = const_content_prompt % (content_template_1[j] % (cur_place))
        # p1:place, p2:task
        data["summary"] = summary_template_1[j//4] % cur_place
        data_list.append(data)

# load model
model_path = "/home/lyc/workspace/ChatGLM-6B/chatglm-6b"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()
model.eval()

query = "请任意说一句之前没有说过的话"
history = []

# gen unknown content
for i in range(data_num):
    response, history = model.chat(tokenizer, query, history=history)
    data = {}
    data["content"] = const_content_prompt % response
    data["summary"] = '{"city":"", "module":"未知"}'
    data_list.append(data)

random.shuffle(data_list)

with open(file_path, 'w', encoding='utf-8') as f:
    json.dump(data_list, f, indent = 4, sort_keys = True, ensure_ascii=False)
posted @ 2024-06-24 13:47  yicheng_liu0219  阅读(113)  评论(0)    收藏  举报