2025-02-02 - 通用人工智能技术 - 通用大模型 ChatGlm4-6b - 流雨声

摘要

2025-02-02 周日杭州阴

小记: 被网络搞烦了，又不想花钱买国外的服务器，但是好不爽的。

课程内容

1. 环境配置

运行环境

# 大模型运行环境，默认要求 python 3.10 版本以上
conda create --name win_chatgllm4-6b python=3.11 -y 
# 环境激活 
conda  activate win_chatgllm4-6b
# 安装依赖
pip install --upgrade pip
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
# 退出环境
conda deactivate

GPU驱动

# 确认 nvidia 版本
nvidia-smi 
# 安装 pytorch : https://pytorch.org/get-started/previous-versions/
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=12.1 -c pytorch -c nvidia
# 验证 pytorch 是否成功(返回 True 为正常)
python
import torch
Print(torch.cuda.is_available())

2. 部署安装

安装依赖

# 大模型调试调用框架
git clone https://github.com/THUDM/ChatGLM4
cd ChatGLM4

# 安装 pip 依赖
pip install -r requirements.txt
pip install transformers==4.26.1

# 下载模型文件
cd llm
git lfs install
git clone https://www.modelscope.cn/ZhipuAI/glm-4-9b-chat-hf.git

3. 代码调试

场景一: 代码调试

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("你的模型保存路径/chatglm3-6b/", trust_remote_code=True)
model = AutoModel.from_pretrained("你的模型保存路径/chatglm3-6b/", trust_remote_code=True, device='cuda')
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "可以使用python3帮我写一个冒泡排序算法吗？", history=history)
print(response)

场景二: CLI调试

cd  basic_demo
python cli_demo.py

场景三: gradio调试

cd  basic_demo
python web_demo_gradio.py

场景四: streamlit调试

cd  basic_demo
streamlit run web_demo_streamlit.py

4. 性能优化

# 性能参数
vi web_demo_streamlit.py

# 12G以上显存 FP16
model = AutoModel.from_pretrained(MODEL_PATH, trust_remote_code=True).half().cuda()

# 8G显存 INT8量化
model = AutoModel.from_pretrained(MODEL_PATH, trust_remote_code=True).half().quantize(8).cuda()

# 6G以上显存 INT4量化
model = AutoModel.from_pretrained(MODEL_PATH, trust_remote_code=True).half().quantize(4).cuda()

# CPU 32G内存
model = AutoModel.from_pretrained(MODEL_PATH, trust_remote_code=True).quantize(4).float()

# CPU 16GB内存
model = AutoModel.from_pretrained(MODEL_PATH, trust_remote_code=True).bfloat16()
streamlit run web_demo_streamlit.py

总结

posted @ 2025-02-02 16:44 流雨声阅读(44) 评论(0) 收藏举报

刷新页面返回顶部

流雨声

行胜于言，不事张扬

2025-02-02 - 通用人工智能技术 - 通用大模型 ChatGlm4-6b - 流雨声

摘要

课程内容

1. 环境配置

2. 部署安装

3. 代码调试

4. 性能优化

总结

公告