llama-recipes fine-tuning 3

multiple GPUs in single node

click to view the code
torchrun --nnodes 1 --nproc_per_node 2  examples/finetuning.py --enable_fsdp --use_peft --peft_method lora --dataset medcqa_dataset --model_name meta-llama/Llama-2-7b-hf --fsdp_config.pure_bf16 --output_dir ./FINE/llama2-7b-medcqa-modify-prompt-modi-Q-T-O-multiple-gpus

please note single GPU couldn't hold even llama7B model, and 4xA6000 GPUs couldn't hold llama2 13B too.

export the model

colck to view the code
from transformers import AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
peft_model_id = "/home/ludaze/Docker/Llama/llama-recipes/FINE/llama2-7b-medcqa-modify-prompt-modi-Q-T-O-multiple-gpus"
model = PeftModel.from_pretrained(base_model, peft_model_id)
merged_model = model.merge_and_unload()
merged_model.save_pretrained("/home/ludaze/Docker/Llama/llama-recipes/FINE_EXPORT/llama2-7b-medcqa-modify-prompt-modi-Q-T-O-multiple-gpus-export")

then we need to copy the token and tokenizer file into the goal directory
image

posted @ 2024-01-30 08:49  Daze_Lu  阅读(31)  评论(0)    收藏  举报