llama-factory fine-tuning-1

data preparation

for llama-factory fine-tuning, here is the instruction for custom dataset preparation.

dataset classification

alpaca

stanford_alpaca dataset is a famous example to fine-tuning llama2 to get alpaca model, follow is its structure. 

[
  {
    "instruction": "user instruction (required)",
    "input": "user input (optional)",
    "output": "model response (required)",
    "history": [
      ["user instruction in the first round (optional)", "model response in the first round (optional)"],
      ["user instruction in the second round (optional)", "model response in the second round (optional)"]
    ]
  }
]

from bellow digraph, you can get how they get alpaca model: 

 

 

sharegpt

ShareGPT is a dialogue dataset actively contributed to and shared by users. It contains conversation samples from different domains, topics, styles, and emotions, covering a variety of types such as chit-chat, Q&A, stories, poetry, and song lyrics. This dataset is characterized by high quality, diversity, personalization, and emotional richness, which can provide conversational robots with more abundant and authentic linguistic knowledge and semantic information.

here is it's data structure. vicuna model is fine-tuning from llama2 by sharegpt style dataset.

[
  {
    "conversations": [
      {
        "from": "human",
        "value": "user instruction"
      },
      {
        "from": "gpt",
        "value": "model response"
      }
    ]
  }
]

 

medical alapaca style fine-tuning

data conversion

we can load huggingface dataset directly, but we have to filter the data, so we download the data and save as JSON file(alpaca style), we can do that with bellow code.

from datasets import load_dataset
import os
import json


dataset = load_dataset("shibing624/medical", "finetune")


save_path = "../medical"
os.makedirs(save_path, exist_ok=True)  

def save_as_json(data, filename):
    file_path = os.path.join(save_path, filename)
    with open(file_path, 'w', encoding='utf-8') as f:

        data_to_save = [item for item in data]
        json.dump(data_to_save, f, ensure_ascii=False, indent=4)

save_as_json(dataset['train'], 'train.json')
save_as_json(dataset['validation'], 'validation.json')
save_as_json(dataset['test'], 'test.json')

 

 

 

 

 

 

 

 

 

 

select english part named as alpaca_medical_en.json then move into Llama-Factory/data/ 

command

CUDA_VISIBLE_DEVICES=1 python src/train_bash.py \
    --stage sft \
    --model_name_or_path ../llama/models_hf/7B \
    --do_train \
    --dataset alpaca_medical_en \
    --template default \
    --finetuning_type lora \
    --lora_target q_proj,v_proj \
    --output_dir ./FINE/llama2-7b-medical_single \
    --overwrite_cache \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate 5e-5 \
    --num_train_epochs 3.0 \
    --plot_loss \
    --fp16

 training loss digraph

 

 

 

 

 

 

 

 

 

 

 evaluate ft model by mmlu

 

4000 steps

 

 

10000 steps

 

 

 

20000 steps

 

50000 steps

 

 

 

70000 steps

 

 

reward modeling

 command
CUDA_VISIBLE_DEVICES=3 python src/train_bash.py \
    --stage rm \
    --model_name_or_path ../llama/models_hf/7B \
    --do_train \
    --dataset comparison_gpt4_en \
    --template default \
    --finetuning_type lora \
    --lora_target q_proj,v_proj \
    --resume_lora_training False \
    --checkpoint_dir ./FINE/llama2-7b-medical_single/checkpoint-70000 \
    --output_dir  ./Reward/medical \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate 1e-6 \
    --num_train_epochs 1.0 \
    --plot_loss \
    --fp16

 

loss digraph

evaluation 

2000 steps

 

 

4000 steps

 

 

PPO training

command (do not use fp16, otherwise there will be an error: ValueError: Attempting to unscale FP16 gradients)

 

CUDA_VISIBLE_DEVICES=1 python src/train_bash.py \
    --stage ppo \
    --model_name_or_path ../llama/models_hf/7B \
    --do_train \
    --dataset alpaca_medical_en \
    --template default \
    --finetuning_type lora \
    --lora_target q_proj,v_proj \
    --resume_lora_training False \
    --checkpoint_dir ./FINE/llama2-7b-medical_single/checkpoint-70000 \
    --reward_model ./Reward/medical/checkpoint-4000 \
    --output_dir ./PPO/medical/medical_gpt4 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 4 \
    --lr_scheduler_type cosine \
    --top_k 0 \
    --top_p 0.9 \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate 1e-5 \
    --num_train_epochs 1.0 \
    --plot_loss

 

 

 

 

 

posted @ 2023-11-29 12:48  Daze_Lu  阅读(491)  评论(0)    收藏  举报