model merge

1 introduction

depository:
https://github.com/arcee-ai/mergekit
merge two models as one model which need the two models have the same structure, tokenizer.

2 merge llama model

mergekit-yaml examples/linear.yml ./merged/linear_example
image

3 merge mistral model

3.1 tourist800/mistral_2x7b

this model is not MOE model, two models merged into a single model

link: https://huggingface.co/tourist800/mistral_2X7b

click to view the config.yaml
slices:
  - sources:
      - model: mistralai/Mistral-7B-Instruct-v0.2
        layer_range: [0, 32]
      - model: mistralai/Mistral-7B-v0.1
        layer_range: [0, 32]
merge_method: slerp
base_model: mistralai/Mistral-7B-Instruct-v0.2
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

run command::~/Docker/Llama/mergkit_moe$
mergekit-yaml config.yaml merged --copy-tokenizer --allow-crimes --out-shard-size 1B --lazy-unpickle

3.2 sam-ezai/Breezeblossom-v4-mistral-2x7B

link: https://huggingface.co/sam-ezai/Breezeblossom-v4-mistral-2x7B

config.yaml

click to view the config file
base_model: MediaTek-Research/Breeze-7B-Instruct-v0_1
gate_mode: hidden
dtype: float16
experts:
  - source_model: MediaTek-Research/Breeze-7B-Instruct-v0_1
    positive_prompts: [ "<s>You are a helpful AI assistant built by MediaTek Research. The user you are helping speaks Traditional Chinese and comes from Taiwan.   [INST] 你好,請問你可以完成什麼任務? [/INST] "]
  - source_model: Azure99/blossom-v4-mistral-7b
    positive_prompts: ["A chat between a human and an artificial intelligence bot. The bot gives helpful, detailed, and polite answers to the human's questions. \n|Human|: hello\n|Bot|: "]

run command:
~/Docker/Llama/mergkit_moe$
mergekit-moe config.yaml merged/mistral_moe --copy-tokenizer --allow-crimes --out-shard-size 1B --lazy-unpickle

3.3 mlabonne/Beyonder-4x7B-v2

link: https://huggingface.co/mlabonne/Beyonder-4x7B-v2

config.yaml

click to view config.yaml
base_model: mlabonne/Marcoro14-7B-slerp
experts:
  - source_model: openchat/openchat-3.5-1210
    positive_prompts:
    - "chat"
    - "assistant"
    - "tell me"
    - "explain"
  - source_model: beowolx/CodeNinja-1.0-OpenChat-7B
    positive_prompts:
    - "code"
    - "python"
    - "javascript"
    - "programming"
    - "algorithm"
  - source_model: maywell/PiVoT-0.1-Starling-LM-RP
    positive_prompts:
    - "storywriting"
    - "write"
    - "scene"
    - "story"
    - "character"
  - source_model: WizardLM/WizardMath-7B-V1.1
    positive_prompts:
    - "reason"
    - "math"
    - "mathematics"
    - "solve"
    - "count"

run command:
mergekit-moe config.yaml Beyonder-4x7B-v2 --copy-tokenizer --allow-crimes --out-shard-size 1B --lazy-unpickle

help command

click to view the help
(mergekit) ludaze@cehsc-app-001:~/Docker/Llama/mergekit$ mergekit-yaml --help
Usage: mergekit-yaml [OPTIONS] CONFIG_FILE OUT_PATH

Options:
  -v, --verbose                   Verbose logging
  --allow-crimes / --no-allow-crimes
                                  Allow mixing architectures  [default: no-
                                  allow-crimes]
  --transformers-cache TEXT       Override storage path for downloaded models
  --lora-merge-cache TEXT         Path to store merged LORA models
  --cuda / --no-cuda              Perform matrix arithmetic on GPU  [default:
                                  no-cuda]
  --low-cpu-memory / --no-low-cpu-memory
                                  Store results and intermediate values on
                                  GPU. Useful if VRAM > RAM  [default: no-low-
                                  cpu-memory]
  --out-shard-size SIZE           Number of parameters per output shard
                                  [default: 5B]
  --copy-tokenizer / --no-copy-tokenizer
                                  Copy a tokenizer to the output  [default:
                                  copy-tokenizer]
  --clone-tensors / --no-clone-tensors
                                  Clone tensors before saving, to allow
                                  multiple occurrences of the same layer
                                  [default: no-clone-tensors]
  --trust-remote-code / --no-trust-remote-code
                                  Trust remote code from huggingface repos
                                  (danger)  [default: no-trust-remote-code]
  --random-seed INTEGER           Seed for reproducible use of randomized
                                  merge methods
  --lazy-unpickle / --no-lazy-unpickle
                                  Experimental lazy unpickler for lower memory
                                  usage  [default: no-lazy-unpickle]
  --write-model-card / --no-write-model-card
                                  Output README.md containing details of the
                                  merge  [default: write-model-card]
  --safe-serialization / --no-safe-serialization
                                  Save output in safetensors. Do this, don't
                                  poison the world with more pickled models.
                                  [default: safe-serialization]
  --help                          Show this message and exit.

4 other example

go to huggingface, click the tag, you will get the corresponding example

image

posted @ 2024-03-02 07:03  Daze_Lu  阅读(89)  评论(0)    收藏  举报