Daze_Lu - 博客园

2024年4月

摘要： backgroud I use huggingface transformers build a new moe model, when I use AutoForCasualModel to load the model, there is no suitable model structure 阅读全文

posted @ 2024-04-25 07:20 Daze_Lu 阅读(5) 评论(0) 推荐(0) 编辑

2024年3月

MOE pruning

摘要： knowledge Identity() model.fc2 = nn.Identity(): replace fc2 as identity, which just return what it gets, do that may be you want to disable that layer 阅读全文

posted @ 2024-03-31 14:46 Daze_Lu 阅读(2) 评论(0) 推荐(0) 编辑

axolotl-mistral fine-tuning

摘要： command & progress click to view the command CUDA_VISIBLE_DEVICES="0,1,2,3" python -m axolotl.cli.preprocess examples/mistral/lora-mps.yml accelerate 阅读全文

posted @ 2024-03-06 07:26 Daze_Lu 阅读(9) 评论(0) 推荐(0) 编辑

model merge

摘要： 1 introduction depository: https://github.com/arcee-ai/mergekit merge two models as one model which need the two models have the same structure, token 阅读全文

posted @ 2024-03-02 07:03 Daze_Lu 阅读(25) 评论(0) 推荐(0) 编辑

2024年1月

llama-recipes fine-tuning 3

摘要： multiple GPUs in single node click to view the code torchrun --nnodes 1 --nproc_per_node 2 examples/finetuning.py --enable_fsdp --use_peft --peft_meth 阅读全文

posted @ 2024-01-30 08:49 Daze_Lu 阅读(19) 评论(0) 推荐(0) 编辑

med-cqa llama-factory fine-tuning

摘要： command In these commands, I changed the prompt, input format and output format. click to view the commad # original prompt + qutsion_input + true_opt 阅读全文

posted @ 2024-01-26 08:17 Daze_Lu 阅读(9) 评论(0) 推荐(0) 编辑

gsm8k benchmark

摘要： using gsm8k-rft-llama7b-u13b_evaluation env: lm_evaluation llama2 7B using GSM8K-eval llama2 7B llama2 13B 阅读全文

posted @ 2024-01-11 08:35 Daze_Lu 阅读(71) 评论(0) 推荐(0) 编辑

MT bench

摘要： MT bench 1 introduction We create MT-bench, a benchmark consisting of 80 high-quality multi-turn questions. MT-bench is designed to test multi-turn co 阅读全文

posted @ 2024-01-09 15:38 Daze_Lu 阅读(20) 评论(0) 推荐(0) 编辑

humaneval benchmark

摘要： use code-eval command git clone https://github.com/abacaj/code-eval.git cd code-eval conda create -n human_eval python=3.10 conda activate human_eval 阅读全文

posted @ 2024-01-09 04:12 Daze_Lu 阅读(23) 评论(0) 推荐(0) 编辑

dataset format of benchmarks

摘要： note: the datasets are classified into two types, generative(the answer is natural language, the length and content are not in a fixed format) and sel 阅读全文

posted @ 2024-01-02 11:51 Daze_Lu 阅读(5) 评论(0) 推荐(0) 编辑

公告