摘要:
backgroud I use huggingface transformers build a new moe model, when I use AutoForCasualModel to load the model, there is no suitable model structure 阅读全文
摘要:
knowledge Identity() model.fc2 = nn.Identity(): replace fc2 as identity, which just return what it gets, do that may be you want to disable that layer 阅读全文
摘要:
command & progress click to view the command CUDA_VISIBLE_DEVICES="0,1,2,3" python -m axolotl.cli.preprocess examples/mistral/lora-mps.yml accelerate 阅读全文
摘要:
1 introduction depository: https://github.com/arcee-ai/mergekit merge two models as one model which need the two models have the same structure, token 阅读全文
摘要:
multiple GPUs in single node click to view the code torchrun --nnodes 1 --nproc_per_node 2 examples/finetuning.py --enable_fsdp --use_peft --peft_meth 阅读全文
摘要:
command In these commands, I changed the prompt, input format and output format. click to view the commad # original prompt + qutsion_input + true_opt 阅读全文
摘要:
using gsm8k-rft-llama7b-u13b_evaluation env: lm_evaluation llama2 7B using GSM8K-eval llama2 7B llama2 13B 阅读全文
摘要:
MT bench 1 introduction We create MT-bench, a benchmark consisting of 80 high-quality multi-turn questions. MT-bench is designed to test multi-turn co 阅读全文
摘要:
use code-eval command git clone https://github.com/abacaj/code-eval.git cd code-eval conda create -n human_eval python=3.10 conda activate human_eval 阅读全文
摘要:
note: the datasets are classified into two types, generative(the answer is natural language, the length and content are not in a fixed format) and sel 阅读全文