llamafactory框架下微调llama3-70b推理问题

问题描述

使用llamafactory + npu lora微调llama3-70b后,最终推理出现乱码以及不能自动停止生成。如下所示：

derrick rose of the chicago bulls has the most career assists among players who have never been named to an all-star game with 3,339 assists.  IICIII.џџџ. 3,339 assists(stypyuseRal;\r\r\n

推测过程

由于出现乱码的位置总是在一段输出结束末尾，同时不能够根据eos_token停止输出，只能到达长度限制停止。推测是eos_token的问题。
检查原模型与微调后的tokenizer配置文件。发现special_tokens_map.json不一致。觉得可能是由于这个问题，导致要么是微调时可能没有充分学习到正确使用结束标记来终止生成，要么是合并权重的时候配置冲突。

原模型配置文件

{
"bos_token": "<|begin_of_text|>",
"eos_token": "<|end_of_text|>"
}

微调后的配置文件

{
"bos_token": {
  "content": "<|begin_of_text|>",
  "lstrip": false,
  "normalized": false,
  "rstrip": false,
  "single_word": false
},
"eos_token": {
  "content": "<|eot_id|>",
  "lstrip": false,
  "normalized": false,
  "rstrip": false,
  "single_word": false
},
"pad_token": "<|eot_id|>"
}

思考为什么不一致。llama3使用template文件中的llama3模板，进入LLaMA-Factory/src/llmtuner/data/template.py查看llama3，发现stop_words和原模型配置中的eos_token不一致。

_register_template(
    name="llama3",
    format_user=StringFormatter(
        slots=[
            (
                "<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|eot_id|>"
                "<|start_header_id|>assistant<|end_header_id|>\n\n"
            )
        ]
    ),
    format_system=StringFormatter(
        slots=[{"bos_token"}, "<|start_header_id|>system<|end_header_id|>\n\n{{content}}<|eot_id|>"]
    ),
    format_observation=StringFormatter(
        slots=[
            (
                "<|start_header_id|>tool<|end_header_id|>\n\n{{content}}<|eot_id|>"
                "<|start_header_id|>assistant<|end_header_id|>\n\n"
            )
        ]
    ),
    default_system="You are a helpful assistant.",
    stop_words=["<|eot_id|>"], # 不一致
    replace_eos=True,
)

解决办法

改为原模型配置中的eos_token。将LLaMA-Factory/src/llmtuner/data/template.py文件中的llama3模板作如下修改：

_register_template(
    name="llama3",
    format_user=StringFormatter(
        slots=[
            (
                "<|start_header_id|>user<|end_header_id|>\n\n{{content}}<|end_of_text|>"  # sss <|eot_id|>
                "<|start_header_id|>assistant<|end_header_id|>\n\n"
            )
        ]
    ),
    format_system=StringFormatter(
        slots=[{"bos_token"}, "<|start_header_id|>system<|end_header_id|>\n\n{{content}}<|end_of_text|>"]  # sss <|eot_id|>
    ),
    format_observation=StringFormatter(
        slots=[
            (
                "<|start_header_id|>tool<|end_header_id|>\n\n{{content}}<|end_of_text|>"  # sss <|eot_id|>
                "<|start_header_id|>assistant<|end_header_id|>\n\n"
            )
        ]
    ),
    default_system="You are a helpful assistant.",
    stop_words=["<|end_of_text|>"], # sss <|eot_id|>
    replace_eos=True,
)

微调训练推理

结果展示

微调后推理，无乱码生成，也可自动停止。

kourtney kardashian, kim kardashian, khloe kardashian, rob kardashian, kendall jenner, and kylie jenner.

待探索

等待尝试量化后的模型是否会出现问题

posted @ 2024-05-28 17:41 sss1001 阅读(1485) 评论(0) 收藏举报

刷新页面返回顶部

llamafactory框架下微调llama3-70b推理问题

问题描述

推测过程

解决办法

结果展示

待探索

公告