yangzailu

导航

LLaMA Factory Linux环境安装

LLaMA Factory Linux环境安装实操建议

1、建议:火山云,阿里云,不建议腾讯云(腾讯云,腾讯云的CUDA驱动版本较低,无法升级,已咨询官方客服,若cuda有预装版就直接用,别折腾,驱动不支持高版本,升级对镜像有要求)、魔塔社区新手操练可以,不建议实际部署

2、python版本要求3.9已上,建议py3.9 以上,最好兼容是py3.10,超过py3.10的直接兼容性报错,目前主流的服务器厂商在预装环境镜像时候就已经安装了python,建议直接选择py3.9以上版本,或者py3.10,若没有请直接编译安装python,若服务器是py2.0版本,请不要卸载系统级python,直接安装python3.10版本,pythonl是inux基础环境的必要,直接卸载容易引起其他问题,py2.0和py3.0可以多版本安装,到时候指定默认版本即可。

3、虚拟环境,在运行LLA-Factory时候强烈建议用虚拟环境,千万别直接安装,和系统的pip包冲突后不好解决,用虚拟环境,安装失败删除直接再安装即可,不影响系统环境。

4、第一版安装是在modelscope上安装,这个魔塔社区基本都是预先安装好的环境,或者兼容环境,对后面操作影响很大,实际部署成功,但是webUI界面报错,或者有时候python版本过高,降配麻烦,存在关机起不来的情况,切不支持公网ip和初始化镜像。

5、第二版是腾讯云GPU低版本安装,存在cuda版本过低,不支持升级

6、第三版是火山云直接安装,系统版本为本次生产指定的 Rocky系统,版本是9.1,python版本3.9.0,无其他任何预先安装镜像,环境稳定。

 

开始安装:

1、安装位置:火山云

 2、安装环境配置

GPU类型:Tesla T4 英伟达低端显卡
系统盘:速型SSD PL0 200 GiB -这个高一些,安装cuda时候要求空间较大

3、默认工作执行目录进入家目录,创建工作目录

cd /home
mkdir wwwroot
cd
wwwroot

4、保证当前 Linux 版本支持CUDA. 在命令行中输入 uname -m && cat /etc/*release,应当看到类似的输出,若

x86_64,则支持CUDA
root@dsw-1218047-74bb68979c-5blmp:/mnt/workspace# uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.5 LTS"
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

 5、检查是否安装了 gcc . 在命令行中输入 gcc --version ,应当看到类似的输出

gcc --version

若没有安装,请安装gcc

6、先看下自己系统是否已经预先镜像安装了cuda,若初始化镜像时候可以自己选择cuda预装,直接安装,不要再自己安装,若没有安装,再手动安装,注意这里有坑!

一般我们建议是英伟达显卡,基本教程中建议我们用 nvcc -V 命令检查是否安装了CUDA,实际情况是这个命令只有加载到linux环境变量中才生效,若用这个没检查出来直接再次安装就GG,肯定覆盖镜像版本。
第一步:用 nvcc -V 检查驱动和cuda版本,若有信息说明CUDA已经安装,无需再次安装,如没有,请用 nvidia-smi 检查是否有预装镜像,若有请直接添加环境变量即可。

nvcc -V
nvidia-smi

 

若nvcc -v 请导入环境变量

配置环境变量(当前终端有效):
export PATH=/usr/local/cuda-12.9/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.9/lib64:$LD_LIBRARY_PATH

 

若全局有效,请参考一下配置:

Rocky系统和centos和ubantu系统不同,请安装自己系统的实际情况配置,以Rocky为例:

  1. 定您使用的 shell:默认情况下,Rocky Linux 使用 bash shell。您可以通过以下命令确认:

    echo $SHELL
    
  2. 编辑配置文件:对于 bash 用户,您需要编辑 ~/.bashrc 文件。可以使用文本编辑器(如 nanovimgedit)打开该文件。例如,使用 nano

    vim  ~/.bashrc
    
  3. 添加环境变量:在文件的末尾添加以下内容:

    export PATH=/usr/local/cuda-12.9/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda-12.9/lib64:$LD_LIBRARY_PATH
    
  4. 保存并退出

  5. 使更改生效:终端中运行以下命令:

    source ~/.bashrc

 

 

 

 7、在以下网址下载所需的 CUDA,这里推荐12.2版本。 https://developer.nvidia.com/cuda-gpus 注意需要根据上述输出选择正确版本

    这里注意,请按照自己系统版本自己选择,若镜像初始化安装,就不要折腾了,厂商已经给搭配好了,只需检查nvcc -V是否新加了环境变量,若已经安装,没加环境变量,请直接参考上面新增环境变量,不要盲目卸载,有的系统类型支持建议的更高版本,但是底层驱动不支持,若盲目卸载重装,会导致再去升级驱动版本,驱动版本和硬件版本又是关联的,无限坑。

直接打开一下地址,按照自己系统版本、架构、系统类型、版本、安装类型选择对应的cuda安装版本 
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Rocky&target_version=9&target_type=runfile_local

 

如果您之前安装过 CUDA(例如为12.1版本),需要先使用 sudo /usr/local/cuda-12.1/bin/cuda-uninstaller 卸载。如果该命令无法运行,可以直接:

sudo rm -r /usr/local/cuda-12.1/
sudo apt clean && sudo apt autoclean

卸载完成后运行以下命令并根据提示继续安装:

wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run

 8、开始安装CUDA

注意:这里有坑,这个安装cuda和python版本有很强的兼容,建议py3.9和py3.10,看具体报错处理吧,官方建议>=3.9 <3.10

sudo sh cuda_12.2.0_535.54.03_linux.run

 

 

 等待下,有点慢... 

安装完成:

 9、完成后输入 nvcc -V 检查是否出现对应的版本号,若出现则安装完成。

这里有坑,若执行nvcc -V 没有显示版本信息,是官方教程再linux系统安装过程中忽略了环境变量添加,请参考上面步骤添加环境变量,这里一定要加上,否则会引起后面LLA报错。 
nvcc -V

 10、LLaMA-Factory 安装

注意:

1、LLaMA-Factory 项目本质是python项目,强烈用虚拟环境运行,用非虚拟环境会引起pip包冲突,要是错误后还能删除重新安装,要是把系统级别的包弄混乱,谁也救不了。
2、git 下载有异常,下载不动,解决,直接去github下载zip包上传解压即可,或者梯子走代理,不建议,每家服务商下载情况不一样,腾讯很快,modelscocp也可以,火山不行,下载地址:
https://github.com/hiyouga/LLaMA-Factory.git

11、 下载LLaMA-Factory安装包,上传到/home/wwwroot/ 目录下


1、解压安装包或者git直接拉取 git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
unzip LLaMA-Factory.zip
2、重名名包:
mv LLaMA-Factory-main LLaMA-Factory
3、进入 LLaMA-Factory目录
cd LLaMA-Factory
4、创建虚拟环境
python3 -m venv virtualenv
5、激活虚拟环境
source virtualenv/bin/activate
6、退出虚拟环境 --不要执行哈,只是留了个命令
deactivate
7、开始安装
pip install -e ".[torch,metrics]"

这个安装会报错,报错如下,还是版本兼容问题,具体为:pip 版本过低,请升级到最新版,setuptools 未安装

解决:
  升级pip:
  /home/wwwroot/LLaMA-Factory/virtualenv/bin/python3 -m pip install --upgrade pip

  确认setuptools是否为最新版本并尝试升级:
  /home/wwwroot/LLaMA-Factory/virtualenv/bin/python3 -m pip install --upgrade setuptools

   重新运行您的安装命令:
    pip install -e ".[torch,metrics]"

 

 详细报错:

 |████████████████████████████████| 117 kB 90.1 MB/s 
Collecting antlr4-python3-runtime==4.9.*
  Downloading https://mirrors.ivolces.com/pypi/packages/3e/38/7859ff46355f76f8d19459005ca000b6e7012f2f1ca597746cbcd1fbfe5e/antlr4-python3-runtime-4.9.3.tar.gz (117 kB)
     |████████████████████████████████| 117 kB 98.5 MB/s 
Using legacy 'setup.py install' for fire, since package 'wheel' is not installed.
Using legacy 'setup.py install' for jieba, since package 'wheel' is not installed.
Using legacy 'setup.py install' for antlr4-python3-runtime, since package 'wheel' is not installed.
Installing collected packages: typing-extensions, urllib3, propcache, nvidia-nvjitlink-cu12, multidict, idna, frozenlist, charset-normalizer, certifi, yarl, tqdm, sniffio, six, requests, pyyaml, packaging, nvidia-cusparse-cu12, nvidia-cublas-cu12, mpmath, mdurl, markupsafe, hf-xet, h11, fsspec, filelock, exceptiongroup, attrs, async-timeout, aiosignal, aiohappyeyeballs, zipp, tzdata, triton, sympy, pytz, python-dateutil, pygments, pydantic-core, pycparser, nvidia-nvtx-cu12, nvidia-nccl-cu12, nvidia-cusparselt-cu12, nvidia-cusolver-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, numpy, networkx, markdown-it-py, jinja2, huggingface-hub, httpcore, dill, anyio, annotated-types, aiohttp, xxhash, websockets, torch, tokenizers, threadpoolctl, starlette, shtab, shellingham, scipy, safetensors, rich, regex, pyparsing, pydantic, pyarrow, psutil, platformdirs, pillow, pandas, multiprocess, llvmlite, kiwisolver, joblib, importlib-resources, httpx, fonttools, eval-type-backport, docstring-parser, cycler, contourpy, click, cffi, uvicorn, tyro, typer, transformers, tomlkit, termcolor, soxr, soundfile, semantic-version, scikit-learn, ruff, python-multipart, pydub, pooch, orjson, numba, msgpack, matplotlib, lazy-loader, gradio-client, ffmpy, fastapi, decorator, datasets, audioread, antlr4-python3-runtime, aiofiles, accelerate, trl, tiktoken, sse-starlette, sentencepiece, protobuf, peft, omegaconf, modelscope, librosa, hf-transfer, gradio, fire, einops, av, torchvision, rouge-chinese, nltk, llamafactory, jieba
    Running setup.py install for antlr4-python3-runtime ... done
    Running setup.py install for fire ... done
  Running setup.py develop for llamafactory
    ERROR: Command errored out with exit status 1:
     command: /home/wwwroot/LLaMA-Factory/virtualenv/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/wwwroot/LLaMA-Factory/setup.py'"'"'; __file__='"'"'/home/wwwroot/LLaMA-Factory/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps
         cwd: /home/wwwroot/LLaMA-Factory/
    Complete output (97 lines):
    /tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/dist.py:599: SetuptoolsDeprecationWarning: Invalid dash-separated key 'index-url' in 'easy_install' (setup.cfg), please use the underscore name 'index_url' instead.
    !!
    
            ********************************************************************************
            Usage of dash-separated 'index-url' will not be supported in future
            versions. Please use the underscore name 'index_url' instead.
    
            By 2026-Mar-03, you need to update your project and remove deprecated calls
            or your builds will no longer be supported.
    
            See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
            ********************************************************************************
    
    !!
      opt = self._enforce_underscore(opt, section)
    /tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/dist.py:599: SetuptoolsDeprecationWarning: Invalid dash-separated key 'index-url' in 'easy_install' (setup.cfg), please use the underscore name 'index_url' instead.
    !!
    
            ********************************************************************************
            Usage of dash-separated 'index-url' will not be supported in future
            versions. Please use the underscore name 'index_url' instead.
            (Affected: llamafactory).
    
            By 2026-Mar-03, you need to update your project and remove deprecated calls
            or your builds will no longer be supported.
    
            See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
            ********************************************************************************
    
    !!
      opt = self._enforce_underscore(opt, section)
    /tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/config/_apply_pyprojecttoml.py:61: SetuptoolsDeprecationWarning: License classifiers are deprecated.
    !!
    
            ********************************************************************************
            Please consider removing the following classifiers in favor of a SPDX license expression:
    
            License :: OSI Approved :: Apache Software License
    
            See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
            ********************************************************************************
    
    !!
      dist._finalize_license_expression()
    /tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/dist.py:759: SetuptoolsDeprecationWarning: License classifiers are deprecated.
    !!
    
            ********************************************************************************
            Please consider removing the following classifiers in favor of a SPDX license expression:
    
            License :: OSI Approved :: Apache Software License
    
            See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
            ********************************************************************************
    
    !!
      self._finalize_license_expression()
    running develop
    /tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/_distutils/cmd.py:90: DevelopDeprecationWarning: develop command is deprecated.
    !!
    
            ********************************************************************************
            Please avoid running ``setup.py`` and ``develop``.
            Instead, use standards-based tools like pip or uv.
    
            By 2025-Oct-31, you need to update your project and remove deprecated calls
            or your builds will no longer be supported.
    
            See https://github.com/pypa/setuptools/issues/917 for details.
            ********************************************************************************
    
    !!
      self.initialize_options()
    /home/wwwroot/LLaMA-Factory/virtualenv/bin/python3: No module named pip
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/wwwroot/LLaMA-Factory/setup.py", line 113, in <module>
        main()
      File "/home/wwwroot/LLaMA-Factory/setup.py", line 78, in main
        setup(
      File "/tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/__init__.py", line 115, in setup
        return distutils.core.setup(**attrs)
      File "/tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 186, in setup
        return run_commands(dist)
      File "/tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 202, in run_commands
        dist.run_commands()
      File "/tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 1002, in run_commands
        self.run_command(cmd)
      File "/tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/dist.py", line 1102, in run_command
        super().run_command(command)
      File "/tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command
        cmd_obj.run()
      File "/tmp/pip-build-env-p1b5ln85/overlay/lib/python3.9/site-packages/setuptools/command/develop.py", line 39, in run
        subprocess.check_call(cmd)
      File "/usr/lib64/python3.9/subprocess.py", line 373, in check_call
        raise CalledProcessError(retcode, cmd)
    subprocess.CalledProcessError: Command '['/home/wwwroot/LLaMA-Factory/virtualenv/bin/python3', '-m', 'pip', 'install', '-e', '.', '--use-pep517', '--no-deps']' returned non-zero exit status 1.
    ----------------------------------------
ERROR: Command errored out with exit status 1: /home/wwwroot/LLaMA-Factory/virtualenv/bin/python3 -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/home/wwwroot/LLaMA-Factory/setup.py'"'"'; __file__='"'"'/home/wwwroot/LLaMA-Factory/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.
WARNING: You are using pip version 21.2.3; however, version 25.1.1 is available.
You should consider upgrading via the '/home/wwwroot/LLaMA-Factory/virtualenv/bin/python3 -m pip install --upgrade pip' command.

 

安装成功如下所示:

 

12、如果出现环境冲突,请尝试使用:
 注意:这是坑,不要轻易执行这个命令,基本上出现这种冲突都是LLA-factory没有用虚拟环境安装导致和系统级别的pip安装冲突,建议走虚拟环境。
pip install --no-deps -e .

13、LLaMA-Factory 校验是否安装成功

 llamafactory-cli version

 14、安装webUI

llamafactory-cli webui

地址:http://127.0.0.1:7860
因为我们是云服务环境,无法本地打开webUI,安全组入方向开放端口7860,公网访问

 15、 开始大模型微调

 

 

 

 

 

 

 

 

 

 

posted on 2025-07-18 14:50  飞离地平线  阅读(559)  评论(0)    收藏  举报