mindie推理框架

  华为自己的大模型推理框架,链接:https://www.hiascend.com/document/detail/zh/mindie/,左上角选版本,目前是2.3.0,据说2.2.rc1最稳定

  三种推理方式:镜像、物理机、容器。镜像方式最省,但还不完美,跑完镜像要进去改配置、手工启动服务。物理机和容器都是要手工安装CANN、mindie等组件。

  mindie镜像下载地址:https://www.hiascend.com/developer/ascendhub/detail/af85b724a7e5469ebd7ea13c3439d48f

  容器启动命令:

docker run -it -d --net=host --shm-size=16g --privileged --restart always --name qwen72b --device=/dev/davinci_manager --device=/dev/hisi_hdc --device=/dev/devmm_svm -v /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro -v /usr/local/sbin:/usr/local/sbin:ro -v /app/model:/model:ro swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-800I-A2-py311-openeuler24.03-lts bash

  进入容器:docker exec -it qwen72b bash

  环境配置:若服务启动报错:bin/mindieservice_daemon: error while loading shared libraries: libtorch.so: cannot open shared object file: No such file or directory,需配置环境变量

export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib/:$LD_LIBRARY_PATH

  修改配置:/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json

{
    "Version" : "1.0.0",
    "LogConfig" :
    {
        "logLevel" : "Info",
        "logFileSize" : 20,
        "logFileNum" : 20,
        "logPath" : "logs/mindie-server.log"
    },

    "ServerConfig" :
    {
        "ipAddress" : "192.168.68.12",
        "managementIpAddress" : "127.0.0.2",
        "port" : 9000,
        "managementPort" : 1027,
        "metricsPort" : 1028,
        "allowAllZeroIpListening" : false,
        "maxLinkNum" : 1000,
        "httpsEnabled" : false,
        "fullTextEnabled" : false,
        "tlsCaPath" : "security/ca/",
        "tlsCaFile" : ["ca.pem"],
        "tlsCert" : "security/certs/server.pem",
        "tlsPk" : "security/keys/server.key.pem",
        "tlsPkPwd" : "security/pass/key_pwd.txt",
        "tlsCrlPath" : "security/certs/",
        "tlsCrlFiles" : ["server_crl.pem"],
        "managementTlsCaFile" : ["management_ca.pem"],
        "managementTlsCert" : "security/certs/management/server.pem",
        "managementTlsPk" : "security/keys/management/server.key.pem",
        "managementTlsPkPwd" : "security/pass/management/key_pwd.txt",
        "managementTlsCrlPath" : "security/management/certs/",
        "managementTlsCrlFiles" : ["server_crl.pem"],
        "kmcKsfMaster" : "tools/pmt/master/ksfa",
        "kmcKsfStandby" : "tools/pmt/standby/ksfb",
        "inferMode" : "standard",
        "interCommTLSEnabled" : true,
        "interCommPort" : 1121,
        "interCommTlsCaPath" : "security/grpc/ca/",
        "interCommTlsCaFiles" : ["ca.pem"],
        "interCommTlsCert" : "security/grpc/certs/server.pem",
        "interCommPk" : "security/grpc/keys/server.key.pem",
        "interCommPkPwd" : "security/grpc/pass/key_pwd.txt",
        "interCommTlsCrlPath" : "security/grpc/certs/",
        "interCommTlsCrlFiles" : ["server_crl.pem"],
        "openAiSupport" : "vllm"
    },

    "BackendConfig" : {
        "backendName" : "mindieservice_llm_engine",
        "modelInstanceNumber" : 1,
        "npuDeviceIds" : [[0,1,2,3,4,5,6,7]],
        "tokenizerProcessNumber" : 8,
        "multiNodesInferEnabled" : false,
        "multiNodesInferPort" : 1120,
        "interNodeTLSEnabled" : true,
        "interNodeTlsCaPath" : "security/grpc/ca/",
        "interNodeTlsCaFiles" : ["ca.pem"],
        "interNodeTlsCert" : "security/grpc/certs/server.pem",
        "interNodeTlsPk" : "security/grpc/keys/server.key.pem",
        "interNodeTlsPkPwd" : "security/grpc/pass/mindie_server_key_pwd.txt",
        "interNodeTlsCrlPath" : "security/grpc/certs/",
        "interNodeTlsCrlFiles" : ["server_crl.pem"],
        "interNodeKmcKsfMaster" : "tools/pmt/master/ksfa",
        "interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb",
        "ModelDeployConfig" :
        {
            "maxSeqLen" : 32768,
            "maxInputTokenLen" : 16384,
            "truncation" : false,
            "ModelConfig" : [
                {
                    "modelInstanceType" : "Standard",
                    "modelName" : "dsqwen32b",
                    "modelWeightPath" : "/model/deepseek/DeepSeek-R1-Distill-Qwen-32B",
                    "worldSize" : 8,
                    "cpuMemSize" : 5,
                    "npuMemSize" : -1,
                    "backendType" : "atb",
                    "trustRemoteCode" : false
                }
            ]
        },

        "ScheduleConfig" :
        {
            "templateType" : "Standard",
            "templateName" : "Standard_LLM",
            "cacheBlockSize" : 128,

            "maxPrefillBatchSize" : 50,
            "maxPrefillTokens" : 16384,
            "prefillTimeMsPerReq" : 150,
            "prefillPolicyType" : 0,

            "decodeTimeMsPerReq" : 50,
            "decodePolicyType" : 0,

            "maxBatchSize" : 200,
            "maxIterTimes" : 16384,
            "maxPreemptCount" : 0,
            "supportSelectBatch" : false,
            "maxQueueDelayMicroseconds" : 5000
        }
    }
}

  启动服务:

nohup ./bin/mindieservice_daemon > output.log 2>&1 &

  测试:

curl -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"model": "qwen-30b","messages": [{"role": "user", "content": "你是什么模型?"},{"role": "assistant", "content": "你好"}],"stream": false}' http://192.168.231.230:1025/v1/chat/completions

  工程化:每次启动需要到容器里面执行命令明显不行。需要将配置文件从外部注入容器,然后自动执行启动脚本。

  • 容器启动时需要注入配置文件和执行脚本
    docker run -it -d --net=host --shm-size=16g --privileged --restart always --name test \
    --device=/dev/davinci_manager --device=/dev/hisi_hdc --device=/dev/devmm_svm -v /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro -v /usr/local/sbin:/usr/local/sbin:ro \
    -v /app2/models:/models -v /app2/scripts/start.sh:/start.sh -v /app2/scripts/config.json:/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json \
    swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.2.RC1-800I-A2-py311-openeuler24.03-lts /bin/bash -c "/start.sh"
  • 宿主机上存放的配置文件(/app2/scripts/config.json、执行脚本(/app2/scripts/start.sh)调整后可以重启容器达到启动不同参数或模型的目的
  • 执行脚本:解决了重启容器能自动执行start.sh脚本问题。
    #!/bin/bash
    set -euo pipefail
    export LD_LIBRARY_PATH=/usr/local/lib64/python3.11/site-packages/torch/lib/:$LD_LIBRARY_PATH
    cd $MIES_INSTALL_PATH || exit 1

    echo "[$(date)] 停止旧的mindieservice_daemon进程..."
    pkill -9 -f mindie || true
    sleep 5

    echo "[$(date)] 启动mindieservice_daemon..."
    nohup ./bin/mindieservice_daemon > output.log 2>&1

    # 监控服务进程,进程退出则脚本退出(核心:让脚本随服务进程存活)
    echo "[$(date)] 监控mindieservice_daemon进程..."
    while pgrep -f mindieservice_daemon > /dev/null; do
        sleep 1
    done
    echo "[$(date)] mindieservice_daemon进程退出,脚本退出"
    exit 1

 

 

posted @ 2026-03-06 14:44  badwood  阅读(2)  评论(0)    收藏  举报
Badwood's Blog