xgboost 多gpu支持 编译


xgboost 多gpu支持 编译

Ubuntu 18.04.2
Linux 4.15.0-46-generic
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0

cuda 10.0
https://docs.nvidia.com/cuda/archive/10.0/cuda-installation-guide-linux/index.html#verify-you-have-supported-version-of-linux
安装略

nccl2
git clone https://github.com/NVIDIA/nccl.git
cd nccl
make -j src.build

xgboost
(建议选择稳定版源码编译 如 0.82)

mkdir xgboost-src

git clone --recursive https://github.com/dmlc/xgboost.git

git clone https://github.com/dmlc/xgboost.git
git submodule init
git submodule update

设置版本0.82(!然而最后安装后的版本是0.81)
git checkout 3f83dcd

mkdir build
cd build
cmake .. -DUSE_CUDA=ON -DUSE_NCCL=ON -DNCCL_ROOT=/xxx/install/nccl-src/nccl/build
make -j4

直至出现类似结果
...
Scanning dependencies of target gpuxgboost
[ 95%] Linking CXX static library libgpuxgboost.a
[ 95%] Built target gpuxgboost
Scanning dependencies of target runxgboost
[ 97%] Building CXX object CMakeFiles/runxgboost.dir/src/cli_main.cc.o
[ 98%] Linking CXX executable ../xgboost
[ 98%] Built target runxgboost
Scanning dependencies of target xgboost
[100%] Linking CXX shared library ../lib/libxgboost.so
[100%] Built target xgboost
cd ../python-package
python setup.py install

备注: 如果切换 使用 update-alternatives gcc/g++ 版本时,可能会出现各种引用异常,此时建议切换到gcc/g++某个已安装版本(如7.3), 重启机器

------------------------------------------------------
tensorflow (对应cuda 10.0)
tensorflow-gpu 1.13.1
pip install tensorflow-gpu

------------------------------------------------------
torch (对应cuda 10.0)
pip install https://download.pytorch.org/whl/cu100/torch-1.0.1.post2-cp36-cp36m-linux_x86_64.whl
pip install torchvision
posted @ 2019-04-01 16:04  衣奎德  阅读(814)  评论(0编辑  收藏  举报