ubuntu GPU环境下使用MindSpore 1.0调试AI诗人的一次无疾而终的尝试
电脑硬件配置:
处理器 英特尔 Core i5-7200U @ 2.50GHz 双核 主板 联想 LNVNB161216 ( 7th Generation Intel Processor Family I/O - 9D58 笔记本芯片组 ) 显卡 Nvidia GeForce 940MX ( 2 GB / 联想 ) 内存 8 GB ( SK Hynix ) 主硬盘 三星 MZNTN512HDJH-000L2 ( 512 GB / 固态硬盘 )
笔记本是win10的,也没多少空间了
手上刚好有个ssd固态U盘,做了一个wintogo 然后把ubuntu也装上去了
软件要求:
Ubuntu 18.04 x86_64 - Python 3.7.5 - CUDA 10.1 - CuDNN 7.6 - gmp 6.1.2
Ubuntu 18.04自带 python 3.6,需要再安装Python3.7.5
1. 安装Python 3.7.5
http://cdn.npm.taobao.org/dist/python/3.7.5/Python-3.7.5.tgz
修改Python源码包中ssl的参数
打开源码解压目录中的 Modules/Setup ,直接搜索 SSL= ,将SSL=后面的目录改为前面openssl的安装目录,并把下面三行的注释去掉。
此外必备的gcc make什么的要提取安装好
tar -zxvf python3.7.5.tgz cd python3.7.5 ./configure --prefix=/usr/local/python375 make && make install ln -s /usr/local/python375/bin/python3.7 /usr/bin/python ln -s /usr/local/python375/bin/pip3.7 /usr/bin/pip
2. 安装CUDA 10.1
sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.105-418.39_1.0-1_amd64.deb sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub sudo apt-get update sudo apt-get install cuda
安装好了配置环境变量
export CUDA_HOME=/usr/local/cuda-10.1 export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/extras/CUPTI/lib64:$LD_LIBRARY_PATHs export PATH=/usr/local/cuda-10.1/bin:$PATH export LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/extras/CUPTI/lib64:/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/lib64"
3. 安装CuDNN 7.6
https://developer.nvidia.com/rdp/cudnn-archive
sudo cp cuda/include/cudnn.h /usr/local/cuda/include sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
4.安装gmp
安装依赖
sudo apt-get install m4
然后就是configure;make;make install
安装完了验证下安装是否成功
import numpy as np from mindspore import Tensor from mindspore.ops import functional as F import mindspore.context as context context.set_context(device_target="GPU") x = Tensor(np.ones([1,3,3,4]).astype(np.float32)) y = Tensor(np.ones([1,3,3,4]).astype(np.float32)) print(F.tensor_add(x, y))
OK,然后开始训练
安装完cuda后需要重启电脑,否则训练会找不到设备
AI诗人https://bbs.huaweicloud.com/forum/thread-80976-1-1.html
https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/nlp_bert_poetry.html
代码下载https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/DemoCode/bert_poetry_c.rar
下载数据集 43030首诗词 https://github.com/AaronJny/DeepLearningExamples/tree/master/keras-bert-poetry-generator
下载BERT-Base模型的预训练ckpt:可在MindSpore官网下载
修改配置文件'pre_training_ckpt': './bert_converted.ckpt', 修改为bert_base.ckpt
开始训练
python poetry.py
什么,出错了,去看代码了
parser.add_argument('--device_target', type=str, default='Ascend', help='Device target')
默认训练设备Ascend,添加参数GPU
python poetry.py --device_target=GPU
What!!!
低端显卡不配吗?CPU下有个算子不支持,GPU又嫌我配置低,......(此处省略3000字)
搬砖去了,等3070出了再来试试。。。