deep learning on flink 环境配置
使用腾讯云环境,为减少重复工作量,将基础环境准备记录一下
一、环境需求
python: 3.7
pip
cmake >= 3.6
java 1.8
maven >=3.3.0
二、Java
1. 下载 https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html
2. 解压 tar -xzvf jdk-8u281-linux-x64.tar.gz
3. 编辑环境变量文件 sudo vim /etc/profile
1 export JAVA_HOME=/home/ubuntu/jdk1.8.0_281 2 export JRE_HOME=$JAVA_HOME/jre 3 export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
三、CMake
1. 下载 https://cmake.org/download/
2. 解压 tar -xzvf cmake-3.20.0-linux-x86_64.tar.gz
3. 编辑环境变量文件 sudo vim /etc/profile
1 export CMAKE_HOME=/home/ubuntu/cmake-3.20.0-linux-x86_64
四、Maven
1. 下载 http://maven.apache.org/download.cgi
2. 解压 tar -xzvf apache-maven-3.6.3-bin.tar.gz
3. 编辑环境变量文件 sudo vim /etc/profile
1 export MAVEN_HOME=/home/ubuntu/apache-maven-3.6.3
4.配置阿里云镜像 vim apache-maven-3.6.3/conf/settings.xml
1 <mirror> 2 <id>nexus-aliyun</id> 3 <name>Nexus aliyun</name> 4 <url>http://maven.aliyun.com/nexus/content/groups/public/</url> 5 <mirrorOf>central</mirrorOf> 6 </mirror>
五、汇总
1. 编辑环境变量文件 sudo vim /etc/profile
1 export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$CMAKE_HOME/bin:$MAVEN_HOME/bin:$PATH
2. 使生效
source /ect/profile
六、Python
1. 安装python编译环境 sudo apt install python-dev
2. 安装pip sudo apt install python-pip
3. 安装pip3 sudo apt install python3-pip
4. 安装依赖包 pip install -U --user pip six numpy wheel mock grpcio grpcio-tools
七、编译
0. 缺失包 sudo apt-get install zlib1g-dev
1. 打开目录 cd deep-learning-on-flink/
2. 跳过测试 mvn -DskipTests=true clean install
八、Hadoop
1. 下载 https://hadoop.apache.org/releases.html
2. 解压 tar -xzvf hadoop-2.10.1.tar.gz
3. 为hdfs创建目录 sudo mkdir name & data & tmp
4. 修改hadoop-env.sh,将${JAVA_HOME}改为指定路径 vim etc/hadoop/hadoop-env.sh
5. 修改core-site.xml vim etc/hadoop/core-site.xml
1 <configuration> 2 <property> 3 <name>hadoop.tmp.dir</name> 4 <value>/home/ubuntu/hdfs/tmp</value> 5 </property> 6 <property> 7 <name>io.file.buffer.size</name> 8 <value>131072</value> 9 </property> 10 <property> 11 <name>fs.default.name</name> 12 <value>hdfs://master:9000</value> 13 </property> 14 <property> 15 <name>hadoop.proxyuser.root.hosts</name> 16 <value>*</value> 17 </property> 18 <property> 19 <name>hadoop.proxyuser.root.groups</name> 20 <value>*</value> 21 </property> 22 </configuration>
6. 修改hdfs-site.xml vim etc/hadoop/hdfs-site.xml
1 <configuration> 2 <property> 3 <name>dfs.replication</name> 4 <value>2</value> 5 </property> 6 <property> 7 <name>dfs.namenode.name.dir</name> 8 <value>/home/ubuntu/hdfs/name</value> 9 </property> 10 <property> 11 <name>dfs.datanode.data.dir</name> 12 <value>/home/ubuntu/hdfs/data</value> 13 </property> 14 <property> 15 <name>dfs.permissions</name> 16 <value>false</value> 17 </property> 18 <property> 19 <name>dfs.webhdfs.enabled</name> 20 <value>true</value> 21 </property> 22 <property> 23 <name>dfs.namenode.secondary.http-address</name> 24 <value>master:9001</value> 25 </property> 26 </configuration>
7. 修改yarn-site.xml vim etc/hadoop/yarn-site.xml
1 <configuration> 2 <property> 3 <name>yarn.resourcemanager.address</name> 4 <value>master:18040</value> 5 </property> 6 <property> 7 <name>yarn.resourcemanager.scheduler.address</name> 8 <value>master:18030</value> 9 </property> 10 <property> 11 <name>yarn.resourcemanager.webapp.address</name> 12 <value>master:18088</value> 13 </property> 14 <property> 15 <name>yarn.resourcemanager.resource-tracker.address</name> 16 <value>master:18025</value> 17 </property> 18 <property> 19 <name>yarn.resourcemanager.admin.address</name> 20 <value>master:18141</value> 21 </property> 22 <property> 23 <name>yarn.nodemanager.aux-services</name> 24 <value>mapreduce_shuffle</value> 25 </property> 26 <property> 27 <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 28 <value>org.apache.hadoop.mapred.ShuffleHandler</value> 29 </property> 30 <property> 31 <name>yarn.nodemanager.resource.memory-mb</name> 32 <value>4096</value> 33 </property> 34 <property> 35 <name>yarn.scheduler.minimum-allocation-mb</name> 36 <value>2048</value> 37 </property> 38 <property> 39 <name>yarn.scheduler.maximum-allocation-mb</name> 40 <value>4096</value> 41 </property> 42 </configuration>
8. 拷贝mapred-site.xml.template,修改mapred-site.xml vim etc/hadoop/mapred-site.xml
1 <configuration> 2 <property> 3 <name>mapreduce.framework.name</name> 4 <value>yarn</value> 5 </property> 6 <property> 7 <name>mapreduce.reduce.memory.mb</name> 8 <value>2048</value> 9 </property> 10 <property> 11 <name>mapreduce.map.memory.mb</name> 12 <value>2048</value> 13 </property> 14 </configuration>
9. 修改slaves,添加master,slave1
10. 修改hosts,将主机ip映射为master和slave1 sudo vim /etc/hosts
11. 拷贝分发 scp -r hadoop-2.10.1 ubuntu@slave1:/home/ubuntu
12. 启动
1 ./bin/hdfs namenode -format 2 #Web UI:master_host:50070 3 ./sbin/start-dfs.sh 4 #Web UI:master_host:18088 5 ./sbin/start-yarn.sh
九、ZooKeeper
1. 下载 https://zookeeper.apache.org/releases.html
2. 解压 tar -xzvf apache-zookeeper-3.7.0-bin.tar.gz
3. 编辑环境变量文件 sudo vim /etc/profile
1 export ZOOKEEPER_HOME=/home/ubuntu/apache-zookeeper-3.7.0-bin
4. 复制配置文件 cp conf/zoo_sample.cfg conf/zoo.cfg
5. 修改配置文件 vim conf/zoo.cfg
1 #默认配置 2 tickTime=2000 3 initLimit=10 4 syncLimit=5 5 clientPort=2181 6 #数据持久化路径 7 dataDir=/home/ubuntu/apache-zookeeper-3.7.0-bin/data/1 8 #集群主机接口及备用接口 9 server.1=master:2888:3888 10 server.2=slave1:2889:3889 11 server.3=slave2:2890:3890
6. 准备数据持久化空间
1 #设置路径 2 mkdir data 3 cd data 4 mkdir 1 5 #创建myid文件,并记录id为1 6 cd 1 7 vim myid 8 #1
7. 拷贝分发 scp -r apache-zookeeper-3.7.0-bin ubuntu@slave:/home/ubuntu
8. 启动 ./bin/zkServer.sh start
十、免密登录
1. 生成密钥 ssh-keygen
2. 将公钥告知本机 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
3. 将密钥告知其他主机 ssh-copy-id ubuntu@slave_host
十一、Docker
1. 卸载旧版本 sudo apt-get remove docker docker-engine docker.io containerd runc
2. 安装apt依赖包 sudo apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
3. 添加Docker官方密钥 curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
4. 设置稳定版仓库 sudo add-apt-repository "deb [arch=amd64] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/ $(lsb_release -cs) stable"
5. 更新apt包索引 sudo apt-get update
6. 安装最新版本的 Docker Engine-Community 和 containerd sudo apt-get install docker-ce docker-ce-cli containerd.io
7. 将当前用户添加至docker用户组,获得执行权限
1 sudo groupadd docker #添加docker用户组 2 sudo gpasswd -a ubuntu docker #将用户ubuntu 加入到docker用户组中 3 newgrp docker #更新用户组

浙公网安备 33010602011771号