Spark配置文件
1.上传spark-2.4.0-bin-hadoop2.6.tgz到/opt目录,并解压到/usr/local
tar -zxf /opt/spark-2.4.0-bin-hadoop2.6.tgz -C /usr/local/
2.进入/usr/local/spark-2.4.0-bin-hadoop2.6/conf
复制slaves.template:cp slaves.template slaves
修改slaves,先删除其中的localhost,然后添加:
slave1
slave2
slave3
3.修改spark-defaults.conf
cp spark-defaults.conf.template spark-defaults.conf
vi spark-defaults.conf
添加:
spark.master spark://master:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://master:8020/spark-logs
spark.history.fs.logDirectory hdfs://master:8020/spark-logs
4.修改spark-env.sh
cp spark-env.sh.template spark-env.sh
vi spark-env.sh
添加:
JAVA_HOME=/usr/java/jdk1.8.0_151
HADOOP_CONF_DIR=/usr/local/hadoop-2.6.5/etc/hadoop
SPARK_MASTER_IP=master
SPARK_MASTER_PORT=7077
SPARK_WORKER_MEMORY=512m
SPARK_WORKER_CORES=1
SPARK_EXECUTOR_MEMORY=512m
SPARK_EXECUTOR_CORES=1
SPARK_WORKER_INSTANCES=1
5.启动Hadoop集群,在HDFS中新建目录:
hdfs dfs -mkdir /spark-logs
6.将Spark安装包分发到其他节点
scp -r /usr/local/spark-2.4.0-bin-hadoop2.6/ slave1:/usr/local/
scp -r /usr/local/spark-2.4.0-bin-hadoop2.6/ slave2:/usr/local/
scp -r /usr/local/spark-2.4.0-bin-hadoop2.6/ slave3:/usr/local/
7.在所有节点配置Spark环境变量
vi /etc/profile
在文件尾加入:
export SPARK_HOME=/usr/local/spark-2.4.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
执行source /etc/profile使命令生效
8.启动spark
进入/usr/local/spark-2.4.0-bin-hadoop2.6/sbin
执行
./start-all.sh
9.查看客户端
http://master:8080
1.进入hive安装目录bin目录,修改hive文件
vi hive
将sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`
修改为:
sparkAssemblyPath=`ls ${SPARK_HOME}/jars/*.jar`
2.拷贝hive-site.xml到/usr/local/spark-2.4.0-bin-hadoop2.6/conf
cp /usr/local/apache-hive-1.2.1-bin/conf/hive-site.xml /usr/local/spark-2.4.0-bin-hadoop2.6/conf/
scp /usr/local/apache-hive-1.2.1-bin/conf/hive-site.xml slave1:/usr/local/spark-2.4.0-bin-hadoop2.6/conf/
scp /usr/local/apache-hive-1.2.1-bin/conf/hive-site.xml slave2:/usr/local/spark-2.4.0-bin-hadoop2.6/conf/
scp /usr/local/apache-hive-1.2.1-bin/conf/hive-site.xml slave3:/usr/local/spark-2.4.0-bin-hadoop2.6/conf/
3.拷贝MYSQL驱动到/usr/local/spark-2.4.0-bin-hadoop2.6/jars
cp /usr/local/apache-hive-1.2.1-bin/lib/mysql-connector-java-5.1.32-bin.jar /usr/local/spark-2.4.0-bin-hadoop2.6/jars/
scp /usr/local/spark-2.4.0-bin-hadoop2.6/jars/mysql-connector-java-5.1.32-bin.jar slave1:/usr/local/spark-2.4.0-bin-hadoop2.6/jars/
scp /usr/local/spark-2.4.0-bin-hadoop2.6/jars/mysql-connector-java-5.1.32-bin.jar slave2:/usr/local/spark-2.4.0-bin-hadoop2.6/jars/
scp /usr/local/spark-2.4.0-bin-hadoop2.6/jars/mysql-connector-java-5.1.32-bin.jar slave3:/usr/local/spark-2.4.0-bin-hadoop2.6/jars/
4.在所有节点/usr/local/spark-2.4.0-bin-hadoop2.6/conf/spark-env.sh 文件中配置 MySQL 驱动
SPARK_CLASSPATH=/usr/local/spark-2.4.0-bin-hadoop2.6/jars/mysql-connector-java-5.1.32-bin.jar
5.启动 MySQL 服务
service mysqld start
6.启动 Hive 的 metastore 服务
hive --service metastore &
7.修改日志级别,在各节点:
cp /usr/local/spark-2.4.0-bin-hadoop2.6/conf/log4j.properties.template /usr/local/spark-2.4.0-bin-hadoop2.6/conf/log4j.properties
修改log4j.properties
log4j.rootCategory=WARN, console
8.启动spark集群
9.访问spark-sql
浙公网安备 33010602011771号