【hadoop集群】三、hive添加tez、spark引擎

hive引擎指令设置

set hive.execution.engine=spark;

一、添加tez

1、解压到本地

mkdir /opt/module/tez
tar -zxvf /opt/software/tez-0.10.1-SNAPSHOT-minimal.tar.gz -C /opt/module/tez

2、新建目录上传到hdfs

hadoop fs -mkdir /tez
hadoop fs -put /opt/software/tez-0.10.1-SNAPSHOT.tar.gz /tez

3、在hadoop环境下新建tez-site.xml

vim $HADOOP_HOME/etc/hadoop/tez-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
 <value>${fs.defaultFS}/tez/tez-0.10.1-SNAPSHOT.tar.gz</value>
</property>
<property>
 <name>tez.use.cluster.hadoop-libs</name>
 <value>true</value>
</property>
<property>
 <name>tez.am.resource.memory.mb</name>
 <value>1024</value>
</property>
<property>
 <name>tez.am.resource.cpu.vcores</name>
 <value>1</value>
</property>
<property>
 <name>tez.container.max.java.heap.fraction</name>
 <value>0.4</value>
</property>
<property>
 <name>tez.task.resource.memory.mb</name>
 <value>1024</value>
</property>
<property>
 <name>tez.task.resource.cpu.vcores</name>
 <value>1</value>
</property>
</configuration>

再添加 vim $HADOOP_HOME/etc/hadoop/shellprofile.d/tez.sh

hadoop_add_profile tez
function _tez_hadoop_classpath
{
 hadoop_add_classpath "$HADOOP_HOME/etc/hadoop" after
 hadoop_add_classpath "/opt/module/tez/*" after
 hadoop_add_classpath "/opt/module/tez/lib/*" after
}

3、hive下的hive-site.xml配置添加

<property>
 <name>hive.execution.engine</name>
 <value>tez</value>
</property>
<property>
 <name>hive.tez.container.size</name>
 <value>1024</value>
</property>

解决日志冲突的问题

rm /opt/module/tez/lib/slf4j-log4j12-1.7.10.jar

问题:

Current usage: 303.3 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.

关闭虚拟机检查  yarn-site.xml

<property>
	<name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
</property>

  

 二、添加spark引擎

1、本地解压

tar -zxvf spark-3.0.0-bin-hadoop3.2.tgz -C /opt/module/

修改环境变量

sudo vim /etc/profile.d/my_env.sh

# SPARK_HOME
export SPARK_HOME=/opt/module/spark
export PATH=$PATH:$SPARK_HOME/bin

2、在hive配置中创建spark的默认配置

vim /opt/module/hive/conf/spark-defaults.conf

spark.master                               yarn
spark.eventLog.enabled                   true
spark.eventLog.dir                        hdfs://hadoop102:8020/spark-history
spark.executor.memory                    1g
spark.driver.memory					   1g

hive-site.xml 配置信息

<!--Spark依赖位置(注意:端口号8020必须和namenode的端口号一致)-->
<property>
    <name>spark.yarn.jars</name>
    <value>hdfs://hadoop102:8020/spark-jars/*</value>
</property>
  
<!--Hive执行引擎-->
<property>
    <name>hive.execution.engine</name>
    <value>spark</value>
</property>

3、把spark的纯净版jar上传到hdfs

tar -zxvf /opt/software/spark-3.0.0-bin-without-hadoop.tgz

hadoop fs -mkdir /spark-jars

hadoop fs -put spark-3.0.0-bin-without-hadoop/jars/* /spark-jars

 

最终执行结果:

 

 执行成功

 

 

posted @ 2022-01-05 10:03  持枢  阅读(278)  评论(0)    收藏  举报