Linux搭建Hive环境

Hive安装地址

Hive官网

http://hive.apache.org

Hive文档地址

https://cwiki.apache.org/confluence/display/Hive/GettingStarted

Hive下载地址

http://archive.apache.org/dist/hive/

Hive Github地址

https://github.com/apache/hive

Hive 安装及配置

Hive安装

解压hive安装包到opt目录下

[root@hadoop101 software]# tar -zxvf apache-hive-3.1.3-bin.tar.gz -C /opt/

更改名称 apache-hive-3.1.3-bin 重名为 hive-3.1.3

[root@hadoop101 opt]# mv apache-hive-3.1.3-bin  hive-3.1.3

将/opt/hive-3.1.3/conf/hive-env.sh.template 重名为hive-env.sh

[root@hadoop101 conf]# mv hive-env.sh.template hive-env.sh 

配置hive-env.sh文件

配置HADOOP_HOME路径

[root@hadoop101 conf]# pwd
/opt/hive-3.1.3/conf
[root@hadoop101 conf]# vi hive-env.sh

export HADOOP_HOME=/opt/hadoop-3.3.6

配置HIVE_CONF_DIR路径

[root@hadoop101 conf]# vi hive-env.sh

export HIVE_CONF_DIR=/opt/hive-3.1.3/conf

Hadoop集群配置

必须启动HDFS 和yarn

root@hadoop101 hadoop-3.3.6]# sbin/start-dfs.sh
Starting namenodes on [hadoop101]
Last login: Sun Mar 24 13:37:11 EDT 2024 from 10.211.55.2 on pts/0
hadoop101: Warning: Permanently added 'hadoop101,fe80::21c:42ff:fe8a:7e9%enp0s5' (ECDSA) to the list of known hosts.
hadoop101: ERROR: Cannot execute /opt/software/hadoop/hadoop-3.3.6/libexec/hdfs-config.sh.
Starting datanodes
Last login: Sun Mar 24 14:01:15 EDT 2024 on pts/0
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
localhost: ERROR: Cannot execute /opt/software/hadoop/hadoop-3.3.6/libexec/hdfs-config.sh.
Starting secondary namenodes [hadoop101]
Last login: Sun Mar 24 14:01:20 EDT 2024 on pts/0
hadoop101: ERROR: Cannot execute /opt/software/hadoop/hadoop-3.3.6/libexec/hdfs-config.sh.

[root@hadoop101 hadoop-3.3.6]# sbin/start-yarn.sh
Starting resourcemanager
Last login: Sun Mar 24 14:01:21 EDT 2024 on pts/0
resourcemanager is running as process 28915.  Stop it first and ensure /tmp/hadoop-root-resourcemanager.pid file is empty before retry.
Starting nodemanagers
Last login: Sun Mar 24 14:10:34 EDT 2024 on pts/0
localhost: ERROR: Cannot execute /opt/software/hadoop/hadoop-3.3.6/libexec/yarn-config.sh.

启动hdfs 与yarn 报错

在HDFS上创建/tmp和/user/hive/warehouse两个目录并修改他们的同组权限可写

[root@hadoop101 hadoop-3.3.6]# bin/hadoop fs -mkdir /tmp
[root@hadoop101 hadoop-3.3.6]# bin/hadoop fs -mkdir -p /user/hive/warehouse

[root@hadoop101 hadoop-3.3.6]# bin/hadoop fs -chmod g+w /tmp
[root@hadoop101 hadoop-3.3.6]# bin/hadoop fs -chmod g+w /user/hive/warehouse

Hive 配置元数据库为mysql

见linux安装部署mysql

初始化Hive 元数据库

[root@hadoop101 hive-3.1.3]# bin/schematool -initSchema -dbType mysql -verbose

Hive 基本操作

启动Hive

[root@hadoop101 hive-3.1.3]# bin/hive

开启新窗口启动hive元数据服务

[root@hcss-ecs-d6d7 hive-3.1.2]# bin/hive --service metastore

启动窗口后,不能在此窗口进行操作。

新开窗口查看数据hive库

hive (default)> show databases;
OK
database_name
bigdata
bigdata2
default
Time taken: 0.62 seconds, Fetched: 3 row(s)
hive (default)> 

启动Hive服务

[root@hcss-ecs-d6d7 hive-3.1.2]# bin/hive --service hiveserver2

此窗口不要关闭。

启动beeline客户端

[root@hcss-ecs-d6d7 hive-3.1.2]# bin/beeline -u jdbc:hive2://主机名:10000 -n root
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hive-3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://hcss-ecs-d6d7:10000
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.2 by Apache Hive
0: jdbc:hive2://hcss-ecs-d6d7:10000> 

测试使用datastrip 连接

成功。

hive-site.xml 详细配置

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!-- 数据库连接  -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://主机名:3306/metastore?characterEncoding=utf-8&amp;useSSL=false</value>
    </property>
    <!-- 数据库驱动  -->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <!-- 数据库用户名  -->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>user</value>
    </property>
    <!-- 数据库密码  -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>pwd</value>
    </property>
    <!-- 元数据存储授权  -->
    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>
    <!-- 指定存储元数据要连接的地址 -->
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://主机名:9083</value>
    </property>

    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
    <property>
        <name>datanucleus.schema.autoCreateAll</name>
        <value>true</value>
    </property>

    <!-- Hive默认在HDFS的工作目录 -->
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
    <!-- 指定hiveserver2连接的host -->
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>主机名</value>
    </property>
    <!-- 指定hiveserver2连接的端口号 -->
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>
    <!-- 默认显示hive数据库  -->
    <property>
        <name>hive.cli.print.header</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.cli.print.current.db</name>
        <value>true</value>
    </property>
</configuration>

posted @ 2024-04-20 15:38  gzshd  阅读(152)  评论(0)    收藏  举报