1.安装CDH5.12.x

安装方式

CDH有三种安装方式

  • parcels 二进制程序包,包含了CDH组件中的依赖\版本等信息,可以方便的切换CDH版本,CM调用yum来安装parcels,非常方便.cloudera 推荐使用此种方式安装.
  • packges rpm方式,缺点是每次只能安装一个版本的rpm,升级\回滚都必须重装rpm,因此升级\回滚存在包不兼容风险.CM调用yum来安装rpm.不推荐.
  • tarball 手动下载tar包安装,无法使用yum安装,十分麻烦,而且不支持通过CM在线升级CDH,必须完全手动升级.不建议.
    本质上parcels和tarball都是CDH组件的压缩包,但是cloudera对tarball没做支持.
    本次以parcels方式安装,不使用CDH自动安装,手动下载parcels包安装

  • 自动安装 设置CDH yum资源库,然后可以直接用yum 安装jdk CM agent

  • 手动安装 手动安装JDK,下载CDH的CM agent,配置本地yum源安装CM和agent

安装前准备

  • OS版本 centos6.8
  • JDK 1.8
  • python 2.7
  • 安装账户root CM依赖root账户
  • 安装路径 CDH默认路径/opt/cloudera/parcls即可
  • 数据库 mysql,不要使用内置的postgrey

安装步骤

  • 在vbox上安装centos6.8 略
  • 设置ssh互信 至少CM节点要能ssh登录其它节点
  • 修改linux系统设置
    修改unlimit
    关闭iptables
    关闭selinux
    关闭swape 略
    关闭ipv6 略
    设置静态ip
    修改/etc/hosts,加入CDH所有节点的ip
  • 安装JDK1.8
  • 安装python2.7
  • 安装mysql数据库,并创建数据库和用户
  • 安装ntp
  • 安装CM
  • 安装agent
  • 安装CDH组件

安装过程

修改/etc/hosts

echo "127.0.0.1 localhost" > /etc/hosts
echo "192.168.0.20 CM"     >> /etc/hosts
echo "192.168.0.21 cdh1"   >> /etc/hosts
echo "192.168.0.21 cdh2"   >> /etc/hosts

设置ssh 互信

参CM节点上执行:

[root@CM ~]# ssh-keygen -t rsa -P ''

然后将key复制到其它节点:

ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.0.21
ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.0.22

这样CM节点就能ssh到192.168.0.21 192.168.0.22两台机器上了

修改linux 系统设置

1.修改unlimit 略
2.关闭iptables

[root@cdh2 ~]# service iptables stop
iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
iptables: Flushing firewall rules:                         [  OK  ]
iptables: Unloading modules:                               [  OK  ]
[root@cdh2 ~]# chkconfig iptables off

3.关闭swap
虚拟机内存不足,略
4.禁用ipv6
这个好麻烦的,略

安装JDK1.8

下载对应版本的JDK1.8,解压到/opt/jdk1.8下
设置java环境变量:
在/etc/profile中设置java环境变量

[root@CM opt]# tar -zxvf jdk-7u80-linux-x64.tar.gz
[root@CM opt]# rm jdk-7u80-linux-x64.tar.gz
[root@CM jdk1.7.0_80]# vi /etc/profile #添加以下内容
JAVA_HOME=/opt/jdk1.7.0_80
PATH=$JAVA_HOME/bin:$PATH

export JAVA_HOME
export PATH

然后执行java 命令测试是否安装成功.

安装python2.7

为了方便,直接安装acconda,再修改系统的python即可.
此外centos6自带的python2.6.6也是可以的

安装mysql/postgreysql数据库

这里选择安装postgreysql,因为mysql的安装包实在太TM的大了

[root@CM ~]# yum isntall postgresql #安装后数据目录在 /var/lib/pgsql
[root@CM ~]# chkconfig --list #找到pg的服务
…
postgresql         0:off    1:off    2:off    3:off    4:off    5:off    6:off
..
[root@CM ~]# service postgresql initdb #初始化pg数据库
Initializing database:                                     [  OK  ]
[root@CM ~]# chkconfig postgresql on #开机启动
[root@CM ~]# service postgresql start #启动pg
Starting postgresql service:                               [  OK  ]
[root@CM ~]# sudo -u postgres psql
could not change directory to "/root"
psql (8.4.20)
Type "help" for help.

postgres=# create user cdh1;
CREATE ROLE
postgres=# select * from pg_users;
ERROR:  relation "pg_users" does not exist
LINE 1: select * from pg_users;
                      ^
postgres=# select * from pg_user;
 usename  | usesysid | usecreatedb | usesuper | usecatupd |  passwd  | valuntil | useconfig 
----------+----------+-------------+----------+-----------+----------+----------+-----------
 postgres |       10 | t           | t        | t         | ******** |          | 
 cdh1     |    16384 | f           | f        | f         | ******** |          | 
(2 rows)

postgres=# alter user cdh1 with password 'cdh1';
ALTER ROLE

postgres=# create database cdh1 owner cdh1 ENCODING 'UTF-8';
CREATE DATABASE

[root@CM ~]# vi  /var/lib/pgsql/data/pg_hba.conf #改成如下
# "local" is for Unix domain socket connections only
local   all         all                               truest
# IPv4 local connections:
host    all         all         127.0.0.1/32          md5
host    all         all         192.168.0.0/24        md5
# IPv6 local connections:
host    all         all         ::1/128               ident
[root@CM data]# vi postgresql.conf #改成
listen_addresses = '*'
[root@CM ~]# service postgresql restart
Stopping postgresql service:                               [  OK  ]
Starting postgresql service:                               [  OK  ]
[root@CM ~]# psql -d cdh1 -U cdh1
Password for user cdh1: 
psql (8.4.20)
Type "help" for help.

cdh1=> 

安装ntp

[root@CM ~]# yum install ntp

设置本地yum源

使用yum安装CM时,可以使用cloudera的远程yum源,或者把CM安装包下载到本地,并设置本地yum源.

[root@CM ~]# mkdir -p cloudera_software/RPMS/x86_64
[root@CM ~]# mkdir -p cloudera_software/repodata
[root@CM ~]# cd cloudera_software/RPMS/x86_64
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-agent-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-daemons-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-server-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/cloudera-manager-server-db-2-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# wget https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.12.0/RPMS/x86_64/enterprise-debuginfo-5.12.0-1.cm5120.p0.120.el6.x86_64.rpm
[root@CM x86_64]# #wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64/jdk-6u31-linux-amd64.rpm
[root@CM x86_64]# #wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
[root@CM x86_64]# cd ../../repodata
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/filelists.xml.gz
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/filelists.xml.gz.asc
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/other.xml.gz
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/other.xml.gz.asc
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/primary.xml.gz
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/primary.xml.gz.asc
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/repomd.xml
[root@CM repodata]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/repodata/repomd.xml.asc
[root@CM repodata]# cd ..
[root@CM cloudera_software]# wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
  • 配置本地yum源
[root@CM cloudera_software]# cd /etc/yum.repos.d/ #添加
[cloudera-manager-local]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 5 x86_64
name=cloudera-manager-local
baseurl=file:///root/cloudera_software/
gpgkey =file:///root/cloudera_software/RPM-GPG-KEY-cloudera
gpgcheck = 1

下载CDH parcels包

https://archive.cloudera.com/cdh5/parcels/ 下载对应版本的parcels包:
这里选择下载https://archive.cloudera.com/cdh5/parcels/latest/ 下的最新的CDH:

https://archive.cloudera.com/cdh5/parcels/latest/CDH-5.12.0-1.cdh5.12.0.p0.29-el6.parcel
https://archive.cloudera.com/cdh5/parcels/latest/CDH-5.12.0-1.cdh5.12.0.p0.29-el6.parcel.sha1
https://archive.cloudera.com/cdh5/parcels/latest/manifest.json

三个文件缺少一个都会导致找不到parcels包,如果新加入了文件,要重启CM进程.

放到/opt/cloudera/parcel-repo下,为什么放这里?因为CDH的parcels包默认就放这里,上图也可以看到.

[root@CM ~]# mkdir -p /opt/cloudera/parcel-repo
[root@CM ~]# mkdir -p /opt/cloudera/parcels

安装CM

使用yum安装CM

[root@CM cloudera_software]# yum install cloudera-manager-agent cloudera-manager-daemons cloudera-manager-server

一开始下成了redhat5的CM包,删除重新下.记得要执行yum clean

上一步完成之后,会有这些文件:

[root@CM x86_64]# ls /etc/cloudera-scm-server
db.properties  log4j.properties

修改其中的db.properties文件:

# Copyright (c) 2012 Cloudera, Inc. All rights reserved.
#
# This file describes the database connection.
#

# The database type
# Currently 'mysql', 'postgresql' and 'oracle' are valid databases.
com.cloudera.cmf.db.type=postgresql

# The database host
# If a non standard port is needed, use 'hostname:port'
com.cloudera.cmf.db.host=localhost

# The database name
com.cloudera.cmf.db.name=cmf

# The database user
com.cloudera.cmf.db.user=cdh1

# The database user's password
com.cloudera.cmf.db.password=cdh1

# The db setup type
# By default, it is set to INIT
# If scm-server uses Embedded DB then it is set to EMBEDDED
# If scm-server uses External DB then it is set to EXTERNAL
com.cloudera.cmf.db.setupType=EXTERNAL

启动CM

[root@CM ~]# service cloudera-scm-server start
Starting cloudera-scm-server:                              [FAILED]
[root@CM ~]# more /var/log/cloudera-scm-server/cloudera-scm-server.out
+======================================================================+
|      Error: JAVA_HOME is not set and Java could not be found         |
+----------------------------------------------------------------------+
| Please download the latest Oracle JDK from the Oracle Java web site  |
|  > http://www.oracle.com/technetwork/java/javase/index.html <        |
|                                                                      |
| Cloudera Manager requires Java 1.6 or later.                         |
| NOTE: This script will find Oracle Java whether you install using    |
|       the binary or the RPM based installer.                         |
+======================================================================+
[root@CM ~]# echo $JAVA_HOME
/opt/jdk1.7.0_80

简直是睁眼说瞎话.网上查了一下 http://community.cloudera.com/t5/Cloudera-Manager-Installation/Error-JAVA-HOME-is-not-set-and-Java-could-not-be-found/td-p/18974/page/3
按照要求设置JAVA_HOME

[root@CM ~]# vi /etc/default/cloudera-scm-server
export JAVA_HOME=/opt/jdk1.7.0_80
然后执行:
[root@CM ~]# service cloudera-scm-server start
Starting cloudera-scm-server:                              [  OK  ]

我有一句mmp不知当讲不当讲!
启动成功了却进不去 192.168.0.20:7180
打开日志一看:

2017-08-20 18:19:25,182 WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0:com.mchange.v2.resourcepool.BasicResourcePool: com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask@a384493 -- Acquisition Attempt Failed!!! Clearing pending acquires. Whil
e trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (5). Last acquisition attempt exception: 
org.postgresql.util.PSQLException: FATAL: database "cmf" does not exist
    at org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(ConnectionFactoryImpl.java:469)
    at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:112)

卧槽,我以为CM会自动创建数据库呢.
再次启动报一堆表找不到,然后CM自动建表了了,可是过一会又挂了

2017-08-20 18:38:48,481 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Error while attempting to retrieve repository info for repo https://archive.cloudera.com/sqoop-connectors/parcels/latest/
java.io.IOException: Closed
    at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.doConnect(NettyAsyncHttpProvider.java:873)
    at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.execute(NettyAsyncHttpProvider.java:858)
    at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:512)

CM一直在找远程资源库有没有?!
再次重启,竟然启动了!what the fuck!

安装agent

一上步就直接安装了agent了
安装完需要修改agent的配置文件使其能找到CM

vi /etc/cloudera-scm-agent/config.ini
server_host=192.168.0.20

启动agent

进入CDH

默认使用admin/admin进入


一开始不要选择新加主机,因为还没有装CM的其它服务,如host monitor





由于已经安装了JDK,就不让CM再安装了.

不装单用户

选key

选择CM节点的pub key

配置正确会出现CDH5.12.1的parcels,下一步,直接到parcels分发完成

完成后会显示没有host monitor


创建cmservice数据库create database cmservice owner cdh1 ENCODING 'UTF-8';

这一步死活连不上数据库.查了很多资料,试了很多方法都不行.

添加节点

添加节点的本质是:
在节点上安装agent,然后agent连接cm,那么这个节点就添加进集群了.再由CM分发parcels包,再安装服务

使用CM添加节点

1.

2.

3.选择第二项可以加入自己的yum源

4.自动添加主机时,会设置节点上的yum源

我们到 https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo 看看这里写的是啥

[cloudera-manager]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64                 
name=Cloudera Manager
baseurl=https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/
gpgkey =https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera    
gpgcheck = 1

其实就是cloudera的在线yum源,这个源里的东西是:

其实就是agent和cm的安装包有没有?
鉴于刷新yum元数据实在太慢了,放弃

或者在上面的步骤中设置自己yum源,不用CDH,这样快很多

手动安装agent

这时就不用选择"这样的操作了",直接在节点上安装agent就好了
从cm上复制agent的安装包,并设置本地yum源并安装

[root@CM cloudera_software]# scp -r /root/cloudera_software root@cdh2:/root/
[root@cdh2 ~]# vi /etc/yum.repos.d/CentOS-Media.repo #添加
[cloudera-manager-local]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 5 x86_64
name=cloudera-manager-local
baseurl=file:///root/cloudera_software/
gpgkey =file:///root/cloudera_software/RPM-GPG-KEY-cloudera
gpgcheck = 1
[root@cdh2 ~]# yum install cloudera-manager-agent
这时竟然要联网下一堆依赖,不过很快。怎么创建局域网源?
[root@cdh1 ~]# vi /etc/cloudera-scm-agent/config.ini
server_host=CM
[root@cdh1 ~]# service cloudera-scm-agent start

agent启动完成后,以CM就能看到节点了。

但此时并没有和host monitor连接。
在CM中重启host monitor后就正常了:

远程yum源安装节点

[root@CM cloudera_software]# yum isntall httpd
………..
[root@CM ~]# vi /etc/httpd/conf/httpd.conf
DocumentRoot "/root/cloudera_software"
……
[root@CM ~]# chown -R apache.apache /root/cloudera_software
#删除默认主页
# [root@CM ~]# mv /etc/httpd/conf.d/welcome.conf /etc/httpd/conf.d/welcome.conf.bk
[root@CM ~]# service httpd start


等我会用httpd了再说吧,mzdd!

等节点加入到主机后,选择"向集群添加新主机" "管理当前主机",选择刚加上的主机,下一步分发parcels。

服务

安装中出现的问题

1.

WARN [770705234@scm-web-1838:tsquery.TimeSeriesQueryService@503]
com.cloudera.server.cmf.tsquery.TimeSeriesQueryService@1c378752 failed
to locate nozzleHOST_MONITORING
com.cloudera.cmon.MgmtServiceLocatorException: Could not find a
HOST_MONITORING
nozzle from SCM.
at
com.cloudera.cmon.MgmtServiceLocator.getNozzleIPC(MgmtServiceLocator.java:147)
at
com.cloudera.server.cmf.tsquery.NozzleRequest.<init>(NozzleRequest.java:50)

在安装时添加主机,出现了这个问题,如果只选CM节点,不出现这个问题.

2.

将jdk修改为1.8,之前是1.7
下载postgresql的jdbc驱动到/usr/share/java下,修改/etc/default/cloudera-scm-server的jar路径,添加postgresql的jdbc驱动.并在该文件中export JAVA_HOME
修改日志级别:
export CMF_ROOT_LOGGER="INFO,LOGFILE"

3.

INFO [JvmPauseMonitor:debug.JvmPauseMonitor@236] Detected pause in JVM or host machine (e.g. a stop the world GC, or JVM not scheduled): paused approximately 1039ms: no GCs detected.
+ cat
+======================================================================+
| Error: JAVA_HOME is not set and Java could not be found |
+----------------------------------------------------------------------+
| Please download the latest Oracle JDK from the Oracle Java web site |
| > h  t t p : /  / w w w .or acl e . c o m/technetwork/java/javase/index.html < |
| |
| Cloudera Manager requires Java 1.6 or later. |
| NOTE: This script will find Oracle Java whether you install using |
| the binary or the RPM based installer. |
+======================================================================+
+ exit 1

修改成jdk1.8,然后并没有任务卵用.
到主机界面设置一下JDK

以上问题,用JAVA1.8重新安装CM AGENT后都解决了!!





posted on 2017-08-30 21:52  月饼馅饺子  阅读(2013)  评论(0编辑  收藏  举报

导航