使用ansible控制Hadoop服务的启动和停止

一、环境:

服务器一台,已安装centos7.5系统,做ansible服务器;

客户机三台:hadoop-master(192.168.1.18)、hadoop-slave1(192.168.1.19)、hadoop-slave2(192.168.1.20)

二、ansible软件安装:

[root@centos75 ~]# yum install ansible

三、ansible配置过程:

1、服务器与客户机之间的免密配置:

(1)生成密钥: ssh-keygen -t rsa

(2)传递密钥:

[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.18

[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.19

[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.20

2、ansible配置

(1)Inventory主机清单配置:

[root@centos75 ~]# vi /etc/ansible/hosts

...

[hadoop]

192.168.1.[18:20]  #这是一种IP地址段表示方式,也可单列每个IP地址。

(2)配置ansible.cfg:

[root@centos75 ~]# vi /etc/ansible/ansible.cfg

...

host_key_checking = False  #禁用每次执行ansbile命令检查ssh key host

...

log_path = /var/log/ansible.log  #开启日志记录

...

[accelerate]  #ansible连接加速配置

#accelerate_port = 5099

accelerate_port = 10000

...

accelerate_multi_key = yes

...

deprecation_warnings = False  #屏蔽弃用告警提示,减少不必要的信息显示

...

四、测试

[root@centos75 ~]# ansible all -m ping

192.168.1.20 | SUCCESS => {

"changed": false,

"ping": "pong"

}

192.168.1.18 | SUCCESS => {

"changed": false,

"ping": "pong"

}

192.168.1.19 | SUCCESS => {

"changed": false,

"ping": "pong"

}

上述信息表明ansible管理对象已全部ping通,ansible配置正常。

五、使用示例

(1) Ad-Hoc模式:

修改Hadoop三台集群服务器的/etc/hosts文件:

[root@centos75 ~]# vi hosts

#127.0.1.1  hadoop-master

192.168.1.18  hadoop-master

192.168.1.19  hadoop-slave1

192.168.1.20  hadoop-slave2

# The following lines are desirable for IPv6 capable hosts

::1  ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

~

[root@centos75 ~]# ansible hadoop -m copy -a "src=/root/hosts dest=/etc/hosts"

192.168.1.20 | SUCCESS => {

"changed": true,

"checksum": "214f72ce3329805c07748997e11313fffb03f667",

"dest": "/etc/hosts",

"gid": 0,

"group": "root",

"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",

"mode": "0644",

"owner": "root",

"size": 298,

"src": "/root/.ansible/tmp/ansible-tmp-1536384515.76-109467000571031/source",

"state": "file",

"uid": 0

}

192.168.1.18 | SUCCESS => {

"changed": true,

"checksum": "214f72ce3329805c07748997e11313fffb03f667",

"dest": "/etc/hosts",

"gid": 0,

"group": "root",

"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",

"mode": "0644",

"owner": "root",

"size": 298,

"src": "/root/.ansible/tmp/ansible-tmp-1536384515.74-269105082907411/source",

"state": "file",

"uid": 0

}

192.168.1.19 | SUCCESS => {

"changed": true,

"checksum": "214f72ce3329805c07748997e11313fffb03f667",

"dest": "/etc/hosts",

"gid": 0,

"group": "root",

"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",

"mode": "0644",

"owner": "root",

"size": 298,

"src": "/root/.ansible/tmp/ansible-tmp-1536384515.75-259083114686776/source",

"state": "file",

"uid": 0

}

还可使用命令查看各客户机hosts文件内容:

ansible hadoop -m shell -a 'cat /etc/hosts'

ansible hadoop -m shell -a 'ls -lhat /etc/hosts'

 

(2) playbook剧本模式:

启动Hadoop集群服务:

[root@centos75 ~]# vi hadoop-start.yml

---

#“---”符号在yml文件中只能在开头出现一次,多次出现会报错;另外,此符号省略也可,不知为何,待继续研究...

- hosts: hadoop

#注意:“-”符号后必须有空格;“:”后面也必须有空格。

tasks:

#注意:缩进按两个空格规范,不能使用TAB!

- name: startup hadoop datanode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh start datanode  #尽管集群服务器上已配置hadoop-2.7.3/sbin的环境变量,但这里必须使用绝对路径

- hosts: 192.168.1.18

tasks:

- name: startup hadoop namenode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh start namenode

~

[root@centos75 ~]# ansible-playbook hadoop-start.yml

PLAY [hadoop] ******************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.20]

ok: [192.168.1.19]

ok: [192.168.1.18]

TASK [startup hadoop datanode services] ****************************************

changed: [192.168.1.19]

changed: [192.168.1.18]

changed: [192.168.1.20]

PLAY [192.168.1.18] ************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.18]

TASK [startup hadoop namenode services] ****************************************

changed: [192.168.1.18]

PLAY RECAP *********************************************************************

192.168.1.18  : ok=4  changed=2  unreachable=0  failed=0

192.168.1.19  : ok=2  changed=1  unreachable=0  failed=0

192.168.1.20  : ok=2  changed=1  unreachable=0  failed=0

可在集群服务器上观察服务启动情况:

root@hadoop-master:~# jps

8976 DataNode

9231 Jps

9093 NameNode

root@hadoop-slave1:~# jps

7058 Jps

6972 DataNode

停止hadoop集群服务:

[root@centos75 ~]# vi hadoop-stop.yml

---

- hosts: hadoop

tasks:

- name: stop hadoop datanode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh stop datanode

- hosts: 192.168.1.18

tasks:

- name: stop hadoop namenode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh stop namenode

~

[root@centos75 ~]# ansible-playbook hadoop-stop.yml

PLAY [hadoop] ******************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.20]

ok: [192.168.1.19]

ok: [192.168.1.18]

TASK [stop hadoop datanode services] *******************************************

changed: [192.168.1.20]

changed: [192.168.1.19]

changed: [192.168.1.18]

PLAY [192.168.1.18] ************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.18]

TASK [stop hadoop namenode services] *******************************************

changed: [192.168.1.18]

PLAY RECAP *********************************************************************

192.168.1.18  : ok=4  changed=2  unreachable=0  failed=0

192.168.1.19  : ok=2  changed=1  unreachable=0  failed=0

192.168.1.20  : ok=2  changed=1  unreachable=0  failed=0

上述过程可看出,ansible已实现了对集群服务启停作业的集中控制。

posted @ 2019-07-25 21:33  sfccl  阅读(1053)  评论(0编辑  收藏  举报