代码改变世界

Oracle Autonomous Health Framework(AHF)

2022-01-06 20:37  abce  阅读(1286)  评论(0)    收藏  举报

最近因为log4j的安全漏洞升级了一下AHF。好久没有碰O,还是第一次使用AHF。

AHF包含了ORAchk、EXAchk、Trace File Analyze(TFA)。RACcheck被ORAchk取代了,RACcheck tool ([MOS ID 1268927.1])。

AHF一般每三个月更新一次(My Oracle Support note 2550798.1)。

安装之前,确保环境变量设置正确,umask的结果应该是22、022或0022。

RAC集群安装需要验证集群内节点之间的root用户的密码等价性。如果不想配置root用户等价性,可以在每个节点本地执行安装。使用tfactl syncnodes命令生成和部署相关的SSL证书。

升级和第一次安装有点类似,用root用户执行ahf_setup脚本。如果AHF已经存在了,重新安装会在已经存在的位置上升级。如果已经安装了,集群升级就不需要SSH验证了。集群升级使用已经存在daemon secure socket在节点之间进行通信。

 

1.升级安装

[root@testdb001 tfa]# ./ahf_setup 

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_214000_401256_2022_01_04-14_40_36.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 21.4.0 Build Date: 202112200745

AHF is already installed at /opt/oracle.ahf

Installed AHF Version: 20.4.4 Build Date: 202103031514

Do you want to upgrade AHF [Y]|N : Y

AHF will also be installed/upgraded on these Cluster Nodes :

1. testdb002

The AHF Location and AHF Data Directory must exist on the above nodes
AHF Location : /opt/oracle.ahf
AHF Data Directory : /u01/app/grid/oracle.ahf/data

Do you want to install/upgrade AHF on Cluster Nodes ? [Y]|N : Y

Upgrading /opt/oracle.ahf
TFA-00002 Oracle Trace File Analyzer (TFA) is not running

Shutting down AHF Services
Nothing to do !
Shutting down TFA
Removed symlink /etc/systemd/system/multi-user.target.wants/oracle-tfa.service.
Removed symlink /etc/systemd/system/graphical.target.wants/oracle-tfa.service.
. . . . . 
. . . 
Successfully shutdown TFA..

Starting AHF Services
Starting TFA..
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Waiting up to 100 seconds for TFA to be started..
. . . . . 
Successfully started TFA Process..
. . . . . 
TFA Started and listening for commands
No new directories were added to TFA
Directory /u01/app/grid/crsdata/testdb001/trace/chad was already added to TFA Directories.


AHF upgrade completed on testdb001

Upgrading AHF on Remote Nodes :

AHF will be installed on testdb002, Please wait.

AHF will prompt twice to install/upgrade per Remote Node. So total 2 prompts

Do you want to continue Y|[N] : Y

AHF will continue with Upgrading on remote nodes

Upgrading AHF on testdb002 :

[testdb002] Copying AHF Installer
root@testdb002's password: 

[testdb002] Running AHF Installer
root@testdb002's password: 

Do you want AHF to store your My Oracle Support Credentials for Automatic Upload ? Y|[N] : N

AHF is successfully upgraded to latest version

.-----------------------------------------------------------------.
| Host      | TFA Version | TFA Build ID         | Upgrade Status |
+-----------+-------------+----------------------+----------------+
| testdb001 |  21.4.0.0.0 | 21400020211220074549 | UPGRADED       |
| testdb002 |  21.4.0.0.0 | 21400020211220074549 | UPGRADED       |
'-----------+-------------+----------------------+----------------'

Moving /tmp/ahf_install_214000_401256_2022_01_04-14_40_36.log to /u01/app/grid/oracle.ahf/data/testdb001/diag/ahf/

[root@testdb001 tfa]# tfactl status

.--------------------------------------------------------------------------------------------------.
| Host      | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+-----------+---------------+--------+------+------------+----------------------+------------------+
| testdb001 | RUNNING       | 404608 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| testdb002 | RUNNING       | 334591 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'-----------+---------------+--------+------+------------+----------------------+------------------'

  

2.全新安装

[root@abcdb01 tfa]# ./ahf_setup 

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_214000_121264_2022_01_04-14_48_26.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 21.4.0 Build Date: 202112200745

Default AHF Location : /opt/oracle.ahf

Do you want to install AHF at [/opt/oracle.ahf] ? [Y]|N : Y

AHF Location : /opt/oracle.ahf

AHF Data Directory stores diagnostic collections and metadata.
AHF Data Directory requires at least 5GB (Recommended 10GB) of free space.

Choose Data Directory from below options : 

1. /u01/app/grid [Free Space : 149616 MB]
2. Enter a different Location

Choose Option [1 - 2] : 1

AHF Data Directory : /u01/app/grid/oracle.ahf/data

Do you want to add AHF Notification Email IDs ? [Y]|N : N

AHF will also be installed/upgraded on these Cluster Nodes :

1. abcdb02

The AHF Location and AHF Data Directory must exist on the above nodes
AHF Location : /opt/oracle.ahf
AHF Data Directory : /u01/app/grid/oracle.ahf/data

Do you want to install/upgrade AHF on Cluster Nodes ? [Y]|N : Y

Extracting AHF to /opt/oracle.ahf

Configuring TFA Services

Discovering Nodes and Oracle Resources

Not generating certificates as GI discovered

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.-----------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             |
+---------+---------------+--------+------+------------+----------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 |
'---------+---------------+--------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------.
|                 Summary of AHF Configuration                 |
+-----------------+--------------------------------------------+
| Parameter       | Value                                      |
+-----------------+--------------------------------------------+
| AHF Location    | /opt/oracle.ahf                            |
| TFA Location    | /opt/oracle.ahf/tfa                        |
| Orachk Location | /opt/oracle.ahf/orachk                     |
| Data Directory  | /u01/app/grid/oracle.ahf/data              |
| Repository      | /u01/app/grid/oracle.ahf/data/repository   |
| Diag Directory  | /u01/app/grid/oracle.ahf/data/abcdb01/diag |
'-----------------+--------------------------------------------'


Starting orachk scheduler from AHF ...

AHF install completed on abcdb01

Installing AHF on Remote Nodes :

AHF will be installed on abcdb02, Please wait.

AHF will prompt twice to install/upgrade per Remote Node. So total 2 prompts

Do you want to continue Y|[N] : Y

AHF will continue with Installing on remote nodes

Installing AHF on abcdb02 :

[abcdb02] Copying AHF Installer
root@abcdb02's password: 

[abcdb02] Running AHF Installer
root@abcdb02's password: 

AHF binaries are available in /opt/oracle.ahf/bin

AHF is successfully installed

Do you want AHF to store your My Oracle Support Credentials for Automatic Upload ? Y|[N] : N

Moving /tmp/ahf_install_214000_121264_2022_01_04-14_48_26.log to /u01/app/grid/oracle.ahf/data/abcdb01/diag/ahf/

  

问题处理

安装后,查看状态的时候,发现状态不正常

[root@abcdb01 tfa]# tfactl status

.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | RUNNING          |
| abcdb02 | RUNNING       | 297423 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'
[root@abcdb01 tfa]# tfactl status

.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| abcdb02 | RUNNING       | 297423 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'

[root@abcdb01 tfa]# tfactl status

.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| abcdb02 | NOT RUNNING   | -      |      |            |                      |                  |
'---------+---------------+--------+------+------------+----------------------+------------------'
[root@abcdb01 tfa]# tfactl status

.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb01 | RUNNING       | 123355 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'
[root@abcdb01 tfa]# tfactl toolstatus
TFA-00104 Cannot establish connection with TFA Server. Please check TFA Certificates

  

修改方法:在好的节点执行同步操作

[root@abcdb02 tmp]# tfactl syncnodes

Current Node List in TFA : 

1. abcdb02
2. abcdb01

Node List in Cluster :

1. abcdb01
2. abcdb02

Node List to sync TFA Certificates : 
     1  abcdb01

Do you want to update this node list? Y|[N]: Y

Please Enter all the remote nodes you want to sync...

Enter Remote Node List (separated by space) : 1

Node List to sync TFA Certificates : 
     1  1

Unable to ping Host 1. Please verify.

.------------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+---------+---------------+--------+------+------------+----------------------+------------------+
| abcdb02 | RUNNING       | 308376 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
| abcdb01 | RUNNING       | 226067 | 5000 | 21.4.0.0.0 | 21400020211220074549 | COMPLETE         |
'---------+---------------+--------+------+------------+----------------------+------------------'

  

其它

建议生产环境数据库均关闭TFA自动收集、分析功能(Autodiagcollect)从而避免类似情况发生影响生产环境数据库的正常运行。

[root@abcdb02 tmp]# tfactl get autodiagcollect
.-------------------------------------------------.
|                     abcdb02                     |
+-----------------------------------------+-------+
| Configuration Parameter                 | Value |
+-----------------------------------------+-------+
| Auto Diagcollection ( autodiagcollect ) | ON    |
'-----------------------------------------+-------'

[root@abcdb02 tmp]# tfactl set autodiagcollect=off

  

 

参考:

安装ahf后的后遗症

orachk.zip超大93G把根盘占满orachk -autostop

记一次生产数据库系统内存使用过高的案例

Oracle Autonomous Health Framework (AHF) – Including TFA and ORAchk/EXAchk (Doc ID 2550798.1)

Autonomous Health Framework (AHF) – Including TFA and ORAchk/EXAchk (Doc ID 2550798.1)