centos7之zabbix只history分表

一、原因分析

   环境介绍:centos7.4.1708  zabbix3.2.6  Mariadb10.3.8

 

  添加了将近500台主机,数据库中的历史数据越来越多了。出现了zabbix自带的housekeeper清理历史数据的时候,造成查数据慢并且触发了很多报警信息(比如Zabbix housekeeper processes more than 75% busy)。

  housekeeper默认一小时清理一次旧的数据。在server配置文件中这两行可以定义:
  HousekeepingFrequency=1   zabbix执行Housekeeping的频率,单位为hours
  MaxHousekeeperDelete=5000  每次最多删除历史数据的行

  但是housekeeper清理过中,会导致数据库负载增高,从而影响读写性能,卡到你怀疑这个东西能不能用。
  所以我们对几个历史数据表做分区表。按照时间(每天)为单位,把历史数据存到各个分区表中,这样做能加快查询速度、快速清理过去一时间的历史数据(删除分区表)

  1、先查看每个表所占容量和行数,可以看到history的表很大

MariaDB [zabbix]> use zabbix;
Database changed
MariaDB [zabbix]> select table_name, (data_length+index_length)/1024/1024 as total_mb, table_rows from information_schema.tables where table_schema='zabbix';
+----------------------------+----------------+------------+
| table_name                 | total_mb       | table_rows |
+----------------------------+----------------+------------+
| users                      |     0.03125000 |          5 |
| maintenances               |     0.01562500 |          0 |
| hosts                      |     0.25000000 |        515 |
| groups                     |     0.03125000 |         22 |
| screens_items              |     0.03125000 |         49 |
| group_prototype            |     0.06250000 |          7 |
| group_discovery            |     0.03125000 |          0 |
| screen_user                |     0.04687500 |          0 |
| screen_usrgrp              |     0.04687500 |          0 |
| slideshows                 |     0.04687500 |          0 |
| slideshow_user             |     0.04687500 |          0 |
| slideshow_usrgrp           |     0.04687500 |          0 |
| interface                  |     0.09375000 |        458 |
| applications               |     0.10937500 |        708 |
| slides                     |     0.04687500 |          0 |
| drules                     |     0.04687500 |          0 |
| dchecks                    |     0.03125000 |          0 |
| items                      |    40.34375000 |       9325 |
| httptest                   |     0.07812500 |          0 |
| httpstep                   |     0.03125000 |          0 |
| httpstepitem               |     0.04687500 |          3 |
| valuemaps                  |     0.03125000 |         17 |
| httptestitem               |     0.04687500 |          2 |
| operations                 |     0.03125000 |         10 |
| scripts                    |     0.06250000 |          3 |
| media_type                 |     0.03125000 |          3 |
| usrgrp                     |     0.03125000 |          8 |
| users_groups               |     0.04687500 |          5 |
| sysmaps_elements           |     0.09375000 |          0 |
| actions                    |     0.04687500 |          6 |
| opmessage                  |     0.03125000 |          8 |
| opmessage_grp              |     0.04687500 |          8 |
| opmessage_usr              |     0.04687500 |          6 |
| opcommand                  |     0.03125000 |          0 |
| opcommand_hst              |     0.04687500 |          0 |
| opcommand_grp              |     0.04687500 |          0 |
| opconditions               |     0.03125000 |          0 |
| optemplate                 |     0.04687500 |          0 |
| conditions                 |     0.03125000 |         35 |
| config                     |     0.04687500 |          0 |
| triggers                   |     1.57812500 |       2139 |
| trigger_depends            |     0.17187500 |        784 |
| functions                  |     1.29687500 |       2165 |
| graphs                     |     0.76562500 |       2580 |
| hostmacro                  |     0.03125000 |          0 |
| graphs_items               |     0.78125000 |       5573 |
| graph_theme                |     0.03125000 |          2 |
| globalmacro                |     0.01562500 |          0 |
| hosts_groups               |     0.15625000 |        831 |
| hosts_templates            |     0.09375000 |        525 |
| services_times             |     0.03125000 |          0 |
| items_applications         |     6.75000000 |       8944 |
| mappings                   |     0.03125000 |        143 |
| media                      |     0.04687500 |          8 |
| rights                     |     0.04687500 |         15 |
| services                   |     0.03125000 |          0 |
| services_links             |     0.04687500 |          0 |
| icon_map                   |     0.04687500 |          0 |
| icon_mapping               |     0.04687500 |          0 |
| sysmaps                    |     0.07812500 |          1 |
| sysmap_usrgrp              |     0.04687500 |          0 |
| sysmaps_link_triggers      |     0.04687500 |          0 |
| sysmaps_links              |     0.06250000 |          0 |
| expressions                |     0.03125000 |          6 |
| sysmap_user                |     0.04687500 |          0 |
| sysmap_url                 |     0.03125000 |          0 |
| alerts                     |    61.65625000 |      88993 |
| history                    |  3072.56250000 |   32407979 |
| maintenances_hosts         |     0.04687500 |          0 |
| acknowledges               |     0.06250000 |          9 |
| maintenances_groups        |     0.04687500 |          0 |
| timeperiods                |     0.01562500 |          0 |
| regexps                    |     0.03125000 |          5 |
| ids                        |     0.01562500 |         50 |
| history_text               |     1.18750000 |       1469 |
| trends                     |    47.17187500 |     641558 |
| history_str                |    34.87500000 |     208635 |
| history_uint               | 10965.95312500 |  111148914 |
| history_log                |     1.18750000 |         38 |
| trends_uint                |   311.56250000 |    4025194 |
| proxy_history              |     0.01562500 |          0 |
| proxy_dhistory             |     0.01562500 |          0 |
| events                     |    11.54687500 |      57790 |
| dhosts                     |     0.03125000 |          0 |
| auditlog                   |     7.01562500 |      27657 |
| auditlog_details           |     1.43750000 |       6605 |
| service_alarms             |     0.04687500 |          0 |
| autoreg_host               |     0.03125000 |          0 |
| graph_discovery            |     0.17187500 |       1952 |
| proxy_autoreg_host         |     0.01562500 |          0 |
| dservices                  |     0.04687500 |          0 |
| escalations                |     0.03125000 |          2 |
| globalvars                 |     0.01562500 |          0 |
| host_inventory             |     0.01562500 |          0 |
| item_discovery             |     6.71875000 |       7137 |
| housekeeper                |     7.03125000 |       1764 |
| images                     |     1.53125000 |        140 |
| profiles                   |     0.32812500 |       1385 |
| host_discovery             |     0.04687500 |          2 |
| interface_discovery        |     0.03125000 |          0 |
| problem                    |     8.51562500 |      22064 |
| sessions                   |     0.50000000 |       2606 |
| trigger_discovery          |     0.06250000 |        447 |
| corr_condition             |     0.03125000 |          0 |
| application_template       |     0.07812500 |        482 |
| item_condition             |     0.03125000 |         22 |
| application_prototype      |     0.04687500 |          1 |
| item_application_prototype |     0.04687500 |          1 |
| application_discovery      |     0.04687500 |          0 |
| opinventory                |     0.01562500 |          0 |
| trigger_tag                |     0.03125000 |          0 |
| event_tag                  |     0.03125000 |          0 |
| problem_tag                |     0.04687500 |          0 |
| event_recovery             |     4.50000000 |      18025 |
| correlation                |     0.01562500 |          0 |
| corr_condition_tagpair     |     0.01562500 |          0 |
| corr_operation             |     0.03125000 |          0 |
| task                       |     0.01562500 |          0 |
| dbversion                  |     0.01562500 |          0 |
| screens                    |     0.04687500 |         13 |
| opgroup                    |     0.04687500 |          0 |
| sysmap_element_url         |     0.03125000 |          0 |
| maintenances_windows       |     0.04687500 |          0 |
| corr_condition_tag         |     0.01562500 |          0 |
| corr_condition_group       |     0.03125000 |          0 |
| corr_condition_tagvalue    |     0.01562500 |          0 |
| task_close_problem         |     0.01562500 |          0 |
+----------------------------+----------------+------------+
127 rows in set (0.005 sec)
View Code

 

  2、下载分区脚本(适合zabbix 3.2,网友大佬写的,非常省事儿)

[root@localhost ~]# wget https://dl.cactifans.com/zabbix/partitiontables_gt_zbx34.sh

  参数更改,还有zabbix-server,如果这里你的服务不对的话改一下。

脚本默认详情数据保留30天,趋势数据保留1个月,如需修改,请修改以下内容:
daily_history_min=30
monthly_history_min=1

脚本默认连接数据库信息,更改成你的:
DBHOST=localhost
DBUSER=zabbix
DBPASS=zabbix

   赋予执行权限:

[root@localhost ~]# chmod +x partitiontables_gt_zbx34.sh

  3、执行分区脚本

备份数据库(我这里用的是xtrabackup):
[root@localhost ~]# mysqldump -uroot -p zabbix>/root/zabbix_$(date +%F).sql
停止zabbix服务:
[root@localhost ~]# systemctl stop zabbix_server
查看数据库状态:
[root@localhost ~]# systemctl status zabbix_server

  执行脚本:

[root@localhost ~]# ./partitiontables_gt_zbx34.sh
Ready to partition tables.
Ready to update permissions of Zabbix user to create routines
#mysql的root帐号和密码:
Enter root DB user: root
Enter root password: 123456
#备份数据库的操作,已经手动备份:
Do you want to backup the database (recommended) (Y/n): n
Are you certain you have a backup (y/N): 
y
Ready to proceed:
Starting yearly partioning at: 2018
and ending at: 2018
With 30 days of daily history
Ready to proceed (Y/n): 
y
Altering table: history
Altering table: history_log
Altering table: history_str
Altering table: history_text
Altering table: history_uint
Altering table: trends
Altering table: trends_uint
Creating monthly partitions for table: trends
Creating monthly partitions for table: trends_uint
Creating daily partitions for table: history
Creating daily partitions for table: history_log
Creating daily partitions for table: history_str
Creating daily partitions for table: history_text
Creating daily partitions for table: history_uint
Ready to apply script to database, this may take a while.(Y/n): 
y
Altering tables
history
history_log
history_str
history_text
history_uint
trends
trends_uint
trends
trends_uint
history
history_log
history_str
history_text
history_uint
Installing procedures
If Zabbix Version = 2.0 
Do you want to update the /etc/zabbix/zabbix_server.conf
#是否关闭管家设置,我这里已经配置过了,所以选择n
to disable housekeeping (Y/n): n
#是否设置计划任务,选择y
Do you want to update the crontab (Y/n): y
The crontab entry can be either in /etc/cron.daily, or added
to the crontab for root
#是否创建计划任务目录,选择y
Do you want to add this to the /etc/cron.daily directory (Y/n): y
#设置清理报告通知邮箱,这里写你的邮箱地址就行,每晚会发邮件给这里:
Enter email of who should get the daily housekeeping reports: gxm@test.com
执行脚本

  如果您顺序运行到这里,那么恭喜你成功了,我这里库已经20GB了,所以跑了1-2小时。

   4、确认定时任务
  上面的脚本会在/etc/cron.daily目录下生成一个名称为zabbixhousekeeping的脚本:

[root@localhost ~]# cat /etc/cron.daily/zabbixhousekeeping 
#!/bin/bash
/usr/local/zabbix/cron.d/housekeeping.sh

  上面的脚本会在/usr/local/zabbix/cron.d/目录下生成一个名称为housekeeping.sh脚本。

[root@localhost ~]# cat /usr/local/zabbix/cron.d/housekeeping.sh 
#!/bin/bash
##这里的邮箱地址用来每晚给你发邮件,上面设置过
MAILTO=gxm@test.com
tmpfile=/tmp/housekeeping$$
date >$tmpfile
/usr/bin/mysql --skip-column-names -B -h localhost -uzabbix -pzabbix zabbix -e "CALL create_zabbix_partitions();" >>$tmpfile 2>&1
/usr/bin/mail -s "Zabbix MySql Partition Housekeeping" $MAILTO <$tmpfile
rm -f $tmpfile

  这样的话,每天凌晨大概3点多会执行,并且用mail程序发送执行结果到上面的你指定的邮箱。可以看到删除过期的数据分区表,并建立新的分区表,我这里因为已经设置过短信报警,所以就不再设置了,如果你收不到报警短信,可以看我前面写过的短信报警笔记。

  5、确认脚本执行效果
  执行以下SQL语句查看histroy表的分区情况:

MariaDB [(none)]> use zabbix;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [zabbix]> select partition_name part,partition_expression expr,partition_description descr,table_rows from information_schema.partitions where table_schema = schema() and table_name='history';
+-----------+---------+------------+------------+
| part      | expr    | descr      | table_rows |
+-----------+---------+------------+------------+
| p20190309 | `clock` | 1552147200 |     814320 |
| p20190310 | `clock` | 1552233600 |     814320 |
| p20190311 | `clock` | 1552320000 |     816816 |
| p20190312 | `clock` | 1552406400 |     755723 |
| p20190313 | `clock` | 1552492800 |     851136 |
| p20190314 | `clock` | 1552579200 |     921024 |
| p20190315 | `clock` | 1552665600 |     980933 |
| p20190316 | `clock` | 1552752000 |    1038024 |
| p20190317 | `clock` | 1552838400 |    1038024 |
| p20190318 | `clock` | 1552924800 |    1038024 |
| p20190319 | `clock` | 1553011200 |    1039896 |
| p20190320 | `clock` | 1553097600 |    1040832 |
| p20190321 | `clock` | 1553184000 |    1040832 |
| p20190322 | `clock` | 1553270400 |    1057680 |
| p20190323 | `clock` | 1553356800 |    1112592 |
| p20190324 | `clock` | 1553443200 |    1112592 |
| p20190325 | `clock` | 1553529600 |    1112592 |
| p20190326 | `clock` | 1553616000 |    1091064 |
| p20190327 | `clock` | 1553702400 |    1250496 |
| p20190328 | `clock` | 1553788800 |    1249872 |
| p20190329 | `clock` | 1553875200 |    1249248 |
| p20190330 | `clock` | 1553961600 |    1250496 |
| p20190331 | `clock` | 1554048000 |    1250496 |
| p20190401 | `clock` | 1554134400 |    1231464 |
| p20190402 | `clock` | 1554220800 |    1096466 |
| p20190403 | `clock` | 1554307200 |    1018368 |
| p20190404 | `clock` | 1554393600 |    1137240 |
| p20190405 | `clock` | 1554480000 |    1095432 |
| p20190406 | `clock` | 1554566400 |    1167504 |
| p20190407 | `clock` | 1554652800 |    1124760 |
| p20190408 | `clock` | 1554739200 |     619944 |
| p20190409 | `clock` | 1554825600 |          0 |
| p20190410 | `clock` | 1554912000 |          0 |
| p20190411 | `clock` | 1554998400 |          0 |
| p20190412 | `clock` | 1555084800 |          0 |
| p20190413 | `clock` | 1555171200 |          0 |
| p20190414 | `clock` | 1555257600 |          0 |
| p20190415 | `clock` | 1555344000 |          0 |
+-----------+---------+------------+------------+
38 rows in set (0.002 sec)
分表查看

  分区表成功,启动zabbix_server服务,这里需要说明一下,分完表以后我的数据库立马小了5个GB,但是housekeeping还是很高,当晚执行一遍计划任务后,第二天就正常了。

  6、取消housekeeper清理

  这里需要注意,最后一定要在zabbix配置文件里面取消下图的设置。

  原来慢试过两个小时清理一遍,每次清理5000条,统统都取消。

  

  这里,开启内部管家的框不选择,只选择下面的“覆盖监控项历史期间”和“覆盖监控项趋势期间”,至于数据存储天数,自己看吧,我这里选择30天,但是我感觉最多7天足矣。

  

 

  

  此笔来自于:http://www.mamicode.com/info-detail-2228907.html

posted @ 2019-04-08 12:30  Charles.L  阅读(1289)  评论(1)    收藏  举报