centos7之zabbix只history分表
一、原因分析
环境介绍:centos7.4.1708 zabbix3.2.6 Mariadb10.3.8
添加了将近500台主机,数据库中的历史数据越来越多了。出现了zabbix自带的housekeeper清理历史数据的时候,造成查数据慢并且触发了很多报警信息(比如Zabbix housekeeper processes more than 75% busy)。
housekeeper默认一小时清理一次旧的数据。在server配置文件中这两行可以定义:
HousekeepingFrequency=1 zabbix执行Housekeeping的频率,单位为hours
MaxHousekeeperDelete=5000 每次最多删除历史数据的行
但是housekeeper清理过中,会导致数据库负载增高,从而影响读写性能,卡到你怀疑这个东西能不能用。
所以我们对几个历史数据表做分区表。按照时间(每天)为单位,把历史数据存到各个分区表中,这样做能加快查询速度、快速清理过去一时间的历史数据(删除分区表)
1、先查看每个表所占容量和行数,可以看到history的表很大
MariaDB [zabbix]> use zabbix; Database changed MariaDB [zabbix]> select table_name, (data_length+index_length)/1024/1024 as total_mb, table_rows from information_schema.tables where table_schema='zabbix'; +----------------------------+----------------+------------+ | table_name | total_mb | table_rows | +----------------------------+----------------+------------+ | users | 0.03125000 | 5 | | maintenances | 0.01562500 | 0 | | hosts | 0.25000000 | 515 | | groups | 0.03125000 | 22 | | screens_items | 0.03125000 | 49 | | group_prototype | 0.06250000 | 7 | | group_discovery | 0.03125000 | 0 | | screen_user | 0.04687500 | 0 | | screen_usrgrp | 0.04687500 | 0 | | slideshows | 0.04687500 | 0 | | slideshow_user | 0.04687500 | 0 | | slideshow_usrgrp | 0.04687500 | 0 | | interface | 0.09375000 | 458 | | applications | 0.10937500 | 708 | | slides | 0.04687500 | 0 | | drules | 0.04687500 | 0 | | dchecks | 0.03125000 | 0 | | items | 40.34375000 | 9325 | | httptest | 0.07812500 | 0 | | httpstep | 0.03125000 | 0 | | httpstepitem | 0.04687500 | 3 | | valuemaps | 0.03125000 | 17 | | httptestitem | 0.04687500 | 2 | | operations | 0.03125000 | 10 | | scripts | 0.06250000 | 3 | | media_type | 0.03125000 | 3 | | usrgrp | 0.03125000 | 8 | | users_groups | 0.04687500 | 5 | | sysmaps_elements | 0.09375000 | 0 | | actions | 0.04687500 | 6 | | opmessage | 0.03125000 | 8 | | opmessage_grp | 0.04687500 | 8 | | opmessage_usr | 0.04687500 | 6 | | opcommand | 0.03125000 | 0 | | opcommand_hst | 0.04687500 | 0 | | opcommand_grp | 0.04687500 | 0 | | opconditions | 0.03125000 | 0 | | optemplate | 0.04687500 | 0 | | conditions | 0.03125000 | 35 | | config | 0.04687500 | 0 | | triggers | 1.57812500 | 2139 | | trigger_depends | 0.17187500 | 784 | | functions | 1.29687500 | 2165 | | graphs | 0.76562500 | 2580 | | hostmacro | 0.03125000 | 0 | | graphs_items | 0.78125000 | 5573 | | graph_theme | 0.03125000 | 2 | | globalmacro | 0.01562500 | 0 | | hosts_groups | 0.15625000 | 831 | | hosts_templates | 0.09375000 | 525 | | services_times | 0.03125000 | 0 | | items_applications | 6.75000000 | 8944 | | mappings | 0.03125000 | 143 | | media | 0.04687500 | 8 | | rights | 0.04687500 | 15 | | services | 0.03125000 | 0 | | services_links | 0.04687500 | 0 | | icon_map | 0.04687500 | 0 | | icon_mapping | 0.04687500 | 0 | | sysmaps | 0.07812500 | 1 | | sysmap_usrgrp | 0.04687500 | 0 | | sysmaps_link_triggers | 0.04687500 | 0 | | sysmaps_links | 0.06250000 | 0 | | expressions | 0.03125000 | 6 | | sysmap_user | 0.04687500 | 0 | | sysmap_url | 0.03125000 | 0 | | alerts | 61.65625000 | 88993 | | history | 3072.56250000 | 32407979 | | maintenances_hosts | 0.04687500 | 0 | | acknowledges | 0.06250000 | 9 | | maintenances_groups | 0.04687500 | 0 | | timeperiods | 0.01562500 | 0 | | regexps | 0.03125000 | 5 | | ids | 0.01562500 | 50 | | history_text | 1.18750000 | 1469 | | trends | 47.17187500 | 641558 | | history_str | 34.87500000 | 208635 | | history_uint | 10965.95312500 | 111148914 | | history_log | 1.18750000 | 38 | | trends_uint | 311.56250000 | 4025194 | | proxy_history | 0.01562500 | 0 | | proxy_dhistory | 0.01562500 | 0 | | events | 11.54687500 | 57790 | | dhosts | 0.03125000 | 0 | | auditlog | 7.01562500 | 27657 | | auditlog_details | 1.43750000 | 6605 | | service_alarms | 0.04687500 | 0 | | autoreg_host | 0.03125000 | 0 | | graph_discovery | 0.17187500 | 1952 | | proxy_autoreg_host | 0.01562500 | 0 | | dservices | 0.04687500 | 0 | | escalations | 0.03125000 | 2 | | globalvars | 0.01562500 | 0 | | host_inventory | 0.01562500 | 0 | | item_discovery | 6.71875000 | 7137 | | housekeeper | 7.03125000 | 1764 | | images | 1.53125000 | 140 | | profiles | 0.32812500 | 1385 | | host_discovery | 0.04687500 | 2 | | interface_discovery | 0.03125000 | 0 | | problem | 8.51562500 | 22064 | | sessions | 0.50000000 | 2606 | | trigger_discovery | 0.06250000 | 447 | | corr_condition | 0.03125000 | 0 | | application_template | 0.07812500 | 482 | | item_condition | 0.03125000 | 22 | | application_prototype | 0.04687500 | 1 | | item_application_prototype | 0.04687500 | 1 | | application_discovery | 0.04687500 | 0 | | opinventory | 0.01562500 | 0 | | trigger_tag | 0.03125000 | 0 | | event_tag | 0.03125000 | 0 | | problem_tag | 0.04687500 | 0 | | event_recovery | 4.50000000 | 18025 | | correlation | 0.01562500 | 0 | | corr_condition_tagpair | 0.01562500 | 0 | | corr_operation | 0.03125000 | 0 | | task | 0.01562500 | 0 | | dbversion | 0.01562500 | 0 | | screens | 0.04687500 | 13 | | opgroup | 0.04687500 | 0 | | sysmap_element_url | 0.03125000 | 0 | | maintenances_windows | 0.04687500 | 0 | | corr_condition_tag | 0.01562500 | 0 | | corr_condition_group | 0.03125000 | 0 | | corr_condition_tagvalue | 0.01562500 | 0 | | task_close_problem | 0.01562500 | 0 | +----------------------------+----------------+------------+ 127 rows in set (0.005 sec)
2、下载分区脚本(适合zabbix 3.2,网友大佬写的,非常省事儿)
[root@localhost ~]# wget https://dl.cactifans.com/zabbix/partitiontables_gt_zbx34.sh
参数更改,还有zabbix-server,如果这里你的服务不对的话改一下。
脚本默认详情数据保留30天,趋势数据保留1个月,如需修改,请修改以下内容: daily_history_min=30 monthly_history_min=1 脚本默认连接数据库信息,更改成你的: DBHOST=localhost DBUSER=zabbix DBPASS=zabbix
赋予执行权限:
[root@localhost ~]# chmod +x partitiontables_gt_zbx34.sh
3、执行分区脚本
备份数据库(我这里用的是xtrabackup): [root@localhost ~]# mysqldump -uroot -p zabbix>/root/zabbix_$(date +%F).sql 停止zabbix服务: [root@localhost ~]# systemctl stop zabbix_server 查看数据库状态: [root@localhost ~]# systemctl status zabbix_server
执行脚本:
[root@localhost ~]# ./partitiontables_gt_zbx34.sh Ready to partition tables. Ready to update permissions of Zabbix user to create routines #mysql的root帐号和密码: Enter root DB user: root Enter root password: 123456 #备份数据库的操作,已经手动备份: Do you want to backup the database (recommended) (Y/n): n Are you certain you have a backup (y/N): y Ready to proceed: Starting yearly partioning at: 2018 and ending at: 2018 With 30 days of daily history Ready to proceed (Y/n): y Altering table: history Altering table: history_log Altering table: history_str Altering table: history_text Altering table: history_uint Altering table: trends Altering table: trends_uint Creating monthly partitions for table: trends Creating monthly partitions for table: trends_uint Creating daily partitions for table: history Creating daily partitions for table: history_log Creating daily partitions for table: history_str Creating daily partitions for table: history_text Creating daily partitions for table: history_uint Ready to apply script to database, this may take a while.(Y/n): y Altering tables history history_log history_str history_text history_uint trends trends_uint trends trends_uint history history_log history_str history_text history_uint Installing procedures If Zabbix Version = 2.0 Do you want to update the /etc/zabbix/zabbix_server.conf #是否关闭管家设置,我这里已经配置过了,所以选择n to disable housekeeping (Y/n): n #是否设置计划任务,选择y Do you want to update the crontab (Y/n): y The crontab entry can be either in /etc/cron.daily, or added to the crontab for root #是否创建计划任务目录,选择y Do you want to add this to the /etc/cron.daily directory (Y/n): y #设置清理报告通知邮箱,这里写你的邮箱地址就行,每晚会发邮件给这里: Enter email of who should get the daily housekeeping reports: gxm@test.com
如果您顺序运行到这里,那么恭喜你成功了,我这里库已经20GB了,所以跑了1-2小时。
4、确认定时任务
上面的脚本会在/etc/cron.daily目录下生成一个名称为zabbixhousekeeping的脚本:
[root@localhost ~]# cat /etc/cron.daily/zabbixhousekeeping #!/bin/bash /usr/local/zabbix/cron.d/housekeeping.sh
上面的脚本会在/usr/local/zabbix/cron.d/目录下生成一个名称为housekeeping.sh脚本。
[root@localhost ~]# cat /usr/local/zabbix/cron.d/housekeeping.sh #!/bin/bash ##这里的邮箱地址用来每晚给你发邮件,上面设置过 MAILTO=gxm@test.com tmpfile=/tmp/housekeeping$$ date >$tmpfile /usr/bin/mysql --skip-column-names -B -h localhost -uzabbix -pzabbix zabbix -e "CALL create_zabbix_partitions();" >>$tmpfile 2>&1 /usr/bin/mail -s "Zabbix MySql Partition Housekeeping" $MAILTO <$tmpfile rm -f $tmpfile
这样的话,每天凌晨大概3点多会执行,并且用mail程序发送执行结果到上面的你指定的邮箱。可以看到删除过期的数据分区表,并建立新的分区表,我这里因为已经设置过短信报警,所以就不再设置了,如果你收不到报警短信,可以看我前面写过的短信报警笔记。
5、确认脚本执行效果
执行以下SQL语句查看histroy表的分区情况:
MariaDB [(none)]> use zabbix; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [zabbix]> select partition_name part,partition_expression expr,partition_description descr,table_rows from information_schema.partitions where table_schema = schema() and table_name='history'; +-----------+---------+------------+------------+ | part | expr | descr | table_rows | +-----------+---------+------------+------------+ | p20190309 | `clock` | 1552147200 | 814320 | | p20190310 | `clock` | 1552233600 | 814320 | | p20190311 | `clock` | 1552320000 | 816816 | | p20190312 | `clock` | 1552406400 | 755723 | | p20190313 | `clock` | 1552492800 | 851136 | | p20190314 | `clock` | 1552579200 | 921024 | | p20190315 | `clock` | 1552665600 | 980933 | | p20190316 | `clock` | 1552752000 | 1038024 | | p20190317 | `clock` | 1552838400 | 1038024 | | p20190318 | `clock` | 1552924800 | 1038024 | | p20190319 | `clock` | 1553011200 | 1039896 | | p20190320 | `clock` | 1553097600 | 1040832 | | p20190321 | `clock` | 1553184000 | 1040832 | | p20190322 | `clock` | 1553270400 | 1057680 | | p20190323 | `clock` | 1553356800 | 1112592 | | p20190324 | `clock` | 1553443200 | 1112592 | | p20190325 | `clock` | 1553529600 | 1112592 | | p20190326 | `clock` | 1553616000 | 1091064 | | p20190327 | `clock` | 1553702400 | 1250496 | | p20190328 | `clock` | 1553788800 | 1249872 | | p20190329 | `clock` | 1553875200 | 1249248 | | p20190330 | `clock` | 1553961600 | 1250496 | | p20190331 | `clock` | 1554048000 | 1250496 | | p20190401 | `clock` | 1554134400 | 1231464 | | p20190402 | `clock` | 1554220800 | 1096466 | | p20190403 | `clock` | 1554307200 | 1018368 | | p20190404 | `clock` | 1554393600 | 1137240 | | p20190405 | `clock` | 1554480000 | 1095432 | | p20190406 | `clock` | 1554566400 | 1167504 | | p20190407 | `clock` | 1554652800 | 1124760 | | p20190408 | `clock` | 1554739200 | 619944 | | p20190409 | `clock` | 1554825600 | 0 | | p20190410 | `clock` | 1554912000 | 0 | | p20190411 | `clock` | 1554998400 | 0 | | p20190412 | `clock` | 1555084800 | 0 | | p20190413 | `clock` | 1555171200 | 0 | | p20190414 | `clock` | 1555257600 | 0 | | p20190415 | `clock` | 1555344000 | 0 | +-----------+---------+------------+------------+ 38 rows in set (0.002 sec)
分区表成功,启动zabbix_server服务,这里需要说明一下,分完表以后我的数据库立马小了5个GB,但是housekeeping还是很高,当晚执行一遍计划任务后,第二天就正常了。
6、取消housekeeper清理
这里需要注意,最后一定要在zabbix配置文件里面取消下图的设置。
原来慢试过两个小时清理一遍,每次清理5000条,统统都取消。

这里,开启内部管家的框不选择,只选择下面的“覆盖监控项历史期间”和“覆盖监控项趋势期间”,至于数据存储天数,自己看吧,我这里选择30天,但是我感觉最多7天足矣。


此笔来自于:http://www.mamicode.com/info-detail-2228907.html

浙公网安备 33010602011771号