性能下降的“隐形杀手”,GBase数据库通过临时文件捕获“真凶”
日常系统运维过程中,小明发现某一时段系统出报表的性能总是异常下降。经过初步排查,他发现是后台数据库返回结果的速度变慢,导致前台业务响应速度下降。
这个系统后台是使用GBase 8a MPP Cluster建设的数据仓库,存储了百TB级的数据,公司众多分析类业务都是这款数据仓库在支持,排查难度可想而知。几次尝试自行解决无果后,小明果断求助GBASE技术团队。
技术专家小吉接到求助后迅速赶到现场。通过和小明沟通,他了解到系统性能下降总是发生在特定时间段。于是,在性能再次下降时,小吉开始监控数据库的CPU、内存、磁盘10等关键指标,很快发现磁盘IO在这段时间异常繁忙。
凭借丰富的经验,小吉迅速做出判断:这很可能是由于某些复杂SQL的中间结果集过于庞大,导致 /opt/gnode/tmpdata 目录下产生了大量中间临时文件。这些临时文件不仅会导致查询响应延迟翻倍,严重时还可能占满磁盘空间,使整个集群陷入瘫痪。
果然,在 /opt/gnode/tmpdata 目录下,小吉发现了大量正在不断增加的临时文件。更糟糕的是现场并发执行着数十条复杂SQL,传统的通过 show processlist 逐条排查的方法在这种情况下无异于大海捞针。
suse103:/opt/gnode/tmpdata/cache_gbase # pwd
/opt/gnode/tmpdata/cache_gbase
suse103:/opt/gnode/tmpdata/cache_gbase # ll
total 75497472
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000001965940xa3d8500.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000001CCF3A0xa3a61c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000002D97070x38ae3c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000000753F980x9626440.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000000FB81C50xbd28000.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000013199B0x17fc7f40.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000026185130x619a6c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT000000268BA140x3d14440.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000030F188E0x9626300.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000035DD840x17fc7e00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000057D15420x38aea00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000005C49BAC0x3d12a00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000005F38C290xbd28140.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000069942570x619bac0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000069F92D60xa3dbe80.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000078F35830xa3dbd40.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000101965940xa3d8500.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000101CCF3A0xa3a61c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000102D97070x38ae3c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000130F188E0x9626300.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000135DD840x17fc7e00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000157D15420x38aea00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT0000015C49BAC0x3d12a00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT0000015F38C290xbd28140.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000169942570x619bac0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000169F92D60xa3dbe80.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000178F35830xa3dbd40.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000201965940xa3d8500.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000201CCF3A0xa3a61c0.express_tmp
-rw-rw---- 1 gbase gbase 402653184 Nov 8 14:27 GB_MAT00000202D97070x38ae3c0.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT0000020753F980x9626440.express_tmp
-rw-rw---- 1 gbase gbase 436207616 Nov 8 14:27 GB_MAT0000020FB81C50xbd28000.express_tmp
-rw-rw---- 1 gbase gbase 436207616 Nov 8 14:27 GB_MAT00000213199B0x17fc7f40.express_tmp
-rw-rw---- 1 gbase gbase 436207616 Nov 8 14:27 GB_MAT00000226185130x619a6c0.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000269F92D60xa3dbe80.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000278F35830xa3dbd40.express_tmp
drwxrwx--x 2 gbase gbase 6 Nov 1 17:34 HashJoin
drwxrwx--x 2 gbase gbase 6 Nov 8 13:58 TrashCan
drwxrwx--x 3 gbase gbase 21 Nov 1 17:34 tmp_materialized
suse103:/opt/gnode/tmpdata # pwd
/opt/gnode/tmpdatasuse103:/opt/gnode/tmpdata # du -sh *
97G cache_gbase
suse103:/opt/gnode/tmpdata #
系统性能异常 GBASE捕获“真凶”
面对数以万计的临时文件,如何快速锁定“元凶SQL”成为解决问题的关键。
小吉果断采取行动,他首先查询了performance_schema中的session memory_usage_info表。这个表在GBase 8a数据库的GN层中扮演着重要角色,其中的temp_space字段表示占用tmpdata空间的大小,ID 字段则表示SQL执行的ID(与show processlist 中的第一个字段“ID”相对应)
gbase> select * from performance_schema.SESSION_MEMORY_USAGE_INFO where TEMP_SPACE>0;
+---------+-----+---------+------+---------------------+-------------+
| HOST | ID | CURRENT | PEAK | PEAK_TIMESTAMP | TEMP_SPACE |
+---------+-----+---------+------+---------------------+-------------+
| suse103 | 342 | 0 | 0 | 1970-01-01 08:00:00 | 86872424448 |
+---------+-----+---------+------+---------------------+-------------+
1 row in set (Elapsed: 00:00:00.00)
gbase>通过查询这个表,小吉获得了占用大量临时空间的SQL的ID。然后,他在gn层的 showprocesslist 中找到了对应的SQL语句。
suse103:~ # gncli -uroot -e"show full processlist" | grep 342
342 root 10.10.10.103:35972 ssbm Query 224 init SELECT /10.10.10.103_179_2_2016-11-08_14:24:21/ /*+ TID('78') */ _tmp_rht_n4_179_t9_1_1477982902_s.lo_orderkey AS lo_orderkey, _tmp_rht_n4_179_t9_1_1477982902_s.lo_linenumber AS lo_linenumber,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_custkey AS lo_custkey, _tmp_rht_n4_179_t9_1_1477982902_s.lo_partkey AS lo_partkey,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_suppkey AS lo_suppkey, _tmp_rht_n4_179_t9_1_1477982902_s.lo_orderdate AS lo_orderdate,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_orderpriority AS lo_orderpriority, _tmp_rht_n4_179_t9_1_1477982902_s.lo_shippriority AS lo_shippriority,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_quantity AS lo_quantity, _tmp_rht_n4_179_t9_1_1477982902_s.lo_extendedprice AS lo_extendedprice,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_ordtotalprice AS lo_ordtotalprice, _tmp_rht_n4_179_t9_1_1477982902_s.lo_discount AS lo_discount,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_revenue AS lo_revenue, _tmp_rht_n4_179_t9_1_1477982902_s.lo_supplycost AS lo_supplycost,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_tax AS lo_tax, _tmp_rht_n4_179_t9_1_1477982902_s.lo_commitdate AS lo_commitdate,
_tmp_rht_n4_179_t9_1_1477982902_s.lo_shipmode AS lo_shipmode, _tmp_rht_n4_179_t9_2_1477982902_s.c1 AS c1, _tmp_rht_n4_179_t9_2_1477982902_s.c2 AS c2, _tmp_rht_n4_179_t9_2_1477982902_s.c3 AS c3, _tmp_rht_n4_179_t9_2_1477982902_s.c4 AS c4, _tmp_rht_n4_179_t9_2_1477982902_s.c5 AS c5,
_tmp_rht_n4_179_t9_2_1477982902_s.c6 AS c6, _tmp_rht_n4_179_t9_2_1477982902_s.c7 AS c7, _tmp_rht_n4_179_t9_2_1477982902_s.c8 AS c8,
_tmp_rht_n4_179_t9_2_1477982902_s.c9 AS c9, _tmp_rht_n4_179_t9_2_1477982902_s.c10 AS c10, _tmp_rht_n4_179_t9_2_1477982902_s.c11 AS c11,
_tmp_rht_n4_179_t9_2_1477982902_s.c12 AS c12, _tmp_rht_n4_179_t9_2_1477982902_s.c13 AS c13, _tmp_rht_n4_179_t9_2_1477982902_s.c14 AS c14,
_tmp_rht_n4_179_t9_2_1477982902_s.c15 AS c15, _tmp_rht_n4_179_t9_2_1477982902_s.c16 AS c16, _tmp_rht_n4_179_t9_2_1477982902_s.c17 AS c17 FROM gctmpdb._tmp_rht_n4_179_t9_1_1477982902_s INNER JOIN gctmpdb._tmp_rht_n4_179_t9_2_1477982902_s ON (_tmp_rht_n4_179_t9_1_1477982902_s.lo_linenumber = _tmp_rht_n4_179_t9_2_1477982902_s.c2)
suse103:~ #根据node层的SQL语句中的特定标记(如IP地址、时间戳等)或临时表名,他进一步查找到了gc层的SQL session id,从而成功定位到了具体的SQL语句——这个导致系统性能下降的“真凶”。
suse103:~ # gccli -uroot -e"show full processlist" | grep -v Sleep | grep 179 179 root 127.0.0.1:63945 ssbm Query 571 Sending task to gnodes select * from lineorder a,lineorder b where a.lo_linenumber=b.lo_linenumber最终,小吉通过与业务部门深入沟通,了解清楚业务需求后,协助业务部门改写了SQL语句,成功解决了性能下降的问题,使系统性能恢复了正常。
GBASE小课堂
现场在执行一些中间结果集较大的sql时,在tmpdata文件夹下会产生大量的中间临时文件,导致磁盘io繁忙,影响集群整体性能。如果现场并发sql较多,通过show processlist排查sql,效率难以保障。如何通过tmpdata下临时文件快速准确找出所对应的sql?
此时,在GBase 8a数据库GN层,可以查询performance_schema中的session memory_usage_info表。
gbase> select * from performance_schema.SESSION_MEMORY_USAGE_INFO;
+---------+-----+---------+------+---------------------+------------+
| HOST | ID | CURRENT | PEAK | PEAK_TIMESTAMP | TEMP_SPACE |
+---------+-----+---------+------+---------------------+------------+
| suse103 | 344 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 342 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 324 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 289 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 274 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 273 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 241 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 215 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 214 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 213 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 132 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 52 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 31 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 26 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 25 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 17 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
+---------+-----+---------+------+---------------------+------------+
16 rows in set (Elapsed: 00:00:00.00)
gbase> select * from performance_schema.SESSION_MEMORY_USAGE_INFO where TEMP_SPACE>0;
Empty set (Elapsed: 00:00:00.00)其中temp_space字段就是表示占用tmpdata空间大小,ID字段表示SQL执行的ID(对应show processlist 中的第一个字段“ID”)。
在gn层的show processlist中可以找到对应的SQL,根据node层的SQL即可查找对应的gc层的SQL。

浙公网安备 33010602011771号