蒋坤的博客
信念 方式 态度
 

排序列表堆(SORTHEAP)等参数引起的db2diag.log不断增大一例

 

服务器配置了4G内存,系统运行初期调整了部分参数,数据库全部分配内存大概在1.5G左右,系统运行稳定,偶有错误出现(两周一次),诸如:未找到****软件包、语句太复杂等等,由于出现的频次不多,于是未予理会。

随着业务的增长,数据的增多服务器开始出现异常情况越来越频繁,主要表现死锁的增加,于是继续调整参数,异常情况消失。运行2月以后偶出现死锁现象,于是对数据库进行监控,发现日志文件db2diag.log异常增大,大小到达4g多,且不断增大,经过漫长的加载终于查看到文件内容:

2009-01-23-16.06.41.484000+480 E199179310H606     LEVEL: Severe (OS)

PID     : 4660                 TID  : 5372        PROC : db2syscs.exe

INSTANCE: DB2                  NODE : 000         DB   : TIMMS

APPHDL  : 0-719                APPID: GA5103A5.M812.01D283080640

FUNCTION: DB2 UDB, oper system services, getPrivateChunksFromOs, probe:100

CALLED  : OS, -, VirtualAlloc

OSERR   : 8 "存储空间不足,无法处理此命令。"

MESSAGE : Private memory and/or virtual address space exhausted. Or ulimit is

          not high enough.

DATA #1 : Requested size, PD_TYPE_MEM_REQUESTED_SIZE, 4 bytes

136544256

 

2009-01-23-16.06.41.484000+480 I199179918H517     LEVEL: Warning

PID     : 4660                 TID  : 5372        PROC : db2syscs.exe

INSTANCE: DB2                  NODE : 000         DB   : TIMMS

APPHDL  : 0-719                APPID: GA5103A5.M812.01D283080640

FUNCTION: DB2 UDB, sort/list services, sqlsAllocateSortMemory, probe:35

DATA #1 : <preformatted>

Not enough memory available for a (private) sort heap of size 50000

bufSize = 136527872, bPrivatePool = 1, bPoolHasSpace = 1

Trying smaller size...

 

2009-01-23-16.06.41.515000+480 E199180437H606     LEVEL: Severe (OS)

PID     : 4660                 TID  : 5372        PROC : db2syscs.exe

INSTANCE: DB2                  NODE : 000         DB   : TIMMS

APPHDL  : 0-719                APPID: GA5103A5.M812.01D283080640

FUNCTION: DB2 UDB, oper system services, getPrivateChunksFromOs, probe:100

CALLED  : OS, -, VirtualAlloc

OSERR   : 8 "存储空间不足,无法处理此命令。"

MESSAGE : Private memory and/or virtual address space exhausted. Or ulimit is

          not high enough.

DATA #1 : Requested size, PD_TYPE_MEM_REQUESTED_SIZE, 4 bytes

117030912

 

2009-01-23-16.06.41.515000+480 E199181045H606     LEVEL: Severe (OS)

PID     : 4660                 TID  : 5372        PROC : db2syscs.exe

INSTANCE: DB2                  NODE : 000         DB   : TIMMS

APPHDL  : 0-719                APPID: GA5103A5.M812.01D283080640

FUNCTION: DB2 UDB, oper system services, getPrivateChunksFromOs, probe:100

CALLED  : OS, -, VirtualAlloc

OSERR   : 8 "存储空间不足,无法处理此命令。"

MESSAGE : Private memory and/or virtual address space exhausted. Or ulimit is

          not high enough.

DATA #1 : Requested size, PD_TYPE_MEM_REQUESTED_SIZE, 4 bytes

117030912

 

2009-01-23-16.06.41.515000+480 I199181653H517     LEVEL: Warning

PID     : 4660                 TID  : 5372        PROC : db2syscs.exe

INSTANCE: DB2                  NODE : 000         DB   : TIMMS

APPHDL  : 0-719                APPID: GA5103A5.M812.01D283080640

FUNCTION: DB2 UDB, sort/list services, sqlsAllocateSortMemory, probe:35

DATA #1 : <preformatted>

Not enough memory available for a (private) sort heap of size 50000

bufSize = 117022720, bPrivatePool = 1, bPoolHasSpace = 1

Trying smaller size...

从上述日志中,可以看到每次日志的写入数据是不一致的,表现在bufSize的值不一致。

第一次:

Not enough memory available for a (private) sort heap of size 50000

bufSize = 136527872, bPrivatePool = 1, bPoolHasSpace = 1

 

第二次:

Not enough memory available for a (private) sort heap of size 50000

bufSize = 117022720, bPrivatePool = 1, bPoolHasSpace = 1

 

从日志中记录中已经可以很直观的看到对于Sortheap大小分配存在严重问题了,于是每有一个查询、更新、插入、删除提交便会出现一次这样的警告,同时也在日志中记录,于是便出现了看到的日志文件不断增大。于是重新对所有的分配进行了一遍计算,发现留给操作系统的内存只有1G,通过任务管理器中发现此时的物理内存可用数为600M左右,而系统缓存也很小的值。

继续调整参数,主要是调整缓冲池、排序堆、Sql语句堆等的值,最后剩余给操作系统的内存增大至1.5G大小,启动业务系统,经过两天观察日志文件再无增大迹象,数据库运行稳定。

数据库的性能调优,就意味着必须调整参数,而参数调整就是一个双刃剑,调整没有什么固定的模式可走,只能依靠对业务、数据量的分析,不断的调整、观察、再调整。每个业务系统的参数调整都很难达到最优化配置,不过能满足业务需求足矣,特别对于业务较大的系统。

对于DB2的参数调整时我们可以参考DB2内存模型的文章,在此文中已经对DB2的整个内存模型做了全面的介绍,同时在这边文章中也可以找到对于上述描述的日志文件中的错误的记录,同时也可以很清楚的了解到上面所提到的bufSize的值不断变化的原因所在,作为数据库共享内存的一个组成部分,而数据库共享内存的剩余值也会随操作SQL的变化而变化,在文章中对于这个部分作了很详细的描述:

Sorting is required when no index satisfies the requested ordering of fetched rows, or the optimizer determines that a sort is less expensive than an index scan. There are two kinds of sorts in DB2, private sorts and shared sorts. Private sorts take place in an agent's private agent memory (which we will discuss in the next section); shared shorts take place in the database's database shared memory.

For private sorts, the database manager configuration parameter sheapthres specifies an instance-wide soft limit on the total amount of memory that can be consumed by private sorts at any given time. When the total private-sort memory consumption for an instance reaches this limit, the memory allocated for additional incoming private-sort requests will be considerably reduced. You will see the following message in the db2diag.log:

"Not enough memory available for a (private) sort heap of size size of sortheap. Trying smaller size..."

If you enable intra-partition parallelism or concentrator, DB2 may choose to perform a shared sort if it determines that method to be more efficient than a private sort. If a shared sort is performed, the sort heap for the sort is allocated in database shared memory. The maximum amount of memory used for shared sorts is controlled by the sheapthres_shr database parameter. This is a database-wide hard limit on the total amount of memory consumed by shared sorts at any given time. When this limit is reached, the application requesting the sort will receive the error SQL0955 (rc2). No further shared-sort memory requests will be allowed until the total shared-sort memory consumption falls below the limit specified by sheapthres_shr.

The following formula calculates approximately how much memory the database shared memory set requires: Database shared memory = (Main bufferpools + 4 hidden bufferpools + database heap + utility heap + locklist + package cache + catalog cache) + (number of estore pages * 100 bytes) + approx. 10% overhead

For databases with intra_parallel enabled or concentrator enable, shared sort memory must be pre-allocated as part of the database shared memory, thus the formula becomes: Database shared memory = (Main bufferpools + 4 hidden bufferpools + database heap + utility heap + locklist + package cache + catalog cache + sheapthres_shr) + (number of estore pages * 100 bytes) + approx. 10% overhead

 

 上文的完整版本的连接:DB2 UDB 内存模型

几年前曾使用DB2自带的自动调整配置程序,期间因为死锁导致系统经常业务系统当机不断的事情发生,也咨询了几次IBM的工程师,不过都没有得到很好的答案,对于置身于开发、调试环境的人才能更好的知道当前的业务系统所需要的,当然如果有专业的数据库开发和管理人员那是最好,不过很多公司也未必有这样专业的人员去做这些事情。经验就是在不断的失败中得到的。

好久没写了,一向对于文字的组织也不太擅长,所以就凑合看看吧。

 

 

posted on 2009-04-14 11:18  jkfree  阅读(1197)  评论(0编辑  收藏  举报