代码改变世界

统计信息自动收集任务失效原因排查

2018-01-29 20:26  AlfredZhao  阅读(1382)  评论(0编辑  收藏  举报

环境:Oracle 11.2.0.3 RAC
问题:统计信息自动收集任务失效原因排查

1.查看自动任务的状态

查看自动任务的状态,确认是enabled状态:
SQL> select client_name,status from dba_autotask_client;

CLIENT_NAME                                                      STATUS
---------------------------------------------------------------- --------
auto optimizer stats collection                                  ENABLED
auto space advisor                                               ENABLED
sql tuning advisor                                               ENABLED

Elapsed: 00:01:03.88

一般来说,这样的结果,就意味着统计信息自动收集是打开的,如今却是没有自动收集。说明确实是有故障,需要进一步深入排查。
若对这些自动任务的状态不理解可以参考之前文章:

2.进一步查看其它信息

根据 Why Auto Optimizer Statistics Collection May Appear to be "Stuck" and Not Running? (文档 ID 1320246.1)

排查以下项:

The 'auto optimizer stats collection' task is enabled in auto task
STATISTICS_LEVEL has already been set to TYPICAL or FULL
dba_autotask_client_history is empty
Statistics on tables are stale.
Some scheduler window is active and LAST_START_DATE is several days ago.

具体操作记录如下:

--STATISTICS_LEVEL:
SQL> show parameter STATISTICS_LEVEL 

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
statistics_level                     string      TYPICAL

--dba_autotask_client_history is empty:
SQL> select * from dba_autotask_client_history;

no rows selected

Elapsed: 00:01:04.79

--Some scheduler window is active and LAST_START_DATE 是20-SEP-12 10.00.00.010777 PM
SELECT window_name, last_start_date, enabled, active FROM dba_scheduler_windows;
SQL> SELECT window_name ,
  2  last_start_date,
  3  enabled ,
  4  active
  5  FROM dba_scheduler_windows;

WINDOW_NAME                    LAST_START_DATE                                                             ENABL ACTIV
------------------------------ --------------------------------------------------------------------------- ----- -----
MONDAY_WINDOW                  22-JAN-18 10.00.00.014299 PM PRC                                            TRUE  FALSE
TUESDAY_WINDOW                 23-JAN-18 10.00.01.076798 PM PRC                                            TRUE  FALSE
WEDNESDAY_WINDOW               24-JAN-18 10.00.00.055066 PM PRC                                            TRUE  FALSE
THURSDAY_WINDOW                20-SEP-12 10.00.00.010777 PM PRC                                            TRUE  TRUE
FRIDAY_WINDOW                  26-JAN-18 10.00.00.725176 PM PRC                                            TRUE  FALSE
SATURDAY_WINDOW                27-JAN-18 06.00.00.010751 AM PRC                                            TRUE  FALSE
SUNDAY_WINDOW                  28-JAN-18 06.00.00.593207 AM PRC                                            TRUE  FALSE
WEEKNIGHT_WINDOW                                                                                           FALSE FALSE
WEEKEND_WINDOW                                                                                             FALSE FALSE

9 rows selected.

Elapsed: 00:00:00.01

可以看到,如今的环境确实完全匹配文档列出的这些特征项。
另外,看下统计信息自动收集任务的WINDOW_GROUP以及对应的MEMBER_NAME:

SQL> select client_name,window_group
  2  from DBA_AUTOTASK_CLIENT
  3  where client_name = 'auto optimizer stats collection';

CLIENT_NAME                                                      WINDOW_GROUP
---------------------------------------------------------------- ----------------------------------------------------------------
auto optimizer stats collection                                  ORA$AT_WGRP_OS

Elapsed: 00:01:05.39
SQL> SELECT * FROM dba_scheduler_group_members
  2  where group_name='ORA$AT_WGRP_OS';

OWNER                          GROUP_NAME                     MEMBER_NAME
------------------------------ ------------------------------ -----------------------------------------------------------------------------------------------------------------------------------
SYS                            ORA$AT_WGRP_OS                 "SYS"."MONDAY_WINDOW"
SYS                            ORA$AT_WGRP_OS                 "SYS"."TUESDAY_WINDOW"
SYS                            ORA$AT_WGRP_OS                 "SYS"."WEDNESDAY_WINDOW"
SYS                            ORA$AT_WGRP_OS                 "SYS"."THURSDAY_WINDOW"
SYS                            ORA$AT_WGRP_OS                 "SYS"."FRIDAY_WINDOW"
SYS                            ORA$AT_WGRP_OS                 "SYS"."SATURDAY_WINDOW"
SYS                            ORA$AT_WGRP_OS                 "SYS"."SUNDAY_WINDOW"

7 rows selected.

Elapsed: 00:00:00.04

可以看到,正常应该是一周7天都有的,但是由于THURSDAY_WINDOW 的窗口ACTIVE是TRUE,导致2012年9月27号(周四)统计信息收集失败。根据MOS描述,会影响其后的窗口都打开失败,进而导致之后所有自动任务的执行都失败。
我实际在我的测试环境验证也是如此,现象如下:

SQL> SELECT window_name, last_start_date, enabled, active FROM dba_scheduler_windows;

WINDOW_NAME                                                  LAST_START_DATE                                                             ENABLED    ACTIVE
------------------------------------------------------------ --------------------------------------------------------------------------- ---------- ----------
MONDAY_WINDOW                                                22-JAN-18 10.00.00.011111 PM EST5EDT                                        TRUE       FALSE
TUESDAY_WINDOW                                               23-JAN-18 10.22.54.068776 PM EST5EDT                                        TRUE       FALSE
WEDNESDAY_WINDOW                                             24-JAN-18 10.00.00.005654 PM EST5EDT                                        TRUE       FALSE
THURSDAY_WINDOW                                              25-JAN-18 10.00.00.009885 PM EST5EDT                                        TRUE       FALSE
FRIDAY_WINDOW                                                26-JAN-18 10.00.00.010077 PM EST5EDT                                        TRUE       FALSE
SATURDAY_WINDOW                                              27-JAN-18 05.00.00.004156 PM EST5EDT                                        TRUE       FALSE
SUNDAY_WINDOW                                                28-JAN-18 01.00.00.010044 PM EST5EDT                                        TRUE       TRUE
WEEKNIGHT_WINDOW                                                                                                                         FALSE      FALSE
WEEKEND_WINDOW                                                                                                                           FALSE      FALSE

9 rows selected.

SQL> EXECUTE DBMS_SCHEDULER.OPEN_WINDOW ('MONDAY_WINDOW','');
BEGIN DBMS_SCHEDULER.OPEN_WINDOW ('MONDAY_WINDOW',''); END;

*
ERROR at line 1:
ORA-27480: window "SUNDAY_WINDOW" is currently open
ORA-06512: at "SYS.DBMS_ISCHED", line 493
ORA-06512: at "SYS.DBMS_SCHEDULER", line 1220
ORA-06512: at line 1


SQL> 

可以看到我测试环境,因为周日的窗口已经是ACTIVE,再尝试打开周一的窗口就会报错,看起来同时就只能有一个窗口ACTIVE,这与MOS的描述也完全一致。

3.解决问题

根据MOS解决建议手工关闭活动的窗口,我这里就是关闭THURSDAY_WINDOW窗口:
SQL> EXECUTE DBMS_SCHEDULER.CLOSE_WINDOW ('THURSDAY_WINDOW');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.14

再次查询状态,确认窗口状态已关闭:

SQL> SELECT window_name ,
  2  last_start_date,
  3  enabled ,
  4  active
  5  FROM dba_scheduler_windows;

WINDOW_NAME                    LAST_START_DATE                                                             ENABL ACTIV
------------------------------ --------------------------------------------------------------------------- ----- -----
MONDAY_WINDOW                  22-JAN-18 10.00.00.014299 PM PRC                                            TRUE  FALSE
TUESDAY_WINDOW                 23-JAN-18 10.00.01.076798 PM PRC                                            TRUE  FALSE
WEDNESDAY_WINDOW               24-JAN-18 10.00.00.055066 PM PRC                                            TRUE  FALSE
THURSDAY_WINDOW                20-SEP-12 10.00.00.010777 PM PRC                                            TRUE  FALSE
FRIDAY_WINDOW                  26-JAN-18 10.00.00.725176 PM PRC                                            TRUE  FALSE
SATURDAY_WINDOW                27-JAN-18 06.00.00.010751 AM PRC                                            TRUE  FALSE
SUNDAY_WINDOW                  28-JAN-18 06.00.00.593207 AM PRC                                            TRUE  FALSE
WEEKNIGHT_WINDOW                                                                                           FALSE FALSE
WEEKEND_WINDOW                                                                                             FALSE FALSE

9 rows selected.

这样就找到了问题,另外因为离统计信息信息自动收集关闭已有多年,目前生产环境运行平稳,所以最终决定保守处理,即:先将生产环境的自动任务都关闭,等在备库测试验证后再考虑生产环境开启自动任务。

Reference

- Why Auto Optimizer Statistics Collection May Appear to be "Stuck" and Not Running? (文档 ID 1320246.1)