代码改变世界

ORACLE中如何找出统计信息过期的表

2022-10-19 14:49  潇湘隐者  阅读(1427)  评论(0编辑  收藏  举报

在ORACLE数据库当中,我们如何判断当前对象的统计信息是否过期/过时(Stale)的呢? 从哪些维度去判断呢? 我们有那些方法呢?下面介绍ORACLE数据库中一些找出过期统计信息的方法,主要有下面几种方法:

1:DBA_TAB_STATISTICS告诉你统计信息过期

主要通过DBA_STATISTICS中的STALE_STATS列来判断统计信息是否过期,当STALE_STATS字段值为YES时,表示统计信息过期了,NO表示统计信息未过期。这个是最简单的方法。那么ORACLE是如何判断统计信息过期的呢?其实数据库判断统计信息是否过期(stale)的依据是当前表的DML操作的记录数是否超过表数据量的10%,如果超过10%这个阈值,统计信息就被认为是过期了。Oracle就认为这些表应该需要重新收集一次统计信息了。

下面是官方文档的部分介绍:

stale_stats says whether the stats are considered fresh or stale, or if the stats will be gathered automatically next time or not. The default settings is 10 percent. If you gather table statistics and then insert/update/delete less than 10 percent of rows the statistics is considered fresh. When you reach 10 percent of modified rows they become stale.

查看统计信息过期的脚本

/*******************************************************************************************

--刷新数据库监控信息:
exec dbms_stats.flush_database_monitoring_info;

For performance reasons, the Oracle Database does not populate these views immediately when the actual modifications occur. 
Run the FLUSH_DATABASE_MONITORING_INFO procedure in the DIMS_STATS PL/SQL package to populate these views 
with the latest information.


--通俗的解释
We don't *immediately* up date the staleness, because if we did, we would have to do it every 
time someone ran a DML statement. So we track it in memory, and flush it to the dictionary from time to time.

********************************************************************************************/
ALTER SESSION SET NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS';
SET LINESIZE 1080
SET PAGESIZE 1000
EXEC DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO;
SELECT S.OWNER
     , S.TABLE_NAME
     , S.PARTITION_NAME
     , S.OBJECT_TYPE
     , S.STALE_STATS
     , S.LAST_ANALYZED 
FROM DBA_TAB_STATISTICS S
INNER JOIN DBA_TABLES T ON S.OWNER= T.OWNER AND S.TABLE_NAME =T.TABLE_NAME
WHERE (S.STALE_STATS = 'YES' OR S.LAST_ANALYZED IS NULL)
   -- STALE_STATS = 'YES'   表示统计信息过期:当对象有超过10%的ROWS被修改时
   -- LAST_ANALYZED IS NULL 表示该对象从未进行过统计信息收集
   AND T.TEMPORARY ='N'  --排除临时表
   AND S.OWNER NOT IN ('MDDATA', 'MDSYS', 'ORDSYS', 'CTXSYS', 
                     'ANONYMOUS', 'EXFSYS', 'OUTLN', 'DIP', 
                     'DMSYS', 'WMSYS', 'XDB', 'ORACLE_OCM', 
                     'TSMSYS', 'ORDPLUGINS', 'SI_INFORMTN_SCHEMA',
                     'OLAPSYS', 'SYSTEM', 'SYS', 'SYSMAN',
                     'DBSNMP', 'SCOTT', 'PERFSTAT', 'PUBLIC',
                     'MGMT_VIEW', 'WK_TEST', 'WKPROXY', 'WKSYS')
   -- 系统用户表的统计信息状态不做统计,根据需求打开或关闭
   AND S.TABLE_NAME NOT LIKE 'BIN%'
   -- 回收站中的表不做统计
   ORDER BY OWNER,TABLE_NAME;

2:通过DBA_TABLES的LAST_ANALYZED列判断

这里主要是通过上一次的收集统计信息的时间来判断。其实根据收集统计信息的时间来判断统计信息是否过时,其实有一定的局限性和不合理性。时间维度来判断太过粗糙。例如对于维表,或者一些数据长期没有DML操作的表来说,以LAST_ANALYZED来判断的话,就有失公允。所以这种方法仅供参考,在有些场合完全是无效的。也就是说从统计信息采集的时间维度来判断,根本是不可靠的,仅供参考。

下面例子,统计信息超过7天没有更新的表。

SET LINESIZE 1080;
ALTER SESSION SET NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS';
SELECT OWNER
      ,TABLE_NAME
      ,LAST_ANALYZED 
FROM DBA_TABLES
WHERE LAST_ANALYZED < SYSDATE-7;

3:使用DBMS_STATS.GATHER_SCHEMA_STATS找出统计信息过期的表

使用dbms_stats.gather_schema_stats包找出统计信息过期的对象,如下所示:

--找出某个SCHEMA下统计信息过期的对象

set serveroutput on;

declare
mystaleobjs dbms_stats.objecttab;
begin
-- check whether there is any stale objects
dbms_stats.gather_schema_stats(ownname=>U'&owner', options=>'LIST STALE',objlist=>
mystaleobjs);
for i in 1 .. mystaleobjs.count loop
dbms_output.put_line(mystaleobjs(i).ownname || '.'||mystaleobjs(i).objname);
end loop;
end;
/

--找出整个数据库中统计信息过期的对象

set serveroutput on

DECLARE
ObjList dbms_stats.ObjectTab;
BEGIN
dbms_stats.gather_database_stats(objlist=>ObjList, options=>'LIST STALE');
FOR i in ObjList.FIRST..ObjList.LAST
LOOP
dbms_output.put_line(ObjList(i).ownname || '.' || ObjList(i).ObjName || ' ' || ObjList(i).ObjType || ' ' || ObjList(i).partname);
END LOOP;
END;
/

--找出没有统计信息的表

set serveroutput on;

declare
mystaleobjs dbms_stats.objecttab;
begin
-- check whether there is any stale objects
dbms_stats.gather_schema_stats(ownname=>'&owner', options=>'LIST EMPTY',objlist=>
mystaleobjs);
for i in 1 .. mystaleobjs.count loop
dbms_output.put_line(mystaleobjs(i).ownname || '.'||mystaleobjs(i).objname);
end loop;
end;
/

参考资料:How to List the Objects with Stale Statistics Using dbms_stats.gather_schema_stats options=>'LIST STALE' (Doc ID 457666.1)

4:手工验证统计信息是否过旧

通过系统视图DBA_TAB_MODIFICATIONS来判断统计新是否过期。DBA_TAB_MODIFICATIONS使用来记录表的DML操作,依靠里面的信息确定统计信息是否陈旧。确定表是否需要统计分析的依据。系统后台调用DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO更新,当然也可以手工调用更新信息。它的原理也是数据变化量是否超过10%。

DBA_TAB_MODIFICATIONS describes modifications to all tables in the database that have been modified since the last time statistics were gathered on the tables. Its columns are the same as those in "ALL_TAB_MODIFICATIONS".

This view is populated only for tables with the MONITORING attribute. It is intended for statistics collection over a long period of time. For performance reasons, the Oracle Database does not populate this view immediately when the actual modifications occur. Run the FLUSH_DATABASE_MONITORING_INFO procedure in the DBMS_STATS PL/SQL package to populate this view with the latest information. The ANALYZE_ANY system privilege is required to run this procedure.

DBA_TAB_MODIFICATIONS
ALL_TAB_MODIFICATIONS
USER_TAB_MODIFICATIONS


INSERTS          --插入操作次数
UPDATES          --更新操作次数
DELETES          --删除操作次数
TRUNCATED	   --自从上次分析之后是否被TRUNCATED过。
SET LINESIZE 1080 PAGESIZE 60;
COL TABLE_OWNER FOR A12;
COL IS_PARTITION FOR A3;
COL NAME FOR A16;
ALTER SESSION SET NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS';
SELECT 	M.TABLE_OWNER,
		     'NO' AS IS_PARTITION,
		     M.TABLE_NAME AS NAME,		
		     M.INSERTS,
		     M.UPDATES,
		     M.DELETES,
		     M.TRUNCATED,
		     M.TIMESTAMP AS LAST_MODIFIED,		
		     ROUND((M.INSERTS+M.UPDATES+M.DELETES)*100/NULLIF(T.NUM_ROWS,0),2) AS EST_PCT_MODIFIED,
		     T.NUM_ROWS AS LAST_KNOWN_ROWS_NUMBER,
		     T.LAST_ANALYZED
FROM 	DBA_TAB_MODIFICATIONS M,
		DBA_TABLES T		
WHERE 	M.TABLE_OWNER=T.OWNER
AND		M.TABLE_NAME=T.TABLE_NAME
AND 	M.TABLE_OWNER NOT IN ('SYS','SYSTEM')
AND 	((M.INSERTS+M.UPDATES+M.DELETES)*100/NULLIF(T.NUM_ROWS,0) > 10 OR T.LAST_ANALYZED IS NULL)
UNION
SELECT 	M.TABLE_OWNER,
		    'YES' AS IS_PARTITION,		
		    M.PARTITION_NAME AS NAME,		
		    M.INSERTS,
		    M.UPDATES,
		    M.DELETES,
		    M.TRUNCATED,
		    M.TIMESTAMP AS LAST_MODIFIED,		
		    ROUND((M.INSERTS+M.UPDATES+M.DELETES)*100/NULLIF(P.NUM_ROWS,0),2) AS EST_PCT_MODIFIED,
		    P.NUM_ROWS AS LAST_KNOWN_ROWS_NUMBER,
		    P.LAST_ANALYZED		
FROM 	DBA_TAB_MODIFICATIONS M,		
		DBA_TAB_PARTITIONS P
WHERE 	M.TABLE_OWNER=P.TABLE_OWNER
AND		M.TABLE_NAME=P.TABLE_NAME
AND 	M.PARTITION_NAME = P.PARTITION_NAME
AND 	M.TABLE_OWNER NOT IN ('SYS','SYSTEM')
AND 	((M.INSERTS+M.UPDATES+M.DELETES)*100/NULLIF(P.NUM_ROWS,0) > 10 OR P.LAST_ANALYZED IS NULL)
ORDER BY 8 DESC;

当然你还可以通过其它方式手工验证统计信息是否过旧,例如检查表的行记录信息

检查表的行记录情况

SQL> SELECT OWNER, TABLE_NAME, OBJECT_TYPE, NUM_ROWS      
  2  FROM DBA_TAB_STATISTICS
  3  WHERE OWNER='SCOTT' AND TABLE_NAME='EMP';

OWNER                          TABLE_NAME                     OBJECT_TYPE    NUM_ROWS
------------------------------ ------------------------------ ------------ ----------
SCOTT                          EMP                            TABLE                14

SQL>  SELECT COUNT(*) FROM SCOTT.EMP;

  COUNT(*)
----------
        14

检查DBA_TAB_COL_STATISTICS中表的列的NUM_DISTINCT,LOW_VALUE,HIGH_VALUE,NUM_NULLS值,然后将其与实际值对比,从而判断统计信息
是否过期,不过这种方法虽然最准确,但是比较繁琐耗时,而且有些表的统计信息的采样比例可能不是100%,所以实际操作实施起来也比较麻烦。

SELECT COUNT(DISTINCT COLUMN_NAME), MIN(COLUMN_NAME), MAX(COLUMN_NAME) FROM OWNER_NAME.TABLE_NAME;

 

SELECT * FROM DBA_TAB_COL_STATISTICS WHERE OWNER='&OWNER' AND TABLE_NAME='&TABLE_NAME';