代码改变世界

SQL Server一个特殊的阻塞案例分析2

2019-05-10 11:08  潇湘隐者  阅读(939)  评论(2编辑  收藏  举报

最近发现一个非常奇怪的阻塞问题,如下截图所示(来自监控工具DPA),会话583被会话1036阻塞,而且阻塞发生在tempdb,被阻塞的SQL如下截图所示,会话等待类型为LCK_M_S

 

 

clip_image001

 

 

 

 

clip_image002

 

因为DPA工具不好截取全部信息,使用下面SQL语句获取了阻塞会话的详细信息如下,来自Microsoft SQL Server Management Studio - Transact-SQL IntelliSense的的SQL被阻塞了

 

SELECT  s.session_id ,
        r.status ,
        r.blocking_session_id 'Blk by' ,
        r.wait_type ,
        wait_resource ,
        r.wait_time / ( 1000 * 60 ) 'Wait M' ,
        r.cpu_time ,
        r.logical_reads ,
        r.reads ,
        r.writes ,
        r.total_elapsed_time / ( 1000 * 60 ) 'Elaps M' ,
        SUBSTRING(st.text, ( r.statement_start_offset / 2 ) + 1,
                  ( ( CASE r.statement_end_offset
                        WHEN -1 THEN DATALENGTH(st.text)
                        ELSE r.statement_end_offset
                      END - r.statement_start_offset ) / 2 ) + 1) AS statement_text ,
        COALESCE(QUOTENAME(DB_NAME(st.dbid)) + N'.'
                 + QUOTENAME(OBJECT_SCHEMA_NAME(st.objectid, st.dbid)) + N'.'
                 + QUOTENAME(OBJECT_NAME(st.objectid, st.dbid)), '') AS command_text ,
        r.command ,
        s.login_name ,
        s.host_name ,
        s.program_name ,
        s.last_request_end_time ,
        s.login_time ,
        r.open_transaction_count
FROM    sys.dm_exec_sessions AS s
        JOIN sys.dm_exec_requests AS r ON r.session_id = s.session_id
        CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st
WHERE r.session_id =583
ORDER BY r.cpu_time DESC;

 

clip_image001[4]

 

 

而会话1036处于睡眠(sleeping)状态,有一个Open的事务。通过wait_resource的值KEY: 2:1125899909070848 (79c68d10aa42)我们定位到是一个系统表sysschobj.它是sys.triggers这个系统视图的基表。如下所示:

 

 

clip_image004

 

 

 

clip_image005

 

 

SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO
CREATE VIEW sys.triggers AS
    SELECT o.name,
        object_id = o.id,
        parent_class = o.pclass,
        parent_class_desc = pc.name,
        parent_id = o.pid,
        type = o.type,
        type_desc = n.name,
        create_date = o.created,
        modify_date = o.modified,
        is_ms_shipped = sysconv(bit, o.status & 1),         -- OBJALL_MSSHIPPED
        is_disabled = sysconv(bit, o.status & 256),         -- OBJTRG_DISABLED
        is_not_for_replication = sysconv(bit, o.status & 512), -- OBJTRG_NOTFORREPL
        is_instead_of_trigger = sysconv(bit, o.status & 1024)     -- OBJTRG_INSTEADOF
    FROM sys.sysschobjs o
    LEFT JOIN sys.syspalnames n ON n.class = 'OBTY' AND n.value = o.type
    LEFT JOIN sys.syspalvalues pc ON pc.class = 'UNCL' AND pc.value = o.pclass
    WHERE o.type IN ('TA','TR') AND o.pclass <> 100
        AND has_access('TR', o.id, o.pid, o.nsclass) = 1
 
GO

 

 

 

使用下面脚本可以查到对应的锁信息(其实,正确的做法应该是加上条件过滤spid),从而可以判断,会话1036可能因为事务中,间接涉及对基表sysschobj的DML操作(有可能是衍生的系统视图),而由于事务没有提交也没有释放,导致在基表上的锁没有释放,从而导致出现这样一个阻塞。个人也很好奇是什么样的SQL会导致这样一个阻塞出现,不过这种阻塞源头是Sleeping状态的会话,已经无法定位阻塞源头SQL,只能通过Profile去跟踪捕获。但是这种阻塞出现时机和场景不清楚,很难一下子捕获到。

 

 
USE tempdb
GO
SELECT  request_session_id AS spid ,
        DB_NAME(resource_database_id) AS dbname ,
        CASE WHEN resource_type = 'OBJECT'
             THEN OBJECT_NAME(resource_associated_entity_id)
             WHEN resource_associated_entity_id = 0 THEN 'n/a'
             ELSE OBJECT_NAME(p.object_id)
        END AS entity_name ,
        index_id ,
        resource_type AS resource ,
        resource_description AS description ,
        request_mode AS mode ,
        request_status AS status
FROM    sys.dm_tran_locks t
LEFT JOIN sys.partitions p
ON      p.partition_id = t.resource_associated_entity_id
WHERE   resource_database_id = DB_ID()

 

 

clip_image006

 

 

今天早上有发现同样的阻塞,猜测是SQL里面有创建临时表或对临时表做DDL或DML操作的迹象。不过依然没有找到源头SQL,只能静候下次机会。找出罪魁祸首。不过让我头痛的是,对于这种特殊阻塞情况,即使是使用Profile跟踪也是相当麻烦,因为不知道它什么时间点出现、出现在那个会话。

 

SELECT  tl.resource_type AS [Resource Type] ,
        DB_NAME(tl.resource_database_id) AS [DB Name] ,
        CASE tl.resource_type
          WHEN 'OBJECT'
          THEN OBJECT_NAME(tl.resource_associated_entity_id,
                           tl.resource_database_id)
          WHEN 'DATABASE' THEN 'DB'
          ELSE CASE WHEN tl.resource_database_id = DB_ID()
                    THEN ( SELECT   OBJECT_NAME(object_id,
                                                tl.resource_database_id)
                           FROM     sys.partitions
                           WHERE    hobt_id = tl.resource_associated_entity_id
                         )
                    ELSE '(Run under DB context)'
               END
        END AS [Object] ,
        tl.resource_description AS [Resource] ,
        tl.request_session_id AS [Session] ,
        tl.request_mode AS [Mode] ,
        tl.request_status AS [Status] ,
        wt.wait_duration_ms AS [Wait (ms)] ,
        qi.sql ,
        qi.query_plan
FROM    sys.dm_tran_locks tl WITH ( NOLOCK )
LEFT OUTER JOIN sys.dm_os_waiting_tasks wt WITH ( NOLOCK )
ON      tl.lock_owner_address = wt.resource_address
        AND tl.request_status = 'WAIT'
OUTER APPLY ( SELECT    SUBSTRING(s.text,
                                  ( er.statement_start_offset / 2 ) + 1,
                                  ( ( CASE er.statement_end_offset
                                        WHEN -1 THEN DATALENGTH(s.text)
                                        ELSE er.statement_end_offset
                                      END - er.statement_start_offset ) / 2 )
                                  + 1) AS sql ,
                        qp.query_plan
              FROM      sys.dm_exec_requests er WITH ( NOLOCK )
              CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) s
              OUTER APPLY sys.dm_exec_query_plan(er.plan_handle) qp
              WHERE     tl.request_session_id = er.session_id
            ) qi
WHERE   tl.request_session_id =134

 

 

clip_image007

 

链接资料

一个特殊的SQL Server阻塞案例分析

SQL Server一个特殊的阻塞案例分析2