Windbg程序调试系列4-Live Debugging

上篇博文中给大家分享了使用Windbg分析线程阻塞问题:

Windbg程序调试系列3-线程阻塞问题

本篇中我们继续,跟大家分享附加进程实时调试-Live Debugging。

先说一下使用Windbg附加进程实时调试的应用场景和注意事项

应用场景:

  • 集成测试环境,影响异常后,分析异常和线程上下文的执行堆栈、参数情况;
  • 生产环境:短时间内调试程序异常,查看异常上下文和参数,但是调试时间不能太久。

注意事项:附加进程调试会阻塞请求,调试后新的请求被阻塞住,前端调用受影响,因此要谨慎、权衡利弊,开发测试环境可以,生产环境要谨慎。

附加进程实时调试的套路:

  1. F6 Attache进程
  2. 加载SOS, .loadby sos clr
  3. 启用异常捕获 sxe clr
  4. 运行,捕获&查看异常 g  !pe
  5. 查看异常所在线程堆栈!clrstack, 查看异常所在线程堆栈的参数和变量 !clrstack -a / !dso
  6. 禁用异常捕获 sxd clr
  7. 退出调试 qd

动手实操:

1. F6 Attache进程

打开Windbg,按下F6键,选择要调试的进程,支持输入进程ID:

2. 加载SOS, .loadby sos clr

.loadby sos clr

3. 启用异常捕获 sxe clr

sxe clr

4. 运行,捕获&查看异常 g  !pe

g

上图的输出中,我们发现了以下异常信息:SocketException:远程追究强迫关闭了一个现有的连接

0:032> !pe
Exception object: 000000d79aedb7a8
Exception type:   System.Net.Sockets.SocketException
Message:          远程主机强迫关闭了一个现有的连接。
InnerException:   <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80004005

异常Exception对象的内存地址:

000000d79aedb7a8

5. 查看异常所在线程堆栈!clrstack, 查看异常所在线程堆栈的参数和变量 !clrstack -a / !dso

0:032> !pe
Exception object: 000000d79aedb7a8
Exception type:   System.Net.Sockets.SocketException
Message:          远程主机强迫关闭了一个现有的连接。
InnerException:   <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80004005
0:032> !clrstack
OS Thread Id: 0x15968 (32)
        Child SP               IP Call Site
000000d7b927ea88 00007ffe61c395fc [HelperMethodFrame: 000000d7b927ea88] 
000000d7b927eb70 00007ffe52e68f0f *** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v4.0.30319_64\System\47e0be927382f169f5de470fab0ceb7d\System.ni.dll
System.Net.Sockets.NetworkStream.Read(Byte[], Int32, Int32) [f:\dd\NDP\fx\src\net\System\Net\Sockets\NetworkStream.cs @ 513]
000000d7b927ebe0 00007ffe53e59c04 *** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v4.0.30319_64\mscorlib\5d0c037297cc1a64b52ce43b45c2ac2e\mscorlib.ni.dll
System.IO.BufferedStream.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\bufferedstream.cs @ 814]
000000d7b927ec10 00007ffe53e2183c System.IO.BinaryReader.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\binaryreader.cs @ 143]
000000d7b927ec40 00007ffdf6188eb5 RabbitMQ.Client.Impl.Frame.ReadFrom(RabbitMQ.Util.NetworkBinaryReader)
000000d7b927ecb0 00007ffdf6188e13 RabbitMQ.Client.Impl.SocketFrameHandler.ReadFrame()
000000d7b927ed10 00007ffdf6188cfc RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()
000000d7b927ed50 00007ffdf6188a9f RabbitMQ.Client.Framing.Impl.Connection.MainLoop()
000000d7b927edb0 00007ffe53e34740 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]
000000d7b927ee80 00007ffe53e345d4 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]
000000d7b927eeb0 00007ffe53e345a2 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 891]
000000d7b927ef00 00007ffe53ebcf62 System.Threading.ThreadHelper.ThreadStart() [f:\dd\ndp\clr\src\BCL\system\threading\thread.cs @ 111]
000000d7b927f158 00007ffe55325a03 [GCFrame: 000000d7b927f158] 
000000d7b927f4a8 00007ffe55325a03 [DebuggerU2MCatchHandlerFrame: 000000d7b927f4a8] 

上面的输出中,我们能看到异常所在的线程是32号线程,线程堆栈如下,RabbitMQ的通讯异常(心跳线程连接RabbitMQ Server异常)

继续看线程堆栈的变量:

!clrstack -a
0:032> !clrstack -a
OS Thread Id: 0x15968 (32)
        Child SP               IP Call Site
000000d7b927ea88 00007ffe61c395fc [HelperMethodFrame: 000000d7b927ea88] 
000000d7b927eb70 00007ffe52e68f0f System.Net.Sockets.NetworkStream.Read(Byte[], Int32, Int32) [f:\dd\NDP\fx\src\net\System\Net\Sockets\NetworkStream.cs @ 513]
    PARAMETERS:
        this = <no data>
        buffer = <no data>
        offset = <no data>
        size = <no data>
    LOCALS:
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>

000000d7b927ebe0 00007ffe53e59c04 System.IO.BufferedStream.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\bufferedstream.cs @ 814]
    PARAMETERS:
        this (<CLR reg>) = 0x000000d79a3c2f18
    LOCALS:
        <no data>
        <no data>

000000d7b927ec10 00007ffe53e2183c System.IO.BinaryReader.ReadByte() [f:\dd\ndp\clr\src\BCL\system\io\binaryreader.cs @ 143]
    PARAMETERS:
        this = <no data>
    LOCALS:
        <no data>

000000d7b927ec40 00007ffdf6188eb5 RabbitMQ.Client.Impl.Frame.ReadFrom(RabbitMQ.Util.NetworkBinaryReader)
    PARAMETERS:
        reader (<CLR reg>) = 0x000000d79a3c2f70
    LOCALS:
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>
        <no data>

000000d7b927ecb0 00007ffdf6188e13 RabbitMQ.Client.Impl.SocketFrameHandler.ReadFrame()
    PARAMETERS:
        this = <no data>
    LOCALS:
        <no data>
        0x000000d7b927ece0 = 0x0000000000000000
        0x000000d7b927ecd8 = 0x000000d79a3c2f70

000000d7b927ed10 00007ffdf6188cfc RabbitMQ.Client.Framing.Impl.Connection.MainLoopIteration()
    PARAMETERS:
        this (<CLR reg>) = 0x000000d79a3c31b8
    LOCALS:
        <no data>
        <no data>

000000d7b927ed50 00007ffdf6188a9f RabbitMQ.Client.Framing.Impl.Connection.MainLoop()
    PARAMETERS:
        this (0x000000d7b927edb0) = 0x000000d79a3c31b8
    LOCALS:
        0x000000d7b927ed8c = 0x0000000000000000
        <no data>
        <no data>
        <no data>
        <no data>

000000d7b927edb0 00007ffe53e34740 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]
    PARAMETERS:
        executionContext = <no data>
        callback = <no data>
        state = <no data>
        preserveSyncCtx = <no data>
    LOCALS:
        <no data>
        <no data>
        <no data>

000000d7b927ee80 00007ffe53e345d4 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]
    PARAMETERS:
        executionContext = <no data>
        callback = <no data>
        state = <no data>
        preserveSyncCtx = <no data>

000000d7b927eeb0 00007ffe53e345a2 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 891]
    PARAMETERS:
        executionContext = <no data>
        callback = <no data>
        state = <no data>

000000d7b927ef00 00007ffe53ebcf62 System.Threading.ThreadHelper.ThreadStart() [f:\dd\ndp\clr\src\BCL\system\threading\thread.cs @ 111]
    PARAMETERS:
        this = <no data>

000000d7b927f158 00007ffe55325a03 [GCFrame: 000000d7b927f158] 
000000d7b927f4a8 00007ffe55325a03 [DebuggerU2MCatchHandlerFrame: 000000d7b927f4a8] 

能看到部分参数的内存地址,但是很多是no data,这样的话,我们尝试用另一个命令,查询线程上下文中的所有变量信息: 

!dso

0:032> !dso
OS Thread Id: 0x15968 (32)
RSP/REG          Object           Name
000000D7B927E8D0 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927E948 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927E9A8 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927E9B0 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927E9E0 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927E9F0 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927EA78 000000d79aedb850 System.Text.StringBuilder
000000D7B927EAC0 000000d79aedb850 System.Text.StringBuilder
000000D7B927EAD0 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927EAD8 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927EB20 000000d79aedb7a8 System.Net.Sockets.SocketException
000000D7B927EB30 000000d79a3c2ed8 System.Net.Sockets.NetworkStream
000000D7B927EB60 000000d79a809eb0 System.Byte[]
000000D7B927EB70 000000d799fd1420 System.String    
000000D7B927EB78 000000d79aeda2e8 RabbitMQ.Client.Framing.BasicProperties
000000D7B927EB80 000000d79aeda568 System.Byte[]
000000D7B927EBB0 000000d79a017a90 System.Threading.ContextCallback
000000D7B927EBB8 000000d79a3c2f18 System.IO.BufferedStream
000000D7B927EBC0 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader
000000D7B927EBC8 000000d79a3c3ff0 System.Threading.ThreadHelper
000000D7B927EBF0 000000d79aeda0a8 System.String    HSF-ServiceState
000000D7B927EBF8 000000d799fd1420 System.String    
000000D7B927EC00 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader
000000D7B927EC10 000000d79aeda2e8 RabbitMQ.Client.Framing.BasicProperties
000000D7B927EC20 000000d79aeda2e8 RabbitMQ.Client.Framing.BasicProperties
000000D7B927EC28 000000d79a017a90 System.Threading.ContextCallback
000000D7B927EC30 000000d79a3c6a00 RabbitMQ.Client.Framing.Impl.Model
000000D7B927EC50 000000d79a3c3ff0 System.Threading.ThreadHelper
000000D7B927EC78 000000d79a017a90 System.Threading.ContextCallback
000000D7B927EC80 000000d79a3c2bc0 RabbitMQ.Client.Impl.SocketFrameHandler
000000D7B927EC88 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader
000000D7B927EC90 000000d79a3c3ff0 System.Threading.ThreadHelper
000000D7B927ECB0 000000d79a017a90 System.Threading.ContextCallback
000000D7B927ECC0 000000d79a3c6688 RabbitMQ.Client.Impl.Session
000000D7B927ECC8 000000d79aedb430 RabbitMQ.Client.Impl.Frame
000000D7B927ECD8 000000d79a3c2f70 RabbitMQ.Util.NetworkBinaryReader
000000D7B927ECF0 000000d79a3c31b8 RabbitMQ.Client.Framing.Impl.Connection
000000D7B927ED80 000000d79a017a90 System.Threading.ContextCallback
000000D7B927EDB0 000000d79a3c31b8 RabbitMQ.Client.Framing.Impl.Connection
000000D7B927EDF0 000000d79a3c3f90 System.Threading.Thread
000000D7B927EE48 000000d79a3c3ff0 System.Threading.ThreadHelper
000000D7B927EE50 000000d79a3c4150 System.Threading.ExecutionContext
000000D7B927EE58 000000d79a017a90 System.Threading.ContextCallback
000000D7B927EEE8 000000d79a3c3ff0 System.Threading.ThreadHelper
000000D7B927EEF0 000000d79a3c4150 System.Threading.ExecutionContext
000000D7B927F100 000000d79a3c4018 System.Threading.ThreadStart

 线程上下文中所有的变量的内存地址,我们都能看到,接下来看每个对象的信息,大家应该都会用了  !do objectAddress, 如果加载的Mex插件,可以用牛逼的  !do2

 

0:032> !do2 000000d79a3c6688 
0x000000d79a3c6688 RabbitMQ.Client.Impl.Session
  0000  _shutdownLock                    : 000000d79a3c66d0 (System.Object)
  0008  _sessionShutdown                 : 000000d79a3c6cf8 (System.EventHandler<RabbitMQ.Client.ShutdownEventArgs>)
  0010  <CloseReason>k__BackingField     : NULL
  0018  <CommandReceived>k__BackingField : 000000d79a3c6c50 (System.Action<RabbitMQ.Client.Impl.ISession,RabbitMQ.Client.Impl.Command>)
  0020  <Connection>k__BackingField      : 000000d79a3c31b8 (RabbitMQ.Client.Framing.Impl.Connection)
  0028  <ChannelNumber>k__BackingField   : 1 (System.Int32)
  0030  m_assembler                      : 000000d79a3c6790 (RabbitMQ.Client.Impl.CommandAssembler)
0:032> !do 000000d79a3c6688 
Name:        RabbitMQ.Client.Impl.Session
MethodTable: 00007ffdf6257888
EEClass:     00007ffdf6264ce8
Size:        72(0x48) bytes
File:        C:\TeldApp\HSF\HSF.Host-BillCM-9086\RabbitMQ.Client.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ffe53fd6fd8  4000283        8        System.Object  0 instance 000000d79a3c66d0 _shutdownLock
00007ffdf6250760  4000284       10 ...RabbitMQ.Client]]  0 instance 000000d79a3c6cf8 _sessionShutdown
00007ffe53fd9428  4000285       30         System.Int32  1 instance                1 <ChannelNumber>k__BackingField
00007ffdf62506c0  4000286       18 ...ShutdownEventArgs  0 instance 0000000000000000 <CloseReason>k__BackingField
00007ffdf6272e78  4000287       20 ...RabbitMQ.Client]]  0 instance 000000d79a3c6c50 <CommandReceived>k__BackingField
00007ffdf62549a0  4000288       28 ...g.Impl.Connection  0 instance 000000d79a3c31b8 <Connection>k__BackingField
00007ffdf62713b0  4000289       38 ....CommandAssembler  0 instance 000000d79a3c6790 m_assembler

我们继续看Connection对象:000000d79a3c31b8 

0:032> !do 000000d79a3c31b8 
Name:        RabbitMQ.Client.Framing.Impl.Connection
MethodTable: 00007ffdf62549a0
EEClass:     00007ffdf6260b78
Size:        232(0xe8) bytes
File:        C:\TeldApp\HSF\HSF.Host-BillCM-9086\RabbitMQ.Client.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ffe53fd6fd8  4000247        8        System.Object  0 instance 000000d79a3c32a0 m_eventLock
00007ffdf6257c48  4000248       10 ...Client.Impl.Frame  0 instance 000000d79a3c32b8 m_heartbeatFrame
00007ffe53fe7378  4000249       18 ....ManualResetEvent  0 instance 000000d79a3c32f8 m_appContinuation
0000000000000000  400024a       20                       0 instance 0000000000000000 m_callbackException
00007ffe53954820  400024b       28 ...bject, mscorlib]]  0 instance 000000d79a3c51d8 m_clientProperties
00007ffdf62506c0  400024c       30 ...ShutdownEventArgs  0 instance 0000000000000000 m_closeReason
00007ffe53ff9a38  400024d       c2       System.Boolean  1 instance                0 m_closed
0000000000000000  400024e       38                       0 instance 0000000000000000 m_connectionBlocked
00007ffdf6250760  400024f       40 ...RabbitMQ.Client]]  0 instance 000000d79a3c6750 m_connectionShutdown
00007ffe53953ff0  4000250       48 ...tArgs, mscorlib]]  0 instance 0000000000000000 m_connectionUnblocked
00007ffdf622f7d0  4000251       50 ...ConnectionFactory  0 instance 000000d79a3c2698 m_factory
00007ffdf6255cb8  4000252       58 ...mpl.IFrameHandler  0 instance 000000d79a3c2bc0 m_frameHandler
00007ffe53ff40a0  4000253       c8          System.Guid  1 instance 000000d79a3c3280 m_id
00007ffdf625a5d0  4000254       60 ...nt.Impl.ModelBase  0 instance 000000d79a3c3860 m_model0
00007ffe53ff9a38  4000255       c3       System.Boolean  1 instance                1 m_running
00007ffdf6257a48  4000256       68 ....Impl.MainSession  0 instance 000000d79a3c35f0 m_session0
00007ffdf6259050  4000257       70 ...pl.SessionManager  0 instance 000000d79a3c6018 m_sessionManager
00007ffdf6258068  4000258       78 ...RabbitMQ.Client]]  0 instance 000000d79a3c3370 m_shutdownReport
00007ffe53fd5570  4000259       c0        System.UInt16  1 instance               60 m_heartbeat
00007ffe53ff3f28  400025a       d8      System.TimeSpan  1 instance 000000d79a3c3290 m_heartbeatTimeSpan
00007ffe53fd9428  400025b       b8         System.Int32  1 instance                0 m_missedHeartbeats
00007ffe53ff4248  400025c       80 ...m.Threading.Timer  0 instance 000000d79a3c61f0 _heartbeatWriteTimer
00007ffe53ff4248  400025d       88 ...m.Threading.Timer  0 instance 000000d79a3c6428 _heartbeatReadTimer
00007ffe53ff4de8  400025e       90 ...ng.AutoResetEvent  0 instance 000000d79a3c33a8 m_heartbeatRead
00007ffe53ff4de8  400025f       98 ...ng.AutoResetEvent  0 instance 000000d79a3c33f8 m_heartbeatWrite
00007ffe53ff9a38  4000260       c4       System.Boolean  1 instance                0 m_inConnectionNegotiation
00007ffdf6258e90  4000261       a0 ...nsumerWorkService  0 instance 000000d79a3c3448 <ConsumerWorkService>k__BackingField
00007ffe53ff21f0  4000262       bc        System.UInt32  1 instance           131072 <FrameMax>k__BackingField
00007ffdf625ace0  4000263       a8 ...AmqpTcpEndpoint[]  0 instance 0000000000000000 <KnownHosts>k__BackingField
00007ffe53954820  4000264       b0 ...bject, mscorlib]]  0 instance 000000d79a3c5468 <ServerProperties>k__BackingField
0:032> !do2 000000d79a3c31b8 
0x000000d79a3c31b8 RabbitMQ.Client.Framing.Impl.Connection
  0000  m_eventLock                          : 000000d79a3c32a0 (System.Object)
  0008  m_heartbeatFrame                     : 000000d79a3c32b8 (RabbitMQ.Client.Impl.Frame)
  0010  m_appContinuation                    : 000000d79a3c32f8 (System.Threading.ManualResetEvent)
  0018  m_callbackException                  : NULL
  0020  m_clientProperties                   : 000000d79a3c51d8 (System.Collections.Generic.Dictionary<System.String,System.Object>)
  0028  m_closeReason                        : NULL
  0030  m_connectionBlocked                  : NULL
  0038  m_connectionShutdown                 : 000000d79a3c6750 (System.EventHandler<RabbitMQ.Client.ShutdownEventArgs>)
  0040  m_connectionUnblocked                : NULL
  0048  m_factory                            : 000000d79a3c2698 (RabbitMQ.Client.ConnectionFactory)
  0050  m_frameHandler                       : 000000d79a3c2bc0 (RabbitMQ.Client.Impl.SocketFrameHandler)
  0058  m_model0                             : 000000d79a3c3860 (RabbitMQ.Client.Framing.Impl.Model)
  0060  m_session0                           : 000000d79a3c35f0 (RabbitMQ.Client.Impl.MainSession)
  0068  m_sessionManager                     : 000000d79a3c6018 (RabbitMQ.Client.Impl.SessionManager)
  0070  m_shutdownReport                     : 000000d79a3c3370 (RabbitMQ.Util.SynchronizedList<RabbitMQ.Client.ShutdownReportEntry>)
  0078  _heartbeatWriteTimer                 : 000000d79a3c61f0 (System.Threading.Timer)
  0080  _heartbeatReadTimer                  : 000000d79a3c6428 (System.Threading.Timer)
  0088  m_heartbeatRead                      : 000000d79a3c33a8 (System.Threading.AutoResetEvent)
  0090  m_heartbeatWrite                     : 000000d79a3c33f8 (System.Threading.AutoResetEvent)
  0098  <ConsumerWorkService>k__BackingField : 000000d79a3c3448 (RabbitMQ.Client.ConsumerWorkService)
  00a0  <KnownHosts>k__BackingField          : NULL
  00a8  <ServerProperties>k__BackingField    : 000000d79a3c5468 (System.Collections.Generic.Dictionary<System.String,System.Object>)
  00b0  m_missedHeartbeats                   : 0 (System.Int32)
  00b4  <FrameMax>k__BackingField            : 131072 (System.UInt32)
  00b8  m_heartbeat                          : 60 (System.UInt16)
  00ba  m_closed                             : False (System.Boolean)
  00bb  m_running                            : True (System.Boolean)
  00bc  m_inConnectionNegotiation            : False (System.Boolean)
  00c0  m_id                                 : 000000d79a3c3280 1c8d8e96-0c6a-4c62-84b7-9feb241b67ad (System.Guid)
  00d0  m_heartbeatTimeSpan                  : 000000d79a3c3290 00:00:15 (System.TimeSpan)

类似的,其他的内存变量也可以看,帮助我们分析问题。

找到问题的大致原因后,我们需要禁用异常捕获,退出调试,如果直接关闭Windbg,此时附加调试的进程就被强制退出了,一定要注意,请使用下面的指令:

6. 禁用异常捕获 sxd clr
7. 退出调试 qd

sxd  clr
qd

以上就是跟大家分享的附加进程实时调试,总结一下套路:

  1. F6 Attache进程
  2. 加载SOS, .loadby sos clr
  3. 启用异常捕获 sxe clr
  4. 运行,捕获&查看异常 g  !pe
  5. 查看异常所在线程堆栈!clrstack, 查看异常所在线程堆栈的参数和变量 !clrstack -a / !dso
  6. 禁用异常捕获 sxd clr
  7. 退出调试 qd

分享给大家。

 

周国庆

2018/11/4

posted @ 2018-11-05 07:03  Eric zhou  阅读(2110)  评论(4编辑  收藏  举报