nova-novncproxy出现zombie进程,且无法提供服务,日志文件也不输出

nova-novncproxy作用

nova-novncproxy提供vm的控制台访问方式,类似与物理机ipmi访问的方式,其优点在于,即使用户业务网络出现异常,用户也能够通过vnc访问其虚拟机. 因此nova-novncproxy是nova的核心模块.

遇到的问题描述

在使用nova-novncproxy的过程中(kilo 2015.1.3,websockify 0.6.0),我们遇到了一个棘手的问题:
每隔一两周, nova-novncproxy会出现很多zombie进程, 而且nova-novncproxy无法提供服务,于是看nova-novncproxy日志/var/log/nova/nova-novncproxy.log,发现最新的日志在好几天前. 但是只要一旦重启nova-novncproxy,立刻恢复
类似于:

 26327 ?        S      0:05  \_ /usr/bin/python /usr/bin/nova-novncproxy --config-file=/etc/nova/nova.conf
 4765 ?        Z      0:00      \_ [nova-novncproxy] <defunct>
 4766 ?        Z      0:00      \_ [nova-novncproxy] <defunct>
 4767 ?        Z      0:00      \_ [nova-novncproxy] <defunct>
 4768 ?        Z      0:00      \_ [nova-novncproxy] <defunct>
 4769 ?        Z      0:00      \_ [nova-novncproxy] <defunct>

疑问:
1. 为什么有zombie进程?
2. nova-novncproxy卡在哪里?

分析

首先想知道的一个问题是,到底nova-novncproxy被阻塞在哪里?
于是想到了,可以用gdb attach到该进程上, 比如命令:gdb -p 26327, 具体的方法可以参考:http://www.cnblogs.com/dkblog/p/3806277.html
摘出核心的方法:
1)确保你的gdb版本>=7
2)安装python-debuginfo包(如:python-debuginfo- 2.6.6-29.el6_2.2.x86_64.rpm,这个版本号一定要跟你所用的python版本一致(可以rpm -qa|grep python查看你安装的python的详细版本号)。找包http://debuginfo.centos.org/6/x86_64/)
3)就可以用#gdb -p 进程号,进行调试了。

attach到进程之后, 输入bt, 打印进程栈,如下:

#0  0x00007fbd4bc703f3 in __select_nocancel () at /lib64/libc.so.6
#1  0x00007fbd4461e070 in time_sleep (secs=<optimized out>) at /usr/src/debug/Python-2.7.5/Modules/timemodule.c:948
#2  0x00007fbd4461e070 in time_sleep (self=<optimized out>, args=<optimized out>) at /usr/src/debug/Python-2.7.5/Modules/timemodule.c:206
#3  0x00007fbd4c949aa4 in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7ffe0f6a11e0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4098
#4  0x00007fbd4c949aa4 in PyEval_EvalFrameEx (f=f@entry=Frame 0x291c560, for file /usr/lib/python2.7/site-packages/eventlet/hubs/poll.py, line 82, in wait (self=<Hub(next_timers=[], clock=<built-in function time>, debug_exceptions=True, debug_blocking_resolution=1, modify=<built-in method modify of select.epoll object at remote 0x7fbd4cdbfc48>, running=True, debug_blocking=False, listeners={'read': {}, 'write': {}}, timers_canceled=0, greenlet=<greenlet.greenlet at remote 0x285a4b0>, closed=[], stopping=False, timers=[], poll=<select.epoll at remote 0x7fbd4cdbfc48>, secondaries={'read': {}, 'write': {}}, lclass=<type at remote 0x2922410>) at remote 0x28e1f90>, seconds=<float at remote 0x236f158>, readers={...}, writers={...}), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#5  0x00007fbd4c94b0bd in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=argcount@entry=2, kws=0x291ebd0, kwcount=0, defs=0x28e1f68, defcount=1, closure=closure@entry=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#6  0x00007fbd4c94976f in PyEval_EvalFrameEx (nk=<optimized out>, na=2, n=2, pp_stack=0x7ffe0f6a13e0, func=<function at remote 0x28eac80>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4194
#7  0x00007fbd4c94976f in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7ffe0f6a13e0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
#8  0x00007fbd4c94976f in PyEval_EvalFrameEx (f=f@entry=Frame 0x291ea20, for file /usr/lib/python2.7/site-packages/eventlet/hubs/hub.py, line 346, in run (self=<Hub(next_timers=[], clock=<built-in function time>, debug_exceptions=True, debug_blocking_resolution=1, modify=<built-in method modify of select.epoll object at remote 0x7fbd4cdbfc48>, running=True, debug_blocking=False, listeners={'read': {}, 'write': {}}, timers_canceled=0, greenlet=<greenlet.greenlet at remote 0x285a4b0>, closed=[], stopping=False, timers=[], poll=<select.epoll at remote 0x7fbd4cdbfc48>, secondaries={'read': {}, 'write': {}}, lclass=<type at remote 0x2922410>) at remote 0x28e1f90>, a=(), kw={}, wakeup_when=None, sleep_time=<float at remote 0x236f158>), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#9  0x00007fbd4c94b0bd in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0x28db928, argcount=1, kws=kws@entry=0x0, kwcount=kwcount@entry=0, defs=defs@entry=0x0, defcount=defcount@entry=0, closure=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#10 0x00007fbd4c8d7f68 in function_call (func=<function at remote 0x28ea0c8>, arg=(<Hub(next_timers=[], clock=<built-in function time>, debug_exceptions=True, debug_blocking_resolution=1, modify=<built-in method modify of select.epoll object at remote 0x7fbd4cdbfc48>, running=True, debug_blocking=False, listeners={'read': {}, 'write': {}}, timers_canceled=0, greenlet=<greenlet.greenlet at remote 0x285a4b0>, closed=[], stopping=False, timers=[], poll=<select.epoll at remote 0x7fbd4cdbfc48>, secondaries={'read': {}, 'write': {}}, lclass=<type at remote 0x2922410>) at remote 0x28e1f90>,), kw=0x0) at /usr/src/debug/Python-2.7.5/Objects/funcobject.c:526
#11 0x00007fbd4c8b30b3 in PyObject_Call (func=func@entry=<function at remote 0x28ea0c8>, arg=arg@entry=(<Hub(next_timers=[], clock=<built-in function time>, debug_exceptions=True, debug_blocking_resolution=1, modify=<built-in method modify of select.epoll object at remote 0x7fbd4cdbfc48>, running=True, debug_blocking=False, listeners={'read': {}, 'write': {}}, timers_canceled=0, greenlet=<greenlet.greenlet at remote 0x285a4b0>, closed=[], stopping=False, timers=[], poll=<select.epoll at remote 0x7fbd4cdbfc48>, secondaries={'read': {}, 'write': {}}, lclass=<type at remote 0x2922410>) at remote 0x28e1f90>,), kw=kw@entry=0x0) at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2529
#12 0x00007fbd4c8c20a5 in instancemethod_call (func=<function at remote 0x28ea0c8>, arg=(<Hub(next_timers=[], clock=<built-in function time>, debug_exceptions=True, debug_blocking_resolution=1, modify=<built-in method modify of select.epoll object at remote 0x7fbd4cdbfc48>, running=True, debug_blocking=False, listeners={'read': {}, 'write': {}}, timers_canceled=0, greenlet=<greenlet.greenlet at remote 0x285a4b0>, closed=[], stopping=False, timers=[], poll=<select.epoll at remote 0x7fbd4cdbfc48>, secondaries={'read': {}, 'write': {}}, lclass=<type at remote 0x2922410>) at remote 0x28e1f90>,), kw=0x0) at /usr/src/debug/Python-2.7.5/Objects/classobject.c:2602
#13 0x00007fbd4c8b30b3 in PyObject_Call (func=func@entry=<instancemethod at remote 0x26d4eb0>, arg=arg@entry=(), kw=<optimized out>) at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2529
#14 0x00007fbd4c944f07 in PyEval_CallObjectWithKeywords (func=<instancemethod at remote 0x26d4eb0>, arg=(), kw=<optimized out>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3967
#15 0x00007fbd44825a9c in g_initialstub () at /usr/lib64/python2.7/site-packages/greenlet.so
#16 0x00007fbd448253e6 in g_switch () at /usr/lib64/python2.7/site-packages/greenlet.so
#17 0x0000000000000003 in  ()
#18 0x00000000028e5b08 in  ()
#19 0x00007fbd4c94b0bd in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=args@entry=0x28e5b08, argcount=3, kws=kws@entry=0x7fbd4cdff068, kwcount=kwcount@entry=0, defs=defs@entry=0x0, defcount=defcount@entry=0, closure=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#20 0x00007fbd4c8d805d in function_call (func=<function at remote 0x7fbd411126e0>, arg=(<KeywordArgumentAdapter(logger=<Logger(name='nova.console.websocketproxy', parent=<RootLogger(name='root', parent=None, handlers=[<WatchedFileHandler(stream=<file at remote 0x2867300>, level=0, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<Semaphore(counter=0, _waiters=<collections.deque at remote 0x28dc670>) at remote 0x28db850>, _RLock__count=0) at remote 0x28db810>, encoding=None, dev=64769L, _name=None, ino=3221237281, baseFilename='/var/log/nova/nova-novncproxy.log', mode='a', filters=[], formatter=<ContextFormatter(project='nova', datefmt='%Y-%m-%d %H:%M:%S', version='unknown', _fmt='%(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [-] %(instance)s%(message)s %(funcName)s %(pathname)s:%(lineno)d', conf=<ConfigOpts(_groups={'remote_debug': <OptGroup(title='remote_debug options', _opts={'host': {'opt': <StrOpt(deprecated_for_removal=False, short=None, name='host', dest='host', required=False, _logged_deprecation=False, sample_default=None, deprecated_opts=[], position...(truncated), kw={}) at /usr/src/debug/Python-2.7.5/Objects/funcobject.c:526
#21 0x00007fbd4c8b30b3 in PyObject_Call (func=func@entry=<function at remote 0x7fbd411126e0>, arg=arg@entry=(<KeywordArgumentAdapter(logger=<Logger(name='nova.console.websocketproxy', parent=<RootLogger(name='root', parent=None, handlers=[<WatchedFileHandler(stream=<file at remote 0x2867300>, level=0, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<Semaphore(counter=0, _waiters=<collections.deque at remote 0x28dc670>) at remote 0x28db850>, _RLock__count=0) at remote 0x28db810>, encoding=None, dev=64769L, _name=None, ino=3221237281, baseFilename='/var/log/nova/nova-novncproxy.log', mode='a', filters=[], formatter=<ContextFormatter(project='nova', datefmt='%Y-%m-%d %H:%M:%S', version='unknown', _fmt='%(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [-] %(instance)s%(message)s %(funcName)s %(pathname)s:%(lineno)d', conf=<ConfigOpts(_groups={'remote_debug': <OptGroup(title='remote_debug options', _opts={'host': {'opt': <StrOpt(deprecated_for_removal=False, short=None, name='host', dest='host', required=False, _logged_deprecation=False, sample_default=None, deprecated_opts=[], position...(truncated), kw=kw@entry={}) at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2529
#22 0x00007fbd4c9462f7 in PyEval_EvalFrameEx (nk=<optimized out>, na=<optimized out>, flags=<optimized out>, pp_stack=0x7ffe0f6a1b90, func=<function at remote 0x7fbd411126e0>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4411
*#23 0x00007fbd4c9462f7 in PyEval_EvalFrameEx (f=f@entry=Frame 0x292b920, for file /usr/lib/python2.7/site-packages/websockify/websocket.py, line 828, in vmsg (self=<NovaWebSocketProxy(wrap_times=[0, 0, 0], target_cfg=None, verbose=False, listen_host='0.0.0.0', tcp_keepintvl=None, ws_connection=False, tcp_keepidle=None, target_host=None, ssl_target=None, web='/usr/share/novnc', listen_port=6080, logger=<KeywordArgumentAdapter(logger=<Logger(name='nova.console.websocketproxy', parent=<RootLogger(name='root', parent=None, handlers=[<WatchedFileHandler(stream=<file at remote 0x2867300>, level=0, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<Semaphore(counter=0, _waiters=<collections.deque at remote 0x28dc670>) at remote 0x28db850>, _RLock__count=0) at remote 0x28db810>, encoding=None, dev=64769L, _name=None, ino=3221237281, baseFilename='/var/log/nova/nova-novncproxy.log', mode='a', filters=[], formatter=<ContextFormatter(project='nova', datefmt='%Y-%m-%d %H:%M:%S', version='unknown', _fmt='%(asctime)s.%(msecs)03d %(process)d %(levelname)s %...(truncated), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2779*
#24 0x00007fbd4c94b0bd in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=argcount@entry=2, kws=0x28fc260, kwcount=0, defs=0x0, defcount=0, closure=closure@entry=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#25 0x00007fbd4c94976f in PyEval_EvalFrameEx (nk=<optimized out>, na=2, n=2, pp_stack=0x7ffe0f6a1d90, func=<function at remote 0x180e848>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4194
#26 0x00007fbd4c94976f in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7ffe0f6a1d90) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
*#27 0x00007fbd4c94976f in PyEval_EvalFrameEx (f=f@entry=Frame 0x28fc050, for file /usr/lib/python2.7/site-packages/websockify/websocket.py, line 996, in start_server (self=<NovaWebSocketProxy(wrap_times=[0, 0, 0], target_cfg=None, verbose=False, listen_host='0.0.0.0', tcp_keepintvl=None, ws_connection=False, tcp_keepidle=None, target_host=None, ssl_target=None, web='/usr/share/novnc', listen_port=6080, logger=<KeywordArgumentAdapter(logger=<Logger(name='nova.console.websocketproxy', parent=<RootLogger(name='root', parent=None, handlers=[<WatchedFileHandler(stream=<file at remote 0x2867300>, level=0, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<Semaphore(counter=0, _waiters=<collections.deque at remote 0x28dc670>) at remote 0x28db850>, _RLock__count=0) at remote 0x28db810>, encoding=None, dev=64769L, _name=None, ino=3221237281, baseFilename='/var/log/nova/nova-novncproxy.log', mode='a', filters=[], formatter=<ContextFormatter(project='nova', datefmt='%Y-%m-%d %H:%M:%S', version='unknown', _fmt='%(asctime)s.%(msecs)03d %(process)d %(level...(truncated), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740*
#28 0x00007fbd4c949860 in PyEval_EvalFrameEx (nk=<optimized out>, na=1, n=1, pp_stack=0x7ffe0f6a1ef0, func=<function at remote 0x180ecf8>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4184
#29 0x00007fbd4c949860 in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7ffe0f6a1ef0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
#30 0x00007fbd4c949860 in PyEval_EvalFrameEx (f=f@entry=Frame 0x28af4f0, for file /usr/lib/python2.7/site-packages/nova/cmd/baseproxy.py, line 72, in proxy (host='0.0.0.0', port=6080), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#31 0x00007fbd4c94b0bd in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=argcount@entry=0, kws=0x159c490, kwcount=2, defs=0x0, defcount=0, closure=closure@entry=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#32 0x00007fbd4c94976f in PyEval_EvalFrameEx (nk=<optimized out>, na=0, n=4, pp_stack=0x7ffe0f6a20f0, func=<function at remote 0x24e7c80>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4194
#33 0x00007fbd4c94976f in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7ffe0f6a20f0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
#34 0x00007fbd4c94976f in PyEval_EvalFrameEx (f=f@entry=Frame 0x159c310, for file /usr/lib/python2.7/site-packages/nova/cmd/novncproxy.py, line 49, in main (), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#35 0x00007fbd4c949860 in PyEval_EvalFrameEx (nk=<optimized out>, na=0, n=0, pp_stack=0x7ffe0f6a2250, func=<function at remote 0x24ec1b8>) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4184
#36 0x00007fbd4c949860 in PyEval_EvalFrameEx (oparg=<optimized out>, pp_stack=0x7ffe0f6a2250) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4119
#37 0x00007fbd4c949860 in PyEval_EvalFrameEx (f=f@entry=Frame 0x1352510, for file /usr/bin/nova-novncproxy, line 10, in <module> (), throwflag=throwflag@entry=0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:2740
#38 0x00007fbd4c94b0bd in PyEval_EvalCodeEx (co=co@entry=0x7fbd4cd828b0, globals=globals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}, locals=locals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}, args=args@entry=0x0, argcount=argcount@entry=0, kws=kws@entry=0x0, kwcount=kwcount@entry=0, defs=defs@entry=0x0, defcount=defcount@entry=0, closure=closure@entry=0x0) at /usr/src/debug/Python-2.7.5/Python/ceval.c:3330
#39 0x00007fbd4c94b1c2 in PyEval_EvalCode (co=co@entry=0x7fbd4cd828b0, globals=globals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}, locals=locals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}) at /usr/src/debug/Python-2.7.5/Python/ceval.c:689
#40 0x00007fbd4c9645ff in run_mod (mod=<optimized out>, filename=filename@entry=0x7ffe0f6a3f2c "/usr/bin/nova-novncproxy", globals=globals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}, locals=locals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}, flags=flags@entry=0x7ffe0f6a24b0, arena=arena@entry=0x1317f40) at /usr/src/debug/Python-2.7.5/Python/pythonrun.c:1374
#41 0x00007fbd4c9657be in PyRun_FileExFlags (fp=fp@entry=0x135fe20, filename=filename@entry=0x7ffe0f6a3f2c "/usr/bin/nova-novncproxy", start=start@entry=257, globals=globals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}, locals=locals@entry={'__builtins__': <module at remote 0x7fbd4cdffb08>, '__file__': '/usr/bin/nova-novncproxy', '__package__': None, 'sys': <module at remote 0x7fbd4cdffbb0>, '__name__': '__main__', 'main': <function at remote 0x24ec1b8>, '__doc__': None}, closeit=closeit@entry=1, flags=flags@entry=0x7ffe0f6a24b0) at /usr/src/debug/Python-2.7.5/Python/pythonrun.c:1360
#42 0x00007fbd4c966a49 in PyRun_SimpleFileExFlags (fp=fp@entry=0x135fe20, filename=filename@entry=0x7ffe0f6a3f2c "/usr/bin/nova-novncproxy", closeit=closeit@entry=1, flags=flags@entry=0x7ffe0f6a24b0) at /usr/src/debug/Python-2.7.5/Python/pythonrun.c:952
#43 0x00007fbd4c966f63 in PyRun_AnyFileExFlags (fp=fp@entry=0x135fe20, filename=filename@entry=0x7ffe0f6a3f2c "/usr/bin/nova-novncproxy", closeit=closeit@entry=1, flags=flags@entry=0x7ffe0f6a24b0) at /usr/src/debug/Python-2.7.5/Python/pythonrun.c:756
#44 0x00007fbd4c977b9f in Py_Main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/Python-2.7.5/Modules/main.c:640
#45 0x00007fbd4bba3b15 in __libc_start_main () at /lib64/libc.so.6
#46 0x0000000000400721 in _start ()

仔细分析该栈,发现卡在:
/usr/lib/python2.7/site-packages/websockify/websocket.py, line 996, in start_server
/usr/lib/python2.7/site-packages/websockify/websocket.py, line 828, in vmsg

大家可以分析下这个/usr/lib/python2.7/site-packages/websockify/websocket.py代码, 实际上start_server函数是一个标准的服务端程序, 建立一个socket,监听,有新的连接来了之后,就启动一个新进程去处理,主进程不停的轮询,当用户的连接断开后,子进程死亡,父进程收到sigchld,会根据消息处理函数self.multiprocessing_SIGCHLD来进行销毁.

通过上述的分析,可以解决我们的第一/二个疑问:

  1. nova-novncproxy卡在websockify库的websocket.py的写日志的函数:vmsg
  2. 因为主进程卡在vmsg函数中(写日志),因此,死亡的孩子进程无法被收尸,因此处于了defunct状态

卡在写日志vmsg的原因

这个问题,我没有能力去处理, 但是我们已经知道了,问题出在nova-novncproxy所依赖的websockify库中,那么靠google吧
搜索关键字: websockify hang
第一篇文章就是:
https://github.com/kanaka/noVNC/issues/556
顺着这个链接,找到了websockify的一个patch:
https://github.com/kanaka/websockify/pull/219/files
同时,确认了我们使用的nova-novncproxy是kilo版本的, 2015.1.3的,第一篇文章提到的nova-novncproxy的修改已经merge进去了
目前就按照该patch, 把websockify更新了下,同时知道vmsg有一定的问题,因此在nova-novncproxy的配置文件中,把debug,verbose置为false,这样会比较稳妥一些.

https://github.com/kanaka/websockify/pull/219
给出的原因是:
Openstack nova novnc-proxy services uses websockify to provide support
for nova vms using novnc proxy. At present, novnc hangs every couple of
weeks. It only resumes post restart of the novnc-proxy which is not
good. Hence, this code in websockify is updated to get rid of additional
signal calls to avoid novnc going in hang state even though process is
running. Basically, we are getting rid of existing msg and vmsg calls in
the websocket.py. This is kind of quick fix but we will need an
additional way of figuring out the logging to make it easy to trace in
case of any further failures in future.

意思是websocket.py中,msg与vmsg function是比较危险的,应该避免被调用到. 深层次的原因还需要深入的探究

暂时先跑着, 看看是否还有问题

posted on 2016-09-22 16:50  silenceli  阅读(1588)  评论(0编辑  收藏  举报