Advanced Linux Server Troubleshooting (part 2) - page 2
All about /proc, and how to debug Python
The signal handler trick is great, but it's no help if you want to find out about a process right now and it doesn't have a signal handler.If you read part 1, you've probably already tried running gdb on that python process without much luck:
$ gdb -p 9997 GNU gdb (GDB) 7.1-ubuntu [ ... startup blather snipped ... ] 0x00f03422 in __kernel_vsyscall () (gdb) where #0 0x00f03422 in __kernel_vsyscall () #1 0x00cb1241 in recv () at ../sysdeps/unix/sysv/linux/i386/socket.S:61 #2 0x081301ba in ?? () #3 0x081303b4 in ?? () #4 0x080e0a21 in PyEval_EvalFrameEx () #5 0x080e2807 in PyEval_EvalCodeEx () #6 0x080e0c8b in PyEval_EvalFrameEx () [ lots more like that ]
Not so useful -- you can see we're doing something that involves recv from a socket, which means we're waiting on something networky ... but not why it's happening, so you can prevent it next time.
But gdb has a few more Python-related tricks up its sleeve. You can get quite a bit of extra information by installing Python with debugging symbols: Debian or Ubuntu package it as python-dbg, while Fedora and Redhat call it python-debug. Then, when you run gdb:
(gdb) where
#0 0xb7849430 in __kernel_vsyscall ()
#1 0xb7827241 in recv () from /lib/tls/i686/cmov/libpthread.so.0
#2 0x081301ba in recv (s=0xb73c93b8, cbuf=0xb73d1654 "\001", len=-1079842992,
flags=0) at /usr/include/bits/socket2.h:45
#3 sock_recv_guts (s=0xb73c93b8, cbuf=0xb73d1654 "\001", len=-1079842992,
flags=0) at ../Modules/socketmodule.c:2317
#4 0x081303b4 in sock_recv (s=0xb73c93b8, args=(1,))
at ../Modules/socketmodule.c:2398
#5 0x080e0a21 in call_function (f=
Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:3750
#6 PyEval_EvalFrameEx (f=
Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:2412
#7 0x080e2807 in PyEval_EvalCodeEx (co=0xb73f77b8, globals=
{'SocketType': , 'getaddrinfo': , 'AI_NUMERICSERV': 1024, 'PACKET_OTHERHOST': 3, 'AI_NUMERICHOST': 4, 'StringIO': , 'IPPORT_RESERVED': 1024, 'AF_IRDA': 23, 'create_connection': , 'MSG_PEEK': 2, 'NETLINK_IP6_FW': 13, 'NETLINK_ROUTE': 0, 'SO_RCVBUF': 8, 'MSG_DONTROUTE': 4, 'EAI_SERVICE': -8, 'TCP_QUICKACK': 12, 'IPV6_RECVDSTOPTS': 58, 'TIPC_CFG_SRV': 0, 'SOCK_SEQPACKET': 5, 'SO_SNDTIMEO': 21, 'SO_ERROR': 4, 'IPPROTO_IPIP': 4, 'herror': , 'BTPROTO_HCI': 1, 'IPV6_RECVRTHDR': 56, 'IPV6_DSTOPTS': 59, 'sslerror': , 'TCP_SYNCNT': 7, 'SOCK_DGRAM': 2, 'EAI_AGAIN': -3, 'SO_TYPE': 3, 'SO_RCVTIMEO': 20, '__file__': '/usr/lib/python2.6/socket.pyc', 'inet_ntop': , 'IPPROTO_RSVP': 46, 'TIPC_DEST_DROPPABLE': 129, 'SO_OOBINLINE': 10, 'MSG_TRUNC': 32, 'SOL_TIPC': 271, 'SSL_ERROR_WANT_X509_LOOKUP': 4, 'SO_LINGER': 13, 'SOCK_RDM': 4, 'I...(truncated), locals=0x0, args=0x92257e0, argcount=1, kws=0x92257e4, kwcount=0, defs=
0xb73fed38, defcount=1, closure=0x0) at ../Python/ceval.c:3000
#8 0x080e0c8b in fast_function (f=
Frame 0x9225694, for file /usr/lib/python2.6/httplib.py, line 349, in _read_status (self=, status='UNKNOWN', will_close='UNKNOWN', chunk_left='UNKNOWN', length='UNKNOWN', strict=0, reason='UNKNOWN', version='UNKNOWN', debuglevel=0, msg=None, chunked='UNKNOWN', _met---Type to continue, or q to quit---
Lots of information about the state of the network socket, especially in #7. If you're a networking expert, that might give you some clues.
Otherwise, you'll still want to know where it's stuck. For that, look at the stack trace for eval_frame or a similarly named function. In this case, PyEval_EvalFrameEx is at position 6:
#6 PyEval_EvalFrameEx (f=
Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:2412
From that line, you can get the current Python filename and approximate line number:
(gdb) up 6
#6 PyEval_EvalFrameEx (f=
Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:2412
2412 ../Python/ceval.c: No such file or directory.
in ../Python/ceval.c
(gdb) x/s ((PyStringObject*)f->f_code->co_filename)->ob_sval
0xb73f86fc: "/usr/lib/python2.6/socket.py"
(gdb) p f->f_lineno
$2 = 375
So it's in /usr/lib/python2.6/socket.py somewhere near line 375. "Somewhere near" because f->f_lineno, unfortunately, may only give you the line where the Python function starts. Not perfect -- but at least it gets you close.
If you do this a lot, or just don't want to forget it, add this to your .gdbinit file:
define pyframe x/s ((PyStringObject*)f->f_code->co_filename)->ob_sval x/s ((PyStringObject*)f->f_code->co_name)->ob_sval p f->f_lineno end
Then call it like this:
(gdb) pyframe 0xb73f86fc: "/usr/lib/python2.6/socket.py" 0xb74da4b4: "readline" $1 = 375
It's not perfect. But with a combination of these tools, you can usually figure out what you need to know about most processes that go awry on your system.
- Skip Ahead
- 1. All about /proc, and how to debug Python
- 2. All about /proc, and how to debug Python
Solid state disks (SSDs) made a splash in consumer technology, and now the technology has its eyes on the enterprise storage market. Download this eBook to see what SSDs can do for your infrastructure and review the pros and cons of this potentially game-changing storage technology.