July 22, 2014
 
 
RSSRSS feed

Advanced Linux Server Troubleshooting (part 2) - page 2

All about /proc, and how to debug Python

  • December 9, 2010
  • By Akkana Peck
The signal handler trick is great, but it's no help if you want to find out about a process right now and it doesn't have a signal handler.

If you read part 1, you've probably already tried running gdb on that python process without much luck:

$ gdb  -p 9997
GNU gdb (GDB) 7.1-ubuntu
[ ... startup blather snipped ... ]
0x00f03422 in __kernel_vsyscall ()
(gdb) where
#0  0x00f03422 in __kernel_vsyscall ()
#1  0x00cb1241 in recv () at ../sysdeps/unix/sysv/linux/i386/socket.S:61
#2  0x081301ba in ?? ()
#3  0x081303b4 in ?? ()
#4  0x080e0a21 in PyEval_EvalFrameEx ()
#5  0x080e2807 in PyEval_EvalCodeEx ()
#6  0x080e0c8b in PyEval_EvalFrameEx ()
[ lots more like that ]

Not so useful -- you can see we're doing something that involves recv from a socket, which means we're waiting on something networky ... but not why it's happening, so you can prevent it next time.

But gdb has a few more Python-related tricks up its sleeve. You can get quite a bit of extra information by installing Python with debugging symbols: Debian or Ubuntu package it as python-dbg, while Fedora and Redhat call it python-debug. Then, when you run gdb:

(gdb) where
#0  0xb7849430 in __kernel_vsyscall ()
#1  0xb7827241 in recv () from /lib/tls/i686/cmov/libpthread.so.0
#2  0x081301ba in recv (s=0xb73c93b8, cbuf=0xb73d1654 "\001", len=-1079842992, 
    flags=0) at /usr/include/bits/socket2.h:45
#3  sock_recv_guts (s=0xb73c93b8, cbuf=0xb73d1654 "\001", len=-1079842992, 
    flags=0) at ../Modules/socketmodule.c:2317
#4  0x081303b4 in sock_recv (s=0xb73c93b8, args=(1,))
    at ../Modules/socketmodule.c:2398
#5  0x080e0a21 in call_function (f=
    Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:3750
#6  PyEval_EvalFrameEx (f=
    Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:2412
#7  0x080e2807 in PyEval_EvalCodeEx (co=0xb73f77b8, globals=
    {'SocketType': , 'getaddrinfo': , 'AI_NUMERICSERV': 1024, 'PACKET_OTHERHOST': 3, 'AI_NUMERICHOST': 4, 'StringIO': , 'IPPORT_RESERVED': 1024, 'AF_IRDA': 23, 'create_connection': , 'MSG_PEEK': 2, 'NETLINK_IP6_FW': 13, 'NETLINK_ROUTE': 0, 'SO_RCVBUF': 8, 'MSG_DONTROUTE': 4, 'EAI_SERVICE': -8, 'TCP_QUICKACK': 12, 'IPV6_RECVDSTOPTS': 58, 'TIPC_CFG_SRV': 0, 'SOCK_SEQPACKET': 5, 'SO_SNDTIMEO': 21, 'SO_ERROR': 4, 'IPPROTO_IPIP': 4, 'herror': , 'BTPROTO_HCI': 1, 'IPV6_RECVRTHDR': 56, 'IPV6_DSTOPTS': 59, 'sslerror': , 'TCP_SYNCNT': 7, 'SOCK_DGRAM': 2, 'EAI_AGAIN': -3, 'SO_TYPE': 3, 'SO_RCVTIMEO': 20, '__file__': '/usr/lib/python2.6/socket.pyc', 'inet_ntop': , 'IPPROTO_RSVP': 46, 'TIPC_DEST_DROPPABLE': 129, 'SO_OOBINLINE': 10, 'MSG_TRUNC': 32, 'SOL_TIPC': 271, 'SSL_ERROR_WANT_X509_LOOKUP': 4, 'SO_LINGER': 13, 'SOCK_RDM': 4, 'I...(truncated), locals=0x0, args=0x92257e0, argcount=1, kws=0x92257e4, kwcount=0, defs=
    0xb73fed38, defcount=1, closure=0x0) at ../Python/ceval.c:3000
#8  0x080e0c8b in fast_function (f=
    Frame 0x9225694, for file /usr/lib/python2.6/httplib.py, line 349, in _read_status (self=, status='UNKNOWN', will_close='UNKNOWN', chunk_left='UNKNOWN', length='UNKNOWN', strict=0, reason='UNKNOWN', version='UNKNOWN', debuglevel=0, msg=None, chunked='UNKNOWN', _met---Type  to continue, or q  to quit---

Lots of information about the state of the network socket, especially in #7. If you're a networking expert, that might give you some clues.

Otherwise, you'll still want to know where it's stuck. For that, look at the stack trace for eval_frame or a similarly named function. In this case, PyEval_EvalFrameEx is at position 6:

#6  PyEval_EvalFrameEx (f=
    Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:2412

From that line, you can get the current Python filename and approximate line number:

(gdb) up 6
#6  PyEval_EvalFrameEx (f=
    Frame 0x922581c, for file /usr/lib/python2.6/socket.py, line 397, in readline (self=<_fileobject at remote 0xb7471fac>, size=-1, buf=, buffers=[''], data=None, recv=), throwflag=0) at ../Python/ceval.c:2412
2412    ../Python/ceval.c: No such file or directory.
        in ../Python/ceval.c
(gdb) x/s ((PyStringObject*)f->f_code->co_filename)->ob_sval
0xb73f86fc:      "/usr/lib/python2.6/socket.py"
(gdb) p f->f_lineno
$2 = 375

So it's in /usr/lib/python2.6/socket.py somewhere near line 375. "Somewhere near" because f->f_lineno, unfortunately, may only give you the line where the Python function starts. Not perfect -- but at least it gets you close.

If you do this a lot, or just don't want to forget it, add this to your .gdbinit file:

define pyframe
x/s ((PyStringObject*)f->f_code->co_filename)->ob_sval
x/s ((PyStringObject*)f->f_code->co_name)->ob_sval
p f->f_lineno
end

Then call it like this:

(gdb) pyframe
0xb73f86fc:      "/usr/lib/python2.6/socket.py"
0xb74da4b4:      "readline"
$1 = 375

It's not perfect. But with a combination of these tools, you can usually figure out what you need to know about most processes that go awry on your system.

Sitemap | Contact Us