November 28, 2014
 
 
RSSRSS feed

Killing With Linux: A Primer

Stopping Runaway Processes

  • August 3, 2008
  • By Carla Schroder

So there you are, dutifully wading through the documentation for whatever gnarly Linux application you're rassling into submission. You're running commands and editing configuration files and things are working and life is good. Until -- yes, you knew the good times weren't going to last -- until you hit the dreaded "send the process a SIGHUP" instruction.

Unfazed, you motor onwards. What is a SIGHUP and how do you send it? Is it like a bouquet of flowers that you send your sweetheart? You're pretty sure it's not a verbal command, but you try it anyway. Nope, that's not it. Then you examine the keyboard. Hmm, no SIGHUP key. You re-read the man page for the application:
sshd rereads its configuration file when it receives a hangup signal, SIGHUP by executing itself with the name and options it was started with, e.g., /usr/sbin/sshd

Um. Well.

Coders Vs. Lusers
Man page authors tend to wobble between addressing end users and ace programmers. That's why you see statements like "the do list is executed as long as the last command in list returns a non-zero exit status." Which is as helpful as saying "send the process a SIGHUP". But not to worry, for today we shall peel off the mask of mystery that covers these deep dark subjects.

Signals and Process Control
This falls under the heading of signals and process control. For us ace admins and users, our primary concerns are starting, stopping, and restarting services, and stopping runaway or hung processes with as little disruption as possible. Because signal-handling varies on different operating systems and different command shells, we'll stick to Linux and the bash shell.

Signals are used to communicate with daemons and processes. Any active task is a process, while daemons are background services that lurk in wait to respond to certain events, or to run scheduled tasks. A program must have some sort of signal handler programmed into it to trap and respond to signals. The signal man page describes the various signals and what they do. Signals are sent by the kill command. kill -l displays a list of signals and their numbers.

All daemons and processes have a Process ID (PID), as this ps command shows:

$ ps aux
USER       PID %CPU %MEM TTY  STAT  COMMAND
root         1  0.0  0.1  ?    S    init '2'
105       7783  0.0  0.2  ?    Ss   /usr/bin/dbus-daemon --system
hal       7796  0.0  0.7  ?    Ss   /usr/sbin/hald
postfix   7957  0.0  0.2  ?    S    qmgr -l -t fifo -u -c
nagios    8371  0.0  0.2  ?    SNs  /usr/sbin/nagios /etc/nagios/nagios.cfg

This output is slimmed down, you'll see more lines and columns on your system. If something is sucking up all your CPU or memory you'll see what it is in the %CPU and %MEM columns. A quicker way to find a runaway process is with the top command, because by default the processes using the most CPU are displayed on top. We can play with this a bit with the yes command:

$ yes carla is teh awesum

This repeats "carla is teh awesum" at high speed until you stop it. It should drive your CPU usage into the red zone:

$ top
...
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12144 carla     25   0 31592  17m  13m R 93.4  3.5   0:50.26 konsole
22236 carla     15   0  2860  468  400 S  4.3  0.1   0:00.97 yes

Interestingly, credit for hammering the CPU goes to Konsole, not yes, because yes is running inside Konsole. If you drop to a "real" console (Ctrl+alt+f2) you'll see yes with the big numbers.

There are a number of ways to stop yes. If you go back to the shell it's running in, just hit CTRL+c. Or you can stop it with the kill command in a second shell, either by PID or by name:

$ kill 22236
$ killall yes

CTRL+c sends a SIGINT (2), or a terminate interrupt from the keyboard. kill and killall both send a SIGTERM (15) by default. SIGTERM (15) can be caught and either ignored, or interpreted in a different way, so when it works unpredictably, you can blame the process you're trying to kill.

Killing a parent process will usually, but not always, kill its children as well. How do you know what the child processes are? Use the -f flag with ps:

$ ps axf
22371 ?        R      2:35  _ konsole 'kdeinit'
22372 pts/3    Ss     0:00  |   _ /bin/bash
24322 pts/3    S+     0:00  |   |   _ yes carla is teh awesum
22381 pts/4    Rs     0:00  |   _ /bin/bash
24323 pts/4    R+     0:00  |   |   _ ps axf

Meanwhile, Back at the SIGHUP Ranch
SIGHUP is pronounced "sig-hup", and means "signal hangup". How do you send a SIGHUP? There are several ways:

# kill -HUP 'pid'
# killall -HUP 'process-name'
# kill -1 'pid'
# killall -1 'process-name'

So you may use PIDs or names, and signal names or numbers. Why do this instead of restarting it with /etc/init.d/foo restart? It is preferable to control services with their init files, as these usually include sanity and error checks, and additional functions. The main reason to be familiar with the kill command and signals is to stop hung or runaway processes as cleanly as possible, and not have to reboot or logout.

Killing Processes
As you saw in man signal, there are dozens of ways to control processes. Here are the commonly-used ones:

kill -STOP 'pid'
SIGSTOP (17,19,23) stops a process without killing it

kill -CONT 'pid'
SIGCONT (19,18,25)restarts a stopped process

kill -KILL 'pid'
SIGKILL (9) forces the process to terminate immediately and performs no cleanup

kill -9 -1
Kill all processes that you own

SIGKILL and SIGSTOP can not be caught, blocked or ignored, but the others can. These are your big guns of last resort.

Bash Kill
The Bash shell contains a kill built-in command, as this shows:

$ type -all kill
kill is a shell built-in
kill is /bin/kill

It's unlikely you'll encounter any conflicts or odd behavior, but if you do try specifying /bin/kill.

Be sure to further check out the fascinating and large world of killing in Linux by consulting the references in Resources, because this is your ticket to making nice surgical repairs instead of rebooting every time you have a problem, like some poor crippled operating systems we know.

Resources

  • Chapter 7 "Starting and Stopping Linux", the Linux Cookbook
  • bash (1) - GNU Bourne-Again Shell
  • yes (1) - output a string repeatedly until killed
  • signal (7) - list of available signals
  • ps (1) - report a snapshot of the current processes
  • kill (1) - send a signal to a process
  • killall (1) - kill processes by name
  • pkill (1) - look up or signal processes based on name and other attributes
  • skill (1) - send a signal or report process status
  • xkill (1) - kill a client by its X resource

Article courtesy of Enterprise Networking Planet, originally published April 25, 2006

Sitemap | Contact Us