Killing With Linux: A Primer
Stopping Runaway Processes
sshd rereads its configuration file when it receives a hangup signal, SIGHUP by executing itself with the name and options it was started with, e.g., /usr/sbin/sshd
Coders Vs. Lusers
Man page authors tend to wobble between addressing end users and ace programmers. That's why you see statements like "the do list is executed as long as the last command in list returns a non-zero exit status." Which is as helpful as saying "send the process a SIGHUP". But not to worry, for today we shall peel off the mask of mystery that covers these deep dark subjects.
Signals and Process Control
This falls under the heading of signals and process control. For us ace admins and users, our primary concerns are starting, stopping, and restarting services, and stopping runaway or hung processes with as little disruption as possible. Because signal-handling varies on different operating systems and different command shells, we'll stick to Linux and the bash shell.
Signals are used to communicate with daemons and processes. Any active task is a process, while daemons are background services that lurk in wait to respond to certain events, or to run scheduled tasks. A program must have some sort of signal handler programmed into it to trap and respond to signals. The signal man page describes the various signals and what they do. Signals are sent by the kill command. kill -l displays a list of signals and their numbers.
All daemons and processes have a Process ID (PID), as this ps command shows:
$ ps aux
USER PID %CPU %MEM TTY STAT COMMAND
root 1 0.0 0.1 ? S init '2'
105 7783 0.0 0.2 ? Ss /usr/bin/dbus-daemon --system
hal 7796 0.0 0.7 ? Ss /usr/sbin/hald
postfix 7957 0.0 0.2 ? S qmgr -l -t fifo -u -c
nagios 8371 0.0 0.2 ? SNs /usr/sbin/nagios /etc/nagios/nagios.cfg
This output is slimmed down, you'll see more lines and columns on your system. If something is sucking up all your CPU or memory you'll see what it is in the %CPU and %MEM columns. A quicker way to find a runaway process is with the top command, because by default the processes using the most CPU are displayed on top. We can play with this a bit with the yes command:
$ yes carla is teh awesum
This repeats "carla is teh awesum" at high speed until you stop it. It should drive your CPU usage into the red zone:
$ top ... PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12144 carla 25 0 31592 17m 13m R 93.4 3.5 0:50.26 konsole 22236 carla 15 0 2860 468 400 S 4.3 0.1 0:00.97 yes
Interestingly, credit for hammering the CPU goes to Konsole, not yes, because yes is running inside Konsole. If you drop to a "real" console (Ctrl+alt+f2) you'll see yes with the big numbers.
There are a number of ways to stop yes. If you go back to the shell it's running in, just hit CTRL+c. Or you can stop it with the kill command in a second shell, either by PID or by name:
$ kill 22236
$ killall yes
CTRL+c sends a SIGINT (2), or a terminate interrupt from the keyboard. kill and killall both send a SIGTERM (15) by default. SIGTERM (15) can be caught and either ignored, or interpreted in a different way, so when it works unpredictably, you can blame the process you're trying to kill.
Killing a parent process will usually, but not always, kill its children as well. How do you know what the child processes are? Use the -f flag with ps:
$ ps axf 22371 ? R 2:35 _ konsole 'kdeinit' 22372 pts/3 Ss 0:00 | _ /bin/bash 24322 pts/3 S+ 0:00 | | _ yes carla is teh awesum 22381 pts/4 Rs 0:00 | _ /bin/bash 24323 pts/4 R+ 0:00 | | _ ps axf
Meanwhile, Back at the SIGHUP Ranch
SIGHUP is pronounced "sig-hup", and means "signal hangup". How do you send a SIGHUP? There are several ways:
# kill -HUP 'pid'
# killall -HUP 'process-name'
# kill -1 'pid'
# killall -1 'process-name'
So you may use PIDs or names, and signal names or numbers. Why do this instead of restarting it with /etc/init.d/foo restart? It is preferable to control services with their init files, as these usually include sanity and error checks, and additional functions. The main reason to be familiar with the kill command and signals is to stop hung or runaway processes as cleanly as possible, and not have to reboot or logout.
As you saw in man signal, there are dozens of ways to control processes. Here are the commonly-used ones:
kill -STOP 'pid'
SIGSTOP (17,19,23) stops a process without killing it
kill -CONT 'pid'
SIGCONT (19,18,25)restarts a stopped process
kill -KILL 'pid'
SIGKILL (9) forces the process to terminate immediately and performs no cleanup
kill -9 -1
Kill all processes that you own
SIGKILL and SIGSTOP can not be caught, blocked or ignored, but the others can. These are your big guns of last resort.
The Bash shell contains a kill built-in command, as this shows:
$ type -all kill
kill is a shell built-in
kill is /bin/kill
It's unlikely you'll encounter any conflicts or odd behavior, but if you do try specifying /bin/kill.
Be sure to further check out the fascinating and large world of killing in Linux by consulting the references in Resources, because this is your ticket to making nice surgical repairs instead of rebooting every time you have a problem, like some poor crippled operating systems we know.
- Chapter 7 "Starting and Stopping Linux", the Linux Cookbook
- bash (1) - GNU Bourne-Again Shell
- yes (1) - output a string repeatedly until killed
- signal (7) - list of available signals
- ps (1) - report a snapshot of the current processes
- kill (1) - send a signal to a process
- killall (1) - kill processes by name
- pkill (1) - look up or signal processes based on name and other attributes
- skill (1) - send a signal or report process status
- xkill (1) - kill a client by its X resource
Article courtesy of Enterprise Networking Planet, originally published April 25, 2006