Sobell on the Bourne Again Shell and the Linux Command Line

By: Ibrahim Haddad
Monday, November 21, 2005 01:41:06 PM EST
URL: http://www.linuxplanet.com/linuxplanet/interviews/6086/1/

The Fate of the Command Line

LinuxPlanet: Mark, is the command line dead?

No, not at all. For some people, for some tasks, it is easier and more straightforward to use a graphical interface. It really depends on what you want to do and who you are. The difference between a GUI and the command line is like the difference between an automatic and a stick shift. I drive a stick because it gives me more control over the car and gives me more of a feel of what the car is doing, how it is performing.

Of course this discussion assumes that you are working with files at a basic or system administration level. Some applications have GUIs and may have no, or a very primitive, command line interface. It makes no sense to try to run these apps from the command line.

One thing that is nice about the command line is that it gives you access to hundreds of utilities. Right on the command line you can use a pipe to combine utilities to perform a task that no one utility is set up to do. Here is a quote from my ... Linux Commands, Editors, and Shell Programming book that talks about pipes and how they connect processes:

"A process is the execution of a command by Linux. Communication between processes is one of the hallmarks of UNIX/Linux. A pipe (written as a vertical bar, |, on the command line and appearing as a solid or broken vertical line on keyboards) provides the simplest form of this kind of communication. Simply put, a pipe takes the output of one utility and sends that output as input to another utility. Using UNIX/Linux terminology, a pipe takes standard output of one process and redirects it to become standard input of another process. Most of what a process displays on the screen is sent to standard output. If you do not redirect it, this output appears on the screen. Using a pipe, you can redirect the output so that it becomes instead standard input of another utility."

For example, you can combine the ls command, which lists the files in a directory, with the wc -w command, which counts words, to count the files in a directory:

     $ ls | wc -w
     45

In the realm of Linux system administration, the GUI tools are often built on top of the command line tools, so you gain no real advantage from the GUI tools. Except of course you get to point and click. Frequently you can do things from the command line that you cannot do from a GUI sys admin tool.

Bourne and Bourne Again

LP: Would you discuss the Bourne Again shell and explain how it compares to the original Bourne Shell?

The shell is the command line interpreter -- it parses the command lines you enter and calls the programs you request, passing to the program the arguments you entered on the command line. The shell is also a high-level programming language. The Bourne Again Shell, or bash, is the default shell on many Linux systems. Other shells are included with most distributions and even more are available for download.

The Bourne Again Shell, written by the GNU project, is a souped up version of the original Bourne Shell (sh), the first shell available under UNIX as it was released by AT&T. I used to recommend that readers consider using the C Shell (csh) as their interactive shell because it has some important features that are not available in the original Bourne Shell. Today bash has all those features and then some, including command completion, history (so you can edit and repeat previous commands), and job control (allows you to move jobs between the foreground and background). And of course you can write shell scripts (batch files) using bash.

Many Linux system shell scripts start with the line #!/bin/sh. This line causes the script to run under the sh program, which is not the original Bourne Shell, but a link to bash.

"Because of its long and successful history, the original Bourne Shell has been used to write many of the shell scripts that help manage UNIX systems. Some of these scripts appear in Linux as Bourne Again Shell scripts. Although the Bourne Again Shell includes many extensions and features not found in the original Bourne Shell, bash maintains compatibility with the original Bourne Shell so you can run Bourne Shell scripts under bash. On UNIX systems the original Bourne Shell is named sh. On Linux systems sh is a symbolic link to bash ensuring that scripts that require the presence of the Bourne Shell still run. When called as sh, bash does its best to emulate the original Bourne Shell."

LP: Would you recommend that someone new to Linux learn the Bourne Again Shell or the TC Shell?

If you are a hard-core C Sheller, go ahead and use the TC Shell (tcsh). Otherwise I would stick with bash. Almost all administration shell scripts that control a Linux system are written to be run by bash, so if you learn bash you will be able to understand and modify those scripts more easily.

Awk-ward

LP: Why would you use awk?

Good question, especially with many people going straight to Perl. This utility is simple and powerful. Before Perl was around it was one of the workhorses of file manipulation. Today it is still very useful. GNU's version of awk, called gawk, has some new features that make it a very capable tool. Here is the section of my book that talks about how to get gawk to communicate with a coprocess:

"Coprocess: Two-Way I/O

"A coprocess is a process that runs in parallel with another process. Starting with version 3.1, gawk can invoke a coprocess to exchange information directly with a background process. A coprocess can be useful when you are working in a client/server environment, setting up an SQL front end/back end, or exchanging data with a remote system over a network. The gawk syntax identifies a coprocess by preceding the name of the program that starts the background process with a |& operator.

"The coprocess command must be a filter (i.e., it reads from standard input and writes to standard output) and must flush its output whenever it has a complete line rather than accumulating lines for subsequent output. When a command is invoked as a coprocess, it is connected via a two-way pipe to a gawk program so that you can read from and write to the coprocess.

"When used alone the tr utility does not flush its output after each line. The to_upper shell script is a wrapper for tr that does flush its output; this filter can be run as a coprocess. For each line read, to_upper writes the line, translated to uppercase, to standard output. Remove the # before set -x if you want to_upper to display debugging output.

$ cat to_upper
#!/bin/bash
#set -x
while read arg
do
    echo "$arg" | tr '[a-z]' '[A-Z]'
done

$ echo abcdef | to_upper
ABCDEF

"The g6 program invokes to_upper as a coprocess. This gawk program reads standard input or a file specified on the command line, translates the input to uppercase, and writes the translated data to standard output.

$ cat g6
    {
    print $0 |& "to_upper"
    "to_upper" |& getline hold
    print hold
    }

$ gawk -f g6 < alpha
AAAAAAAAA
BBBBBBBBB
CCCCCCCCC
DDDDDDDDD

"The g6 program has one compound statement, enclosed within braces, comprising three statements. Because there is no pattern, gawk executes the compound statement once for each line of input.

"In the first statement, print $0 sends the current record to standard output. The |& operator redirects standard output to the program named to_upper, which is running as a coprocess. The quotation marks around the name of the program are required. The second statement redirects standard output from to_upper to a getline statement, which copies its standard input to the variable named hold. The third statement, print hold, sends the contents of the hold variable to standard output."

The Utility Known as tr

LP: Would you talk a little more about the tr utility?

Ah, tr. Well, first thing that comes to mind is that it is the answer to the trivia question, "Name a Linux utility that accepts input only from standard input and never from a file named as an argument on the command line." It is an odd beast that is useful only sometimes--but when it is useful it is very useful. Here is an excerpt that talks about tr:

"The tr utility reads standard input and, for each input character, maps it to an alternate character, deletes the character, or leaves the character alone. This utility reads from standard input and writes to standard output.

"The tr utility is typically used with two arguments, string1 and string2. The position of each character in the two strings is important: Each time tr finds a character from string1 in its input, it replaces that character with the corresponding character from string2.

"With one argument, string1, and the --delete option, tr deletes the characters specified in string1. The option --squeeze-repeats replaces multiple sequential occurrences of characters in string1 with single occurrences (for example, abbc becomes abc).

"You can use a hyphen to represent a range of characters instring1 or string2. The two command lines in the following example produce the same result:

$ echo abcdef | tr  'abcdef' 'xyzabc'
xyzabc
$ echo abcdef | tr  'a-f' 'x-za-c'
xyzabc

"The next example demonstrates a popular method for disguising text, often called ROT13 (rotate 13) because it replaces the first letter of the alphabet with the thirteenth, the second with the fourteenth, and so forth.

$ echo The punchline of the joke is ... |
> tr 'A-M N-Z a-m n-z' 'N-Z A-M n-z a-m'
Gur chapuyvar bs gur wbxr vf ...

"To make the text intelligible again, reverse the order of the arguments to tr:

$ echo Gur chapuyvar bs gur wbxr vf ... |
> tr 'N-Z A-M n-z a-m' 'A-M N-Z a-m n-z'
The punchline of the joke is ...

"The --delete option causes tr to delete selected characters:

$ echo If you can read this, you can spot the missing vowels! |
> tr --delete 'aeiou'
If y cn rd ths, y cn spt th mssng vwls!

"In the following example, tr replaces characters and reduces pairs of identical characters to single characters:

$ echo tennessee | tr --squeeze-repeats 'tnse' 'srne'
serene

"The next example replaces each sequence of nonalphabetic characters (the complement of all the alphabetic characters as specified by the character class alpha) in the file draft1 with a single NEWLINE character. The output is a list of words, one per line.

$ tr --complement --squeeze-repeats '[:alpha:]' '\n' < draft1

"The final example uses character classes to upshift the string hi there:

$ echo hi there | tr '[:lower:]' '[:upper:]'
HI THERE

Wrapping Up

LP: Any final thoughts?

I would say that the command line is not for everyone. It is for users who want to get their hands dirty and have more control over the beast they are taming. Half the battle is learning what the shell can do. The other half is learning a little about a lot of the utilities that come in a standard Linux distribution. You do not need to know every option of each command; it is enough to know the names of some of the commands and basically what each one does. You can read the man page, the info page, or the command reference section of my book from there. Look up the tac utility and laugh at the origin of its name to get started.

Book Information

A Practical Guide to Linux Commands, Editors, and Shell Programming
Prentice Hall PTR, 2005
ISBN 0-13-147823-0
U.S. $39.99, Canada $55.99
www.sobell.com

About the Book Author

Mark G. Sobell is president of Sobell Associates Inc., a consulting firm that specializes in UNIX/Linux training, support, and custom software development. He is the author of many best-selling UNIX and Linux books including A Practical Guide to Red Hat Linux, Second Edition, from Prentice Hall PTR. His most recent book is A Practical Guide to Linux Commands, Editors, and Shell Programming, published by Prentice Hall PTR. He has more than 25 years of experience working with UNIX and Linux. Go to www.sobell.com for more information on Mark Sobell's books.

Copyright Jupitermedia Corp. All Rights Reserved.