Understanding UNIX shell scripts
Audience: This tutorial assumes you have little knowledge about UNIX/Linux Operating System and its functionalities. A basic understanding on various computer concepts will also help you in understanding various exercises given in this tutorial.
1. Command-line processing
The UNIX shell is a command-line interpreter that processes each command or combination of commands on a command line when you press Enter.
This example shows only a single command on the command line.
$ sort -r list
You can combine multiple commands on the same command line by creating a composite command.
This example shows a composite command comprising the ls and less commands.
$ ls -l | less
You can use a number of special characters in the command line. A semicolon (;), for example, allows you to place more than one command statement in the same command line. When you enter the code, the shell executes the preceding commands only when it reaches the semicolon.
In this example, the shell executes the cp command statement and then the cat command statement.
$ cp list list.txt; cat list.txt
Special characters you can use to manipulate commands in the command line include
backslash (\)
greater than (>)
less than (<)
pipe (|)
ampersand (&)
backslash (\)
The backslash (\) character prevents the shell from treating another character as a special character through a process called backslash escaping.
This allows you to split a command statement across multiple lines. When you place the backslash at the end of a line and then press Enter, you can continue the statement on the next line. The backslash prevents the shell from treating the Enter keystroke – or new line character – as a special character.
This example shows a long echo statement carried across three lines.
The code for this is
$ echo Long pieces of text may not always fit onto \
> a single line of the command line interface, so \
> it becomes necessary to split them across multiple \
> lines using backslashes.
greater than (>)
The greater-than character (>) allows you to direct the standard output of a command to a file or a device such as a printer instead of to the terminal screen.
This example will send the output of the ls command to a file called userdirs.
The code for this is
$ ls -l /home/tsm > tsmlist
less than (<)
The less-than character (<) allows you to send the contents of a file to a command as its standard input.
This example sends input from a file called list to the sort command.
The code for this is
$ sort -d < tsmlist
pipe (|)
The pipe character (|) allows you to direct the output of one command to the input of another command.
This example pipes the output from the cat command as input to the grep command for further processing.
The code for this is
$ cat tsmlist | grep 'volumes'
ampersand (&)
An ampersand (&) character at the end of a command statement allows you to run commands in the background.
This example specifies that the find command will run a long search process in the background.
The code for this is
$ find 'tsmlist' &
[1] 1234
$ tsmlist
If you want to use special characters in command-line text without the shell recognizing them as special characters, you have to enclose them in quotes or precede them with a backslash (\).
This example shows an echo command in which the echo text contains an ampersand. There’s a backslash in front of the ampersand, which prevents the shell from treating it as a special character.
$ echo Tours \& Travels
Tours & Travels
$
Tokens
The segments into which the shell divides a command line are called tokens. To execute a command line, the shell processes the first token and then each subsequent token in turn.
To begin processing a token, the shell checks whether it’s a keyword, an alias, or an ordinary word.
If the token is a keyword that opens a substructure such as a function, conditional statement, or bracketed group of commands, the shell processes the substructure before moving on to the next token.
If a token is an alias, the shell replaces it with the command to which the alias is mapped.
If a token is an ordinary word such as a command or a filename, the shell processes it directly.
After comparing a token against the list of known keywords and aliases, the shell processes it using several stages of expansion and substitution.
Expansion and substitution takes place in the following sequence:
brace expansion
tilde expansion
parameter substitution
command substitution
arithmetic substitution
word splitting
pathname substitution
brace expansion
In brace expansion, the shell looks for braces ({}) – also called curly brackets – in the token. If braces are present, it expands their contents.
For example, the token b{all,ook} expands into ball book.
tilde expansion
In tilde expansion, the shell looks for tildes (~) in the token. If a tilde is present, it replaces the tilde with the location of the current user’s home directory.
For example, depending on the system configuration, the token ~t01zxmh/file2 might expand into /usr/home/t01zxmh/file2.
parameter substitution
In parameter substitution, the shell checks whether the token is a variable name preceded by a dollar sign ($). If it is, the shell replaces the token with the current value of the corresponding variable.
For example, if the value of the SHELL parameter is /bin/ksh, the token $SHELL is replaced with /bin/ksh.
command substitution
In command substitution, the shell checks whether the token is a command enclosed in brackets and preceded by a dollar sign ($). If it is, the shell processes the command and replaces the token with the command’s output.
For example, the token $(type username) might be replaced with vincep.
arithmetic substitution
In arithmetic substitution, the shell checks whether the token is an arithmetic expression enclosed in double brackets and preceded by a dollar sign. If it is, the shell evaluates the expression and replaces the token with the result.
For example, the shell replaces the token $((32/8) with 4.
word splitting
In word splitting, the shell examines those parts of the command line that have resulted from previous stages of expansion and substitution. If any of these contain spaces or special characters, it splits them into tokens for processing.
pathname substitution
In pathname substitution, the shell looks for wildcard characters in the token. If it finds asterisks (*), question marks (?), or double slashes (//), it searches the current directory for filenames that match these wildcards and substitutes them for the token.
For example, depending on the files in the current directory, the token t*.txt might expand into
tivoli.txt tsm.txt tdp.txt.
After performing expansion and substitution, the shell processes subsequent tokens until it reaches the end of a command, denoted by a semicolon or a new line character.
Then it matches the command against its list of known functions, built-in commands, and pathnames.
Once the shell has identified which command it needs to execute, it executes the command to produce output.
It then moves on to the next command, processing its tokens in the same way.
2. Command grouping
You can join commands on a command line in such a way that the second command executes only if the first command has executed successfully.
For example, you can use a first command to check whether a file exists and a second command to perform an operation on it if it exists.
To make one command conditional on another, you join the commands using a double ampersand (&&). The command after the && symbols executes only if the command before the && symbols produces a zero exit status – in other words, if it executes successfully.
In this example, the ls command checks whether the userlist file exists. Because it does exist, the ls command executes without errors ( so its exit state is zero. This causes the sort command to execute.
$ ls userlist && sort userlist
userlist
BAKER, Daniel
CARUSO, Maria
GARZA, Teresa
LOGAN, Greg
MANEROWSKI, Sarah
NOVAK, Nicholas
NOVIALLO, Glen
OSWALD, Sam
PASCUCCI, Vince
REILLY, Molly
STROTHER, Tanya
WADE, Debora
$
If you delete the userlist file and run the command again, the ls command encounters an error – so its exit state is non-zero. Because the sort command is conditional, the shell doesn’t attempt to execute it.
$ ls userlist && sort userlist
ls: userlist: No such file or directory
$
You use a double pipe (||) to make a command conditional on the unsuccessful execution of the previous command.
In such a case, the second command executes only if the first command has a non-zero exit state.
In this example, the ls command looks for a file called userlist. If it fails to find the file, the touch command creates it.
$ ls userlist || touch userlist
ls: userlist: No such file or directory
$
If the ls command executes successfully, this means that the file already exists. In this case, the touch command doesn’t execute.
$ ls userlist || touch userlist
userlist
$
You can group commands using braces ({}). The shell treats any command block enclosed in braces as if it were a single command.
This allows you to redirect input and output to and from a group of commands.
In this example, the braces group the sort and grep commands into a code block so that the shell sorts input and then extracts any lines containing the word Mexico.
$ {sort | grep 'Mexico'}
You can redirect input and output to a command block as if it were a single command. In this example, the code specifies the flights file as input and the mex_flights file as output.
$ {sort | grep 'Mexico'} < flights > mex_flights
$
You can group commands using round brackets – often called parentheses – instead of braces. This causes the shell to spawn a subshell and execute the command block in the subshell.
Commands that execute in a subshell do not affect what’s happening in the main shell.
This allows you to define variables that exist only for the lifetime of the subshell, and to change the working directory within the subshell without affecting the parent shell.
$ (sort | grep 'Mexico') < massivefile > mex_info
$
3. Storing commands in scripts
Command grouping is useful for executing relatively short command-line code that you need to run only once.
However, you may need to run larger pieces of code that include several lines or to use the same piece of code many times.
In such cases, it’s advantageous to store the code in a file.
You can store blocks of shell commands in shell scripts.
The contents of shell scripts are stored as ordinary ASCII text.
You can read and edit ordinary text files, but you cannot execute them. However, you need to be able to execute shell scripts.
Therefore, you have to assign executable permissions on script files.
The first line in any shell script has to be a special line of code that specifies the particular shell program in which the script must run.
This is necessary because some commands run differently in different shell programs.
The shell identifier at the beginning of a shell script consists of a hash followed by an exclamation point (#!) – commonly called a shebang – and the absolute pathname of the shell program.
This example shows the first line of a script that uses the Korn shell.
#! /bin/ksh
This simple example of a script tests whether the directory /usr/shared/tours exists. If it doesn’t, the script creates it. Then it creates a file called tourlist inside this directory and returns a message.
#! /bin/ksh
ls /usr/shared/tours || mkdir /usr/shared/tours
touch /usr/shared/tours/tourlist
echo tour directory and tourlist file created.
Once you’ve created a script and made it executable, you can use it as many times as you like. You can execute it directly from the command line or you can invoke it from inside other scripts.
Summary
You can use special characters to join commands on a single command line, to redirect input and output, to run commands in the background, and to continue a command over multiple lines. You can prevent the shell from recognizing a special character by preceding it with a backslash. When you execute a command line, the shell splits it into tokens and processes each token in turn.
You can group commands using braces or brackets, which cause the shell to treat the commands as a single command. You can join two commands so that the second command will execute only if the first command executes successfully or only if it executes unsuccessfully.
You can store blocks of commands in a text file called a shell script and make this file executable. You can execute shell scripts directly from the command line and reuse them as often as necessary.
Comentários