Programming the Shell

J. R. Schmidt

Shell Programming
The main philosophy behind Unix is maximal reuseablity of code. No one program will do everything that you want it to do, so few programs or applications are written to have multiple functions. Programs are designed to do one thing, and do it very well. With use of the shell one can use bf pipes and bf data redirection to funnel the output of one program into another, and so build up from several programs the precise functionality that you are looking for.

Shell programming is often the preferred route in solving a problem in which C or C++ will be overkill, or else too involved to be practical. Shell scripts are very easy to write and are especially useful for automating very large jobs such as running backups, performing bulk file conversions or performing data formatting on large collections of data files. The basic commands of shell scripting are just the command line programs provided by the operating system.

First of all you will need to learn about some basic commands on the system. The way to learn about commands and their options is through the man pages. The commands that we will use for now are ls, less, cd, wc and ps. To learn about all of the options ,assuming that your prompt is ``$'', type
$ man wc

Do this for all five commands above and test out their use with various options.
The shell is basically a command interpreter that you use to issue commands to the kernel, like command.com on a DOS system. It is however much more because it is also an interpreted programming language that does not need to be compiled, much like basic. In fact anything you can do with basic , you can do with the shell in Unix, except for video output.

We will be using the bash shell under Linux. This shell is now available for any flavor of Unix, and so there will be no problem with portability of this knowledge to other computer systems.

Data Redirection in the shell is is performed with > , >> operators. For example, suppose that you want to list all of the files in your current directory including file attributes and hidden files, but want to save a record of the output in a file, issue the command
$ ls -al > file.txt
To append more text to the end of an already existing file , issue instead
$ la -al >> file.txt
If you are planning on using data redirection a lot, issue the command
$ set -C
which sets the noclobber option, preventing files from being overwritten during data redirection.
There are three file descriptors for data redirection; 0 is for standard input, 1 is for standard output, and 2 is for standard error. The command
$ command -1 > out.txt 2 > error.txt
sends output to out.txt and error messages to error.txt.
You can also redirect input to a program. Suppose that you want to use the GnuGraphics program graph to draw a plotfile of some data stored in a file called ``data'', issue
$ graph < data
Pipes
Pipes can be used to connect processes together. For example we can run the plot2ps command to convert the output from graph into a Postscript file for printing
$ graph < data| plot2ps > data.ps
or we could issue the command ps to see all running process, pipe the standard output of this command into the paginator less
$ ps | less
and now scroll through the output one page or one line at a time.
Read the man page for the command grep. This is one of the most useful of all commands in the shell; it searches for strings in a file or set of files. You can use very long strings of pipes to accomplish complicated jobs. For example
$ ps -a | sort | grep -v sh | less
will do the following;
1. List all running processes, sending the standard output to sort which
2. sorts it into alphabetical order, sending its output to grep which
3. removes the process called sh sending its output to less which
4. finally displays the processed output one page at a time.

A shell script is a file containing a sequence of commands to be interpreted by the shell as a program. It can contain references or calls to any program on the system. Your shell script, call it script.sh, must begin with the statement
#!/bin/sh
which is a type of comment that tells the kernel to execute the shell /bin/sh, which is the program for the shell. You should create a directory called bin in your own home directory, and install your script there. Type the command
$ echo $PATH
the system will print to the screen (the standard output device) the value of your path. Now include your own bin directory in your path by issuing the command
$ export PATH=$PATH:/home/myhomedirectory/bin
in which you substitute the name of your home directory. You make the system recognize your script as being executable by running chmod on it
$ chmod +x script.sh

Shell Syntax

Variables Variables in shell scripts are declared when they are first used or assigned a value. All variables are stored as strings, even if they have numerical values. Remember that Unix is case sensitive. Run the script below
#!/bin/sh
variable= Greetings
echo $variable
exit 0

The value of a variable is accessed by prepending it with the dollar sign. The command echo causes the system to literally echo or repeat the argument passes to echo. The exit 0 command exits the program after returning a success exit code, 0. The program has no built-in ways to detect whether or not it completed successfully, and so we have built this in. This is the shell equivalent of the ``hello-world.c'' program that is everybody's first C program.
Quoting a variable will cause the shell to replace anything inside of the quote symbols with its literal value. There are two types of quotes, single or double. To see how to use them, execute the following script

#! /bin/sh
variable=''Greetings Humanoid''
echo $variable
echo ``$variable''
echo '$variable'
exit 0

The output will be
Greetings Humanoid
Greetings Humanoid
$variable
The variable variable was assigned the string Greetings Humanoid when it was created. Normally parameters are separated by white space. We want this one parameter to contain whitespace as a character, so we used the double quotes. Double quotes do not affect variable substitution. Single quotes will substitute the variable for the actual string within the single quotes.
Parameter Variables
Parameter variables are;
$1,$2,... Parameters given to a script.
$ a list of all variables give to the script, using field separators (white space).
Conditions
The ability to test conditions is crucial to nearly all aspects of programming, and the bash shell does have this capability. The simplest is the [] or test command. For example the basic set of file conditionals are
file conditionalresult
-d filetrue if file is a directory
-e filetrue if file exists
-f filetrue if file is a file
-r filetrue if file is readable
-s filetrue if file is nonzero size
-u filetrue if set-user-id is set on file
-w filetrue if file is writable
-x filetrue if file is executable

and so the code fragment
if test -f file.cc
then
lots of code
fi
will execute the code following then if file.cc is a regular file. The alternative syntax to accomplish the same thing using [] is
if [ -f file.cc ]
then
...
fi
The statement fi signals the end of the if statement code block. We will often use shell scripts to perform repeated processing of many files, and so the file conditionals will be particularly useful to us. Since the shell programming language is interpreted rather than compiled, it would be inefficient to use it for numerical computation, and so variable conditionals are less important. It would be like using Basic instead of C or Fortran for number crunching; we only do it for quick, small jobs because of the enormous speed difference in execution times.
For comparing strings, the conditionals are

comparisonresult
stringtrue if string is not an empty string
string1 = string2true if they are the same
string1 != string2true if they are not the same
-n stringtrue if string is not null (empty)
-z stringtrue if string is null


The arithmetic conditionals on two expressions expr1 and expr2 are

conditionalresult
expr1 -eq expr2true if expressions are equal
expr1 -ne expr2true if they are not equal
expr1 -gt expr2true if first is greater than second
expr1 -ge expr2true if expr1 ³ expr2
expr1 -lt expr2true if expr1 < expr2
expr1 -le expr2true if expr1 £ expr2
!expr1returns true if expr1 is false


Control Structures
The if statement will test a condition and depending on its truth, execute some statements. A second conditional can be added to be executed under the else portion of the conditional with the elif statement. In the code
if conditions
then statements 1
else
statements 2
fi

the set statements 1 will be executed if the conditional returns true. The set statements 2 will be executed if the conditional returns false. The command elif can be used when writing up a script that requires the user to pass options to the script, to report an error if the user passes an unrecognized or unacceptable option. For example, suppose that you write a code that requires the user to answer a question with yes or no. We want the code to exit if an anything but yes or no is passed to the script

#!/bin/sh
echo ``answer yes or no''
read response
if [ $response = ''yes'' ]
then
statements
elif [ $reponse = ``no'' ]
statements2
else
echo ``that is not a valid response goofball''
exit 1
fi
exit 0

The code exits with exit code 1, indicating an execution error, if any response other than yes or no is entered by the user. Another problem can arise; what if the user just hits ``return'' when queried for a response by a script. The script will report an error message. The problem comes up if the value of the variable response is a null string. This can be fixed by modifying the code so that the first conditional will return a false even if a null string is returned as the value of the response variable. The fixed script has quotes

#!/bin/sh
echo ``answer yes or no''
read response
if [ ``$response'' = ''yes'' ]
then
statements
elif [ $reponse = ``no'' ]
statements2
else
echo ``that is not a valid response goofball''
exit 1
fi
exit 0

Now the null response of just hitting return will be recognized as resulting in a valid conditional, a response other than yes or no.

Looping with the for command
Execution of a loop is extremely useful for performing repetitious tasks, which is actually what we will be using shell programming for; chopping and processing data files and building complicated LaTex documents with figures. The syntax for for is

For variables in values
do
statements
done

Suppose that you want to repeatedly copy a set of files to floppy disks. Perhaps you have files text1.ps, text2.ps,... and wish to copy the four consecutive files text3.ps, text4.ps, text5.ps and text6.ps to a dozen floppies for distribution. Insert a floppy and run the following script from the directory containing the files

#!/bin/sh
for file in $( ls text[3456].ps)
do
mcopy $file a:
done
exit 0

What this will do is the following. The ls command will produce a list of the files in question. This list becomes the argument of the in segment of the loop. The mcopy program will allow you to copy files to a floppy without mounting the floppy, and is the method that you should use to transfer files to an MSDOS formatted disk.
To execute a command a fixed number of times, we can use the While statement with syntax
while condition do
statements
done
For example, suppose that a user is to pass the string ``start'' to the script, here is a script fragment that will continue to prompt the user until this script is passed

#!/bin/bash
echo ``enter the word start''
read word
while [ ``$word != ``start'' ]
echo ``try again''
read word
done
exit 0

Suppose instead that you only want to give the user a finite number of tries, then us the for, in syntax instead.
The following script will perform arithmetic substitution on a variable to compute the factorial of a number, in this case 10
#!/bin/bash
seed=1
while [ ``$seed'' -le 10 ]
do
factorial=$(($factorial * $seed)) seed=$(($seed+1))
done
exit 0

The until statement will loop over and repeat a set of statements until a condition is met, with syntax

until conditions
do
statements
done

The case iteration is more sophisticated and works exactly the same way that it's C/C++ language counterpart works

case variable in
pattern [ | pattern ]...) statements ;;
pattern [ | pattern ]...) statements ;;
...
esac
For example the code

#!/bin/bash

echo `` enter the file type''
read type

case ``$type'' in
``fig'') echo ``Xfig savefile'';;
``ps'' ) echo ``Postscript level 2 ``;;
``pdf'' ) echo ``Adobe Portable Document Format'' ;;
* ) echo ``not recognized'' ;;
esac
exit 0

Compare this to how the program file works. Execute file on any file in your home directory. The script above compares each filename extension in list fig, ps and pdf to the strings in turn and prints out an identifier for the file extension. The wildcard line matches all other possible strings and takes care of the default case. Suppose that you want to relax the case dependency of a response, or accept several responses for the same case, this is handled with wildcards or multiple patterns

case ``$response'' in
``yes'' | ``y'' | ``Yes'' ) echo ``that's an affirmative'' ;;
``n*'' | ``N* | ) echo ``we'll take that to be a negative'' ;;

You can see that any of the commands in /usr/bin can be used as code fragments or subroutines in a shell script, which is what makes shell scripting such a powerful tool. We will see next time that the shell also possesses even more commands of its own, plus the ability to accept user defined functions.
Exercises

1. Read the manpage for the command basename. Test it out on files in your home directory, see how you can remove the suffix from a filename. Read the manpage for the ps2pdf program, or else just try to execute it on a postscript file in your home directory. This is a postscript to pdf translator. Run the xfig program. Draw a few diagrams and save them in your home directory as .fig files.
Now read the manpage for fig2dev. This is a program that converts fig files made by xfig into postscript files suitable for inclusion in a LaTex document. Draw a fig file and save it as picture.fig. Execute the command
fig2dev -L ps picture.fig picture.ps
which will postscript the file for you.
Now draw and save at least three xfig drawings in your home directory. Using the ls command along with basename and fig2dev write a shell script that reads all the files in your home directory, makes a list of files ending in .fig, and translates them into postscript files with the same file prefix. For example if you have one.fig, two.fig and three.fig in your home directory, the script will write one.ps, two.ps, and three.ps. You will find this script very useful if you need to produce LaTex documents containing many line drawings; LaTex can only include postscripted graphics.

2. Write a shell script that counts the number of files in your home directory and reports the number of executables, the number of readable files and the number of writable files.

3. Read the manpage for the tr command. Run the command
tr a-z A-Z file > file.out
on a text file in your home directory. Now write a shell script that reads in a text file with file extension .txt only, exiting with error if it is passed any other file type , and translates all characters in the file to upper case.

And Lists
Suppose that we wish to execute some commands provided that several conditions are met. We can link all of them with and operators, starting at the left end of the list each statement will be read and a judgment of false returned if any of the conditions are not met

statement&&statement && statement&&...

The or list is a list of conditions or statements that are tested, the entire list is evaluated as true if any of the statements in the list return true

statement||statement||statement ||...

Multiple statement blocks in a place where only one statement would be allowed must be enclosed in braces , for example

statement && {
statement 1
statement 2
}

Functions
The ability to accept user-defined functions gives the shell tremendous flexibility, your scripts can be made modular by defining functions as if they were subroutines and simply calling them. For example

#!/bin/bash
fix() {
new=$(basename $1 .gif)
convert $1 $new.jpg
echo ``$1 converted from gif to jpg''
}

for file in $(ls *.gif)
do
fix $file
done


This simple script generates a list of GIF files in your directory, and converts them to JPG files by calling the function fix, a user defined function that invokes the ImageMagick program convert. Functions can have local variables defined within them just as in C/C++ programming.
Functions that return a value are handled with basically the same syntax that we use in C/C++. The function will return or numerical value, or a string if the string has been stored in a variable. The function below will run through an infinite loop generated by while true which acts like a null command; it is always true. It will continue to prompt the user for a response, and will only exit the loop by returning a value if the user inputs a yes or no to the function's query;

#!/bin/bash
answer() {
echo ``answer the question''
while true
do
echo ``enter yes or no''
read answer
case ``$answer'' in
y | yes | Yes ) return 0 ;;
n | no | No ) return 1 ;;
*) echo ``not a valid response, try again''
esac
done
}

The break command, just as in C/C++ can be used to exit from a loop before the control condition has been met. The continue command will make a loop continue at the next iteration , with the loop variable taking on the next value in the list. You can also pass to continue the loop number at which to continue execution. For example, run the following script

#!/bin/bash
for x in 1 2 3 4 5 6
do
echo $x
continue
echo `` x is now $x''
done

In the output, notice that the loop has been continued with the next value of the expansion variable before the line `` x is now ...'' has been reached; the script has exited the loop before executing all of its commands, and has resumed execution at the next iteration. The continue statement can be used to exit a given iteration of a loop and execute from the next iteration.
The exec command will replace the shell in which the script has been run with a command, or a different script. Any lines in the script after exec has been invoked will not be run, because the shell that the script has been running in has been killed. Another way to exit a running script is through exit. The script is exited with an exit code. For example, if you type exit at your command prompt, your login shell will be terminated and you will be logged off.Exit code 0 is success, exit codes 1 and beyond all have meanings; 127 means ``command not found'' and 126 means that a file was not executable.
Every time you run a program from your shell, a new shell (child) will be spawned in which to run the process. If you want to make the value of a variable available in all of the child shells spawned by your login shell, use the export command. A good example of this is the way in which you expand your binary search path, with
export PATH=$PATH:/home/me/bin

Mathematical Expressions are evaluated with expr. For example
x='expr $x+1'
The single quote marks cause x to take on the value of the expression. The types of mathematical expressions that the shell can evaluate are

expr1=expr2 equal
expr1 > expr2greater than
expr1 > = expr2 greater than or equal to
expr1 < expr2less than
expr1 < = expr2less than or equal to
expr1 ! = expr2not equal
expr1+expr2addition
expr1-expr2subtraction
expr1*expr2multiplication
expr1/expr2division
expr1 % expr2integer modulo

A more compact notation for expr is the equivalent $((...)).
Keep in mind that compiled languages are the languages of choice for numerical calculations. Nevertheless it is possible to do calculations in shell programming, why you would want to I couldn't know. Compare the two scripts below that compute 10! by two different means

#!/bin/bash
fact=1
new=1
while [ $new -le 10 ]
do
fact=$(($fact*$new))
new=$(($new + 1))
done
echo $fact
exit 0

and the simpler script
#!/bin/bash
fact=1
for new in 1 2 3 4 5 6 7 8 9
do
fact=$(($fact*$new))
done
echo $fact
exit 0

Formatted Output
The printf command is extremely useful since it allows one to format output, almost exactly like it's C/C++ counterpart. The syntax is
printf ``string format `` parameter1 parameter2 ...
with the following conversion descriptors used to format the output (usage is a % followed by a descriptor)
doutput a decimal
coutput a character
soutput a string
% output the % character
Remember that strings with spaces will be seen as two separate strings unless enclosed in double quotes `` ``. In addition the escape sequences allow data output to be spaced or accompanied by the system bell, exactly as in C.
\\backslash character
\a System bell
\bbackspace character
\fformfeed character
\nnewline character
\rcarriage return
\ttab character
\vvertical tab

For example , run the following script

#!/bin/bash
printf ``% s \t % s'' ``hi there'' 15 people
exit 0

the output will be formatted with a tab between ``15'' and ``people''. Experiment with some of the other escape sequences.

Suppose that a command that you execute in a script returns a signal to the shell, such as an error message, and that depending on the value of the signal, certain actions are to take place. The trap command is passed the appropriate action, followed by the signal that the action is the desired response to

trap command signal

the list of signals is
signaldescription
HUP (1)hang up
INT (2)Interrupt, or ctrl-C
QUIT (3)Quit or ctrl-\
ABRT (6)abort
ALRM (14)alarm, for handing time-outs
TERM (15)terminate, used for shutdown


For example, the following script will execute the command to remove a file if ctrl-C or interrupt signal is sent by the user from the keyboard

#!/bin/bash
trap 'rm -f file' INT
exit 0

If a command such as
trap - INT

is later incorporated into the script, this resets interrupt from the keyboard to result in the default behavior; terminating execution of the script.
If you wish to trap the output of a command, called command and use this output as a variable, the correct syntax is
$(command)
for example, we will use the ls command in the following script to create a list of files whose names will be used as variables for a loop expansion

#!/bin/bash
for file in $(ls *.cc)
do
...
done

This basic syntax together with a pretty good knowledge of the system manpages should allow you to write quite a few extremely powerful shell scripts to perform repetitious tasks or to perform system maintainance.
Exercises

1. In the next few weeks we will write several C/C++ programs and build our own libraries of special functions and subroutines. Suppose that you design a suite of programs that have executables, library files with file suffixes ``dot a'' such as lib_myown.a, and header files with suffixes ``dot h'' such as myown.h. You may wish to distribute this package as a compressed , tarred archive within a directory called my_software.
write an installation script for this software package that when executed from the directory my_software installed as a subdirectory of your home directory, it will;
A. test to see if you have bin, lib and include subdirectories in your home directory, and if not, make them.
B. move the executables from within my_software to the bin subdirectory of your home directory, the libraries into lib, and the headers into include.

2. Write a program that tests for the existence of a file called data in your home directory, if it does nor exist, to make it and to write to this file formatted data, including a comment line beginning with a # and data points x, x2 separated by tabs, for 0 £ x £ 10 with one data pair per line.

3. The command ps2pdf under Ghostscript 4.0 or higher will convert a PS file (postscript) to PDF, which can be read by Adobe Acrobat. Write a shell script that will convert each PS file in the directory in which it executed in into a PDF file and sound the system bell when it finishes. Test it.

Solution (sort of)

#!/bin/bash
# shell script for fig --> ps conversion
for file in $(ls *.fig)
do
echo $file
base=$(basename $file .fig)
echo $base
fig2dev.old -L ps $file $base.ps
done
exit 0
4. Construct a shell script that carves up a large data file generated by a C program to simulate some evolving system, into individual time snap-shots, graphs them as Postscript, and mpeg-encodes them with the program mpeg_encode. Read the man pages for this program, it uses a parameter file such as the one below for input.

# test suite parameter file

PATTERN         IBPBIBPBPB
OUTPUT          new.mpg 

YUV_SIZE        288x400 

BASE_FILE_FORMAT        PPM     
INPUT_CONVERT   *
GOP_SIZE        10
SLICES_PER_FRAME  1

INPUT_DIR       .

INPUT
new.*.ppm [1-60] 
END_INPUT

PIXEL           HALF
RANGE           8

PSEARCH_ALG     LOGARITHMIC
BSEARCH_ALG     SIMPLE

IQSCALE         8
PQSCALE         10
BQSCALE         25

REFERENCE_FRAME ORIGINAL

Solution 1

#!/bin/sh
# The value of $1 is the number of frames you want in the mpeg
fix () {
mv -v $1 $2
graph -x -16 16 -y -0.5 1.5 <$2|plot2ps>$2.ps
convert -geometry 288x400 $2.ps new.$2.ppm
rm $2
rm $2.ps
}
count=`cat data|wc -l`
echo "The data file has $count lines" 
m=$1
points=`expr $count / $m`
echo "This is divided into $m files, with $points data points per file"
split -l $points data
n=1
for file in $( ls x* ); do
fix $file $n
n=$(($n+1))
done
# now encode the mpeg size 288x400 into file new.mpg
mpeg_encode file 
# clean up all of the construction mess
rm new.*.ppm
echo "The mpeg file is called new.mpg. It would be a good idea to rename it." 

Using Shell Scripts for Creating Animations
Below is a partially reprinted paper illustrating the use of shell scripts for automating the process of animation from raw data.

Introduction
The Unix operating system provides an incredible number of simple file manipulation utilities and programmable shells in which to execute scripts composed of these basic programs, each performing a single function, with data piped or redirected from one program into another. We will present in this short article a tutorial on the animation of computer simulation output in the Unix environment using only freely available programs and utilities present on nearly any Unix system.

Building an Animation Toolkit

The basic philosophy of the Unix operating system is to maintain maximum code reuseability. This essentially means that Unix applications are rarely of a one size fits all nature. Applications tend to be specialized and each one may be very good for a particular job, but few are written to be complete solutions to a wide variety of tasks. This is why the Unix shells are so powerful, they can be used to construct scripts that call on individual programs to perform a job, and then pipe the results into another program to be processed again, until in the end any desired task has been completed in an apparently seamless way.

Unix shell scripts are very nicely suited to the automation of repetitious jobs, such as bulk file conversion. The script itself resembles a DOS batch file , with a few important differences. Unix shells not only are user interfaces for executing commands, they also support variable definitions, loops, conditionals and cases. As a simple example, suppose that we have written a large LaTex document with several dozen line drawings created by the graphics editor Xfig, and need to convert all of them into postscript files for inclusion into the document. This can be done by hand but will be time consumming. That time would be better spent writing a script that we could use again under similar circumstances, that performs the entire job. For example

#!/bin/bash
# A shell script for bulk .fig - > .ps conversion
for file in $(ls *.fig)
do
base=$(basename $file  .fig)
fig2dev  -L  ps  $file  $base.ps
done
exit  0

The main features of the script are the following. The first line is a special comment that indicated that the script is a bash shell script. Bash is used as the default shell on Linux systems. The second line is a typical comment explaining the function of the script. The script itself begins with a line that creates a list of files with file extension .fig. This is generated by the list command ls. A variable file takes its values from this list. Next we run through a loop over the elements in our list. For each .fig file in the list we extract the basename using the basename command, which reads a file and reports its name sans file extension to a new variable base. Variables have names and values. To access a variable's value we prepend a dollar sign to it's name. The next line performs the file conversion using the command fig2dev, creating a postscript file with the same basename as the input file. Fig2dev has several output language options that are specified with the -L switch. Finally the end of the loop is denoted and the script is exited normally, without echoing any error statement to the console.
To run this script, one first needs to save it to a named file, such as fig2ps and make it executable

chmod +x fig2ps

Now the script can be used to perform bulk file conversions by invoking it by name in the directory containing the source .fig files.

Bash is not the only shell available to the Unix programmer. Most Unix machines also offer csh, ksh, tcsh, pdksh, and the ultimate shell zsh. Programming in any one of these shells is the same as in any other, with minor differences in syntax. Most syntax is the same in bash and pdksh, however the tcsh shell does not support functions. The tcsh shell is more forgiving in regard to the placement of spaces in a line of code setting a variable, bash and pdksh do not like spaces around the equal sign. Our choice of bash for the examples in this article is motivated by the need to use a shell that supports functions.

Any standard Unix distribution or flavour will provide a vast number of file utilities for file manipulation and processing. In addition to these we need programs that process or produce graphics files from raw data. Any good graphics toolkit should contain the ImageMagick and netpbm program packages. ImageMagick consists of several programs that can be compiled with support for a wide variety of graphics file formats. There is a file format conversion program convert, a viewer display and several other extremely useful tools. ImageMagick can be compiled with mpeg support, and can therefore produce digital motion picture files. However ImageMagick is an XWindows program, and many Unix or Unix-like systems such as NextStep do not use XWindows. We suggest augmenting the toolkit with the standard Berkeley mpeg encoder mpeg_encode because it is very easy to use, is fast and highly configurable, and does not require XWindows.
Netpbm is a general purpose graphics toolkit for manipulation of ppm and pnm files and many other graphics applications rely on having Netpbm on the system. Together with ghostscript ,netpbm is capable of handling all of the file conversions needed to build mpeg or animated gif files without any need for XWindows programs.
In addition we must have a package for converting raw data from a simulation into a plot or graph. There are many very sophistcated tools available for this. As an example we will use gnu graphics which is a small but powerful command line application that is very well suited to being used from within a shell script.
The PlotMTV program can be run from the command line like graphics and also produces postscript output. Two and three dimensional vector plots are it's forte which are perfect for animations of electromagnetic and hydrodynamic simulations.
All of these programs are freely available in source form and compile on nearly any Unix machine.

The mpeg_encode program is highly configurable through a parameter file that must be passed to the program as an arguement when it is called on the command line. Mpeg_encode can input several different file format images for encoding, but the best results seem to come from the ppm image format. Below we have an example parameter file which is the default file supplied with the program with a few minor changes.

# mpeg_encode parameter file

PATTERN IBPBIBPBPB
OUTPUT new.mpg

YUV_SIZE 400x400

BASE_FILE_FORMAT PPM
INPUT_CONVERT *
GOP_SIZE 10
SLICES_PER_FRAME 1

INPUT_DIR .

INPUT
new.*.ppm [1-60]
END_INPUT

PIXEL HALF
RANGE 8

PSEARCH_ALG LOGARITHMIC
BSEARCH_ALG SIMPLE

IQSCALE 8
PQSCALE 10
BQSCALE 25

REFERENCE_FRAME ORIGINAL
# end parameter file

The size of the images can be specified, as well as the number of frames to be encoded and the names of the separate images that will comprise the finished animation. Note the input image format new.*.ppm in which the wildcard takes integer values from 1 to 60. The manual accompanying the source code documents the meaning of all other parameters , but detailed knowledge of these items is not needed to use the program effectively .

An Example; Gaussian Packets Incident in a Potential

Animation of collision processes such as a Gaussian nondispersive packet incident on a finite square well can illustrate phenomena that otherwise are easily lost in the arithmetic of an analytical solution. The first step in building the animation is of course generation of the raw data. Since all of the complex graphics processing will be done by other applications, a completely bare bones program can be used to simply dump data for the graph of the wavefunction into a file for a sequence of time values. As an example the simple C++ program below does nothing more than write the value of y(x,t) for each x value within a given minimum and maximum range into a file called data for 60 sequential time values that differ incrementally.
A typical snapshot of the wave plotted by graph for the program below looks like the following

// Gaussian packets incident on a square well
#include<iostream.h>
#include<fstream.h>
#include<math.h>
float Den[61];
float ReR[61];
float ImR[61];
float ReT[61];
float ImT[61];
float ReA[61];
float ImA[61];
float ReB[61];
float ImB[61];
float Amp[61];

main()
{
ofstream f_out("data");

float dx=0.01;
float L=1.0;
float V=20.0;
float dt=0.2;
float dk=0.1;
float n=10;
for(int m=0;m<=60;m++){
float k=(m-n)*dk;
float lam=sqrt(k*k+V);

// some definitions to avoid repetitious calculations

float k2=k*k;
float lam2=k2+V;
float prod1=2.0*k*lam;
float prod2=lam2+k2;
float C=cos(2.0*L*lam);
float S=sin(2.0*L*lam);
float factor1=prod1*C;
float factor2=prod2*S;

// the reflection and transmission coefficients for each k

 Den[m]=(4.0*k2*lam2)+(V*V*S*S);
 ReR[m]=-V*S*factor2/Den[m];
 ImR[m]=V*S*factor1/Den[m];
 ReT[m]=prod1*factor1/Den[m];
 ImT[m]=prod1*factor2/Den[m];
 ReA[m]=k*(k+lam)*factor1/Den[m];
 ImA[m]=k*(lam+k)*factor2/Den[m];
 ReB[m]=k*(lam-k)*factor1/Den[m];
 ImB[m]=k*(lam-k)*factor2/Den[m];
 Amp[m]=exp(-0.25*(k-2.0)*(k-2.0));
}
for(int Index=0;Index<=60;Index++){
float t=-6.0+Index*dt;
// this code draws the well for each snapshot
f_out<<-15.0<<" "<<0.0<<"\n";
f_out<<-1.0<<" "<<0.0<<"\n";
f_out<<-1.0<<" "<<-9.0<<"\n";
f_out<<1.0<<" "<<-9.0<<"\n";
f_out<<1.0<<" "<<0.0<<"\n";
f_out<<15.0<<" "<<0.0<<"\n";
f_out<<-7.0<<" "<<0.0<<"\n";

// the wave to the left of the well

for(int l=-700;l<=-100;l++){
float x=l*dx;
float RSum=0.0;
float ISum=0.0;
for(int m1=0;m1<=60;m1++){
float k=(m1-n)*dk;
float arg1=k*x-t*fabs(k);
float Rright=cos(arg1);
float Iright=sin(arg1);
float arg2=k*x+2.0*k*L+t*fabs(k);
float Rleft=cos(arg2);
float Ileft=-sin(arg2);
float Repsi=Rright+(ReR[m1]*Rleft)-(ImR[m1]*Ileft);
float Impsi=Iright+(ReR[m1]*Ileft)+(ImR[m1]*Rleft);
RSum=RSum+dk*Repsi*Amp[m1];
ISum=ISum+dk*Impsi*Amp[m1];
}
float psi2=RSum*RSum+ISum*ISum;
f\out<<x<<" "<<psi2<<"\n";
}
// the wave within the well

for(int l1=-99;l1<=99;l1++){
float x=l1*dx;
float RSum=0.0;
float ISum=0.0;
for(int m3=0;m3<=60;m3++){
float k=(m3-n)*dk;
float lam=sqrt(k*k+V);
float arg3=lam*x-k*L-lam*L-t*fabs(k);
float arg4=lam*x+k*L-lam*L+t*fabs(k);
float Rright=cos(arg3);
float Iright=sin(arg3);
float Rleft=cos(arg4);
float Ileft=-sin(arg4);
float Repsi=(ReB[m3]*Rleft)-(ImB[m3]*Ileft)+(ReA[m3]*Rright)-(ImA[m3]*Iright);
float Impsi=(ReB[m3]*Ileft)+(ImB[m3]*Rleft)+(ReA[m3]*Iright)+(ImA[m3]*Rright);
RSum=RSum+dk*Repsi*Amp[m3];
ISum=ISum+dk*Impsi*Amp[m3];
}
float psi2=RSum*RSum+ISum*ISum;
f_out<<x<<" "<<psi2<<"\n";
}

// the wave to the right of the well

for(int l2=100;l2<=700;l2++){
float x=l2*dx;
float RSum=0.0;
float ISum=0.0;
for(int m2=0;m2<=60;m2++){
float k=(m2-n)*dk;
float arg5=k*x-2.0*k*L-t*fabs(k);
float Rright=cos(arg5);
float Iright=sin(arg5);
float Repsi=(ReT[m2]*Rright)-(ImT[m2]*Iright);
float Impsi=(ReT[m2]*Iright)+(ImT[m2]*Rright);
RSum=RSum+dk*Repsi*Amp[m2];
ISum=ISum+dk*Impsi*Amp[m2];
}
float psi2=RSum*RSum+ISum*ISum;
f_out<<x<<" "<<psi2<<"\n";
}
}
f\_out.close();
return(0);
}

The program also draws the well for each time value. The wavefunction is a superposition of monochromatic waves and is constructed using the exact values of the transmission and reflection coefficients for each monochromatic component, for example the wave transmitted beyond the well is

yT(x,t) =
å
k 
e[(-(k-k0)2)/( 2s2)][ (2kl)2cos2lL +i(2lk)(l2+k2)sin2lL
(2lk)2+(l2-k2)2 sin2 2lL
]ei(kx-2kL -c|k|t)
We are summing over a sampling of k values with a Gaussian amplitude factor and a standard Schrodinger picture phase factor to make the wave time dependent. We have used the dispersion relation
Ek = c|k|
rather than
Ek = (h/2p)2 k2
2m
so that the wave does not disperse. This way we are seeing only scattering effects in the time evolution of the packet. Within the potential well
V(x) = 0       |x| > L
V(x) = -V0       |X| £ L
the wavevector is
l2
2
-V0 = k2
2
The actual plotting of the data and image processing is accomplished with a shell script that makes calls to standard Unix file commands such as ls, wc, split, mv, and cat which perform listing, wordcount or linecount, file splitting, file moving and concatenation into the console or standard output. In addition the script will call graphics , convert, and mpeg_encode to plot the figures and encode them into an mpeg stream. The script itself is rather short, and we have augmented it with a few user-friendly features such as a case test to determine if the script has been properly invoked.

#!/bin/bash
# The value of $1 is the number of frames you want in the mpeg. 
# You must set this as well in the parameter file
fix () {
 mv -v $1 $2
 graph -T ps -x -10 10 -y -4 16 <$2 > $2.ps
 convert -geometry 400x400 $2.ps new.$2.ppm
 rm $2
 rm $2.ps
# here we are just saving disk space
}
export m=$1
case ``$m'' in 
``h'' | ``H'' | ``help'' | ``Help'' ) echo ``Usage; showtime [number of frames] ``; exit 0 ;;
-* ) echo ``That is not a valid number of frames ``; exit 0 ;;
  * )
if [ -e data ]; then
  count=`cat data| wc -l`
  echo "The data file has $count lines" 
  m=$1 
  points=`expr $count /$m` 
  echo "This is divided into $m files, with $points data points per file" 
  split -l $points data 
  n=1 
  for file in $( ls x* ); do 
  fix $file $n 
  n=$(($n+1)) 
  done 
  # now encode the mpeg size 400x400 into file new.mpg 
  mpeg_encode parameters 
  # clean up after ourselves
  rm new.*.ppm
  echo "The mpeg file is called new.mpg. It would be a good idea to rename it" 
  else
  echo ``There is no data file to process.''
  exit 1
 fi ;;
  esac
  exit 0
The first few lines are familiar from our earlier filter script, but the following lines introduce another capability of the Unix shell; user defined functions. Our script uses a function or procedure called fix that performs the repetitious plotting tasks . Fix will plot each snapshot of the wavefunction after the data file has been carved up and will also convert the plots into the pnm format. Fix has two variable arguements, $1 and $2, which are the names of the input and output files respectively.
To plot the data files fix invokes graph with the appropriate command line switches for producing a plot with the correct range and domain. It does so by first renaming the input file with the value of the second arguement passed to it, which will be an integer that labels where in the stream this particular plot will be encoded. Graph will pipe it's output into plot2ps which writes a postscript file of the plotted data, which is next converted into a 400× 400 pixel pnm file using convert. On a computer without XWindows, this step can be accomplished with netpbm and ghostscript by substituting

gs  -sDEVICE=ppm  -sOutputFile=-  -sNOPAUSE  -q  $2.ps  -c  showpage  -c  quit | pnmcrop > $2.ppm

for

convert -geometry 400x400 $2.ps new.$2.ppm

All of the construction debris made by fix is cleaned up before the procedure is exited.
In order to call the script, it is named and rendered executuable and then called by name together with a parameter equal to the number of frames in the desired mpeg. This equals the number of snapshots that the data file will be cut into. Once the script begins execution it tests to see if this parameter is in an acceptable range. If it is passed a negative number, the script exits with an appropriate error message. If a parameter such as h, H, help or Help is passed to the script, it will exit without error and echo usage instructions to the console.

Assuming that an acceptable number of frames has been passed to the script, it inspects the current directory for a file called data. If the file exists, the remaining steps in the program are executed. First a variable "count" is set equal to the number of lines in the data file. This number is extracted by concatenating data to the console and piping it into the wordcount program with the line count switch. This will return a number equalling the line count. The number of points to be plotted per snapshot is next calculated by dividing this line count by the number of frames passed to the script at invokation. The data file is next split into this many snapshots by the Unix program split. The default behaviour of split is to name the new files xaa, xab, and so forth . This ensures that the Unix ls command will list them in correct sequential order. If they were numerically named, ls would list file "10" before file "2" in other words, nonsequentially.
In order to encode the snapshots sequentially they must be renamed numerically since this is how mpeg_encode expects to find them, and they must be graphed. This is accomplished in the following way. First all files in the current directory beginning with the character "x" are listed lexigraphically, which is also sequentially according to the names given them by split. A loop is initiated that runs over the files in this list, passing each to fix along with the index of the loop, an integer, as second arguement. Fix does it's work, graphing each snapshot and renaming the plots in a fashion understandible by mpeg_encode.
We are left with a working directory filled with the plots ready for encoding. Mpeg_encode is called with it's parameter file to complete the final job of building the motion picture from the sequence of plots, and after this is done, the script exits without error.
This last step can be performed by ImageMagick itself, with no call to mpeg_encode if ImageMagick has been compiled with mpeg support. In this case, we can substitute the line that invokes mpeg_encode with

convert  -loop  1  -size  400x400  -interlace  line  new.*.ppm  new.mpg

The animated gif file gif89 is another file format option that may be more desireable if the final animation is going to be viewed with a web browser. Most modern web browsers support this file format, and fortunately so does ImageMagick. The command line in the script for this type of output file would be

convert  -loop  1  -size  400x400  -interlace  line  new.*.ppm  new.gif

Animated gif files tend to be much larger than mpeg files, which may or may not be an important consideration if one needs to be able to work with the results within a web environment.

For certain types of graphics applications it may be necessary to create images with transparent backgrounds. Indexed graphics formats such as Gif89 support this. To create an image with a transparent background, first create it on a uniform monocolor background, say for example white (#FFFFFF in RGB) in the ppm format. The netpbm toolkit contains the program ppmtogif which accepts command line directives to render a certain color transparent. For example

ppmtogif -trans ``#FFFFFF'' image.ppm > image.gif

will perform the needed conversion. This is very handy for creating web applications in which an image or animation is desired such that the background can show through the image.


File translated from TEX by TTH, version 1.93.
On 10 Jun 1999, 17:47.