It isn't easy to see how to read a file line-by-line in a shell script.
And while you
can write a file line-by-line by using the file-appending operator
>>
(two right angle brackets) with each command that should add
to the file, there's a more efficient way to do that as well.
The trick is to open the file and associate a file descriptor number (3, 4, ..., 9) with it. UNIX keeps a file pointer, like a bookmark in a book, that tells it where the next read or write should be in each open file. For example, if you open a file for reading and read the first line, the file pointer will stay at the start of the second line. The next read from that same open file will move the pointer to the start of the third line. This trick only works with files that stay open; each time you open a file, the file pointer is set to the start of the file. [1] The Bourne shell exec command (45.7) can open a file and associate a file descriptor with it. For example, this exec command makes the standard input of all following commands come from the file formfile:
[1] The file-appending operator
>>
sets the pointer to the end of the file before the first write.
...all commands read their stdin from default place exec < formfile ...all commands will read their stdin from formfile
There's another way to rearrange file descriptors: by doing it at the last line of while loops, if and case statements. For example, all commands in the while loop below will take their standard inputs from the file formfile. The standard input outside the while loop isn't changed:
...all commands read their stdin from default place while ... do ...all commands will read their stdin from formfile done < formfile ...all commands read their stdin from default place
I call those "redirected-I/O loops." Those and other Bourne shell structures have some problems (45.23), but they're usually worth the work to solve.
We'll use all that to make a shell script for filling in forms. The script, formprog, reads an empty form file like this one, line by line:
Name: Address: City: State/Province: Phone: FAX: Project: Corporate Decision Comments:
If a line has just a label, like Name:
, the script will prompt
you to fill it in.
If you do, the script will add the completed line to an output file; otherwise,
no output line is written.
If a form line is already completed, like:
Project: Corporate Decision
the script doesn't prompt you; it just writes the line to the output file:
%formprog formfile completed
Name:Jerry Peek
Address:123 Craigie St.
City:Cambridge
State/Province:MA
Phone:(617)456-7890
FAX: Project: Corporate Decision Comments: %cat completed
Name: Jerry Peek Address: 123 Craigie St. City: Cambridge State/Province: MA Phone: (617)456-7890 Project: Corporate Decision
Here's the formprog script. The line numbers are for reference only; don't type them into the file. There's more explanation after the script:
1 #!/bin/sh 2 # formprog - fill in template form from $1, leave completed form in $2 3 # TABSTOPS ARE SET AT 4 IN THIS SCRIPT 4 5 template="$1" completed="$2" errors=/tmp/formprog$$ 6 myname=`basename $0` # BASENAME OF THIS SCRIPT (NO LEADING PATH) 7 trap 'rm -f $errors; exit' 0 1 2 15 8 9 # READ $template LINE-BY-LINE, WRITE COMPLETED LINES TO $completed: 10 exec 4<&0 # SAVE ORIGINAL stdin (USUALLY TTY) AS FD 4 11 while read label text 12 do 13 case "$label" in 14 ?*:) # FIRST WORD ENDS WITH A COLON; LINE IS OKAY 15 case "$text" in 16 ?*) # SHOW LINE ON SCREEN AND PUT INTO completed FILE: 17 echo "$label $text" 18 echo "$label $text" 1>&3 19 ;; 20 *) # FILL IT IN OURSELVES: 21 echo -n "$label " 22 exec 5<&0 # SAVE template FILE FD; DO NOT CLOSE! 23 exec 0<&4 # RESTORE ORIGINAL stdin TO READ ans 24 read ans 25 exec 0<&5 # RECONNECT template FILE TO stdin 26 case "$ans" in 27 "") ;; # EMPTY; DO NOTHING 28 *) echo "$label $ans" 1>&3 ;; 29 esac 30 ;; 31 esac 32 ;; 33 *) echo "$myname: bad $1 line: '$label $text'" 1>&2; break;; 34 esac 35 done <"$template" 2>$errors 3>"$completed" 36 37 if [ -s $errors ]; then 38 /bin/cat $errors 1>&2 39 echo "$myname: should you remove '$completed' file?" 1>&2 40 fi
Line 10 uses the
4<&0
operator (45.21)
to save the location of the original standard input - usually your
terminal, but not always - as file descriptor 4.
[2]
(We'll need to read that original stdin in line 24.)
[2] We can't assume that standard input is coming from a terminal. If we do, it prevents you from running formprog this way:
%command-generator-program
| formprog
%formprog <
command-file
During lines 11-35 of the redirected-I/O while loop:
all commands' standard input comes from the file named in $template
,
all standard error goes to the $errors
file,
and anything written to file descriptor 3 is added to the $completed
file.
UNIX keeps file pointers for all those open files - so each read and
write is done just past the end of the previous one.
Here's what happens each time the loop is executed:
The
read command (44.13)
in line 11 reads the next line from its standard input - that's the open
$template
file.
The
case (44.5)
in lines 15-31 checks the text from the
$template
file:
If the text has both a label -ding with a colon (:
)) and some
other text (stored in $text
), the complete line is written two places.
Line 17 writes the line to the standard output - which is probably
your screen (it's not redirected by the script, anyway).
Line 18 writes the line to file descriptor 3, the open
$completed
file.
If the text has just a label, line 21 writes the label to
standard output (usually your terminal) without a newline.
We want to read the answer, at line 24, but there's a problem:
on some Bourne shells, the read command can only read from file
descriptor 0 and won't let you use operators like <&4
on its
command line.
So, in line 22, we save a copy of the open $template
file descriptor and the location of the open file pointer in file
descriptor 5.
Line 23 changes standard input so the read in line 24 will
read from the right place (usually the terminal).
Line 25 adjusts standard input so the next read at the top of
the loop (line 11) will come from the $template
file.
If line 24 doesn't read an answer, line 27 does not write a line.
Otherwise, line 28 writes the line to file descriptor 3, the open
$completed
file.
If the template label doesn't end with a colon, line 33 writes
a message to stderr (file descriptor 2).
These messages, together with messages to stderr from any other
command in the loop, are redirected into the $errors
file.
After the loop, if the
test (44.20)
in line 37 sees any text in the file,
the text is displayed in line 38 and the script prints a
warning.
The loop keeps reading and writing line by line until the read
at the top of the loop reaches the end-of-file of $template
.
-