[Chapter 1] 1.4 Using Pipes to Create a New Tool

1.4 Using Pipes to Create a New Tool

It's easy enough to imagine a trivial use of pipes (1.3). For example, whenever the output of a command is longer than will fit on a single screen, you might want to pipe to a pager program such as more (25.3), which shows the output a screenful at a time, and waits for you to press a key before it shows the next screen. If you were a writer like me, and wanted to check each "which" that you wrote to find out if any of them should have been "that," you might use the search program grep (27.1) and type:

[Ww]
% grep '[Ww]hich' chapter1 | more

[Ww]	% `grep '[Ww]hich' chapter1 \| more`

(Article 13.1 has more about pipes.) more lets you see the output a screenful at a time.

However, if you want to see how pipes can be really useful, you need to be a little more ambitious, or maybe just have a more specialized problem.

For example, the troff (43.13) formatting package (used in our office for typesetting some of our books) includes an indexing feature that allows the user to enter indexing commands of the following form:

.XX "topic, subtopic"

When the document is printed, the formatting package collects these entries, adds page numbers, and assembles the index. It is important that all entries be consistent. For example, if at one point the user makes the entry:

.XX "Indexing, introduction to"

and at another point:

.XX "Index, introduction to"

the program will generate two separate entries rather than merging them into one entry with two page references.

In order to check the consistency of index entries, one could enter the following command:

% cat files | grep .XX | sort -u | more

In this command, files is a list of the files to be checked. grep searches through that text for a specified string or pattern. [1] sort -u (36.6) puts the lines selected by grep in alphabetical order and removes duplicate lines.

[1] The pattern is a regular expression (26.4) in which a dot (.) stands for "any character." To be precise, use the command grep '^\.XX' instead.

The pipeline is started with the cat (25.2) command, which simply types the files' contents so that the input to the pipeline will be a single, continuous stream of text. (Otherwise grep will print the name of the file in which the string is found, which will keep the lines from being sorted correctly. In some versions of grep, the -h option can be used to suppress filenames. To see if this works on your UNIX system, type grep -h .XX files, omitting cat and the pipe.)

This is a very specific - and normally very tedious - job that needs to be done. And because UNIX provides general-purpose tools and an easy way of using them together in a kind of assembly line, you are provided a relatively simple way to get the job done.

But...

"Ugh!" you say, "That's just what I hate about UNIX. All these long filenames and options I can't remember. Who wants to type all that stuff!"

Precisely. That's why UNIX makes it so easy to create custom commands, in the form of aliases (10.2), shell functions (10.9), and shell scripts (1.5).

- TOR


1.3 Programs Are Designed to Work Together		1.5 Anyone Can Program the Shell