If you've worked with databases, you'll probably know what to do with the UNIX join command; see your online manual page. If you don't have a database (as far as you know!), you'll still probably have a use for join: combining or "joining" two column-format files. join searches certain columns in the files; when it finds columns that match one another, it "glues the lines together" at that column. This is easiest to show with an example.
I needed to summarize the information in thousands of email messages
under the MH mail system.
MH made that easy: it has one command (scan) that gave me
almost all the information I wanted about each message in the format I wanted.
But I also had to use
wc -l (29.6)
to count the number of lines in each
message.
I ended up with two files, one with scan output and the other with
wc output.
One field in both lines was the message number; I used
sort (36.1)
to
sort the files on that field.
I used awk '{print $1 "," $2}'
to massage wc output
into comma-separated fields.
Then I used join to "glue" the two lines together on the
message-number field.
(Next I fed the file to a PC running dBASE, but that's another story.)
Here's the file that I told scan to output.
The columns (message number, email address, comment, name, and date
sent) are separated with commas (,
):
0001,andrewe@isc.uci.edu,,Andy Ernbaum,19901219 0002,bc3170x@cornell.bitnet,,Zoe Doan,19910104 0003,zcode!postman@uunet.uu.net,,Head Honcho,19910105 ...
Here's the file from wc and awk with the message number and number of lines:
0001,11 0002,5 0003,187 ...
Then, this join command joined the two files at their first
columns (-t,
tells join that the fields are comma-separated):
%join -t, scanfile wcfile
The output file looked like:
0001,andrewe@isc.uci.edu,,Andy Ernbaum,19901219,11 0002,bc3170x@cornell.bitnet,,Zoe Doan,19910104,5 0003,zcode!postman@uunet.uu.net,,Head Honcho,19910105,187 ...
Of course, join can do a lot more than this simple example shows. See your online manual page.
-