[This article was written for SunOS. Many versions of tar don't have some or all of these features. Some do it in a different way. Check your tar manual page, or use the GNU tar (19.6) that we provide on the disc. -JP]
On some systems,
make (28.13)
creates filenames starting with a
comma (,
)
to keep track of dependencies.
Various editors create backup files whose names end with a
percent sign (%
)
or a
tilde (~
).
I often keep the original copy of a program with the .orig
extension and old versions with a .old
extension.
I often don't want to save these files on my backups. There may be some binary files that I don't want to archive, but don't want to delete either.
A solution is to use the X flag to tar (20.1). [Check your tar manual page for the F and FF options, too. -JIK ] This flag specifies that the matching argument to tar is the name of a file that lists files to exclude from the archive. Here is an example:
%find project ! -type d -print | \ egrep '/,|%$|~$|\.old$|SCCS|/core$|\.o$|\.orig$' > Exclude
%tar cvfX project.tar Exclude project
In this example, find (17.1) lists all files in the directories, but does not print the directory names explicitly. If you have a directory name in an excluded list, it will also exclude all the files inside the directory. egrep (27.5) is then used as a filter to exclude certain files from the archive. Here, egrep is given several regular expressions to match certain files. This expression seems complex but is simple once you understand a few special characters:
/
The slash is not a special character. However, since no filename can contain a slash, it matches the beginning of a filename, as output by the find command.
|
The vertical bar separates each regular expression.
$
The dollar sign is one of the two regular expression
"anchors"
and specifies the end of the line, or filename in this case.
The other anchor, which specifies the beginning of the line, is
^
(caret).
But because we are matching filenames output by
find,
the only filenames that can match
^
are those in the top directory.
\.
Normally the dot matches any character in a regular
expression. Here, we want to match the actual
character .
(dot),
which is why the backslash is used to
quote
or
escape
the normal meaning.
A breakdown of the patterns and examples of the files that match these patterns is given here:
Pattern | Matches Files | Used by |
---|---|---|
/, | starting with , | make dependency files |
%$ | ending with % | textedit backup files |
~$ | ending with ~ | emacs backup files |
\.old$ | ending with .old | old copies |
SCCS | in SCCS directory | Source Code Control System (20.13) |
/core$ | with name of core | core dump (52.9) |
\.o$ | ending with .o | object files |
\.orig$ | ending with .orig | original version |
Instead of specifying which files are to be excluded, you can specify which
files to archive using the
-I
option.
As with the exclude flag, specifying a directory tells
tar
to include (or exclude) the entire directory.
You should also note that the syntax of the
-I
option is different from the typical
tar
flag.
The next example archives all C files and makefiles.
It uses egrep's ()
grouping operators to make
the $
anchor character apply to all patterns inside the
parentheses:
%find project -type f -print | \ egrep '(\.[ch]|[Mm]akefile)$' > Include
%tar cvf project.tar -I Include
I suggest using find to create the include or exclude file. You can edit it afterward, if you wish. One caution: extra spaces at the end of any line will cause that file to be ignored.
One way to debug the output of the find command is to use /dev/null (13.14) as the output file:
%tar cvfX /dev/null Exclude project
There are times when you want to make an archive of several directories. You may want to archive a source directory and another directory like /usr/local. The natural, but wrong, way to do this is to use the command:
%tar cvf /dev/rmt8 project /usr/local
NOTE: When using tar, you must never specify a directory name starting with a slash (/). This will cause problems when you restore a directory, as you will see later (20.10).
The proper way to handle the incorrect example above is to use the -C flag:
%tar cvf /dev/rmt8 project -C /usr local
This will archive /usr/local/... as local/.... Article 20.10 has more information.
For the above options to work when you extract files from an archive, the pathname given in the include or exclude file must exactly match the pathname on the tape.
Here's a sample run. I'm extracting from a file named appe.tar. Of course, this example applies to tapes, too:
%tar tf appe.tar
appe code/appendix/font_styles.c code/appendix/xmemo.c code/appendix/xshowbitmap.c code/appendix/zcard.c code/appendix/zcard.icon
Next, I create an exclude file, named exclude, that contains the lines:
code/appendix/zcard.c code/appendix/zcard.icon
Now, I run the following tar command:
%tar xvfX appe.tar exclude
x appe, 6421 bytes, 13 tape blocks x code/appendix/font_styles.c, 3457 bytes, 7 tape blocks x code/appendix/xmemo.c, 10920 bytes, 22 tape blocks x code/appendix/xshowbitmap.c, 20906 bytes, 41 tape blocks code/appendix/zcard.c excluded code/appendix/zcard.icon excluded
If you're archiving the current directory (.
) instead of
starting at a subdirectory, remember to start with
two pathnames in the Exclude file:
the archive that tar creates and the Exclude file itself.
That keeps tar from trying to archive its own output!
%cat > Exclude ./somedir.tar ./Exclude
[CTRL-d] %find . -type f -print | \ egrep '/,|%$|~$|\.old$|SCCS|/core$|\.o$|\.orig$' >>Exclude
%tar cvfX somedir.tar Exclude .
In that example,
we used
cat
>
(25.2)
to create the file quickly; you could use a text
editor instead.
Notice that the pathnames in the Exclude file start with ./
;
that's what the tar command expects when you tell it to archive
the current directory (.
).
The long find/egrep command line uses the
>>
operator (13.1)
to add other pathnames to the end of the Exclude file.
Or, instead of adding the archive and exclude file's pathnames to the exclude file, you can move those two files somewhere out of the directory tree that tar will read.
-