Article 26.4 gives a tutorial introduction to regular expressions. This article is intended for those of you who just need a quick listing of regular expression syntax as a refresher from time to time. It also includes some simple examples. The characters in Table 26.6 have special meaning only in search patterns.
Pattern | What Does it Match? |
---|---|
. | Match any single character except newline. |
* | Match any number (or none) of the single
characters that immediately precede it. The preceding character can
also be a regular expression. For example, since |
^ | Match the following regular expression at the beginning of the line. |
$ | Match the preceding regular expression at the end of the line. |
[ ] | Match any one of the enclosed characters. |
A hyphen ( | |
\{n ,m \} | Match a range of occurrences of the single character that immediately precedes
it. The preceding character can also be a regular
expression. \{ |
\ | Turn off the special meaning of the character that follows. |
\( \) | Save the pattern enclosed between \( and \) into a special holding space. Up to nine patterns can be saved on a single line. They can be "replayed" in substitutions by the escape sequences \1 to \9. |
\< \> | Match characters at beginning
( |
+ | Match one or more instances of preceding regular expression. |
? | Match zero or one instances of preceding regular expression. |
| | Match the regular expression specified before or after. |
( ) | Apply a match to the enclosed group of regular expressions. |
The characters in Table 26.7 have special meaning only in replacement patterns.
Pattern | What Does it Match? |
---|---|
\ | Turn off the special meaning of the character that follows. |
\n | Restore the |
& | Re-use the search pattern as part of the replacement pattern. |
~ | Re-use the previous replacement pattern in the current replacement pattern. |
\u | Convert first character of replacement pattern to uppercase. |
\U | Convert replacement pattern to uppercase. |
\l | Convert first character of replacement pattern to lowercase. |
\L | Convert replacement pattern to lowercase. |
When used with grep or egrep, regular expressions
are surrounded by quotes. (If the pattern contains a $
,
you must use single quotes; e.g., '
pattern
'
.)
When used with ed, ex, sed, and awk,
regular expressions are usually surrounded by /
(although any
delimiter works).
Table 26.8
has some example patterns.
Pattern | What Does it Match? |
---|---|
bag | The string bag . |
^bag | bag at beginning of line. |
bag$ | bag at end of line. |
^bag$ | bag as the only word on line. |
[Bb]ag | Bag or bag . |
b[aeiou]g | Second letter is a vowel. |
b[^aeiou]g | Second letter is a consonant (or uppercase or symbol). |
b.g | Second letter is any character. |
^...$ | Any line containing exactly three characters. |
^\. | Any line that begins with a . (dot). |
^\.[a-z][a-z] | Same, followed by two lowercase letters (e.g., troff requests). |
^\.[a-z]\{2\} | Same as previous, grep or sed only. |
^[^.] | Any line that doesn't begin with a . (dot). |
bugs* | bug , bugs , bugss , etc. |
"word" | A word in quotes. |
"*word"* | A word, with or without quotes. |
[A-Z][A-Z]* | One or more uppercase letters. |
[A-Z]+ | Same, egrep or awk only. |
[A-Z].* | An uppercase letter, followed by zero or more characters. |
[A-Z]* | Zero or more uppercase letters. |
[a-zA-Z] | Any letter. |
[^0-9A-Za-z] | Any symbol (not a letter or a number). |
[567] | One of the numbers 5 , 6 , or 7 . |
egrep or awk pattern: | |
five|six|seven | One of the words five , six , or seven . |
80[23]?86 | One of the numbers 8086 , 80286 , or 80386 . |
compan(y|ies) | One of the words company or companies . |
ex or vi pattern: | |
\<the | Words like theater or the . |
the\> | Words like breathe or the . |
\<the\> | The word the . |
sed or grep pattern: | |
0\{5,\} | Five or more zeros in a row. |
[0-9]\{3\}-[0-9]\{2\}-[0-9]\{4\} | US social security number (nnn - nn - nnnn ). |
The following examples show the metacharacters
available to sed or ex.
(ex commands begin with a colon.)
A space is marked by ; a TAB is marked by tab
.
Command | Result |
---|---|
s/.*/( & )/ | Redo the entire line, but add parentheses. |
s/.*/mv & &.old/ | Change a wordlist into mv commands. |
/^$/d | Delete blank lines. |
:g/^$/d | ex version of previous. |
/^[tab ]*$/d | Delete blank lines, plus lines containing only spaces or TABs. |
:g/^[tab ]*$/d | ex version of previous. |
s/*//g | Turn one or more spaces into one space. |
:%s/*//g | ex version of previous. |
:s/[0-9]/Item &:/ | Turn a number into an item label (on the current line). |
:s | Repeat the substitution on the first occurrence. |
:& | Same. |
:sg | Same, but for all occurrences on the line. |
:&g | Same. |
:%&g | Repeat the substitution globally. |
:.,$s/Fortran/\U&/g | Change word to uppercase, on current line to last line. |
:%s/.*/\L&/ | Lowercase entire file. |
:s/\<./\u&/g | Uppercase first letter of each word on current line (useful for titles). |
:%s/yes/No/g | Globally change a word to No . |
:%s/Yes/~/g | Globally change a different word to No
(previous replacement). |
s/die or do/do or die/ | Transpose words. |
s/\([Dd]ie\) or \([Dd]o\)/\2 or \1/ | Transpose, using hold buffers to preserve case. |
- from O'Reilly & Associates' UNIX in a Nutshell (SVR4/Solaris)