3.9 Summary
- Regular expressions describe sets of strings to be matched.
In
awk
, regular expression constants are written enclosed
between slashes: /
…/
.
- Regexp constants may be used standalone in patterns and
in conditional expressions, or as part of matching expressions
using the ‘~’ and ‘!~’ operators.
- Escape sequences let you represent nonprintable characters and
also let you represent regexp metacharacters as literal characters
to be matched.
- Regexp operators provide grouping, alternation, and repetition.
- Bracket expressions give you a shorthand for specifying sets
of characters that can match at a particular point in a regexp.
Within bracket expressions, POSIX character classes let you specify
certain groups of characters in a locale-independent fashion.
- Regular expressions match the leftmost longest text in the string being
matched. This matters for cases where you need to know the extent of
the match, such as for text substitution and when the record separator
is a regexp.
- Matching expressions may use dynamic regexps (i.e., string values
treated as regular expressions).
-
gawk
’s IGNORECASE
variable lets you control the
case sensitivity of regexp matching. In other awk
versions, use tolower()
or toupper()
.