Here are the regular expressions available
in the Reference List.
| Expression |
Syntax |
Description |
| Any character |
. |
Matches any single character except a line
break. |
| Zero or more |
* |
Matches zero or more occurrences of the
preceding expression, making all possible
matches. |
| One or more |
+ |
Matches at least one occurrence of the
preceding expression. |
| Beginning of line |
^ |
Anchors the match string to the beginning of
a line. |
| End of line |
$ |
Anchors the match string to the end of a
line. |
| Beginning of word |
< |
Matches only when a word begins at this
point in the text. |
| End of word |
> |
Matches only when a word ends at this point
in the text. |
| Line break |
\n |
Matches a platform-independent line break.
In a Replace expression, inserts a line break. |
| Any one character in the set |
[] |
Matches any one of the characters within the
[]. To specify a range of characters, list the
starting and ending character separated by a
dash (-), as in [a-z]. |
| Any one character not in the set |
[^...] |
Matches any character not in the set of
characters following the ^. |
| Or |
| |
Matches either the expression before or the
one after the OR symbol (|). Mostly used within
a group. For example,
(sponge|mud) bath matches "sponge bath"
and "mud bath." |
| Escape |
\ |
Matches the character that follows the
backslash (\) as a literal. This allows you to
find the characters used in regular expression
notation, such as { and ^. For example,
\^ Searches for the ^ character. |
| Tagged expression |
{} |
Matches text tagged with the enclosed
expression. |
| C/C++ Identifier |
:i |
Matches the expression ([a-zA-Z_$][a-zA-Z0-9_$]*). |
| Quoted string |
:q |
Matches the expression
(("[^"]*")|('[^']*')). |
| Space or Tab |
:b |
Matches either space or tab characters. |
| Integer |
:z |
Matches the expression ([0-9]+). |
The list of all regular expressions that are
valid in Find and Replace operations is
longer than can be displayed in the Reference List.
You can also insert any of the following regular
expressions into a Find what string:
| Expression |
Syntax |
Description |
| Minimal — zero or more |
@ |
Matches zero or more occurrences of the
preceding expression, matching as few characters
as possible. |
| Minimal — one or more |
# |
Matches one or more occurrences of the
preceding expression, matching as few characters
as possible. |
| Repeat n times |
^n |
Matches n occurrences of the
preceding expression. For example,
[0-9]^4 matches any 4-digit sequence. |
| Grouping |
() |
Groups a subexpression. |
| nth tagged text |
\n |
In a Find or Replace
expression, indicates the text matched by the
nth tagged expression, where n is a
number from 1 to 9. In a Replace expression,
\0 inserts the entire matched text. |
| Right-justified field |
\(w,n) |
In a Replace expression, right-justifies the
nth tagged expression in a field at least
w characters wide. |
| Left-justified field |
\(-w,n) |
In a Replace expression, left-justifies the
nth tagged expression in a field at least
w characters wide. |
| Prevent match |
~(X) |
Prevents a match when X appears at this
point in the expression. For example,
real~(ity) matches the "real" in "realty"
and "really," but not the "real" in "reality." |
| Alphanumeric character |
:a |
Matches the expression ([a-zA-Z0-9]). |
| Alphabetic character |
:c |
Matches the expression ([a-zA-Z]). |
| Decimal digit |
:d |
Matches the expression ([0-9]). |
| Hexadecimal digit |
:h |
Matches the expression ([0-9a-fA-F]+). |
| Rational number |
:n |
Matches the expression
(([0-9]+.[0-9]*)|([0-9]*.[0-9]+)|([0-9]+)). |
| Alphabetic string |
:w |
Matches the expression ([a-zA-Z]+). |
| Escape |
\e |
Unicode U+001B. |
| Bell |
\g |
Unicode U+0007. |
| Backspace |
\h |
Unicode U+0008. |
| Tab |
\t |
Matches a tab character, Unicode U+0009. |
| Unicode character |
\x#### or \u#### |
Matches a character given by Unicode value
where #### is hexadecimal digits. You can
specify a character outside the Basic
Multilingual Plane (that is, a surrogate) with
the ISO 10646 code point or with two Unicode
code points giving the values of the surrogate
pair. |
The following table lists the syntax for
matching by standard Unicode character properties. The
two-letter abbreviation is the same as listed in the
Unicode character properties database. These may be
specified as part of a character set. For example, the
expression [:Nd:Nl:No] matches any kind of digit.
| Expression |
Syntax |
Description |
| Uppercase letter |
:Lu |
Matches any one capital letter. For example,
:Luhe matches "The" but not "the". |
| Lowercase letter |
:Ll |
Matches any one lower case letter. For
example,
:Llhe matches "the" but not "The". |
| Title case letter |
:Lt |
Matches characters that combine an uppercase
letter with a lowercase letter, such as Nj and
Dz. |
| Modifier letter |
:Lm |
Matches letters or punctuation, such as
commas, cross accents, and double prime, used to
indicate modifications to the preceding letter. |
| Other letter |
:Lo |
Matches other letters, such as gothic letter
ahsa. |
| Decimal digit |
:Nd |
Matches decimal digits such as 0-9 and their
full-width equivalents. |
| Letter digit |
:Nl |
Matches letter digits such as roman numerals
and ideographic number zero. |
| Other digit |
:No |
Matches other digits such as old italic
number one. |
| Open punctuation |
:Ps |
Matches opening punctuation such as open
brackets and braces. |
| Close punctuation |
:Pe |
Matches closing punctuation such as closing
brackets and braces. |
| Initial quote punctuation |
:Pi |
Matches initial double quotation marks. |
| Final quote punctuation |
:Pf |
Matches single quotation marks and ending
double quotation marks. |
| Dash punctuation |
:Pd |
Matches the dash mark. |
| Connector punctuation |
:Pc |
Matches the underscore or underline mark. |
| Other punctuation |
:Po |
Matches (,), ?, ", !, @, #, %, &, *, \, (:),
(;), ', and /. |
| Space separator |
:Zs |
Matches blanks. |
| Line separator |
:Zl |
Matches the Unicode character U+2028. |
| Paragraph separator |
:Zp |
Matches the Unicode character U+2029. |
| Non-spacing mark |
:Mn |
Matches non-spacing marks. |
| Combining mark |
:Mc |
Matches combining marks. |
| Enclosing mark |
:Me |
Matches enclosing marks. |
| Math symbol |
:Sm |
Matches +, =, ~, |, <, and >. |
| Currency symbol |
:Sc |
Matches $ and other currency symbols. |
| Modifier symbol |
:Sk |
Matches modifier symbols such as circumflex
accent, grave accent, and macron. |
| Other symbol |
:So |
Matches other symbols, such as the copyright
sign, pilcrow sign, and the degree sign. |
| Other control |
:Cc |
Matches end of line. |
| Other format |
:Cf |
Formatting control character such as the
bi-directional control characters. |
| Surrogate |
:Cs |
Matches one half of a surrogate pair. |
| Other private-use |
:Co |
Matches any character from the private-use
area. |
| Other not assigned |
:Cn |
Matches characters that do not map to a
Unicode character. |
In addition to the standard Unicode
character properties, the following additional
properties may be specified as part of a character set.
| Expression |
Syntax |
Description |
| Alpha |
:Al |
Matches any one character. For example,
:Alhe matches words such as "The",
"then", and "reached". |
| Numeric |
:Nu |
Matches any one number or digit. |
| Punctuation |
:Pu |
Matches any one punctuation mark, such as ?,
@, ', and so on. |
| White space |
:Wh |
Matches all types of white space, including
publishing and ideographic spaces. |
| Bidi |
:Bi |
Matches characters from right-to-left
scripts such as Arabic and Hebrew. |
| Hangul |
:Ha |
Matches Korean Hangul and combining Jamos. |
| Hiragana |
:Hi |
Matches hiragana characters. |
| Katakana |
:Ka |
Matches katakana characters. |
| Ideographic/Han/Kanji |
:Id |
Matches ideographic characters, such as Han
and Kanji. |
|