This section describes the regular expression syntax supported by AdvancedSearch. It does not discuss how to use regular expressions. For information about using regular expressions, the reader should consult one of the many excellent books on the subject (for example, Mastering Regular Expressions by O'Reilly). The reader may also wish to use the plentiful resources that are freely available on the web.
Supported Syntax
|
Character Representations
|
|
\f
|
Form feed (equivalent to \x0c)
|
|
\n
|
Newline (equivalent to \x0a)
|
|
\r
|
Carriage return (equivalent to \x0d)
|
|
\t
|
Tab (equivalent to \x09)
|
|
\a
|
Alarm (equivalent to \x07)
|
|
\e
|
Escape (equivalent to \x1b)
|
|
\xnn
|
ASCII character represented by hexadecimal number nn
|
|
\x{nnnn}
|
Unicode character represented by hexadecimal number nnnn
|
|
Character Classes
|
|
[…]
|
Any character between the brackets
|
|
[^…]
|
Any charcacter except those between the brackets
|
|
-
|
A range of characters e.g. "a" through to "z" as in [a-z]
|
|
.
|
Any character (period symbol)
|
|
\w
|
Any word character (equivalent to [a-zA-Z0-9_])
|
|
\W
|
Any non-word character (equivalent to [^a-zA-Z0-9_])
|
|
\s
|
Any whitespace character (equivalent to [\t\n\r\f])
|
|
\S
|
Any non-whitespace character (equivalent to [^\t\n\r\f])
|
|
\d
|
Any digit (equivalent to [0-9])
|
|
\D
|
Any non-digit (equivalent to [^0-9])
|
|
Repetition
|
|
{n,m}
|
Match previous item at least n times but no more than m times
|
|
{n,}
|
Match previous item at least n times
|
|
{n}
|
Match previous item exactly n times
|
|
?
|
Match previous item zero or one times (equivalent to {0,1})
|
|
+
|
Match previous item one or more times (equivalent to {1,})
|
|
*
|
Match previous item zero or more times (equivalent to {0,})
|
|
{}?
|
As for {} but match as few times as possible
|
|
??
|
As for ? but match as few times as possible (equivalent to {0,1}? )
|
|
+?
|
As for + but match as few times as possible (equivalent to {1,)? )
|
|
*?
|
As for * but match as few times as possible (equivalent to {0,}? )
|
|
Modes
|
|
i
|
Case insensitive pattern matching
|
|
m
|
Treat as multiple lines (^ and $ match internal \n)
|
|
s
|
Treat as single line (^ and $ do not match internal \n)
|
|
Extended Regular Expression
|
|
(?#…)
|
Treat … as comment
|
|
(?ims)
|
Turn on listed modes for rest of subexpression
|
|
(?-ims)
|
Turn off listed modes for rest of subexpression
|
|
Grouping
|
|
(…)
|
Group subpattern and store in \1,\2,…,\9
|
|
|
|
Alternation
|
|
\n
|
Match characters as stored in subpattern number n
|
|
Anchors
|
|
^
|
Match start of line
|
|
$
|
Match end of line
|
|
\b
|
Match word boundary (i.e. position between word character (\w) and non-word character (\W))
|
|
\B
|
Match a position that is not a word boundary
|
|
\A
|
Match start of string (differs from ^ only if m mode is turned on)
|
|
\Z
|
Match end of string (differs from $ only if m mode is turned on)
|
Note
By default m mode is switched off and s mode is switched on. i mode depends on the option settings selected by the user when initiating the search.