Perl-compatible regular expressions are supported by many ORF features, such as the sender and recipient email address whitelists and blacklists, the keyword and attachment filtering, the URL domain blacklisting and the Log Viewer Find feature.
This help section provides a brief introduction to regular expressions and their implementation in ORF. Due to the complexity of the topic, this help file cannot volunteer to teach you writing regular expressions, but can point you to the right direction.
Regular expressions may sound familiar for Unix/Linux/BSD administrators and software developers as they are widely used in both worlds. It is a powerful string toolkit, which allows defining complex text masks like "any word beginning with letter "t" or the sequence "zorro", followed by a sequence of at least 5, but maximum 8 digits" (this expression is .*\b(t|zorro)\d{5,8}\b.* by the way).
ORF uses case-insensitive regular expression matching, except where case sensitivity can be configured.
ORF uses the PCRE engine written Philip Hazel ([email protected]). This engine provides high compatibility with Perl 5's regular expression engine and used by projects like Python, Apache, PHP or Postfix.
The PCRE man pages are available at http://www.pcre.org/pcre.txt
Find the most common regex wildcards below:
Wildcard | Matches | Negative | Matches |
---|---|---|---|
. | Any character | ||
^ | Beginning of a string | ||
$ | End of string | ||
\w | Any alphanumeric character | \W | Any non-alphanumeric character |
\s | Any whitespace character | \S | Any character which is not a whitespace |
\d | Any digit | \D | Any character which is not a whitespace |
\b | The beginning or end of a word | \B | A position that is NOT the beginning or end of a word |
Using the latter for wildcards alone will match a single occurrence. For example \s matches a single whitespace character. The same applies to their negative version: \D matches a single character which is not a digit.
As you can see above, the dot character (.) is a wildcard. But what if you want to match the dot character itself? In this case, it has to be "escaped" which can be achieved by using backslash: \. will match the dot character while . alone matches any character.
If you want to match more than a single occurrence of any character or wildcard, you can do so by adding any of the following:
Repetition wildcard | Meaning |
---|---|
* | Any number of repetitions |
+ | One or more repetitions |
? | Zero or one time |
{n} | n times |
{n,m} | Repeat at least n, but no more than m times |
{n,} | Repeat at least n times |
For example the expression
Will match both [email protected] and [email protected] but not [email protected]
By using the pipe character, you can define an OR relation in your expression. For example the expression
Will match both [email protected] and [email protected] but not [email protected]
The above should be sufficient for constructing basic regular expressions. For more information about advanced regex techniques (such as positive and negative lookarounds, grouping, matching character classes, etc), see the links below.