Single characters are represented by literal characters or operators. If preceded by a backslash \, a special character of an operator is interpreted as a literal character. This applies in particular for the backslash \ itself, so that the regular expression \\ is the same as the single character \. If the backslash is followed by a literal character, the backslash is ignored and not taken into account.
A literal character is a character that is not a special character, or a special character preceded by a backslash \, or enclosed between \Q ... \E. As a search string, a literal character matches the same single character exactly.
Upper/lower case sensitivity can be controlled for the corresponding commands or methods.
The following table shows some results of the test program, if the value ' ' is passed to ignore_case.
Pattern | Text | Match |
A | A | X |
A | a | - |
\. | . | X |
A | AB | - |
AB | AB | X |
The regular expression 'AB' is a concatenation of two expressions for single characters.
These operators are made up of the special characters ., [, ], ^ and -, whereby the last two only act as special characters at specific positions within [ ]. The special characters can be made into literal characters using the prefix \.
The special character . is a placeholder for any single character. The operator \C has the same effect as the special character .. A regular expression . or \C matches exactly one single character.
The following table shows some results of the test program, , whereby the value transferred at ignore_case can be any value.
Pattern | Text | Match |
. | A | X |
\C | a | X |
. | AB | - |
.. | AB | X |
The regular expression '..' is a concatenation of two expressions for single characters.
The special characters [ ] can be set around any number of literal characters or names for character classes (see below), and thus define a set of literal characters. A regular expression [...] matches exactly one single character that is listed as a literal character within the brackets, or which is contained in a specified character class. At least one literal character or one name for a character class (see below) must be contained within the brackets. A character [ or ], which is positioned directly after the opening bracket, is interpreted as a literal character. Some of the special characters that start with a backslash, such as \A or \Q, lose their special function within sets, and are interpreted as the simple literal character A or Q.
The following table shows some results of the test program.
Pattern | Text | Match |
[ABC] | B | X |
[ABC] | ABC | - |
[AB][CD] | AD | X |
[\d] | 9 | X |
The regular expression [AB][CD] is a concatenation of two expressions for single characters.
If the character ^ is the first character in a self-defined set for single characters and is listed directly after [, it acts as a special character and negates the rest of the set of literal characters or character classes. A regular expression [^... ] matches exactly one single character that is not listed within the brackets as a literal character, or is not contained in a specified character class. A character ^ that is not listed directly after [ acts as a literal character.
The following table shows some results of the test program.
Pattern | Text | Match |
[^ABC] | B | - |
[^ABC] | Y | X |
[^ABC] | ABC | X |
[^A][^B] | BA | X |
[A^B] | ^ | X |
The regular expression [^A][^B] is a concatenation of two expressions for single characters.
If the character - is between two literal characters, it acts as a special character and defines a range between the literal characters. The range is the set of characters that is enclosed by literal characters in the code page of the current operating system. A regular expression [...-... ] matches exactly one single character that is within the defined range. A character -, which is not between two literal characters, acts as a literal character. A literal character can not be part of two ranges, for example, 'a-z-Z' is not a regular expression.
The following table shows some results of the test program.
Pattern | Text | Match |
[A-Za-z0-9] | B | X |
[A-Za-z0-9] | 5 | X |
[A-Za-z0-9] | # | - |
[A-Za-z0-9] | - | - |
[A-Za-z0-9-] | - | X |
In the last expression, the closing - does not act as a special character.
Within sets for single characters defined with [ ], predefined platform-independent and language-independent character classes can be specified:
Character classes only act within [ ] as specified. A regular expression [:digit:] does not define the set of all digits, but instead defines a character set consisting of :, d, g, i and t. To specify the set of all digits, the regular expression [[:digit:]] is used.
The following table shows some results from the test program, if the value ' ' is passed to ignore_case.
Pattern | Text | Match |
[[:alnum:]] | a | X |
[[:alnum:]] | ; | - |
[[:alpha:]] | 1 | - |
[[:digit:][:punct:]] | 4 | X |
[[:digit:][:punct:]] | . | X |
[[:lower:]] | â | X |
[[:upper:]] | Ä | X |
For frequently used character sets, specific operators are available as abbreviations:
Character set | Abb. | Meaning |
[[:digit:]] | \d | Placeholder for a digit |
[^[:digit:]] | \D | Placeholder for a non-digit |
[[:lower:]] | \l | Placeholder for a lower-case letter |
[^[:lower:]] | \L | Placeholder for a character that is not a lower-case letter |
[[:space:]] | \s | Placeholder for a blank character |
[^[:space:]] | \S | Placeholder for characters other than blank characters |
[[:upper:]] | \u | Placeholder for an upper-case letter |
[^[:upper:]] | \U | Placeholder for a character that is not an upper-case letter |
[[:word:]] | \w | Placeholder for an alphanumeric character including underscore _ |
[^[:word:]] | \W | Placeholder for a non-alphanumeric character except for underscore _ |
If upper/lower case is not taken into account in the ABAP statement FIND and REPLACE or when generating an object of the class CL_ABAP_REGEX, then \l and \u are equivalent to [[:alpha:]] and \L, and \U is equivalent to [^[:alpha:]]. The special characters \w, \u, \l, \d, \s can also be listed within sets [...]. Use of the special characters \W, \U, \L, \D, \S within sets is not permitted and triggers an exception CX_SY_INVALID_REGEX.
The following table shows some results of the test program, if the ' ' is transferred to ignore_case.
Pattern | Text | Match |
\d | 4 | X |
\D | ; | X |
\l | u | X |
\l | U | - |
\L | s | X |
\s | X | |
\S | # | X |
\u | U | X |
\U | , | X |
\w | A | X |
\w | 8 | X |
\W | : | X |
\W | _ | - |
The operators [..] and [==] are reserved for later language enhancements and trigger the exception CX_SY_INVALID_REGEX if used in sets.