Single Character Strings

Single characters are represented by literal characters or operators. If preceded by a backslash \, a special character of an operator is interpreted as a literal character. This applies in particular for the backslash \ itself, so that the regular expression \\ is the same as the single character \. If the backslash is followed by a literal character, the backslash is ignored and not taken into account.

Literal characters

A literal character is a character that is not a special character, or a special character preceded by a backslash \, or enclosed between \Q ... \E. As a search string, a literal character matches the same single character exactly.

Note

Upper/lower case sensitivity can be controlled for the corresponding commands or methods.

Examples

The following table shows some results of the test program, if the value ' ' is passed to ignore_case.

Pattern	Text	Match
A	A	X
A	a	-
\.	.	X
A	AB	-
AB	AB	X

The regular expression 'AB' is a concatenation of two expressions for single characters.

Operators for single characters

These operators are made up of the special characters ., [, ], ^ and -, whereby the last two only act as special characters at specific positions within [ ]. The special characters can be made into literal characters using the prefix \.

Placeholders for single characters

The special character . is a placeholder for any single character. The operator \C has the same effect as the special character .. A regular expression . or \C matches exactly one single character.

Examples

The following table shows some results of the test program, , whereby the value transferred at ignore_case can be any value.

Pattern	Text	Match
.	A	X
\C	a	X
.	AB	-
..	AB	X

The regular expression '..' is a concatenation of two expressions for single characters.

Self-defined sets for single characters

The special characters [ ] can be set around any number of literal characters or names for character classes (see below), and thus define a set of literal characters. A regular expression [...] matches exactly one single character that is listed as a literal character within the brackets, or which is contained in a specified character class. At least one literal character or one name for a character class (see below) must be contained within the brackets. A character [ or ], which is positioned directly after the opening bracket, is interpreted as a literal character. Some of the special characters that start with a backslash, such as \A or \Q, lose their special function within sets, and are interpreted as the simple literal character A or Q.

Examples

The following table shows some results of the test program.

Pattern	Text	Match
[ABC]	B	X
[ABC]	ABC	-
[AB][CD]	AD	X
[\d]	9	X

The regular expression [AB][CD] is a concatenation of two expressions for single characters.

Negation of a self-defined set for single characters

If the character ^ is the first character in a self-defined set for single characters and is listed directly after [, it acts as a special character and negates the rest of the set of literal characters or character classes. A regular expression [^... ] matches exactly one single character that is not listed within the brackets as a literal character, or is not contained in a specified character class. A character ^ that is not listed directly after [ acts as a literal character.

Examples

The following table shows some results of the test program.

Pattern	Text	Match
[^ABC]	B	-
[^ABC]	Y	X
[^ABC]	ABC	X
[^A][^B]	BA	X
[A^B]	^	X

The regular expression [^A][^B] is a concatenation of two expressions for single characters.

Ranges in a self-defined set for single characters

If the character - is between two literal characters, it acts as a special character and defines a range between the literal characters. The range is the set of characters that is enclosed by literal characters in the code page of the current operating system. A regular expression [...-... ] matches exactly one single character that is within the defined range. A character -, which is not between two literal characters, acts as a literal character. A literal character can not be part of two ranges, for example, 'a-z-Z' is not a regular expression.

Examples

The following table shows some results of the test program.

Pattern	Text	Match
[A-Za-z0-9]	B	X
[A-Za-z0-9]	5	X
[A-Za-z0-9]	#	-
[A-Za-z0-9]	-	-
[A-Za-z0-9-]	-	X

In the last expression, the closing - does not act as a special character.

Character classes

Within sets for single characters defined with [ ], predefined platform-independent and language-independent character classes can be specified:

[:alnum:]
Set of all alphanumeric characters

[:alpha:]
Set of all upper and lower case letters including language-specific special characters (umlauts, accents, diphthongs)

[:blank:]
Blank characters and horizontal tabs

[:cntrl:]
Set of all control characters

[:digit:]
Set of all digits 0 to 9

[:graph:]
Set of all graphic special characters

[:lower:]
Set of all lower case letters including language-dependent special characters (umlauts, accents, diphthongs)

[:print:]
Set of all displayable characters

[:punct:]
Set of all punctuation characters

[:space:]
Set of all blank characters, tabs, and carriage feeds

[:unicode:]
Set of all characters with a character representation larger than 255 (only in Unicode systems)

[:upper:]
Set of all upper case letters including language-dependent special characters (umlauts, accents, diphthongs)

[:word:]
Set of all alphanumeric characters including underscore _.

[:xdigit:]
Set of all hexadecimal digits (0-9, A-F and a-f)

Note

Character classes only act within [ ] as specified. A regular expression [:digit:] does not define the set of all digits, but instead defines a character set consisting of :, d, g, i and t. To specify the set of all digits, the regular expression [[:digit:]] is used.

Examples

The following table shows some results from the test program, if the value ' ' is passed to ignore_case.

Pattern	Text	Match
[[:alnum:]]	a	X
[[:alnum:]]	;	-
[[:alpha:]]	1	-
[[:digit:][:punct:]]	4	X
[[:digit:][:punct:]]	.	X
[[:lower:]]	â	X
[[:upper:]]	Ä	X

Abbreviations for character sets

For frequently used character sets, specific operators are available as abbreviations:

Character set	Abb.	Meaning
[[:digit:]]	\d	Placeholder for a digit
[^[:digit:]]	\D	Placeholder for a non-digit
[[:lower:]]	\l	Placeholder for a lower-case letter
[^[:lower:]]	\L	Placeholder for a character that is not a lower-case letter
[[:space:]]	\s	Placeholder for a blank character
[^[:space:]]	\S	Placeholder for characters other than blank characters
[[:upper:]]	\u	Placeholder for an upper-case letter
[^[:upper:]]	\U	Placeholder for a character that is not an upper-case letter
[[:word:]]	\w	Placeholder for an alphanumeric character including underscore _
[^[:word:]]	\W	Placeholder for a non-alphanumeric character except for underscore _

Note

If upper/lower case is not taken into account in the ABAP statement FIND and REPLACE or when generating an object of the class CL_ABAP_REGEX, then \l and \u are equivalent to [[:alpha:]] and \L, and \U is equivalent to [^[:alpha:]]. The special characters \w, \u, \l, \d, \s can also be listed within sets [...]. Use of the special characters \W, \U, \L, \D, \S within sets is not permitted and triggers an exception CX_SY_INVALID_REGEX.

Examples

The following table shows some results of the test program, if the ' ' is transferred to ignore_case.

Pattern	Text	Match
\d	4	X
\D	;	X
\l	u	X
\l	U	-
\L	s	X
\s		X
\S	#	X
\u	U	X
\U	,	X
\w	A	X
\w	8	X
\W	:	X
\W	_	-

Equivalence classes

The operators [..] and [==] are reserved for later language enhancements and trigger the exception CX_SY_INVALID_REGEX if used in sets.