Character strings are represented by concatenations or operators.
Concatenations are valid regular expressions that are written after each other. IF r and s are regular expressions, the concatenation rs matches all character strings that can be formed from the concatenation of character strings that match r and s.
The following table shows some results of the test program.
Pattern | Text | Match |
H[aeu]llo | Hallo | X |
H[aeu]llo | Hello | X |
H[aeu]llo | Hullo | X |
H[aeu]llo | Hollo | - |
H[aeu]llo is the concatenation of five regular expressions for single characters.
These operators are made up of the special characters {, }, *, +, ?, |, (, ) and \. The special characters can be made into literal characters using the prefix \ or by enclosing with \Q ... \E.
The operators {n}, {n,m}, *, + and ? (whereby n and m are natural numbers, including zero) can be written directly after a regular expression r, and thus generate concatenations rrr... of the regular expression:
The following table shows some results from the test program.
Pattern | Text | Match |
Hel{2}o | Hello | X |
H.{4} | Hello | X |
.{0,4} | Hello | - |
.{4,} | Hello | X |
.+H.+e.+l.+l.+o.+ | Hello | - |
x*Hx*ex*lx*lx*ox* | Hello | X |
l+ | ll | X |
The operators ( ... ) and (?: ... ) group concatenations of regular expressions together into one entity and thus influence the range of effectiveness of other operators such as * or |, which act on this entity. In this case, the regular expressions (r) and (?:r) match the regular expression r.
The following table shows some results of the test program.
Pattern | Text | Match |
Tral+a | Tralala | - |
Tr(al)+a | Tralala | X |
Tr(?:al)+a | Tralala | X |
In the first expression, the concatenation with the operator + acts on the literal character l, in the second and third expressions, it acts on the subgroup al.
The operator ( ... ) acts in the same way as (?: ... ) in the formation of subgroups. In addition, when comparing the regular expression with a character string, the substrings that match the subgroups ( ... ) of the expression, are stored sequentially in registers. In this process, an operator \1, \2, \3, ... is assigned to each subgroup, which can be listed within the expression after its subgroup, and thus acts as a placeholder for the character string stored in the corresponding register. In text replacements, the special characters $1, $2, $3, ... can be used to access the last assignment to the register.
The number of subgroups and registers is only limited by the capacity of the platform.
The addition SUBMATCHES of the statements FIND and REPLACE and the eponymous column of the results table filled using the addition RESULTS can be used to access the content of all registers for a found location. The the class CL_ABAP_MATCHER contains the method GET_SUBMATCH for this purpose.
The following table shows some results of the test program.
pattern | text | match |
(["']).+\1 | "Hello" | X |
(["']).+\1 | 'Hello' | X |
(["']).+\1 | 'Hello" | - |
The concatenation (["']).+\1 matches all text strings of which the first character is " or ' and the last character is the same as the first. A call matcher->get_submatch( index = 1 ) returns the values ", " or ' for all three cases.
The character string (? ... ) is generally reserved for later language enhancements, and with the exception of the already supported operators (?:... ), (?=... ) and (?!...), triggers the exception CX_SY_INVALID_REGEX.
The operator | can be written between two regular expressions r and s, and thus generates a single regular expression r|s, which matches both r and s.
Concatenations and other operators are more binding than |, which means that r|st and r|s+ are equivalent to r|(?:st) or r|(?:s+), and not to (?:r|s)t or (?:r|s)+.
The following table shows some results of the test program.
Pattern | Text | Match |
H(e|a|u)llo | Hello | X |
H(e|a|u)llo | Hollo | - |
He|a|ullo | Hallo | - |
He|a|ullo | ullo | X |
The operators \Q ... \E form a character string of literal characters from all enclosed characters. Special characters have no effect in this character string.
The following table shows some results of the test program.
Pattern | Text | Match |
.+\w\d | Special: \w\d | - |
.+\\w\\d | Special: \w\d | X |
.+\Q\w\d\E | Special: \w\d | X |