One of the principle uses of regular expressions is the search for (and subsequent replacement of) substrings in character strings. In general, a user is interested in a specific selection of character strings that match a regular expression. In ABAP, the search of regular expressions is realized using the addition REGEX of the statement FIND, whereby the found substrings are determined with no overlaps according to the leftmost-longest rule.
First, the substring is determined that is the furthest to the left in the character string, and which matches the regular expression ("leftmost"). If there are several substrings, the longest sequence is chosen ("longest"). This procedure is then repeated for the remaining sequence after the found location.
For the regular expression d*od*, five substrings are found in doobedoddoo: do at offset 0, o at offset 2, dodd at offset 5, o at offset 9 and o at offset 10.
The following operators support searching in character strings. These operators are made up of the special characters ^, $, \, <, >, (, ), =and !. The special characters can be made into literal characters using the prefix \.
The operators ^ and $ act as anchor characters for the offset before the first character of a line and the offset after the last character of a line. If the character string to be searched contains control characters such as a line feed, it is interpreted as consisting of several lines.
The operators \A and \Z have the same effect as ^ and $, but always refer to the whole character string instead of to single lines.
The operators ^, $ and \A, \Z behave differently if control characters are present. Within ABAP programs, these control characters normally occur only for importing externally generated data records.
The following search finds Smile at the start of the first line and at the end of the last line of the internal table text_tab.
The operator \< fits at the start of a word and the operator \> fits at the end of a word. The operator <\b fits at both the beginning and the end of a word. A word is defined as an uninterrupted sequence of alphanumeric characters.
The following search finds the three words One, two and 3. Instead of the expression \<[[:alnum:]]+\>, \b[[:alnum:]]+\b can also be used.
The operator (?=...) defines a regular expression s as a subsequent condition for a previous regular expression r. The regular expression r(?=s) has the same effect in a search as the regular expression r, if the regular expression s matches the substring that immediately follows the substring found with r.
The operator (?!...) acts in the same way as (?=... ), with the difference that r(?!s) matches the substring for r if s does not match the subsequent substring.
The substring found by the preview s is not a part of the match found by r(?=s).
The following search finds the substring la at offset 7.