Class

# RegEx

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

## Description

Used to do search and replace operations using regular expressions (i.e., perl). The <span class="title-ref">RegEx</span> class uses the current version of the PCRE library, 8.33.

## Properties

<div class="rst-class">

table-centered_columns_3_and_4

</div>

| Name                                             | Type                                                       | Read-Only | Shared |
|--------------------------------------------------|------------------------------------------------------------|-----------|--------|
| `Options<regex.options>`                         | `RegExOptions</api/text/regular_expressions/regexoptions>` |           |        |
| `ReplacementPattern<regex.replacementpattern>`   | `String</api/data_types/string>`                           |           |        |
| `SearchPattern<regex.searchpattern>`             | `String</api/data_types/string>`                           |           |        |
| `SearchStartPosition<regex.searchstartposition>` | `Integer</api/data_types/integer>`                         |           |        |

## Methods

<div class="rst-class">

table-centered_column_4

</div>

| Name                     | Parameters                                                                                                      | Returns                                                | Shared |
|--------------------------|-----------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|--------|
| `Replace<regex.replace>` |                                                                                                                 | `String</api/data_types/string>`                       |        |
|                          | targetString As `String</api/data_types/string>`, \[searchStartPosition As `Integer</api/data_types/integer>`\] | `RegExMatch</api/text/regular_expressions/regexmatch>` |        |
| `Search<regex.search>`   |                                                                                                                 | `RegExMatch</api/text/regular_expressions/regexmatch>` |        |
|                          | targetString As `String</api/data_types/string>`, \[searchStartPosition As `Integer</api/data_types/integer>`\] | `RegExMatch</api/text/regular_expressions/regexmatch>` |        |

## Property descriptions

<div id="regex.options">

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

</div>

<div class="rst-class">

forsearch

</div>

RegEx.Options

**Options** As `RegExOptions</api/text/regular_expressions/regexoptions>`

> These options are various states which you can set for the Regular Expressions engine. See the `RegExOptions</api/text/regular_expressions/regexoptions>` class.

<div id="regex.replacementpattern">

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

</div>

<div class="rst-class">

forsearch

</div>

RegEx.ReplacementPattern

**ReplacementPattern** As `String</api/data_types/string>`

> This is the replacement string, which can include references to substrings matched previously, via the standard '1' or '\$1' notation common in regular expressions.
>
> This pattern is used either with the Replace property or passed to the `RegExMatch</api/text/regular_expressions/regexmatch>` class when Search returns, and subsequently used with Replace if no parameters are specified.

<div id="regex.searchpattern">

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

</div>

<div class="rst-class">

forsearch

</div>

RegEx.SearchPattern

**SearchPattern** As `String</api/data_types/string>`

> This is the pattern you are currently searching for.

<div id="regex.searchstartposition">

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

</div>

<div class="rst-class">

forsearch

</div>

RegEx.SearchStartPosition

**SearchStartPosition** As `Integer</api/data_types/integer>`

> The byte offset at which you want to start the search if the optional TargetString parameter to *Replace* is not specified.
>
> Keep in mind: If you set it, it will only be used if you don't specify a TargetString, since setting a new TargetString resets the value.
>
> **The offset is zero-based!**. I.e. to start at the beginning of the string, use 0.
>
> If you want to set a start value past the first character, and if the string uses an encoding where not every character is exactly one byte long, such as UTF-8, you need to convert the string's character position into its byte offset.
>
> Here's a way to convert a (1-based) character position into the (0-based) byte position:
>
> ``` xojo
> aRegEx.SearchStartPosition = theString.Left(characterPosition-1).Bytes
> ```

## Method descriptions

<div id="regex.replace">

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

</div>

<div class="rst-class">

forsearch

</div>

RegEx.Replace

**Replace** As `String</api/data_types/string>`

> Finds *SearchPattern* in the last used *targetString* and replaces the contents of *SearchPattern* with *ReplacementPattern* starting at the last *SearchStartPosition*. Returns the resulting `String</api/data_types/string>`.

**Replace**(targetString As `String</api/data_types/string>`, \[searchStartPosition As `Integer</api/data_types/integer>`\]) As `RegExMatch</api/text/regular_expressions/regexmatch>`

> Finds *SearchPattern* in the last used *targetString* starting at *searchStartPosition* and replaces the contents of *SearchPattern* with *ReplacementPattern* starting at the last *SearchStartPosition*. Returns the result as a `RegExMatch</api/text/regular_expressions/regexmatch>`.
>
> This code does a simple remove of HTML tags from source HTML:
>
> ``` xojo
> Var re As New RegEx
> re.SearchPattern = "<[^<>]+>"
> re.ReplacementPattern = ""
> re.Options.ReplaceAllMatches = True
>
> Var html As String = "<p>Hello.</p>"
> Var plain As String = re.Replace(html)
>
> MessageBox(plain) ' "Hello."
> ```
>
> This code finds all occurrences of the word "a" and replace them all with "the":
>
> ``` xojo
> Var re As New RegEx
> re.SearchPattern = "\ba\b"
> re.ReplacementPattern = "the"
> re.Options.ReplaceAllMatches = True
>
> Var origText As String = "a bus drove on a street in a town"
> Var newText As String = re.Replace(origText)
>
> MessageBox(newText) ' "the bus drove on the street in the town"
> ```
>
> This code replaces the second occurrence only:
>
> ``` xojo
> Var re As New RegEx
> re.SearchPattern = "\ba\b"
> re.ReplacementPattern = "the"
>
> Var sampleText As String = "a bus drove on a street in a town"
> Var match As RegExMatch = re.Search(sampleText) ' Find the first SearchPattern in the text
>
> If match <> Nil Then
>   sampleText = re.Replace ' Find the second SearchPattern in the text and replace it
> End If
>
> MessageBox(sampleText) ' "a bus drove on the street in a town"
> ```
>
> This code uses the same <span class="title-ref">RegEx</span> on several strings:
>
> ``` xojo
> Var sources() As String = Array("<b>this</b>", "<i>that</i>", "<strong>the other</strong>")
>
> Var re As New RegEx
> re.SearchPattern = "<[^<>]+>"
> re.ReplacementPattern = ""
> re.Options.ReplaceAllMatches = True
>
> For sourceIndex As Integer = 0 To sources.LastIndex
>   sources(sourceIndex) = re.Replace(sources(sourceIndex))
> Next sourceIndex
>
> ' sources now contains
> ' {"this", "that", "the other"}
> ```

<div id="regex.search">

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

</div>

<div class="rst-class">

forsearch

</div>

RegEx.Search

**Search** As `RegExMatch</api/text/regular_expressions/regexmatch>`

> Resumes searching in the previously provided TargetString (see the `Notes<regex.notes>`).

**Search**(targetString As `String</api/data_types/string>`, \[searchStartPosition As `Integer</api/data_types/integer>`\]) As `RegExMatch</api/text/regular_expressions/regexmatch>`

> Finds *SearchPattern* in *targetString*, beginning at *searchStartPosition* if provided.
>
> If it succeeds, it returns a `RegExMatch</api/text/regular_expressions/regexmatch>`. Both parameters are optional; if *targetString* is omitted, it assumes the previous *targetString*, so you will want to pass a *targetString* the first time you call Search. If you call Search with a *targetString* and omit Search *startPosition*, zero is assumed. If you call Search with no parameters after initially passing a *targetString*, it assumes the previous *targetString* and will begin the search where it left off in the previous call. This is the easiest way to find the next occurrence of *SearchPattern* in *targetString*.
>
> The `RegExMatch</api/text/regular_expressions/regexmatch>` will remember the ReplacementPattern specified at the time of the search.
>
> This example finds occurrences of "software" in the supplied string. Note that <span class="title-ref">RegEx</span> searches are case-insensitive by default so "software" is found twice:
>
> ``` xojo
> Var re As New RegEx
> Var match As RegExMatch
>
> re.SearchPattern = "software"
> match = re.Search("How much software can a Software Developer make?")
>
> Var result As String
>
> Do
>   If match <> Nil Then
>     result = match.SubExpressionString(0)
>     MessageBox(result)
>   End If
>
>   match = re.Search
> Loop Until match Is Nil
> ```

## Notes

This section describes the syntax of regular expressions.

| Pattern      | Description                                                                                                                           |
|--------------|---------------------------------------------------------------------------------------------------------------------------------------|
| .            | Matches any character except newline.                                                                                                 |
| \[a-z0-9\]   | Matches any single character of set.                                                                                                  |
| \[^a-z0-9\]  | Matches any single character not in set.                                                                                              |
| \d           | Matches a digit. Same as \[0-9\].                                                                                                     |
| D            | Matches a non-digit. Same as \[^0-9\].                                                                                                |
| \w           | Matches an alphanumeric (word) character - \[[a-zA-Z0-9]()\].                                                                         |
| \W           | Matches a non-word character \[^a-[zA-Z0-9]()\].                                                                                      |
| \s           | Matches a whitespace character (space, tab, newline, etc.).                                                                           |
| \S           | Matches a non-whitespace character.                                                                                                   |
| \n           | Matches a newline (line feed).                                                                                                        |
| \r           | Matches a return.                                                                                                                     |
| \t           | Matches a tab.                                                                                                                        |
| \f           | Matches a formfeed.                                                                                                                   |
| \b           | Matches a word boundary. Use \[\b\] to match a backspace.                                                                             |
| \0           | Matches a null character.                                                                                                             |
| \000         | Also matches a null character because of the following:                                                                               |
| \\*nnn*      | Matches an ASCII character of that octal value.                                                                                       |
| \x\*nn\*     | Matches an ASCII character of that hexadecimal value.                                                                                 |
| \c\*X\*      | Matches an ASCII control character.                                                                                                   |
| \\*metachar* | Matches the meta-character (e.g. \\ \\).                                                                                              |
| (abc)        | Used to create subexpressions. Remembers the match for later backreferences. Referenced by replacement patterns that use \1, \2, etc. |
| \1, \2,…     | Matches whatever first (second, and so on) of parens matched.                                                                         |
| x?           | Matches 0 or 1 x's, where x is any of above.                                                                                          |
| x\*          | Matches 0 or more x's.                                                                                                                |
| x+           | Matches 1 or more x's.                                                                                                                |
| x{m,n}       | Matches at least m x's, but no more than n. {x} (matches x occurrences) and {x,} (matches at least x occurrences).                    |
| abc          | Matches all of a, b, and c in order.                                                                                                  |
| ac           | Matches one of a, b, or c.                                                                                                            |
| \b           | Matches a word boundary (outside \[\] only).                                                                                          |
| \B           | Matches a non-word boundary.                                                                                                          |
| ^            | Anchors match to the beginning of a line or string.                                                                                   |
| \$           | Anchors match to the end of a line or string.                                                                                         |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Replacement patterns

The following expressions can only apply to the replacement pattern:

| Pattern  | Description                                                                                         |
|----------|-----------------------------------------------------------------------------------------------------|
| \$\`     | Replaced with the entire target string before match.                                                |
| \$&      | The entire matched area; this is identical to \0 and \$0.                                           |
| \$'      | Replaced with the entire target string following the matched text.                                  |
| \$0-\$50 | \$0-\$50 evaluate to nothing if the subexpression corresponding to the number doesn't exist.        |
| \0-\50   |                                                                                                     |
| \x\*nn\* | Replaced with the character represented by *nn* in Hex, e.g., &#8482;is &#8482;.                    |
| \n\*nn\* | Replaced with the character represented by *nn* in Octal.                                           |
| \c\*X\*  | Replaced with the character that is the control version of *X*, e.g., \cP is DLE, data line escape. |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Double-byte systems

If you are working with a double-byte system such as Japanese, <span class="title-ref">RegEx</span> cannot operate on the characters directly. You should first convert all double-byte text to UTF8 using the built-in Text Converter functions. See, for example, the `TextConverter</api/text/encoding_text/textconverter>` class for an example of how to use the Text Converter functions.

All text that will be processed by <span class="title-ref">RegEx</span> should be converted. This includes SearchPattern, ReplacementPattern, and TargetString. The result of the Search or Search and Replace will be a UTF8 string, so you will need to convert it back to its original form using the Text Converter functions. Both Search and Search and Replace operations work on all platforms, provided that this conversion takes place.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

<div id="/api/text/regular_expressions/regex/regular_expression_examples">

Regular expression examples \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*

</div>

The basic idea of regular expressions is that it enables you to find and replace text that matches the set of conditions you specify. It extends normal Search and Replace with pattern searching.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Wildcards

Some special characters are used to match a class of characters:

| Wildcard | Matches                                                      |
|----------|--------------------------------------------------------------|
| .        | Any single character except a line break, including a space. |

If you use the "." as the search pattern, you will select the first character in the target string and, if you repeat the search, you will find each successive character, except for Return characters

The following wildcards match by position in a line:

| Wildcard | Matches                                                           | Example                                           |
|----------|-------------------------------------------------------------------|---------------------------------------------------|
| ^        | Beginning of a line (unless used in a character class; see below) | ^Phone: Finds lines that begin with "Phone":      |
| \$       | End of a line (unless used in a character class)                  | \$: Finds the last character in the current line. |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Character classes

A character class allows you to specify a set or range of characters. You can choose to either match or ignore the character class. The set of characters is enclosed in brackets. If you want to ignore the character class instead of match it, precede it by a caret (^). Here are some examples:

| Character Class | Matches                                                                                                                                                                 |
|-----------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| \[aeiou\]       | Any one of the characters a, e, i, o, u.                                                                                                                                |
| \[^aeiou\]      | Any character except a, e, i, o, u.                                                                                                                                     |
| \[a-e\]         | Any character in the range a-e, inclusive                                                                                                                               |
| \[a-zA-Z0-9\]   | Any alphanumeric character. Note: Case-sensitivity is controlled by the CaseSensitive property of the `RegExOptions</api/text/regular_expressions/regexoptions>` class. |
| \[\[\]          | Finds a \[.                                                                                                                                                             |
| \[\]\]          | Finds a \]. To find a a closing bracket, place it immediately after the opening bracket.                                                                                |
| \[a-e^\]        | Finds a character in the range a-e or the caret character. To find the caret character, place it anywhere except as the first character after the opening bracket.      |
| \[a-c-\]        | Finds a character in the range a-c or the - sign. To match a -, place it at the beginning or end of the set.                                                            |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Non-printing characters

You can use the following notation to find non-printing characters:

| Special Character | Matches               |
|-------------------|-----------------------|
| \r                | Line break (return)   |
| \n                | Newline (line feed)   |
| \t                | Tab                   |
| \f                | Formfeed (page break) |
| \x\*NN\*          | Hex code NN.          |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Other special characters

The following patterns are wildcards for the following special characters:

| Special Character | Matches                                                            |
|-------------------|--------------------------------------------------------------------|
| \s                | Any whitespace character (space, tab, return, linefeed, form feed) |
| \S                | Any non-whitespace character.                                      |
| \w                | Any "word" character (a-z, A-Z, 0-9, and \_)                       |
| \W                | Any "non-word" character (All characters not included by \w).      |
| \d                | Any digit \[0-9\].                                                 |
| \D                | Any non-digit character.                                           |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Repetition characters

Repetition characters are modifiers that allow you to repeat a specified pattern.

| Repetition Character | Matches                  | Examples                                                  |
|----------------------|--------------------------|-----------------------------------------------------------|
| \\                   | Zero or more characters. | d\* finds no characters, or one or more consecutive "d"s. |
| \\                   | One or more characters.  | d+ finds one or more consecutive "d"s.                    |
| ?                    | Zero or one characters.  | d? finds no characters or one "d".                        |

Please note that, since \* and ? match zero instances of the pattern, they always succeed but may not select any text. You can use them to specify an optional character, as in the examples in the following section.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### "greediness"

The "?" is used as a "greediness" modifier for a subpattern in a regular expression. By default, greediness is controlled by the Greedy property of the `RegExOptions</api/text/regular_expressions/regexoptions>` class, but can be overridden using the "?". You can place a "?" directly after a \* or + to reverse the "greediness" setting. That is, if Greedy is `True</api/language/true>`, using the ? after a \* or + causes it to match the minimum number of times possible: For example, consider the following.

| Target String | Greedy | Regular Expression | Result         |
|---------------|--------|--------------------|----------------|
| aaaa          | True   | (a+?)(a+)          | \$1=a, \$2=aaa |
| aaaa          | False  | (a+?)(a+)          | \$1=aaa, \$2=a |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Extension mechanism

We also support the regular expression extension mechanism used in Perl. For instance:

| (?#text)      | Comment                                                                                                                                                                            |
|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| (?:pattern)   | For grouping without creating backreferences                                                                                                                                       |
| (?=pattern)   | A zero-width positive look-ahead assertion. For example, \w+(?=\t) matches a word followed by a tab, without including the tab in \$&.                                             |
| (?!pattern)   | A zero-width negative look-ahead assertion. For example foo(?!bar) matches any occurrence of "foo" that isn't followed by "bar".                                                   |
| (?\<=pattern) | A zero-width positive look-behind assertion. For example, (?\<=\t)\w+ matches a word that follows a tab, without including the tab in \$&. Works only for fixed-width look-behind. |
| (?\<!pattern) | A zero-width negative look-behind assertion. For example (?\<!bar)foo matches any occurrence of "foo" that does not follow "bar". Works only for fixed-width look-behind.          |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Subexpressions

You can use parentheses within your search patterns to isolate portions of the matched string. You do this when you need to refer to subsections of the matched in your replacement string. For example you would do this if you need to replace only a portion of the matched string or insert other text into the matched string.

Here is an example. If you want to match any date followed by the letters "B.C." you can use the pattern "\d+\sB.C." (Any number of digits followed by a space character, followed by the letters "B.C.") This will match dates such as 33 B.C., 1742 B.C., etc. However, if you wanted your replacement pattern to leave the year alone but replace the letters with something else, you would use parens. The search pattern "(\d+)\s(B.C.)" does this.

When you write your replacement pattern, you can refer to the year only with the variable \1 and the letters with \2.

If you write "(\d+)\s(B.C.BC\|AD)", then \2 would contain the matched letters.

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Combining patterns

Much of the power of regular expressions comes from combining these elementary patterns to make up complex searches. Here are some examples:

| Pattern             | Matches                                                    |
|---------------------|------------------------------------------------------------|
| \\?\[0-9,\]+\\?\d\* | Matches dollar amounts with an optional dollar sign.       |
| \d+\sB.C.           | One or more digits followed by a space, followed by "B.C." |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### The alternation operator

The alternation operator (\|) allows you to match any of a number of patterns using the logical "or" operator. Place it between two existing patterns to match either pattern. You can use more than one alternation operator in a pattern:

| Pattern                   | Matches                                                                   |
|---------------------------|---------------------------------------------------------------------------|
| \she\s\|\sshe\s           | " he " or " she "                                                         |
| catpossum                 | "cat", "dog", or "possum"                                                 |
| (\[0-9,\]+\s(B.C.\|A.D.)) | Years of the form "yearNum B.C. or A.D." e.g., "2,175 B.C." or "215 A.D." |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Search and replace

You use special patterns to represent the matched pattern. Using replacement patterns, you can append or prepend the matched pattern with other text.

| Pattern      | Description                                                                           |
|--------------|---------------------------------------------------------------------------------------|
| \$&          | Contains the entire matched pattern.                                                  |
| \1, \2, etc. | Contains the matched subpatterns, defined by use of parentheses in the search string. |

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

### Credits

Xojo uses a modified version of the PCRE library package, which is open source software, written by Philip Hazel, and copyright by the University of Cambridge, England.

The source to this library is available [here](https://sourceforge.net/projects/pcre/files/).

## Sample code

The following `Button's</api/user_interface/desktop/desktopbutton>` Pressed event handler allows you to search the text in `TextArea1</api/user_interface/desktop/desktoptextarea>` using the search pattern entered into `TextField1</api/user_interface/desktop/desktoptextfield>` and display the results of the search in a `Label</api/user_interface/desktop/desktoplabel>`:

``` xojo
Var rg As New RegEx
Var myMatch As RegExMatch

rg.SearchPattern = TextField1.Text
myMatch = rg.Search(TextArea1.Text)

If myMatch <> Nil Then
  Label1.Text = myMatch.SubExpressionString(0)
Else
  Label1.Text = "Text not found!"
End If

Exception err As RegExException
  MessageBox(err.Message)
```

## Compatibility

|                       |     |
|-----------------------|-----|
| **Project Types**     | All |
| **Operating Systems** | All |

<div class="seealso">

`Object</api/data_types/additional_types/object>` parent class; `RegExMatch</api/text/regular_expressions/regexmatch>`, `RegExOptions</api/text/regular_expressions/regexoptions>` classes; `RegExException</api/exceptions/regexexception>` Error.

</div>
