RegEx is the Camel Case notation of Regular Expressions. These are a powerful way to search for strings and make replacements. For example, RegEx might be written “RegEx”, “regex” or “Regular Expressions”. With a normal string search functions you have to make three separate searches. With regex you only need to type “(RegEx|regex|Regular Expressions)”. What does those parenthesis and vertical rules mean? These are so-called metacharacters:
| . | Represents any symbol except Carriage-Return, Tab and so on. The not-included whitespaces depend on your configuration. “e.p” matches “exp” and “emp” |
| ^ | RegEx must be found at string starting position. “^Mail” matches “Mailer” but not “ReMail”. |
| $ | RegEx must be found at string end position. “Mail$” matches “ReMail” but not “Mailer”. |
| Escape character. “.” matches a real dot and not any character. “” matches a backslash. | |
| [ ] | Characters inside the brackets define a range. For example “p[aoe]st” matches “post”, “past” and “pest”, but not “poast” and “pst”. To simplify ranges, you can simply enter [0-9] to match any number. [a-z] matches any lowercase character and [A-Z] uppercases. Two things to mention: Firstly metacharacters are unbound from their meaning (. is a dot and not any character) and secondly you have to prepend a minus symbol inside the brackets to include a real minus. For example “[-a-zA-Z0-9()!]” matches “Hello-World!”, “number4866″ and “(nothing!)”. |
| ( ) | Save match inside the brackets in a buffer. For example “(lukas) (prokop)” saves “lukas” in Buffer #0 and “prokop” in Buffer #1. These buffers can useful when using RegEx in replacement contexts. |
| < > | Start/End of a word. A word is defined by a punctuation mark or a space. For example “<reg” matches “registry”, but not “preg_match”. “home>” matches “at home”, but not “homepage” or “homer”. |
| | | OR-Selection. Use the term to the right or left. “(home|page)” matches “home” and “page”. |
Those metacharacters can be combined to powerful RegEx’. Visit regular-expressions.info to get information about different purposes of RegEx and their implementation in programming languages.
Beside metacharacters another part of RegEx are Quantifiers: Characters which define the number of possible occurences of a characters:
| * | Term may occur never, 1 time or more often. |
| + | Term may occur 1 time or more often. |
| = | Term may occur never or 1 time. |
| {n,m} | Term may occur between n and m times (including m and n itself). |
| {n} | Term may occur exactly n times. |
| {,m} | Term may occur between 0 and m (incl. m) times. |
| {n,} | Term may occur at least n times. |
In the next session we will learn how to use regex’ in Vim

Recent Comments