characters | description | matches |
---|---|---|
. | not newline | any character except line terminators (LF, CR, LS, PS). |
\t | tab (HT) | a horizontal tab character (same as \u0009). |
\n | newline (LF) | a newline (line feed) character (same as \u000A). |
\v | vertical tab (VT) | a vertical tab character (same as \u000B). |
\f | form feed (FF) | a form feed character (same as \u000C). |
\r | carriage return (CR) | a carriage return character (same as \u000D). |
\cletter | control code | a control code character whose code unit value is the same as the remainder of dividing the code unit value of letter by 32. For example: \ca is the same as \u0001, \cb the same as \u0002, and so on... |
\xhh | ASCII character | a character whose code unit value has an hex value equivalent to the two hex digits hh. For example: \x4c is the same as L, or \x23 the same as #. |
\uhhhh | unicode character | a character whose code unit value has an hex value equivalent to the four hex digits hhhh. |
\0 | null | a null character (same as \u0000). |
\int | backreference | the result of the submatch whose opening parenthesis is the int-th (int shall begin by a digit other than 0). See groups below for more info. |
\d | digit | a decimal digit character (same as [[:digit:]]). |
\D | not digit | any character that is not a decimal digit character (same as [^[:digit]]). |
\s | whitespace | a whitespace character (same as [[:space:]]). |
\S | not whitespace | any character that is not a whitespace character (same as [^[:space:]]). |
\w | word | an alphanumeric or underscore character (same as [_[:alnum:]]). |
\W | not word | any character that is not an alphanumeric or underscore character (same as [^_[:alnum:]]). |
\character | character | the character character as it is, without interpreting its special meaning within a regex expression. Any character can be escaped except those which form any of the special character sequences above. Needed for: ^ $ \ . * + ? ( ) [ ] { } | |
[class] | character class | the target character is part of the class (see character classes below) |
[^class] | negated character class | the target character is not part of the class (see character classes below) |
|
|
characters | times | effects |
---|---|---|
* | 0 or more | The preceding atom is matched 0 or more times. |
+ | 1 or more | The preceding atom is matched 1 or more times. |
? | 0 or 1 | The preceding atom is optional (matched either 0 times or once). |
{int} | int | The preceding atom is matched exactly int times. |
{int,} | int or more | The preceding atom is matched int or more times. |
{min,max} | between min and max | The preceding atom is matched at least min times, but not more than max. |
characters | description | effects |
---|---|---|
(subpattern) | Group | Creates a backreference. |
(?:subpattern) | Passive group | Does not create a backreference. |
characters | description | condition for match |
---|---|---|
^ | Beginning of line | Either it is the beginning of the target sequence, or follows a line terminator. |
$ | End of line | Either it is the end of the target sequence, or precedes a line terminator. |
\b | Word boundary | The previous character is a word character and the next is a non-word character (or vice-versa). Note: The beginning and the end of the target sequence are considered here as non-word characters. |
\B | Not a word boundary | The previous and next characters are both word characters or both are non-word characters. Note: The beginning and the end of the target sequence are considered here as non-word characters. |
(?=subpattern) | Positive lookahead | The characters following the assertion must match subpattern, but no characters are consumed. |
(?!subpattern) | Negative lookahead | The characters following the assertion must not match subpattern, but no characters are consumed. |
character | description | effects |
---|---|---|
| | Separator | Separates two alternative patterns or subpatterns. |
class | description | notes |
---|---|---|
[:classname:] | character class | Uses the regex traits' isctype member with the appropriate type gotten from applying lookup_classname member on classname for the match. |
[.classname.] | collating sequence | Uses the regex traits' lookup_collatename to interpret classname. |
[=classname=] | character equivalents | Uses the regex traits' transform_primary of the result of regex_traits::lookup_collatename for classname to check for matches. |
class | description | equivalent (with regex_traits, default locale) |
---|---|---|
[:alnum:] | alpha-numerical character | isalnum |
[:alpha:] | alphabetic character | isalpha |
[:blank:] | blank character | isblank |
[:cntrl:] | control character | iscntrl |
[:digit:] | decimal digit character | isdigit |
[:graph:] | character with graphical representation | isgraph |
[:lower:] | lowercase letter | islower |
[:print:] | printable character | isprint |
[:punct:] | punctuation mark character | ispunct |
[:space:] | whitespace character | isspace |
[:upper:] | uppercase letter | isupper |
[:xdigit:] | hexadecimal digit character | isxdigit |
[:d:] | decimal digit character | isdigit |
[:w:] | word character | isalnum |
[:s:] | whitespace character | isspace |