Java Regular Expression Reference
Standard Regex Meta-Characters
Character | Matches |
---|---|
\\ | The backslash character (\) |
\0n | Octal character value 0n |
\xn | Hexidecimal character value 0xn |
\x{n...n} | Hexidecimal character value 0xn...n |
\t | Tab character |
\v | Vertical tab character |
\n | Newline character |
\r | Carriage-return character |
\f | Form-feed character |
\a | Alert (bell) character |
\e | Escape character |
\cX | Control character corresponding to X |
Regex Predefined Character Classes
Character Class | Matches |
---|---|
. | Any character (except \n depending on engine) |
\d | Any digit character (0-9) |
\D | Any non-digit character (not 0-9) |
\s | Any whitespace character |
\S | Any non-whitespace character |
\w | Any word character |
\W | Any non-word character |
[xgn] | x, g, or n |
[^xgn] | Not x, g, or n |
[a-zA-Z] | All uppercase and lowercase letters |
[xgn[aei]] | x, g, n, a, e, or i (union) |
[n-z&&[xgn]] | x or n (intersection) |
[a-z&&[^bc]] | All lowercase letters but not b or c (subtraction) |
[a-z&&[^g-i]] | All lowercase letters but not g, h, or i (subtraction) |
Regex POSIX Character Classes
Java Pattern | POSIX Pattern | Matches |
---|---|---|
\p{Lower} | [:lower:] | All lowercase letters [a-z] |
\p{Upper} | [:upper:] | All uppercase letters [A-Z] |
\p{ASCII} | All ASCII characters | |
\p{Alpha} | [:alpha:] | All uppercase and lowercase letters [A-Za-z] |
\p{Digit} | [:digit:] | All digits [0-9] |
\p{Alnum} | [:alnum:] | All letters and digits [A-Za-z0-9] |
\p{Punct} | [:punct:] | All punctuation [!"#$%&'()*+,\-./:;<=>?@\[\]^_`{|}~\\] |
\p{Graph} | [:graph:] | All letters, digits, and punctuation |
\p{Print} | [:print:] | All graph characters and space |
\p{Blank} | [:blank:] | Space or tab [\t ] |
\p{Cntrl} | [:cntrl:] | All control characters |
\p{XDigit} | [:xdigit:] | All hexidecimal characters [A-Fa-f0-9] |
\p{Space} | [:space:] | All whitespace characters [ \t\n\v\f\r] |
Regex Quantifiers
Symbol | Greedy Usage | Reluctant Usage | Matches |
---|---|---|---|
? | X? | X?? | Proceeding construct 0 or 1 time |
* | X* | X*? | Proceeding construct 0 or more times |
+ | X+ | X+? | Proceeding construct 1 or more times |
{n} | X{n} | X{n}? | Proceeding construct n times |
{n,} | X{n,} | X{n,}? | Proceeding construct n or more times |
{n,m} | X{n,m} | X{n,m}? | Proceeding construct at least n but no more than m times |
Regex Escaping & Quoting
Pattern | Usage | Meaning |
---|---|---|
\ | \. | When escaping does not produce a meta-character matches the following character, in the example, a dot |
\Q and \E | \Qliteral text.\E | Uses the string literal between the start, \Q, and end, \E, ignoring special meanings of characters |
Regex Logical Operators
Operator | Usage | Meaning |
---|---|---|
| | A|B | Matches A or B |
Regex Groups & Back-references
Pattern | Usage | Meaning |
---|---|---|
() | (A) | Creates a capturing group, matching and capturing A |
(?<name>) | (?<name>To Match) | Creates a named capturing group with the given name |
(?:) | (?:To Match)+ | Creates a non-capturing group |
\n | (A)\1 | Matches the proceeding captured group n (ordered left to right) |
\k<name> | \k<test> | Matches the proceeding captured group with the given name |