RE | Description |
---|---|
. | Any character except new line |
\d | Digit (0 - 9) |
\D | Not a Digit |
\w | Word character (a-z, A-Z, 0-9, _) |
\W | Not a word character |
\s | Whitespace (space, tab, newline) |
\S | Not whitespace |
\b | Word boundary #\bha means space or nothing before 'ha' |
\B | Not a word boundary |
^ | Beginning of a string |
$ | End of a string |
[] | Matches characters in bracket, no need for inside in case of escape characters,[1-7] == [1234567] != [-17] |
[^ ] |
Matches characters not in bracket, [^a-c] == all letters except a,b,c |
| | Either Or |
() | Group |
Quantifiers | |
* | 0 or more |
+ | 1 or more |
? | 0 or 1 |
{3} | Exact number |
{3, 4} | Range of numbers (minimum, maximum) |
Meta Characters | |
.[{()^$?*+ | need to be escaped by (back slash) |
Text | Regular Expression |
---|---|
abcdefghijklmnopqurtuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890 |
|
Ha HaHa |
\bHa (using word boundary) |
abhishekpathak.com | \w+\.com |
321-555-4321 123.555.1234 |
\b{3}[-.]\b\{3}[-.]\b{4} |
Mr. Schafer Mr Smith Ms David Mrs. Robinson Mr. T |
M(r\|s\|rs).?\s[A-Z]\w* |
cat mat pat bat |
[^b]at |
CoreyMSchafer@gmail.com corey.schafer@university.edu corey-321-schafer@my-work.net |
[a-zA-Z0-9.-]+@[a-zA-Z-]+.(com\|edu\|net) |
All kind of email addresses | [a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+ |
```python
raw string: print("\tTab"): ____Tab ; print(r"\tTab"): \tTab¶
import re search_in_text = """ abcdefghijklmnopqurtuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890 """ pattern = re.compile(r'regular_expression_here') searches = pattern.finditer(search_in_text) for search in searches: print(search)```