Regxone note
Special Character: \ / ( ) [ ] { } + * ? | $ ^ .
which are escaped characters, so you need to add \
backslash to before them.
Greedy Way
As the official python document said, It just go as far as it can.
abc
will match the sequence ordered.\d
will match every digits[0-9]
\D
will match every character which is Non-digit[^0-9]
.
Any Charater\.
Period (Some preceding charcter must using a slash eg ? ?)[abc]
Only a, b, or c[^abc]
Nor a, b, nor c[a-z]
Characters a to z[0-9]
Numbers 0 to 9\w
Any Alphanumeric character[a-zA-Z_0-9]
\W
Any Non-alphanumeric character[^a-zA-Z_0-9]
{m}
m Repetitions{m, n}
m to n Repetitions*
Zero or more repetiotions, also denoted by{0,}
+
One or more repetitions, also denoted by{1,}
?
Optional character, also denoted by{0, 1}
\s
Any Whitespace (it will match any of the specific whitespace such as space or tab or new line)[\t\n\r\f\v]
\S
Any Non-whitespace character[^\t\n\r\f\v]
^...$
Starts and ends(...)
Capture Group(a(bc))
Capture Sub-group(.*)
Capture all(abc | def)
Matches abc or def
Anchors
^
: Matches the beginning of a string. Example: ^(I|You)
matches I or You at the start of a string.
$
: Normally matches the empty string at the end of a string or just before a newline at the end of a string. Example: (\.edu|\.org|\.com)$
matches .edu, .org, or .com at the end of a string.
\b
: Matches a "word boundary", the beginning or end of a word. Example: s\b
matches s characters at the end of words.
Some tricks in class
- If you wanna non-greedy
You should add?
after '*' or '+' to disable the greedy way that means match much as they can. (Bug it can't be useful in commandsed
, and you should take commandperl
)