Regxone note

Special Character: \ / ( ) [ ] { } + * ? | $ ^ . which are escaped characters, so you need to add \ backslash to before them.

Greedy Way

As the official python document said, It just go as far as it can.

abc will match the sequence ordered.
\d will match every digits [0-9]
\D will match every character which is Non-digit [^0-9]
. Any Charater
\. Period (Some preceding charcter must using a slash eg ? ?)
[abc] Only a, b, or c
[^abc] Nor a, b, nor c
[a-z] Characters a to z
[0-9] Numbers 0 to 9
\w Any Alphanumeric character [a-zA-Z_0-9]
\W Any Non-alphanumeric character [^a-zA-Z_0-9]
{m} m Repetitions
{m, n} m to n Repetitions
* Zero or more repetiotions, also denoted by {0,}
+ One or more repetitions, also denoted by {1,}
? Optional character, also denoted by {0, 1}
\s Any Whitespace (it will match any of the specific whitespace such as space or tab or new line) [\t\n\r\f\v]
\S Any Non-whitespace character [^\t\n\r\f\v]
^...$ Starts and ends
(...) Capture Group
(a(bc)) Capture Sub-group
(.*) Capture all
(abc | def) Matches abc or def

Anchors

^: Matches the beginning of a string. Example: ^(I|You) matches I or You at the start of a string.
$: Normally matches the empty string at the end of a string or just before a newline at the end of a string. Example: (\.edu|\.org|\.com)$ matches .edu, .org, or .com at the end of a string.
\b: Matches a "word boundary", the beginning or end of a word. Example: s\b matches s characters at the end of words.

Some tricks in class

If you wanna non-greedy
You should add ? after '*' or '+' to disable the greedy way that means match much as they can. (Bug it can't be useful in command sed, and you should take command perl)

posted on 2022-07-01 20:08 Jack404 阅读(8) 评论(0) 收藏举报

刷新页面返回顶部