代码改变世界

一起谈.NET技术,.NET 中的正则表达式

2011-09-02 00:21  狼人:-)  阅读(257)  评论(0编辑  收藏  举报

前两天面试一个程序员,自己说工作中用到过正则表达式,也比较熟悉,问他要使用正则表达式需要引用那个命名空间,使用哪些类,居然吱吱唔唔答不上来,让他写一个验证电话号码的正则表达式也写不出来,实在是很奇怪这种程序员是怎么工作了两三年的。

言归正传,下面介绍下.net中正则表达式中的使用。

要在.net中使用正则表达式,需要引用System.Text.RegularExpressions 命名空间。新建一个正则表达式类:

string pattern = "some_pattern"; //正则表达式字符串
Regex regex = new Regex(pattern);

使用正则表达式匹配字符串

string input = "some_input";
Match match
= regex.Match(input);
MatchCollection matches
= regex.Matches(input);
bool isMatch = regex.IsMatch(input);

Match方法返回单个的精确匹配结果,Matches返回所有的匹配结果的一个Match类的集合,IsMatch方法返回是否能够匹配输入字符串的一个bool结果。

Match类是一个保持匹配结果的类,它有一个成员Groups,是一个保存Group class的集合类。

Group 表示单个捕获组的结果。由于存在数量词,一个捕获组可以在单个匹配中捕获零个、一个或更多的字符串,因此 Group 提供 Capture 对象的集合。

Capture 表示单个成功捕获中的一个子字符串。

Group从Capture继承,表示单个捕获组的最后一个字符串。

即对于一个Group 类的实例对象group:

int captureCount = group.Captures.Count;

则group.Value与group.Captures[captureCount - 1].Value是相等的。

以下是几个正则表达式的使用样例:

使用正则表达式检查字符串是否具有表示货币值的正确格式。

 

代码
using System;
using System.Text.RegularExpressions;

public class Test
{
public static void Main ()
{
// Define a regular expression for currency values.
Regex rx = new Regex(@"^-?\d+(\.\d{2})?$");

// Define some test strings.
string[] tests = {"-42", "19.99", "0.001", "100 USD",
".34", "0.34", "1,052.21"};

// Check each test string against the regular expression.
foreach (string test in tests)
{
if (rx.IsMatch(test))
{
Console.WriteLine(
"{0} is a currency value.", test);
}
else
{
Console.WriteLine(
"{0} is not a currency value.", test);
}
}
}
}
// The example displays the following output to the console:
// -42 is a currency value.
// 19.99 is a currency value.
// 0.001 is not a currency value.
// 100 USD is not a currency value.
// .34 is not a currency value.
// 0.34 is a currency value.
// 1,052.21 is not a currency value.

 

使用正则表达式检查字符串中重复出现的词。

 

 

using System;
using System.Text.RegularExpressions;

public class Test
{

public static void Main ()
{

// Define a regular expression for repeated words.
Regex rx = new Regex(@"\b(?<word>\w+)\s+(\k<word>)\b",
RegexOptions.Compiled
| RegexOptions.IgnoreCase);

// Define a test string.
string text = "The the quick brown fox fox jumped over the lazy dog
dog.";

// Find matches.
MatchCollection matches = rx.Matches(text);

// Report the number of matches found.
Console.WriteLine("{0} matches found in:\n {1}",
matches.Count,
text);

// Report on each match.
foreach (Match match in matches)
{
GroupCollection groups
= match.Groups;
Console.WriteLine(
"'{0}' repeated at positions {1} and {2}",
groups[
"word"].Value,
groups[
0].Index,
groups[
1].Index);
}

}

}
// The example produces the following output to the console:
// 3 matches found in:
// The the quick brown fox fox jumped over the lazy dog dog.
// 'The' repeated at positions 0 and 4
// 'fox' repeated at positions 20 and 25
// 'dog' repeated at positions 50 and 54

使用 Capture 对象在控制台中显示每个正则表达式匹配项组的成员。

 

代码
string text = "One fish two fish red fish blue fish";
string pat = @"(?<1>\w+)\s+(?<2>fish)\s*";
// Compile the regular expression.
Regex r = new Regex(pat, RegexOptions.IgnoreCase);
// Match the regular expression pattern against a text string.
Match m = r.Match(text);
while (m.Success)
{
// Display the first match and its capture set.
System.Console.WriteLine("Match=[" + m + "]");
CaptureCollection cc
= m.Captures;
foreach (Capture c in cc)
{
System.Console.WriteLine(
"Capture=[" + c + "]");
}
// Display Group1 and its capture set.
Group g1 = m.Groups[1];
System.Console.WriteLine(
"Group1=[" + g1 + "]");
foreach (Capture c1 in g1.Captures)
{
System.Console.WriteLine(
"Capture1=[" + c1 + "]");
}
// Display Group2 and its capture set.
Group g2 = m.Groups[2];
System.Console.WriteLine(
"Group2=["+ g2 + "]");
foreach (Capture c2 in g2.Captures)
{
System.Console.WriteLine(
"Capture2=[" + c2 + "]");
}
// Advance to the next match.
m = m.NextMatch();
}
// The example displays the following output:
// Match=[One fish ]
// Capture=[One fish ]
// Group1=[One]
// Capture1=[One]
// Group2=[fish]
// Capture2=[fish]
// Match=[two fish ]
// Capture=[two fish ]
// Group1=[two]
// Capture1=[two]
// Group2=[fish]
// Capture2=[fish]
// Match=[red fish ]
// Capture=[red fish ]
// Group1=[red]
// Capture1=[red]
// Group2=[fish]
// Capture2=[fish]
// Match=[blue fish]
// Capture=[blue fish]
// Group1=[blue]
// Capture1=[blue]
// Group2=[fish]
// Capture2=[fish]