java正则中的Groups

组是用括号划分的正则表达式,可以根据组的编号来引用整个组。组号为0表示整个表达式,组号为1表示被第一对括号括起的组,依次类推。因此,在下面这个表达式,

A(B(C))D

中有三个组,组0是 ABCD ,组1是 BC ,组2是 C

使用示例:

//: strings/Groups.java
import java.util.regex.*;
import static net.mindview.util.Print.*;

public class Groups {
  static public final String POEM =
    "Twas brillig, and the slithy toves\n" +
    "Did gyre and gimble in the wabe.\n" +
    "All mimsy were the borogoves,\n" +
    "And the mome raths outgrabe.\n\n" +
    "Beware the Jabberwock, my son,\n" +
    "The jaws that bite, the claws that catch.\n" +
    "Beware the Jubjub bird, and shun\n" +
    "The frumious Bandersnatch.";
  public static void main(String[] args) {
    Matcher m =
      Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$")
        .matcher(POEM);
    while(m.find()) {
      for(int j = 0; j <= m.groupCount(); j++)
        printnb("[" + m.group(j) + "]");
      print();
    }
  }
} /* Output:
[the slithy toves][the][slithy toves][slithy][toves]
[in the wabe.][in][the wabe.][the][wabe.]
[were the borogoves,][were][the borogoves,][the][borogoves,]
[mome raths outgrabe.][mome][raths outgrabe.][raths][outgrabe.]
[Jabberwock, my son,][Jabberwock,][my son,][my][son,]
[claws that catch.][claws][that catch.][that][catch.]
[bird, and shun][bird,][and shun][and][shun]
[The frumious Bandersnatch.][The][frumious Bandersnatch.][frumious][Bandersnatch.]
*///:~

分析:

首先,regex"(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$"中(?m)表示多行模式;每行以$结束,这里表示以$符号前面正则匹配到的东西结尾。
这个正则的目的是捕获每行最后的3个词。从输出的结果我们可以看到,一共有5组,(\\S+)\\s+((\\S+)\\s+(\\S+))$中有4对括号,加上group(0),也就是整个表达式,就是5组了。
这里按照顺序分出组0组1组2组3组4的方法和上面相同,就不复述,这个从最后的输出结果中也可以验证。
posted @ 2020-09-16 22:14  模糊计算士  阅读(470)  评论(0编辑  收藏  举报