代码改变世界

groovy-正则表达式

2013-10-02 01:35  Rollen Holt  阅读(17116)  评论(0编辑  收藏  举报

Groovy使用~”pattern” 来支持正则表达式,它将使用给定的模式字符串创建一个编译好的Java Pattern 对象。Groovy也支持 =~(创建一个Matcher)和 ==~ (返回boolean,是否给定的字符串匹配这个pattern)操作符。

对于groups的匹配, matcher[index] 是一个匹配到的group字符串的List或者string。

1 import java.util.regex.Matcher
2 import java.util.regex.Pattern
3 // ~ creates a Pattern from String
4 def pattern = ~/foo/
5 assert pattern instanceof Pattern
6 assert pattern.matcher("foo").matches() // returns TRUE
7 assert pattern.matcher("foobar").matches() // returns FALSE, because matches() must match whole String
8  
9 // =~ creates a Matcher, and in a boolean context, it's "true" if it has at least one match, "false" otherwise.
10 assert "cheesecheese" =~ "cheese"
11 assert "cheesecheese" =~ /cheese/
12 assert "cheese" == /cheese/ /*they are both string syntaxes*/
13 assert ! ("cheese" =~ /ham/)
14  
15 // ==~ tests, if String matches the pattern
16 assert "2009" ==~ /\d+/ // returns TRUE
17 assert "holla" ==~ /\d+/ // returns FALSE
18  
19 // lets create a Matcher
20 def matcher = "cheesecheese" =~ /cheese/
21 assert matcher instanceof Matcher
22  
23 // lets do some replacement
24 def cheese = ("cheesecheese" =~ /cheese/).replaceFirst("nice")
25 assert cheese == "nicecheese"
26 assert "color" == "colour".replaceFirst(/ou/, "o")
27  
28 cheese = ("cheesecheese" =~ /cheese/).replaceAll("nice")
29 assert cheese == "nicenice"
30  
31 // simple group demo
32 // You can also match a pattern that includes groups. First create a matcher object,
33 // either using the Java API, or more simply with the =~ operator. Then, you can index
34 // the matcher object to find the matches. matcher[0] returns a List representing the
35 // first match of the regular expression in the string. The first element is the string
36 // that matches the entire regular expression, and the remaining elements are the strings
37 // that match each group.
38 // Here's how it works:
39 def m = "foobarfoo" =~ /o(b.*r)f/
40 assert m[0] == ["obarf""bar"]
41 assert m[0][1] == "bar"
42  
43 // Although a Matcher isn't a list, it can be indexed like a list. In Groovy 1.6
44 // this includes using a collection as an index:
45  
46 matcher = "eat green cheese" =~ "e+"
47  
48 assert "ee" == matcher[2]
49 assert ["ee""e"] == matcher[2..3]
50 assert ["e""ee"] == matcher[02]
51 assert ["e""ee""ee"] == matcher[01..2]
52  
53 matcher = "cheese please" =~ /([^e]+)e+/
54 assert ["se""s"] == matcher[1]
55 assert [["se""s"], [" ple"" pl"]] == matcher[12]
56 assert [["se""s"], [" ple"" pl"]] == matcher[1 .. 2]
57 assert [["chee""ch"], [" ple"" pl"], ["ase""as"]] == matcher[0,2..3]
58 // Matcher defines an iterator() method, so it can be used, for example,
59 // with collect() and each():
60 matcher = "cheese please" =~ /([^e]+)e+/
61 matcher.each println it }
62 matcher.reset()
63 assert matcher.collect { it }?? ==
64  [["chee""ch"], ["se""s"], [" ple"" pl"], ["ase""as"]]
65 // The semantics of the iterator were changed by Groovy 1.6.
66 // In 1.5, each iteration would always return a string of the entire match, ignoring groups.
67 // In 1.6, if the regex has any groups, it returns a list of Strings as shown above.
68  
69 // there is also regular expression aware iterator grep()
70 assert ["foo""moo"] == ["foo""bar""moo"].grep(~/.*oo$/)
71 // which can be written also with findAll() method
72 assert ["foo""moo"] == ["foo""bar""moo"].findAll { it ==~ /.*oo/ }

More Examples

匹配每行开头的大写单词:

1 def before='''
2 apple
3 orange
4 y
5 banana
6 '''
7  
8 def expected='''
9 Apple
10 Orange
11 Y
12 Banana
13 '''
14  
15 assert expected == before.replaceAll(/(?m)^\w+/,
16  { it[0].toUpperCase() + ((it.size() > 1) ? it[1..-1] : '') })

匹配字符串中的每一个大写单词

1 assert "It Is A Beautiful Day!" ==
2  ("it is a beautiful day!".replaceAll(/\w+/,
3  { it[0].toUpperCase() + ((it.size() > 1) ? it[1..-1] : '') }))

使用 .toLowerCase() 让其他单词小写:

1 assert "It Is A Very Beautiful Day!" ==
2  ("it is a VERY beautiful day!".replaceAll(/\w+/,
3  { it[0].toUpperCase() + ((it.size() > 1) ? it[1..-1].toLowerCase() :'') }))

Gotchas

怎么使用String.replaceAll()的反向引用

GStrings 可能和你期望的不一样

1 def replaced = "abc".replaceAll(/(a)(b)(c)/, "$1$3")

产生一个类似于下面的错误:

[] illegal string body character after dollar sign:

解决办法:: either escape a literal dollar sign “\$5″ or bracket the value expression “${5}” @ line []

Solution:

Use ‘ or / to delimit the replacement string:

1 def replaced = "abc".replaceAll(/(a)(b)(c)/, '$1$3')