Oracle 中的正则函数

保存在这里,方便自己以后查看 (_) (╯▽╰)

 

这个是函数中将会用到的模式串的介绍:

clip_image001

 

Oracle 中的正则函数一共有5个:

REGEXP_COUNT -- 统计子串出现的次数

REGEXP_INSTR -- 查找子串在母串中的位置

REGEXP_LIKE -- 模糊查询子串

REGEXP_REPLACE -- 替换

REGEXP_SUBSTR -- 截取

 

以下是官方关于这5个函数可能会用到的参数的介绍:

·        source_char is a character expression that serves as the search value. It is commonly a character column and can be of any of the datatypes CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB.

 

·        pattern is the regular expression. It is usually a text literal and can be of any of the datatypes CHAR, VARCHAR2, NCHAR, or NVARCHAR2. It can contain up to 512 bytes. If the datatype of pattern is different from the datatype of source_char, Oracle Database converts pattern to the datatype of source_char. For a listing of the operators you can specify in pattern, please refer to Appendix C, "Oracle Regular Expression Support".

 

·        position is a positive integer indicating the character of source_char where Oracle should begin the search. The default is 1, meaning that Oracle begins the search at the first character of source_char.

 

·        occurrence is a positive integer indicating which occurrence of pattern in source_char Oracle should search for. The default is 1, meaning that Oracle searches for the first occurrence of pattern.

 

·        return_option lets you specify what Oracle should return in relation to the occurrence:

o     If you specify 0, then Oracle returns the position of the first character of the occurrence. This is the default.

o     If you specify 1, then Oracle returns the position of the character following the occurrence.

·        match_parameter is a text literal that lets you change the default matching behavior of the function. You can specify one or more of the following values for match_parameter:

o     'i' specifies case-insensitive matching.

o     'c' specifies case-sensitive matching.

o     'n' allows the period (.), which is the match-any-character character, to match the newline character. If you omit this parameter, the period does not match the newline character.

o     'm' treats the source string as multiple lines. Oracle interprets ^ and $ as the start and end, respectively, of any line anywhere in the source string, rather than only at the start or end of the entire source string. If you omit this parameter, Oracle treats the source string as a single line.

o     'x' ignores whitespace characters. By default, whitespace characters match themselves.

1. REGEXP_COUNT(<source_string>, <pattern>[[, <start_position>], [<match_parameter>]])

-- 从'123123123123123'串中查找'123'子串的个数,起始查找位置为1
select regexp_count('123123123123123', '123', 1, 'i') from dual;
5

2. REGEXP_INSTR(<source_string>, <pattern>

   [[, <start_position>][, <occurrence>][, <return_option>][, <match_parameter>][,<sub_expression>]])

-- 从'500 Oracle Parkway, Redwood Shores, CA'串中查找 至少包含一个非空格字符的子串的位置
-- 起始查找位置为1,查找匹配的第6个子串的位置
select regexp_instr('500 Oracle Parkway, Redwood Shores, CA', '[^ ]+', 1, 6) from dual;

→ 37-- 从'500 Oracle Parkway, Redwood Shores, CA'串中查找

-- 包含s或r或p字符,后面跟着6个字母的子串,不区分大小写('Parkway','Redwood'匹配)
-- 起始查找位置为3,查找匹配的第2个子串('Redwood'匹配)
-- 返回的位置是紧接着匹配子串后面的字符的位置('Redwood'后面空格的位置)

select regexp_instr('500 Oracle Parkway, Redwood Shores, CA',
                    '[s|r|p][[:alpha:]]{6}',
                    3,
                    2,
                    1,
                    'i')
  from dual;

→ 28

3. REGEXP_LIKE(<source_string>, <pattern>, <match_parameter>)

-- 从指定列中查询出含有2个连续空格的字段
select source_string
  from (select 'hello, one space!' as source_string from dual
        union all
        select 'hello, two space! ' as source_string from dual
        union all
        select 'hello, two sequential space!  ' as source_string from dual)
 where regexp_like(source_string, '[[:space:]]{2}');

 

-- 从指定列中查询出first_name列以'Ste'开头,'en'结尾,中间包含字母'v'或'ph'的字段
select first_name, last_name
  from (select 'Steven' as first_name, 'King' as last_name from dual
        union all
        select 'Steven' as first_name, 'Markle' as last_name from dual
        union all
        select 'Stephen' as first_name, 'Stiles' as last_name from dual)
 where regexp_like(first_name, '^Ste(v|ph)en$')
 order by first_name, last_name;

4. REGEXP_REPLACE(<source_string>, <pattern>, <replace_string>

   [, <position>[, <occurrence>[, <match_parameter>]]])

-- 将字符串中连续的多个空格替换成一个空格
select regexp_replace('500   Oracle     Parkway,    Redwood  Shores, CA',
                      '( ){2,}',
                      ' ')
  from dual;

 

-- 把'xxx.xxx.xxxx'格式的电话号码转换为'(xxx) xxx-xxxx'格式
with T1 as(
     select '515.123.4567' as phone_number from dual
     union all
     select '515.123.4568' as phone_number from dual
     union all
     select '(515) 123 4569' as phone_number from dual
     union all
     select '13207730591' as phone_number from dual)
select regexp_replace(phone_number,
                      '([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})',
                      '(\1) \2-\3')
  from T1;

 

-- 给字段的每个字符后面加上空格
with T2 as(
     select 'Argentina' as country_name from dual
     union all
     select 'Australia' as country_name from dual
     union all
     select 'Belgium' as country_name from dual
     union all
     select 'Brazil' as country_name from dual
     union all
     select 'Canada' as country_name from dual)
select regexp_replace(country_name,
                      '(.)',
                      '\1 ')
  from T2;

 

 

5. REGEXP_SUBSTR(<source_string>, <pattern>[, <position> [, <occurrence>[, <match_parameter>]]])

-- 截取字符串中两个','之间的子串,子串只能开头和结尾含有',',中间至少要含有一个非','字符   
select regexp_substr('500 Oracle Parkway, Redwood Shores, CA', ',[^,]+,')   
  from dual;   

  

-- 截取字符串中包含 'http://' 子串,后面跟着3-4个 '字母/数字 0-1个.',再跟着0-1个 '/' 的子串  
select regexp_substr('http://www.oracle.com/products',   
                     'http://([[:alnum:]]+\.?){3,4}/?')   
  from dual;   

     

with T3 as(select '1:3,4:6,8:10,3:4,7:6,11:12' as source_string from dual)   
-- 从source_string列中查找不含':'的子串,起始查找位置为1,返回第1个子串   
select regexp_substr(source_string,'[^:]+', 1, 1) from T3   
union all  
-- 从source_string列中查找不含':'的子串,起始查找位置为3,返回第2个子串 
select regexp_substr(source_string,'[^:]+', 3, 2) from T3;

posted @ 2013-01-14 11:26  全威儒  阅读(1617)  评论(0编辑  收藏  举报