MATLAB中的Regex
- regexprep——用于对字符串进行查找并替换。
regexp
Definition:
用于对字符串进行查找,大小写敏感。
- startIndex = regexp(str,expression)
返回与正则表达式指定的字符模式匹配的每个str子字符串的起始索引。如果没有匹配,startIndex就是一个空数组。
- [startIndex,endIndex] = regexp(str,expression)
返回起始索引和结束索引。
- out = regexp(str,expression,outkey)
返回由outkey指定的输出。例如,如果outkey是'match',那么regexp将返回与表达式匹配的子字符串,而不是它们的起始索引。
- [out1,...,outN] = regexp(str,expression,outkey1,...,outkeyN)
用于指定多个输出关键字outkey,获得多个输出。
outkey:
- 'start':起始索引;
- 'end':结束索引;
- 'tokenExtents':返回HTML标签的起始和结束索引;
- 'match':匹配到的文本;
- 'tokens':返回匹配的HTML标签;
- 'names':匹配数值并分配给命名;
- 'split':被expression分隔开的str的非匹配子字符串的文本。
examples:
普通索引匹配
str = 'bat cat can car coat court CUT ct CAT-scan'; expression = 'c[aeiou]+t'; startIndex = regexp(str,expression)
startIndex = 1×2
5 17
多个字符串同时匹配
str = {'Madrid, Spain','Romeo and Juliet','MATLAB is great'};
capExpr = '[A-Z]';
capStartIndex = regexp(str,capExpr);
celldisp(capStartIndex)
capStartIndex{1} =
1 9
capStartIndex{2} =
1 11
capStartIndex{3} =
1 2 3 4 5 6
字符串匹配('match')
str = 'EXTRA! The regexp function helps you relax.'; expression = '\w*x\w*'; matchStr = regexp(str,expression,'match'); celldisp(matchStr)
matchStr{1} =
regexp
matchStr{2} =
relax
非匹配文本
str = 'She sells sea shells by the seashore.'; expression = '[Ss]h.'; [match,noMatch] = regexp(str,expression,'match','split')
match = 1×3 cell 数组
{'She'} {'she'} {'sho'}
combinedStr = strjoin(noMatch,match)
combinedStr = 'She sells sea shells by the seashore.'
捕获HTML标记
str = '<title>My Title</title><p>Here is some text.</p>'; expression = '<(\w+).*>.*</\1>'; [tokens,matches] = regexp(str,expression,'tokens','match');
tokens{1}{1} =
title
tokens{2}{1} =
p
matches{1} =
<title>My Title</title>
matches{2} =
<p>Here is some text.</p>
Enclosing \w+ in parentheses captures the name of the HTML tag in a token. (回溯引用)
命名匹配分配('names')
str = '01/11/2000 20-02-2020 03/30/2000 16-04-2020';
expression = ['(?<month>\d+)/(?<day>\d+)/(?<year>\d+)|'...
'(?<day>\d+)-(?<month>\d+)-(?<year>\d+)'];
tokenNames = regexp(str,expression,'names');
for k = 1:length(tokenNames)
disp(tokenNames(k))
end
month: '01'
day: '11'
year: '2000'
month: '02'
day: '20'
year: '2020'
month: '03'
day: '30'
year: '2000'
month: '04'
day: '16'
year: '2020'
(?<name>\d+) finds one or more numeric digits and assigns the result to the token indicated by name.
regexpi
和regexp用法类似,大小写不敏感。
regexprep
- newStr = regexprep(str,expression,replace)
Replaces the text in str that matches expression with the text described by replace. The regexprep function returns the updated text in newStr.
examples
回溯引用替换
str = 'I walk up, they walked up, we are walking up.'; expression = 'walk(\w*?) up'; replace = 'ascend$1'; newStr = regexprep(str,expression,replace)
newStr = 'I ascend, they ascended, we are ascending.'
引用函数(这个似乎只有在MATLAB里面试用,在其他场合如Notepad++并不适用)
str = 'here are two sentences. neither is capitalized.';
expression = '(^|\.)\s*.';
replace = '${upper($0)}';
newStr = regexprep(str,expression,replace)
newStr = 'Here are two sentences. Neither is capitalized.'
The replace expression calls the upper function for the currently matching character ($0).

浙公网安备 33010602011771号