正则表达式匹配带括号的复杂表达式样例

  在项目中,遇到需求,需要进行规则入库,想到使用正则进行表达式的拆分和分类,具体如下:  

Operation Mode(Operation Mode_2) (Approve CR1) equals Accept And ( Need Physical Access(Need Physical Access) (Create CR) does not equal Switch Site Or ( Need Physical Access(Need Physical Access) (Create CR) equals Switch Site And Region(Region) (Create CR) equals R10 ) ) And ( Change Type(ChangeType_CreateCR) (Create CR) belongs to PM / CM CR,BTS_Modernization,FDD/TDD,Cell_Upgrade Or Change Type(ChangeType_CreateCR) (Create CR) belongs to PAT and FAT and FAC_NSA,Health Check for TX,Survey_Visit NSA,Site Cleanup,Site Battery Test,BSC/RNC Software Upgrade,OTDR/Fusion,APN Creation/Modification,Firewall,Facility_Non Telecom_NSA Or Change Type(ChangeType_CreateCR) (Create CR) belongs to WiMAX MW Upgrade,RAN MW Upgrade,FDD and WIMAX,TDD and WIMAX,FDD/TDD and WIMAX,Change time_Not implemented CR Or Change Type(ChangeType_CreateCR) (Create CR) belongs to SiteLevel_QOS test_QOS implementation,RAN CRs,TXCore_NSA without physical change,TXCore_NSA including physical change,CoreCS_NSA without physical change,CorePS_NSA without physical change,CorePS_NSA including physical change,IP NSA_without physical change,IP NSA_including physical change,CoreCS_NSA including physical change Or Change Type(ChangeType_CreateCR) (Create CR) belongs to LAC / TAK Modification,Site Sharing,Rehoming Site,BSC/RNC_Unit check/replace,RAN LOS Checking,IP_Installation / Integration / Dismantling,Net Security Access,PAT/FAT/FAC_SA_SiteLevel,SiteLevel_installation_Change_DismantlingEquipment,SiteLevel_SoftwareUpgrade,SiteLevel_Anti Theft Belt Installation,Connecting External Fiber To Internal Fiber In Site,SiteLevel_PDB Cabling And Fuse Change,BSC/RNC_OSS Upgrade,TX Core_V LAN Adding,WiMAX_Core Level,BSC/RNC_NSA_installation / integration / dismantling Or Change Type(ChangeType_CreateCR) (Create CR) belongs to Site Level_Rerouting/Relocation,BSC/RNC_Redundancy Board Check,WiMAX_Site or TX Level including physical change,WiMAX_Site or TX Level without physical change,PM/CM_WiMAX SA Or Change Type(ChangeType_CreateCR) (Create CR) belongs to TX Core_OSS Upgrade,IP_OSS Upgrade,SiteLevel_Clean up_NSA,SiteLevel_Clean up_SA,TX Access_Without physical change,TX Access_Including physical change,SiteLevel_Software changes,BSC/RNC Software changes,SDM Or Change Type(ChangeType_CreateCR) (Create CR) belongs to BSS-BTS Level,Core-CS,Core-PS,TX-Core,IP-SA,WiMAX-BSS and TX,WiMAX-Core,Survey/Visit Including Tower Activity Or Change Type(ChangeType_CreateCR) (Create CR) belongs to Go_Live,Gi APN Creation/Modification,MW Performance,Sunshade Installation,BTS Halt Or Change Type(ChangeType_CreateCR) (Create CR) belongs to IP_RAN,TX_SA,Fiber,Gi Lan,PM_SiteLevel_SA Or Change Type(ChangeType_CreateCR) (Create CR) belongs to Optimization_High SA,Roaming ) 

  目标是:拆分介于表示逻辑运算的“And”或者“Or”的子句,比如:Operation Mode(Operation Mode_2) (Approve CR1) equals Accept;Need Physical Access(Need Physical Access) (Create CR) does not equal Switch Site等子句,以便于后续的处理;

  正则表达式的思路:上述字符串的子句是“Operation Mode”等诸如此类的关键字+括号+括号+equals等表示所属关系的关键字+各类单项,

  那么,

  1.首先整理出来,用于匹配开头的关键字的子句,记作K1:(Operation Mode|Need Physical Access|Change Type|Region)

  2.接下来是整理出匹配equals等表示所属关系的关键字,记作K2:( belongs to | equals | contains | does not equal | does not contain | does not belong to )

  3.接下来是整理出匹配两个连续括号部分的第一个括号,记作S1,这里有这么几类:(字符串+空格+字符串)

  那么对应的S1为:((\((\w* ?\w*)*\))|(\((\w*_\w*)*\))) :其中(\w* ?\w*)用于匹配(字符串+空格+字符串)(单个字符串)这两种格式,(\w* ?\w*)*用于匹配上述两种格式的多个字符串,如:(Need Physical Access) 。其中(\w*_\w*)*用于匹配(字符串_字符串)的这种情况;

  4.接下来是整理出匹配两个连续括号部分的第二个括号,记作S2,这里仅有一类:(字符串+空格+字符串)

  那么对应的S2为:这地方第二个括号的情况包含于第一类,所以令S2=S1

  5.接下来,接下来需要取出表示所属关系的字符串后边的字段,这里有如下情况,在所属关系的字符串后边的字段中,有    “字符串空格And(”或者“字符串+Or(  ”或者“And”或者“Or”,那么,可以使用不包含字段:((?!所要剔除的字符串).)*,但是,这个地方需要注意的是,有些关系运算符后边的字符串也包含And或者Or,那么,我们需要进行如下处理,需要提出的只是And或者Or后边是K1匹配出来的子串,S3表达式如下:
  S3:((?!( And \(|Or \(| \)| And (K1S1 S1)| Or (K1S1 S1))).)*

  那么,最终的表达式如下所示K1S1 S2 K2 S3,

  即:(Operation Mode|Need Physical Access|Change Type|Region)((\((\w* ?\w*)*\))|(\((\w*_\w*)*\))) ((\((\w* ?\w*)*\))|(\((\w*_\w*)*\)))( belongs to | equals | contains | does not equal | does not contain | does not belong to )((?!( And \(|Or \(| \)| And ((Operation Mode|Need Physical Access|Change Type|Region)((\((\w* ?\w*)*\))|(\((\w*_\w*)*\))) ((\((\w* ?\w*)*\))|(\((\w*_\w*)*\))))| Or ((Operation Mode|Need Physical Access|Change Type|Region)((\((\w* ?\w*)*\))|(\((\w*_\w*)*\))) ((\((\w* ?\w*)*\))|(\((\w*_\w*)*\)))))).)*

  匹配的效果如下所示:

供大家参考一下~~

 

posted on 2018-04-06 17:52  Cultivate  阅读(7066)  评论(0编辑  收藏  举报

导航