利用sed将xml报文转换为分隔符形式报文

原始xml文本如下

 1 <?xml version="1.0" encoding="utf-8"?>
 2 <Message>
 3   <Header>
 4     <Version>2000000</Version>
 5     <MessageClass>5</MessageClass>
 6     <MessageType>7</MessageType>
 7     <SenderId>9999999964020001</SenderId>
 8     <ReceiverId>9999999964011001</ReceiverId>
 9     <MessageId>3280260</MessageId>
10   </Header>
11   <Body ContentType="1">
12     <ClearTargetDate>2017-03-22</ClearTargetDate>
13   <ServiceProviderId>9999999934030001</ServiceProviderId>
14     <IssuerId>9999999964011001</IssuerId>
15     <MessageId>406843026</MessageId>
16     <Count>1</Count>
17     <Amount>110.00</Amount>
18     <Transaction>
19       <TransId>1</TransId>
20       <Time>2017-03-21T20:40:36</Time>
21       <Fee>110.00</Fee>
22       <Service>
23         <ServiceType>1</ServiceType>
24         <Description>曹庄|宿州</Description>
25         <Detail>1|04|3401|804|33|20170321 204036|03|3401|1105|1|20170321 182056</Detail>
26       </Service>
27       <ICCard>
28         <CardType>22</CardType>
29         <NetNo>6401</NetNo>
30         <CardId>1638220100098530</CardId>
31         <License>宁B63222</License>
32         <TransNo>104</TransNo>
33         <PreBalance>2157.60</PreBalance>
34         <PostBalance>2047.60</PostBalance>
35       </ICCard>
36       <Validation>
37         <TAC>9439DAD2</TAC>
38         <TransType>09</TransType>
39         <TerminalNo>0134000030BC</TerminalNo>
40         <TerminalTransNo>0018002D</TerminalTransNo>
41       </Validation>
42       <OBU>
43         <NetNo>C4FE</NetNo>
44         <OBUId>0000000200031918</OBUId>
45         <OBEState>0001</OBEState>
46         <License>宁B63222</License>
47       </OBU>
48     </Transaction>
49   </Body>
50 </Message>

 

现在需要将上述内容Transaction标签中的值转换为下面的分隔符格式 

1|||2017-03-21T20:40:36|||110.00|||1|||曹庄|宿州|||1|04|3401|804|33|20170321204036|03|3401|1105|1|20170321182056||||||22|||6401|||1638220100098530|||宁B63222|||104|||2157.60|||2047.60||||||9439DAD2|||09|||0134000030BC|||0018002D||||||C4FE|||0000000200031918|||0001|||宁B63222|||

 

下面是我执行的操作步骤

1、替换换行符,将整个xml文件处理成一行文本,重定向到文本1中

cat ***.xml | tr "\n" " " > 1

 

结果如下

<?xml version="1.0" encoding="utf-8"?><Message>   <Header>     <Version>2000000</Version>     <MessageClass>5</MessageClass>     <MessageType>7</MessageType>     <SenderId>9999999964020001</SenderId>     <ReceiverId>9999999964011001</ReceiverId>     <MessageId>3280260</MessageId>   </Header>   <Body ContentType="1">     <ClearTargetDate>2017-03-22</ClearTargetDate>     <ServiceProviderId>9999999934030001</ServiceProviderId>     <IssuerId>9999999964011001</IssuerId>     <MessageId>406843026</MessageId>     <Count>1</Count>     <Amount>110.00</Amount>     <Transaction>      <TransId>1</TransId>       <Time>2017-03-21T20:40:36</Time>       <Fee>110.00</Fee>       <Service>         <ServiceType>1</ServiceType>         <Description>曹庄|宿州</Description>         <Detail>1|04|3401|804|33|20170321 204036|03|3401|1105|1|20170321182056</Detail>       </Service>       <ICCard>         <CardType>22</CardType>         <NetNo>6401</NetNo>         <CardId>1638220100098530</CardId>         <License>宁B63222</License>         <TransNo>104</TransNo>         <PreBalance>2157.60</PreBalance><PostBalance>2047.60</PostBalance>       </ICCard>       <Validation>         <TAC>9439DAD2</TAC>         <TransType>09</TransType>      <TerminalNo>0134000030BC</TerminalNo>         <TerminalTransNo>0018002D</TerminalTransNo>       </Validation>       <OBU>  <NetNo>C4FE</NetNo>         <OBUId>0000000200031918</OBUId>         <OBEState>0001</OBEState>         <License>宁B63222</License>      </OBU>     </Transaction>   </Body> </Message>

 

2、去除空格

sed 's/ //g' 1 > 2

 

结果如下

<?xml version="1.0" encoding="utf-8"?><Message><Header><Version>2000000</Version><MessageClass>5</MessageClass><MessageType>7</MessageType><SenderId>9999999964020001</SenderId><ReceiverId>9999999964011001</ReceiverId><MessageId>3280260</MessageId></Header><BodyContentType="1"><ClearTargetDate>2017-03-22</ClearTargetDate><ServiceProviderId>9999999934030001</ServiceProviderId><IssuerId>9999999964011001</IssuerId><MessageId>406843026</MessageId><Count>1</Count><Amount>110.00</Amount><Transaction><TransId>1</TransId><Time>2017-03-21T20:40:36</Time><Fee>110.00</Fee><Service><ServiceType>1</ServiceType><Description>曹庄|宿州</Description><Detail>1|04|3401|804|33|20170321204036|03|3401|1105|1|20170321182056</Detail></Service><ICCard><CardType>22</CardType><NetNo>6401</NetNo><CardId>1638220100098530</CardId><License>宁B63222</License><TransNo>104</TransNo><PreBalance>2157.60</PreBalance><PostBalance>2047.60</PostBalance></ICCard><Validation><TAC>9439DAD2</TAC><TransType>09</TransType><TerminalNo>0134000030BC</TerminalNo><TerminalTransNo>0018002D</TerminalTransNo></Validation><OBU><NetNo>C4FE</NetNo><OBUId>0000000200031918</OBUId><OBEState>0001</OBEState><License>宁B63222</License></OBU></Transaction></Body></Message>

 

3、去除无用的头部和尾部xml,只保留Transaction标签中的内容

sed 's/.*<Transaction>//g;s/<\/OBU>.*<\/Message>//g' 2 > 3

 

结果如下

<TransId>1</TransId><Time>2017-03-21T20:40:36</Time><Fee>110.00</Fee><Service><ServiceType>1</ServiceType><Description>曹庄|宿州</Description><Detail>1|04|3401|804|33|20170321204036|03|3401|1105|1|20170321182056</Detail></Service><ICCard><CardType>22</CardType><NetNo>6401</NetNo><CardId>1638220100098530</CardId><License>宁B63222</License><TransNo>104</TransNo><PreBalance>2157.60</PreBalance><PostBalance>2047.60</PostBalance></ICCard><Validation><TAC>9439DAD2</TAC><TransType>09</TransType><TerminalNo>0134000030BC</TerminalNo><TerminalTransNo>0018002D</TerminalTransNo></Validation><OBU><NetNo>C4FE</NetNo><OBUId>0000000200031918</OBUId><OBEState>0001</OBEState><License>宁B63222</License>

 

4、将闭合标签</***>替换为|||

sed 's/<\/[^>]*>/|||/g' 3 > 4

  

结果如下

<TransId>1|||<Time>2017-03-21T20:40:36|||<Fee>110.00|||<Service><ServiceType>1|||<Description>曹庄|宿州|||<Detail>1|04|3401|804|33|20170321204036|03|3401|1105|1|20170321182056||||||<ICCard><CardType>22|||<NetNo>6401|||<CardId>1638220100098530|||<License>宁B63222|||<TransNo>104|||<PreBalance>2157.60|||<PostBalance>2047.60||||||<Validation><TAC>9439DAD2|||<TransType>09|||<TerminalNo>0134000030BC|||<TerminalTransNo>0018002D||||||<OBU><NetNo>C4FE|||<OBUId>0000000200031918|||<OBEState>0001|||<License>宁B63222|||

 

5、将开始标签<***>去除

sed 's/<[^>]*>//g' 4 > 5

 

结果如下

1|||2017-03-21T20:40:36|||110.00|||1|||曹庄|宿州|||1|04|3401|804|33|20170321204036|03|3401|1105|1|20170321182056||||||22|||6401|||1638220100098530|||宁B63222|||104|||2157.60|||2047.60||||||9439DAD2|||09|||0134000030BC|||0018002D||||||C4FE|||0000000200031918|||0001|||宁B63222|||

 

到此大功告成

 

将所有标签整理在一起

cat ***.xml | tr "\n" " " > 1
sed 's/ //g;s/.*<Transaction>//g;s/<\/OBU>.*<\/Message>//g;s/<\/[^>]*>/|||/g;s/<[^>]*>//g' 1 > 2

 

posted @ 2017-04-28 14:39  MiaoCunFa  阅读(817)  评论(0编辑  收藏  举报