Getting http address from text file by awk script

Today , I writed a awk script,it can get http address. The http address only contains number, alphabet,backslash and point. My awk script is following:

 

#! /bin/awk -f

{

  httpIndex=index($0,"http://")

      if ( httpIndex > 0 )

     {

        match($0,/http:\/\/[[:alnum:]\.\//)

          httpstr=substr($0,RSTART,RSTART + RLENGTH -1 )

     match(httpstr,/http:/\/\/[[:alnum:]\.\/]/)        // line 8

    httpstr=substr(httpstr,RSTART,RSTRAT + RLENGTH -1 ) // line 9

            print httpstr

     }

}

 

I test this script by 1 000 line texts, I found I need add line 8 and line 9 code, otherwise some http address always behand by some special character ,such as space.

          

 

posted @ 2014-09-25 22:56  ordi  阅读(172)  评论(0编辑  收藏  举报