前段时间需要使用PHP进行简单的分词处理,于是找到了httpcws,http://blog.s135.com/httpcws_v100/,安装完成后使用很方便,下面为示例代码:

 1 function fenci($text){     
 2     $ch = curl_init();
 3     curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
 4             
 5     //用GET方法请求分词结果
 6     //$url = "http://127.0.0.1:1985?w=".urlencode(iconv("UTF-8","GB2312",$text));
 7     //echo "<p>url:$url</p>";
 8     //curl_setopt($ch, CURLOPT_URL, $url);
 9             
10     //用POST方法请求分词结果
11     curl_setopt($ch, CURLOPT_URL, "http://127.0.0.1:1985");
12     curl_setopt($ch, CURLOPT_POSTFIELDS, urlencode(iconv("UTF-8","GB2312",$wbtext))) ;
13         
14     $result = iconv("GB2312","UTF-8",curl_exec($ch));
15     curl_close($ch);
16     
17     $wordArr = explode(' ', $result);
18 }    

这里要说的是进行请求时一定要对中文文本进行编码转换,否则极有可能得到的结果是乱码的

 

 

posted on 2013-08-01 09:20  demin7926  阅读(165)  评论(0)    收藏  举报