笔记

万物寻其根,通其堵,便能解其困。
  博客园  :: 新随笔  :: 管理

Pullword 分词工具

Posted on 2019-01-13 09:54  草妖  阅读(499)  评论(0)    收藏  举报
    def get_response(self, txt):
        """ 热词工具 """
        datas = []
        request_lists = []
        # 筛选文本
        with open(txt,'r', encoding='utf8') as f:
            for line in f:
                data_one = line.strip()
                if data_one:
                    datas.append(data_one)
        url = 'http://www.pullword.com/process.php'
        headers = {
            "Connection": "keep-alive",
            "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
        }
        for data in datas:
            FromData = {
                'source': data,
                'param1': 1,
                'param2': 0
            }
            try:
                response = requests.post(url, headers=headers, data=FromData)
            except:
                print("热词 {} 请求有误...".format(data))
            else:
                content = response.text
                content = (content.split('SAMEORIGIN')[1]).strip()  # 去除空格
                contents = content.split('\r\n')  # 获取单词list
                request_lists.extend(contents)  # 将其合并
        return request_lists  # 返回list