Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析
CODE:
#!/usr/bin/python
# -*- coding: utf-8 -*-
'''
Created on 2014-7-9
@author: guaguastd
@name: weiboFrequencyAnalysis.py
'''
if __name__ == '__main__':
# get weibo_api to access sina api
from sinaWeiboLogin import sinaWeiboLogin
sinaWeiboApi = sinaWeiboLogin()
# import sinaWeibo
from sinaWeibo import extractWeiboEntities
# import sinaWeoboStatuses
from sinaWeiboStatuses import publicTimeline
# import sinaWeiboFrequency
from sinaWeiboFrequency import weiboFrequencyAnalysis
# get the new 5 weibo
weiboNum = 5
statuses = publicTimeline(sinaWeiboApi, weiboNum)
status_texts,screen_names,words = extractWeiboEntities(statuses)
for label, data in (('Word', words),
('Screen Name', screen_names)):
weiboFrequencyAnalysis(label, data, weiboNum)RESULT:
+------------------------------------------+-------+ | Word | Count | +------------------------------------------+-------+ | http://t.cn/8snKY0S | 1 | | [围观]CANNCI千姿百袋2014新款牛皮菱格女包 | 1 | | 时尚潮流单肩包 | 1 | | 浪漫RI系「喜欢请赞 | 1 | | ✲✲✲✲✲✲ | 1 | +------------------------------------------+-------+ +--------------------+-------+ | Screen Name | Count | +--------------------+-------+ | 马傻强 | 1 | | 手机用户2360148561 | 1 | | 潮流爆款搭V | 1 | | star爱上泡面猫 | 1 | | 美容潮搭健康 | 1 | +--------------------+-------+
浙公网安备 33010602011771号