# 组合数据类型练习，英文词频统计实例

1. 列表实例：由字符串创建一个作业评分列表，做增删改查询统计遍历操作。例如，查询第一个3分的下标，统计1分的同学有多少个，3分的同学有多少个等。
s=list('132132123131')
print('作业评分列表',s)
s.append('3')
print('增加',s)
s.pop()
print('删除最后一个',s)
s[3]='3'
print('将第三个改为3',s)
print('第一次出现3分的下标',s.index('3'))
print('1分的同学有：',s.count('1'))
print('3分的同学有：',s.count('3'))
2. 字典实例：建立学生学号成绩字典，做增删改查遍历操作。
x={'201720':'88','201721':'90','201722':'78'}
print(x)
x['201723']=99
print('增加201723学号成绩',x)
x.pop('201720')
x['201722']=82
print('修改201722学号成绩',x['201722'])
print('查找201733学号成绩',x.get('201733','无数据'))
print(x.items())
3. 列表，元组，字典，集合的遍历。总结列表，元组，字典，集合的联系与区别。

l=list('1233332121')
t=tuple('654321123456')
x={'201720':'88','201721':'90','201722':'78'}
s=set('123321332122')
print("列表：",l)
for i in l:
print(i,end=' ')
print("\n")
print("元组：",t)
for i in t:
print(i,end=' ')
print("\n")
print("字典：",x)
for i in x:
print(i,end='\t')
print(x[i],end='\n')
print("集合：",)
for i in s:
print(i,end=' ')

4.英文词频统计实例

1. 待分析字符串
2. 分解提取单词
1. 大小写 txt.lower()
2. 分隔符'.,:;?!-_’
3. 计数字典
1. 排除语法型词汇，代词、冠词、连词

4. 排序list.sort()

5.输出TOP(10)

news='''The committee has published two books: The Contemporary and Modern History of Three East Asian Countries in 2005 and A Modern History of East Asia Beyond The Boundaries in 2012.
Committee members attending a history seminar in Nanjing, east China's Jiangsu province, told Xinhua Sunday that work on a third book has begun and is expected to be completed in 2020.
Li Xizhu, a fellow of the Institute of Modern History, Chinese Academy of Social Sciences, said scholars from the three countries have reached consensus on the focus of the book.
"It is to address the differences in how we, the three countries, see history and to respond to the current debate on historical issues," Li said.
Ueyama Yurika, a Japanese member, said the committee will create contents in line with education practice in each country's context so that the textbooks can be used more widely.
Scholars agree that a correct perception of history is the foundation for reconciliation in East Asia.
Japanese scholar Kasahara Tokushi said history textbooks in Japan contain fewer and increasingly more obscure contents on the 1937 Nanjing Massacre.'''
for i in ''',.?!"''':
news=news.replace(i,' ')

words=news.split(' ')
print (words)
d={}
keys = set(words)
for i in keys:
d[i]=words.count(i)
print(d)
print(d.values())
l=list(d.values())
l.sort()
print(l)

5.文本操作

fo=open('/Users/Administrator/Desktop/test.txt','r')
fo.close()
exc={'the','a','to','of','and','on','in','that'}
news =news.lower()
for i in ''',.?!"''':
news=news.replace(i,' ')

print(news)
words=news.split(' ')
print(words)
d={}
keys = set(words)
for r in exc:
keys.remove(r)
for i in keys:
d[i]=words.count(i)
wc=list(d.items())
wc.sort(key=lambda x:x[1],reverse=True)
for i in range(10):
print(wc[i])

posted on 2017-09-21 15:53  吴林鸿  阅读(115)  评论(0编辑  收藏  举报