组合数据类型练习,英文词频统计实例上
1.字典实例:建立学生学号成绩字典,做增删改查遍历操作。
#建立学生学号成绩字典
d={'01':99,'02':100,'03':97,'04':80,'05':77,'06':100}
print(d,'\n')
#增
d['07']=66
print('增加后的成绩字典为:',d,'\n')
#删
d.pop('04')
print('删除后的成绩字典为:',d,'\n')
#改
d['05']=80
print('修改后的成绩字典为:',d,'\n')
#查
print('02同学的成绩为:',d['02'],'\n')
#遍历
for i in d:
print('{}\t{}'.format(i,d[i]))

2.列表,元组,字典,集合的遍历。
总结列表,元组,字典,集合的联系与区别。
#列表
li=list('123123123113')
print('列表的遍历:')
print(li)
for i in li:
print(i)
#元组
tu=tuple('123123123113')
print('元组的遍历:')
print(tu)
for i in tu:
print(i)
#字典
d={'01':99,'02':100,'03':97,'04':80,'05':77,'06':100}
print('字典的遍历:')
print(d)
for i in d:
print(i,d[i])
#集合
s=set([1,2,3,1,2,3,1,2,3,1,1,3])
print('集合的遍历:')
print(s)
for i in s:
print(i)

列表:是一种有序的序列,可以随时添加和删除其中的元素,没有长度限制、元素类型可以不同。
元组:和list非常相似,但是一旦初始化便不能修改。
字典:使用键-值进行存储,其中键必须为不可变的对象。
集合:值不能重复,所以遍历出来的值没有重复值,是无序的。
3.英文词频统计实例
1.待分析字符串
2.分解提取单词
1.大小写 txt.lower()
2.分隔符'.,:;?!-_’
3.单词列表
3.单词计数字典
str='''Tyler was born infected with HIV:his mother was also infected.From the very beginning of his life,he was dependent on medications to enable him to survive.When he was five,he had a tube surgically inserted in a vein in his chest.This tube was connected to a pump,which he carried in a small backpack on his back.Medications were hooked up to this pump and were continuously supplied through this tube to his bloodstream.At times,he also needed supplemented oxygen to support his breathing.'''
#将所有大写转换为小写
str=str.lower()
print('全部转换为小写的结果:'+str+'\n')
#将所有将所有其他做分隔符(,.?!)替换为空格
for i in ',.?!:':
str=str.replace(i,' ')
print('其他分隔符替换为空格的结果:'+str+'\n')
#分隔出一个一个单词
str=str.split(' ')
print('分隔结果为:',str,'\n')
word = set(str)
dic={}
for i in word:
dic[i]= str.count(i)
str=list(dic.items())
str.sort(key=lambda x:x[1],reverse=True)
print(str,'\n')
print('词频前10为:')
for i in range(10):
word,count=str[i]
print('{}\t{}'.format(word,count))


浙公网安备 33010602011771号