Python基本数据类型-set(集合)
集合(set)是由一个或数个形态各异的大小整体组成的,构成集合的事物或对象称作元素或是成员。
基本功能是进行成员关系测试和删除重复元素。
由不同元素组成的集合,集合中是一组无序排列的可hash值,可以作为字典的key
可以使用大括号 { } 或者 set() 函数创建集合,注意:创建一个空集合必须用 set() 而不是 { },因为 { } 是用来创建一个空字典。
创建格式:
parame = {value01,value02,...}
或者
set(value)
作用:去重,关系运算
#定义:
知识点回顾
可变类型是不可hash类型
不可变类型是可hash类型
#定义集合:
集合:可以包含多个元素,用逗号分割,
集合的元素遵循三个原则:
1:每个元素必须是不可变类型(可hash,可作为字典的key)
2:没有重复的元素
3:无序
可变集合:不能用作字典的key,也不能做其他集合的元素
不可变集合:与上面相反
注意集合的目的是将不同的值存放到一起,不同的集合间用来做关系运算,无需纠结于集合中单个值 #优先掌握的操作: #1、长度len #2、成员运算in和not in #3、|合集 #4、&交集 #5、-差集 #6、^对称差集 #7、== #8、父集:>,>=
#9、子集:<,<=
s=set('abc') #只能创建单个字符串 s1={1,2,3,4} print(s) #{'a', 'c', 'b'} print(s1) #{1, 2, 3, 4} #不可变集合 fs=frozenset(s) print(fs) #frozenset({'a', 'c', 'b'}) #长度 print(len(s)) #3
''' 2、访问集合:由于集合本身是无序的,所以不能为集合创建索引或切片, 只能循环遍历或只用in、not in来访问或判断集合元素 ''' print(2 in s) print(4 in s) print(4 not in s) # 3、更新集合,add()、update()、remove() s.add('uuu') # 添加一个元素 print(s) s.update('wawa')#添加单个元素 s.update([12, 'll']) # 添加多个元素 print(s) s.remove(2) # 删除 s.pop()#随机删除 s.clear()#清空 del s#删除集合本身 print(s) '''4、集合类型操作符 in 、not in 集合等价与不等价(==、!=) 子集、超集 ''' print(set('tom') < set('tomyyyy')) print(set('tom') == set('tommmm')) # and和起来 print(set('tom') and set('tomx')) # or取交集 print(set('tom') or set('tomx')) ''' set重点 交集 并集 差集 ''' a = set([1, 2, 3, 4, 5]) b = set([4, 5, 6, 7, 8]) # intersection()交集 print(a.intersection(b)) print(a & b) # 并集union print(a.union(b)) print(a | b) # 差集difference print(a.difference(b)) # a里有b没有 print(b.difference(a)) # b里有a没有 print(a - b) # 对称差集 symmetric=对称,也叫反向交集 print(a.symmetric_difference(b)) print(a ^ b) # 父集、超集 print(a.issuperset(b)) print(a > b) # 子集 print(a.issubset(b)) print(a < b)
循环遍历集合
s1={1,2,3,4}
for j in s1:
print(j)
一.关系运算
有如下两个集合,pythons是报名python课程的学员名字集合,linuxs是报名linux课程的学员名字集合
pythons={'tom','miqi','yuanhao','wupeiqi','gangdan','biubiu'}
linuxs={'wupeiqi','tom','gangdan'}
1. 求出即报名python又报名linux课程的学员名字集合
2. 求出所有报名的学生名字集合
3. 求出只报名python课程的学员名字
4. 求出没有同时这两门课程的学员名字集合
# 有如下两个集合,pythons是报名python课程的学员名字集合,linuxs是报名linux课程的学员名字集合 pythons={'tom','miqi','yuanhao','wupeiqi','gangdan','biubiu'} linuxs={'wupeiqi','tom','gangdan'} # 求出即报名python又报名linux课程的学员名字集合 print(pythons & linuxs) # 求出所有报名的学生名字集合 print(pythons | linuxs) # 求出只报名python课程的学员名字 print(pythons - linuxs) # 求出没有同时这两门课程的学员名字集合 print(pythons ^ linuxs)
二.去重
1. 有列表l=['a','b',1,'a','a'],列表元素均为可hash类型,去重,得到新列表,且新列表无需保持列表原来的顺序
2.在上题的基础上,保存列表原来的顺序
3.去除文件中重复的行,肯定要保持文件内容的顺序不变
4.有如下列表,列表元素为不可hash类型,去重,得到新列表,且新列表一定要保持列表原来的顺序
l=[
{'name':'egon','age':18,'sex':'male'},
{'name':'alex','age':73,'sex':'male'},
{'name':'egon','age':20,'sex':'female'},
{'name':'egon','age':18,'sex':'male'},
{'name':'egon','age':18,'sex':'male'},
]
#去重,无需保持原来的顺序 l=['a','b',1,'a','a'] print(set(l)) #去重,并保持原来的顺序 #方法一:不用集合 l=[1,'a','b',1,'a'] l1=[] for i in l: if i not in l1: l1.append(i) print(l1) #方法二:借助集合 l1=[] s=set() for i in l: if i not in s: s.add(i) l1.append(i) print(l1) #同上方法二,去除文件中重复的行 import os with open('db.txt','r',encoding='utf-8') as read_f,\ open('.db.txt.swap','w',encoding='utf-8') as write_f: s=set() for line in read_f: if line not in s: s.add(line) write_f.write(line) os.remove('db.txt') os.rename('.db.txt.swap','db.txt') #列表中元素为可变类型时,去重,并且保持原来顺序 l=[ {'name':'egon','age':18,'sex':'male'}, {'name':'alex','age':73,'sex':'male'}, {'name':'egon','age':20,'sex':'female'}, {'name':'egon','age':18,'sex':'male'}, {'name':'egon','age':18,'sex':'male'}, ] # print(set(l)) #报错:unhashable type: 'dict' s=set() l1=[] for item in l: val=(item['name'],item['age'],item['sex']) if val not in s: s.add(val) l1.append(item) print(l1) #定义函数,既可以针对可以hash类型又可以针对不可hash类型 def func(items,key=None): s=set() for item in items: val=item if key is None else key(item) if val not in s: s.add(val) yield item print(list(func(l,key=lambda dic:(dic['name'],dic['age'],dic['sex']))))
三、set()函数用法,查看用
class set(object): """ set() -> new empty set object set(iterable) -> new set object Build an unordered collection of unique elements. """ def add(self, *args, **kwargs): # real signature unknown """ Add an element to a set. This has no effect if the element is already present. """ pass def clear(self, *args, **kwargs): # real signature unknown """ Remove all elements from this set. """ pass def copy(self, *args, **kwargs): # real signature unknown """ Return a shallow copy of a set. """ pass def difference(self, *args, **kwargs): # real signature unknown """ 相当于s1-s2 Return the difference of two or more sets as a new set. (i.e. all elements that are in this set but not the others.) """ pass def difference_update(self, *args, **kwargs): # real signature unknown """ Remove all elements of another set from this set. """ pass def discard(self, *args, **kwargs): # real signature unknown """ 与remove功能相同,删除元素不存在时不会抛出异常 Remove an element from a set if it is a member. If the element is not a member, do nothing. """ pass def intersection(self, *args, **kwargs): # real signature unknown """ 相当于s1&s2 Return the intersection of two sets as a new set. (i.e. all elements that are in both sets.) """ pass def intersection_update(self, *args, **kwargs): # real signature unknown """ Update a set with the intersection of itself and another. """ pass def isdisjoint(self, *args, **kwargs): # real signature unknown """ Return True if two sets have a null intersection. """ pass def issubset(self, *args, **kwargs): # real signature unknown """ 相当于s1<=s2 Report whether another set contains this set. """ pass def issuperset(self, *args, **kwargs): # real signature unknown """ 相当于s1>=s2 Report whether this set contains another set. """ pass def pop(self, *args, **kwargs): # real signature unknown """ Remove and return an arbitrary set element. Raises KeyError if the set is empty. """ pass def remove(self, *args, **kwargs): # real signature unknown """ Remove an element from a set; it must be a member. If the element is not a member, raise a KeyError. """ pass def symmetric_difference(self, *args, **kwargs): # real signature unknown """ 相当于s1^s2 Return the symmetric difference of two sets as a new set. (i.e. all elements that are in exactly one of the sets.) """ pass def symmetric_difference_update(self, *args, **kwargs): # real signature unknown """ Update a set with the symmetric difference of itself and another. """ pass def union(self, *args, **kwargs): # real signature unknown """ 相当于s1|s2 Return the union of sets as a new set. (i.e. all elements that are in either set.) """ pass def update(self, *args, **kwargs): # real signature unknown """ Update a set with the union of itself and others. """ pass def __and__(self, *args, **kwargs): # real signature unknown """ Return self&value. """ pass def __contains__(self, y): # real signature unknown; restored from __doc__ """ x.__contains__(y) <==> y in x. """ pass def __eq__(self, *args, **kwargs): # real signature unknown """ Return self==value. """ pass def __getattribute__(self, *args, **kwargs): # real signature unknown """ Return getattr(self, name). """ pass def __ge__(self, *args, **kwargs): # real signature unknown """ Return self>=value. """ pass def __gt__(self, *args, **kwargs): # real signature unknown """ Return self>value. """ pass def __iand__(self, *args, **kwargs): # real signature unknown """ Return self&=value. """ pass def __init__(self, seq=()): # known special case of set.__init__ """ set() -> new empty set object set(iterable) -> new set object Build an unordered collection of unique elements. # (copied from class doc) """ pass def __ior__(self, *args, **kwargs): # real signature unknown """ Return self|=value. """ pass def __isub__(self, *args, **kwargs): # real signature unknown """ Return self-=value. """ pass def __iter__(self, *args, **kwargs): # real signature unknown """ Implement iter(self). """ pass def __ixor__(self, *args, **kwargs): # real signature unknown """ Return self^=value. """ pass def __len__(self, *args, **kwargs): # real signature unknown """ Return len(self). """ pass def __le__(self, *args, **kwargs): # real signature unknown """ Return self<=value. """ pass def __lt__(self, *args, **kwargs): # real signature unknown """ Return self<value. """ pass @staticmethod # known case of __new__ def __new__(*args, **kwargs): # real signature unknown """ Create and return a new object. See help(type) for accurate signature. """ pass def __ne__(self, *args, **kwargs): # real signature unknown """ Return self!=value. """ pass def __or__(self, *args, **kwargs): # real signature unknown """ Return self|value. """ pass def __rand__(self, *args, **kwargs): # real signature unknown """ Return value&self. """ pass def __reduce__(self, *args, **kwargs): # real signature unknown """ Return state information for pickling. """ pass def __repr__(self, *args, **kwargs): # real signature unknown """ Return repr(self). """ pass def __ror__(self, *args, **kwargs): # real signature unknown """ Return value|self. """ pass def __rsub__(self, *args, **kwargs): # real signature unknown """ Return value-self. """ pass def __rxor__(self, *args, **kwargs): # real signature unknown """ Return value^self. """ pass def __sizeof__(self): # real signature unknown; restored from __doc__ """ S.__sizeof__() -> size of S in memory, in bytes """ pass def __sub__(self, *args, **kwargs): # real signature unknown """ Return self-value. """ pass def __xor__(self, *args, **kwargs): # real signature unknown """ Return self^value. """ pass __hash__ = None
四、数据类型总结
按存储空间的占用分(从低到高
数字 字符串 集合:无序,即无序存索引相关信息 元组:有序,需要存索引相关信息,不可变 列表:有序,需要存索引相关信息,可变,需要处理数据的增删改 字典:无序,需要存key与value映射的相关信息,可变,需要处理数据的增删改
按存值个数区分
| 标量/原子类型 | 数字,字符串 |
| 容器类型 | 列表,元组,字典 |
按可变不可变区分
| 可变 | 列表,字典 |
| 不可变 | 数字,字符串,元组 |
按访问顺序区分
| 直接访问 | 数字 |
| 顺序访问(序列类型) | 字符串,列表,元组 |
| key值访问(映射类型) | 字典 |
实例
#!/usr/bin/python3 student = {'Tom', 'Jim', 'Mary', 'Tom', 'Jack', 'Rose'} print(student) # 输出集合,重复的元素被自动去掉 # 成员测试 if 'Rose' in student : print('Rose 在集合中') else : print('Rose 不在集合中') # set可以进行集合运算 a = set('abracadabra') b = set('alacazam') print(a) print(a - b) # a 和 b 的差集 print(a | b) # a 和 b 的并集 print(a & b) # a 和 b 的交集 print(a ^ b) # a 和 b 中不同时存在的元素
以上实例输出结果:
{'Mary', 'Jim', 'Rose', 'Jack', 'Tom'}
Rose 在集合中
{'b', 'a', 'c', 'r', 'd'}
{'b', 'd', 'r'}
{'l', 'r', 'a', 'c', 'z', 'm', 'b', 'd'}
{'a', 'c'}
{'l', 'r', 'z', 'm', 'b', 'd'}

浙公网安备 33010602011771号