Python基本数据类型-set(集合)

集合(set)是由一个或数个形态各异的大小整体组成的,构成集合的事物或对象称作元素或是成员。

基本功能是进行成员关系测试和删除重复元素。

由不同元素组成的集合,集合中是一组无序排列的可hash值,可以作为字典的key

可以使用大括号 { } 或者 set() 函数创建集合,注意:创建一个空集合必须用 set() 而不是 { },因为 { } 是用来创建一个空字典。

创建格式:

parame = {value01,value02,...}
或者
set(value)

作用:去重,关系运算
#定义:
            知识点回顾
            可变类型是不可hash类型
            不可变类型是可hash类型

#定义集合:
            集合:可以包含多个元素,用逗号分割,
            集合的元素遵循三个原则:
             1:每个元素必须是不可变类型(可hash,可作为字典的key)
             2:没有重复的元素
             3:无序
可变集合:不能用作字典的key,也不能做其他集合的元素

 不可变集合:与上面相反

 
注意集合的目的是将不同的值存放到一起,不同的集合间用来做关系运算,无需纠结于集合中单个值
 

#优先掌握的操作:
#1、长度len
#2、成员运算in和not in
#3、|合集
#4、&交集
#5、-差集
#6、^对称差集
#7、==
#8、父集:>,>= 
#9、子集:<,<=
 
s=set('abc') #只能创建单个字符串
s1={1,2,3,4}
print(s) #{'a', 'c', 'b'}
print(s1) #{1, 2, 3, 4}

#不可变集合
fs=frozenset(s)
print(fs) #frozenset({'a', 'c', 'b'})


#长度
print(len(s)) #3
'''
2、访问集合:由于集合本身是无序的,所以不能为集合创建索引或切片,
只能循环遍历或只用in、not in来访问或判断集合元素
'''
print(2 in s)
print(4 in s)
print(4 not in s)

# 3、更新集合,add()、update()、remove()
s.add('uuu')  # 添加一个元素
print(s)
s.update('wawa')#添加单个元素
s.update([12, 'll'])  # 添加多个元素
print(s)
s.remove(2)  # 删除
s.pop()#随机删除
s.clear()#清空
del s#删除集合本身
print(s)

'''4、集合类型操作符
in 、not in
集合等价与不等价(==、!=)
子集、超集
'''
print(set('tom') < set('tomyyyy'))
print(set('tom') == set('tommmm'))

# and和起来
print(set('tom') and set('tomx'))

# or取交集
print(set('tom') or set('tomx'))

'''
set重点
交集
并集
差集
'''
a = set([1, 2, 3, 4, 5])
b = set([4, 5, 6, 7, 8])
# intersection()交集
print(a.intersection(b))
print(a & b)

# 并集union
print(a.union(b))
print(a | b)

# 差集difference
print(a.difference(b))  # a里有b没有
print(b.difference(a))  # b里有a没有
print(a - b)

# 对称差集 symmetric=对称,也叫反向交集
print(a.symmetric_difference(b))
print(a ^ b)

# 父集、超集
print(a.issuperset(b))
print(a > b)

# 子集
print(a.issubset(b))
print(a < b)

 


循环遍历集合
s1={1,2,3,4}
for j in s1:
    print(j)

 

一.关系运算
  有如下两个集合,pythons是报名python课程的学员名字集合,linuxs是报名linux课程的学员名字集合
  pythons={'tom','miqi','yuanhao','wupeiqi','gangdan','biubiu'}
  linuxs={'wupeiqi','tom','gangdan'}
  1. 求出即报名python又报名linux课程的学员名字集合
  2. 求出所有报名的学生名字集合
  3. 求出只报名python课程的学员名字
  4. 求出没有同时这两门课程的学员名字集合
# 有如下两个集合,pythons是报名python课程的学员名字集合,linuxs是报名linux课程的学员名字集合
pythons={'tom','miqi','yuanhao','wupeiqi','gangdan','biubiu'}
linuxs={'wupeiqi','tom','gangdan'}
# 求出即报名python又报名linux课程的学员名字集合
print(pythons & linuxs)
# 求出所有报名的学生名字集合
print(pythons | linuxs)
# 求出只报名python课程的学员名字
print(pythons - linuxs)
# 求出没有同时这两门课程的学员名字集合
print(pythons ^ linuxs)
 二.去重

   1. 有列表l=['a','b',1,'a','a'],列表元素均为可hash类型,去重,得到新列表,且新列表无需保持列表原来的顺序
   2.在上题的基础上,保存列表原来的顺序
   3.去除文件中重复的行,肯定要保持文件内容的顺序不变
   4.有如下列表,列表元素为不可hash类型,去重,得到新列表,且新列表一定要保持列表原来的顺序
l=[
    {'name':'egon','age':18,'sex':'male'},
    {'name':'alex','age':73,'sex':'male'},
    {'name':'egon','age':20,'sex':'female'},
    {'name':'egon','age':18,'sex':'male'},
    {'name':'egon','age':18,'sex':'male'},
]
#去重,无需保持原来的顺序
l=['a','b',1,'a','a']
print(set(l))

#去重,并保持原来的顺序
#方法一:不用集合
l=[1,'a','b',1,'a']

l1=[]
for i in l:
    if i not in l1:
        l1.append(i)
print(l1)
#方法二:借助集合
l1=[]
s=set()
for i in l:
    if i not in s:
        s.add(i)
        l1.append(i)

print(l1)


#同上方法二,去除文件中重复的行
import os
with open('db.txt','r',encoding='utf-8') as read_f,\
        open('.db.txt.swap','w',encoding='utf-8') as write_f:
    s=set()
    for line in read_f:
        if line not in s:
            s.add(line)
            write_f.write(line)
os.remove('db.txt')
os.rename('.db.txt.swap','db.txt')

#列表中元素为可变类型时,去重,并且保持原来顺序
l=[
    {'name':'egon','age':18,'sex':'male'},
    {'name':'alex','age':73,'sex':'male'},
    {'name':'egon','age':20,'sex':'female'},
    {'name':'egon','age':18,'sex':'male'},
    {'name':'egon','age':18,'sex':'male'},
]
# print(set(l)) #报错:unhashable type: 'dict'
s=set()
l1=[]
for item in l:
    val=(item['name'],item['age'],item['sex'])
    if val not in s:
        s.add(val)
        l1.append(item)

print(l1)


#定义函数,既可以针对可以hash类型又可以针对不可hash类型
def func(items,key=None):
    s=set()
    for item in items:
        val=item if key is None else key(item)
        if val not in s:
            s.add(val)
            yield item

print(list(func(l,key=lambda dic:(dic['name'],dic['age'],dic['sex']))))

三、set()函数用法,查看用
class set(object):
    """
    set() -> new empty set object
    set(iterable) -> new set object
    
    Build an unordered collection of unique elements.
    """
    def add(self, *args, **kwargs): # real signature unknown
        """
        Add an element to a set.
        
        This has no effect if the element is already present.
        """
        pass

    def clear(self, *args, **kwargs): # real signature unknown
        """ Remove all elements from this set. """
        pass

    def copy(self, *args, **kwargs): # real signature unknown
        """ Return a shallow copy of a set. """
        pass

    def difference(self, *args, **kwargs): # real signature unknown
        """
        相当于s1-s2
        
        Return the difference of two or more sets as a new set.
        
        (i.e. all elements that are in this set but not the others.)
        """
        pass

    def difference_update(self, *args, **kwargs): # real signature unknown
        """ Remove all elements of another set from this set. """
        pass

    def discard(self, *args, **kwargs): # real signature unknown
        """
        与remove功能相同,删除元素不存在时不会抛出异常
        
        Remove an element from a set if it is a member.
        
        If the element is not a member, do nothing.
        """
        pass

    def intersection(self, *args, **kwargs): # real signature unknown
        """
        相当于s1&s2
        
        Return the intersection of two sets as a new set.
        
        (i.e. all elements that are in both sets.)
        """
        pass

    def intersection_update(self, *args, **kwargs): # real signature unknown
        """ Update a set with the intersection of itself and another. """
        pass

    def isdisjoint(self, *args, **kwargs): # real signature unknown
        """ Return True if two sets have a null intersection. """
        pass

    def issubset(self, *args, **kwargs): # real signature unknown
        """ 
        相当于s1<=s2
        
        Report whether another set contains this set. """
        pass

    def issuperset(self, *args, **kwargs): # real signature unknown
        """
        相当于s1>=s2
        
         Report whether this set contains another set. """
        pass

    def pop(self, *args, **kwargs): # real signature unknown
        """
        Remove and return an arbitrary set element.
        Raises KeyError if the set is empty.
        """
        pass

    def remove(self, *args, **kwargs): # real signature unknown
        """
        Remove an element from a set; it must be a member.
        
        If the element is not a member, raise a KeyError.
        """
        pass

    def symmetric_difference(self, *args, **kwargs): # real signature unknown
        """
        相当于s1^s2
        
        Return the symmetric difference of two sets as a new set.
        
        (i.e. all elements that are in exactly one of the sets.)
        """
        pass

    def symmetric_difference_update(self, *args, **kwargs): # real signature unknown
        """ Update a set with the symmetric difference of itself and another. """
        pass

    def union(self, *args, **kwargs): # real signature unknown
        """
        相当于s1|s2
        
        Return the union of sets as a new set.
        
        (i.e. all elements that are in either set.)
        """
        pass

    def update(self, *args, **kwargs): # real signature unknown
        """ Update a set with the union of itself and others. """
        pass

    def __and__(self, *args, **kwargs): # real signature unknown
        """ Return self&value. """
        pass

    def __contains__(self, y): # real signature unknown; restored from __doc__
        """ x.__contains__(y) <==> y in x. """
        pass

    def __eq__(self, *args, **kwargs): # real signature unknown
        """ Return self==value. """
        pass

    def __getattribute__(self, *args, **kwargs): # real signature unknown
        """ Return getattr(self, name). """
        pass

    def __ge__(self, *args, **kwargs): # real signature unknown
        """ Return self>=value. """
        pass

    def __gt__(self, *args, **kwargs): # real signature unknown
        """ Return self>value. """
        pass

    def __iand__(self, *args, **kwargs): # real signature unknown
        """ Return self&=value. """
        pass

    def __init__(self, seq=()): # known special case of set.__init__
        """
        set() -> new empty set object
        set(iterable) -> new set object
        
        Build an unordered collection of unique elements.
        # (copied from class doc)
        """
        pass

    def __ior__(self, *args, **kwargs): # real signature unknown
        """ Return self|=value. """
        pass

    def __isub__(self, *args, **kwargs): # real signature unknown
        """ Return self-=value. """
        pass

    def __iter__(self, *args, **kwargs): # real signature unknown
        """ Implement iter(self). """
        pass

    def __ixor__(self, *args, **kwargs): # real signature unknown
        """ Return self^=value. """
        pass

    def __len__(self, *args, **kwargs): # real signature unknown
        """ Return len(self). """
        pass

    def __le__(self, *args, **kwargs): # real signature unknown
        """ Return self<=value. """
        pass

    def __lt__(self, *args, **kwargs): # real signature unknown
        """ Return self<value. """
        pass

    @staticmethod # known case of __new__
    def __new__(*args, **kwargs): # real signature unknown
        """ Create and return a new object.  See help(type) for accurate signature. """
        pass

    def __ne__(self, *args, **kwargs): # real signature unknown
        """ Return self!=value. """
        pass

    def __or__(self, *args, **kwargs): # real signature unknown
        """ Return self|value. """
        pass

    def __rand__(self, *args, **kwargs): # real signature unknown
        """ Return value&self. """
        pass

    def __reduce__(self, *args, **kwargs): # real signature unknown
        """ Return state information for pickling. """
        pass

    def __repr__(self, *args, **kwargs): # real signature unknown
        """ Return repr(self). """
        pass

    def __ror__(self, *args, **kwargs): # real signature unknown
        """ Return value|self. """
        pass

    def __rsub__(self, *args, **kwargs): # real signature unknown
        """ Return value-self. """
        pass

    def __rxor__(self, *args, **kwargs): # real signature unknown
        """ Return value^self. """
        pass

    def __sizeof__(self): # real signature unknown; restored from __doc__
        """ S.__sizeof__() -> size of S in memory, in bytes """
        pass

    def __sub__(self, *args, **kwargs): # real signature unknown
        """ Return self-value. """
        pass

    def __xor__(self, *args, **kwargs): # real signature unknown
        """ Return self^value. """
        pass

    __hash__ = None
View Code

 

 

四、数据类型总结

   按存储空间的占用分(从低到高

数字
字符串
集合:无序,即无序存索引相关信息
元组:有序,需要存索引相关信息,不可变
列表:有序,需要存索引相关信息,可变,需要处理数据的增删改
字典:无序,需要存key与value映射的相关信息,可变,需要处理数据的增删改

按存值个数区分

标量/原子类型 数字,字符串
容器类型 列表,元组,字典

 

 

按可变不可变区分

可变 列表,字典
不可变 数字,字符串,元组

 

 

按访问顺序区分

直接访问 数字
顺序访问(序列类型) 字符串,列表,元组
key值访问(映射类型) 字典









实例
#!/usr/bin/python3 student = {'Tom', 'Jim', 'Mary', 'Tom', 'Jack', 'Rose'} print(student) # 输出集合,重复的元素被自动去掉 # 成员测试 if 'Rose' in student : print('Rose 在集合中') else : print('Rose 不在集合中') # set可以进行集合运算 a = set('abracadabra') b = set('alacazam') print(a) print(a - b) # a 和 b 的差集 print(a | b) # a 和 b 的并集 print(a & b) # a 和 b 的交集 print(a ^ b) # a 和 b 中不同时存在的元素

以上实例输出结果:

{'Mary', 'Jim', 'Rose', 'Jack', 'Tom'}
Rose 在集合中
{'b', 'a', 'c', 'r', 'd'}
{'b', 'd', 'r'}
{'l', 'r', 'a', 'c', 'z', 'm', 'b', 'd'}
{'a', 'c'}
{'l', 'r', 'z', 'm', 'b', 'd'}
posted @ 2019-06-03 11:08  ilspring  阅读(171)  评论(0)    收藏  举报